Problem of non-stationarity in time series analysis in STATA

The previous article discussed the process for setting the ‘Time variable’ while conducting time series analysis in STATA. The purpose of this article is to explain the process of determining and creating stationarity in time series analysis.

Determining the presence of stationarity

Creating a visual plot of data is the first step in time series analysis. Graphical representation of data helps understand it better. To plot a graph, follow these steps:

  1. Click on ‘Statistics’ in ribbon of ‘Output’ Window.
  2. Select ‘Time series’.
  3. Select ‘Graph’.
  4. Click on ‘Line plots’.

The figure below shows this step:

Figure 1: Plotting a time series graph in STATA

Figure 1: Plotting a time series graph in STATA

A dialogue box ‘Time-series lines plot’ will appear as shown in figure below. Click on ‘Create’ to start. Then click on ‘OK’.

Figure 2: Creating time series lines plots in STATA

Figure 2: Creating time series lines plots in STATA

A new dialogue box will appear as shown in figure below. Here, select the variable to plot in ‘Y variable’.

Figure 3: Creating time series lines plots in STATA

Figure 3: Creating time series lines plots in STATA

Graphical Representation of GDP

Select first variable of concern in ‘Y variable’, i.e. ‘gdp’. Then click on ‘Accept’. The figure below shows this step.

Figure 4: Select variable for time series lines plots in STATA

Figure 4: Select variable for time series lines plots in STATA

A graph of ‘gdp’ variable will appear in a separate window as shown in figure below.

Figure 5: Graphical representation of ‘gdp’ variable in STATA

Figure 5: Graphical representation of ‘gdp’ variable in STATA

The figure below is a clearer picture of the graph.

Figure 6: Graphical representation of ‘gdp’ variable in STATA

Figure 6: Graphical representation of ‘gdp’ variable in STATA

As seen in the graph, the GDP of India is trending upward during 1996 – 2016. Therefore GDP time series has been increasing with minor fluctuations. Thus, time series cannot have a constant mean and variance. Hence, the primary assumption of time series, i.e. stationarity, is missing. ‘Stationarity’ implies that data framed in different time frames should have a constant mean and variance. The ‘gdp’ variable here is trending upward with non-constant mean and variances. Thus GDP time series is non-stationary. However, the graph is only a preliminary step in formal test process of stationarity.

Augmented Dickey Fuller test

Dickey Fuller test helps examine the stationarity in time series data. An important assumption of this test is that the error term is uncorrelated. Therefore Augmented Dickey Fuller test is conducted first. It checks the correlation in error term by adding lags. To perform Dickey Fuller Test:

  1. Click on ‘Statistics’ in ribbon
  2. Select ‘Time series’
  3. Select ‘Tests’
  4. Click on ‘Augmented Dickey-Fuller unit root test’
Figure 7: Performing Augmented Dickey Fuller test for time series analysis in STATA

Figure 7: Performing Augmented Dickey Fuller test for time series analysis in STATA

A dialogue box “Augmented Dickey Fuller Unit-root test” will appear. Select time series variable which needs testing for stationarity or unit root problem in ‘Variable’ option as the figure below shows. Also, if the time series variable graph shows any trend, select ‘Include Trend term in regression’. However first, perform the Dickey Fuller test without considering lags. Therefore, keep the ‘Lagged differences’ option as it is.

Figure 8: Performing Augmented Dickey Fuller test for time series analysis in STATA

Figure 8: Performing Augmented Dickey Fuller test for time series analysis in STATA

Select the concerned time series variable in dialogue box for Augmented Dickey Fuller test. In this case ‘gdp’ is selected. Select ‘Include Trend term in regression’. Then click on ‘OK’. The figure below shows this step.

Figure 9: Dialogue box for Augmented Dickey Fuller test for time series analysis in STATA

Figure 9: Dialogue box for Augmented Dickey Fuller test for time series analysis in STATA

 

OR

Follow STATA command:

Command: dfuller gdp, lag(0)

The Dickey Fuller test results will appear as shown in figure below. To test stationarity, focus on only two values of the result; Z(t) and Mackinnon p-value for Z(t). For a time series data to be stationary, the Z(t) should have a large negative number. p-value should be significant at least on 5% level. Neither of these conditions are met in this test. Therefore null hypothesis i.e. time series data is non-stationary, cannot be rejected. And since the time series GDP is non-stationary, further analysis cannot be performed on it.

Figure 10: Dickey Fuller test results in STATA

Figure 10: Dickey Fuller test results in STATA

Augmented Dickey Fuller test including lags

In the above Dickey Fuller test, lags was excluded, assuming the error term is uncorrelated.  If lags are included, stationarity of GDP time series can  change. To do the same, perform Augumented Dickey Fuller test again as shown in Figure 7. Select ‘gdp’ and ‘Include trend term in regression’ again. This time increase the lagged difference number to 12 as shown in figure below. Then click on ‘Ok’.

Figure 11: Dialogue box for Augmented Dickey Fuller test

Figure 11: Dialogue box for Augmented Dickey Fuller test

The output window will appear as the figure below shows. To examine stationarity, again focus on only two values of the result; Z(t) and Mackinnon p-value for Z(t). Here again, Z(t) value does not have any large negative number. Also p-value is insignificant. Thus, again the null hypothesis of Dickey Fuller test, which states that the time series data is non-stationary, cannot be rejected. Therefore time series GDP is non stationary even after taking lags for correlated error terms.

Figure 12: Dickey Fuller test results in STATA

Figure 12: Dickey Fuller test results in STATA

Stationarity is important in order to proceed with the remaining steps in Time Series analysis. Therefore the proceeding article explains the solution to non-stationarity.

Priya Chetty

Partner at Project Guru
Priya is a master in business administration with majors in marketing and finance. She is fluent with data modelling, time series analysis, various regression models, forecasting and interpretation of the data. She has assisted data scientists, corporates, scholars in the field of finance, banking, economics and marketing.

Related articles

  • ARCH model for time series analysis in STATA The previous article showed how to initiate the AutoRegressive Conditional Heteroskedasticity (ARCH) model on a financial stock return time series for period 1990 to 2016. It showed results for stationarity, volatility, normality and autocorrelation on a differenced log of stock returns.
  • Setting the ‘Time variable’ for time series analysis in STATA Time series analysis works on all structures of data. It comprises of methods to extract meaningful statistics and characteristics of data. Time series test is applicable on datasets arranged periodically (yearly, quarterly, weekly or daily).
  • ARIMA modeling for time series analysis in STATA In the previous article, all possibilities for performing Autoregressive Integrated Moving Average (ARIMA) modeling for the time series GDP were identified as under. S. […]
  • How to identify ARCH effect for time series analysis in STATA? Volatility only represents a high variability in a series over time.This article explains the issue of volatility in data using Autoregressive Conditional Heteroscedasticity (ARCH) model. It will identify the ARCH effect in a given time series in STATA.
  • Understanding point forecasting in STATA This article explains how to perform point forecasting in STATA, where one can generate forecast values even without performing ARIMA.
Discussions

4 Comments.

  1. Great article, it’s nice to see these problems explained clearly and simply. Have you written somewhere on how to best address the issue of non-stationarity?

  2. In the agument Dfuller test, why did you seleted 12 for the lag? Was this determined by varsoc?

    • Hi Richard,

      Yes, you are right. For specifying the lag order for dfuller first difference model, the optimal lag length is obtained from varsoc command less one as the first differencing removes the highest lag for the model. You can also add time trend to varsoc by assuming the time identifier as an exogenous variable.

Discuss

We are looking for candidates who have completed their master's degree or Ph.D. Click here to know more about our vacancies.