Problem of non-stationarity in time series analysis in STATA

The previous article discussed the process for setting the ‘Time variable’ while conducting time series analysis in STATA. The purpose of this article is to explain the process of determining and creating stationarity in time series analysis.

Determining the presence of stationarity

Creating a visual plot of data is the first step in time series analysis. Graphical representation of data helps understand it better. To plot a graph, follow these steps:

1. Click on ‘Statistics’ in ribbon of ‘Output’ Window.
2. Select ‘Time series’.
3. Select ‘Graph’.
4. Click on ‘Line plots’.

The figure below shows this step:

A dialogue box ‘Time-series lines plot’ will appear as shown in figure below. Click on ‘Create’ to start. Then click on ‘OK’.

A new dialogue box will appear as shown in figure below. Here, select the variable to plot in ‘Y variable’.

Graphical Representation of GDP

Select first variable of concern in ‘Y variable’, i.e. ‘gdp’. Then click on ‘Accept’. The figure below shows this step.

A graph of ‘gdp’ variable will appear in a separate window as shown in figure below.

The figure below is a clearer picture of the graph.

As seen in the graph, the GDP of India is trending upward during 1996 – 2016. Therefore GDP time series has been increasing with minor fluctuations. Thus, time series cannot have a constant mean and variance. Hence, the primary assumption of time series, i.e. stationarity, is missing. ‘Stationarity’ implies that data framed in different time frames should have a constant mean and variance. The ‘gdp’ variable here is trending upward with non-constant mean and variances. Thus GDP time series is non-stationary. However, the graph is only a preliminary step in formal test process of stationarity.

Augmented Dickey Fuller test

Dickey Fuller test helps examine the stationarity in time series data. An important assumption of this test is that the error term is uncorrelated. Therefore Augmented Dickey Fuller test is conducted first. It checks the correlation in error term by adding lags. To perform Dickey Fuller Test:

1. Click on ‘Statistics’ in ribbon
2. Select ‘Time series’
3. Select ‘Tests’
4. Click on ‘Augmented Dickey-Fuller unit root test’

A dialogue box “Augmented Dickey Fuller Unit-root test” will appear. Select time series variable which needs testing for stationarity or unit root problem in ‘Variable’ option as the figure below shows. Also, if the time series variable graph shows any trend, select ‘Include Trend term in regression’. However first, perform the Dickey Fuller test without considering lags. Therefore, keep the ‘Lagged differences’ option as it is.

Select the concerned time series variable in dialogue box for Augmented Dickey Fuller test. In this case ‘gdp’ is selected. Select ‘Include Trend term in regression’. Then click on ‘OK’. The figure below shows this step.

OR

`Command: dfuller gdp, lag(0)`

The Dickey Fuller test results will appear as shown in figure below. To test stationarity, focus on only two values of the result; Z(t) and Mackinnon p-value for Z(t). For a time series data to be stationary, the Z(t) should have a large negative number. p-value should be significant at least on 5% level. Neither of these conditions are met in this test. Therefore null hypothesis i.e. time series data is non-stationary, cannot be rejected. And since the time series GDP is non-stationary, further analysis cannot be performed on it.

Augmented Dickey Fuller test including lags

In the above Dickey Fuller test, lags was excluded, assuming the error term is uncorrelated.  If lags are included, stationarity of GDP time series can  change. To do the same, perform Augumented Dickey Fuller test again as shown in Figure 7. Select ‘gdp’ and ‘Include trend term in regression’ again. This time increase the lagged difference number to 12 as shown in figure below. Then click on ‘Ok’.

The output window will appear as the figure below shows. To examine stationarity, again focus on only two values of the result; Z(t) and Mackinnon p-value for Z(t). Here again, Z(t) value does not have any large negative number. Also p-value is insignificant. Thus, again the null hypothesis of Dickey Fuller test, which states that the time series data is non-stationary, cannot be rejected. Therefore time series GDP is non stationary even after taking lags for correlated error terms.

Stationarity is important in order to proceed with the remaining steps in Time Series analysis. Therefore the proceeding article explains the solution to non-stationarity.

Priya Chetty

Partner at Project Guru
Priya is a master in business administration with majors in marketing and finance. She is fluent with data modelling, time series analysis, various regression models, forecasting and interpretation of the data. She has assisted data scientists, corporates, scholars in the field of finance, banking, economics and marketing.
4 thoughts on “Problem of non-stationarity in time series analysis in STATA”
1. Jay 1 year & 11 months ago

Great article, it’s nice to see these problems explained clearly and simply. Have you written somewhere on how to best address the issue of non-stationarity?

2. Richard 4 months & 3 weeks ago

In the agument Dfuller test, why did you seleted 12 for the lag? Was this determined by varsoc?

• Divya Narang 4 months & 2 weeks ago

Hi Richard,

Yes, you are right. For specifying the lag order for dfuller first difference model, the optimal lag length is obtained from varsoc command less one as the first differencing removes the highest lag for the model. You can also add time trend to varsoc by assuming the time identifier as an exogenous variable.