The problem of non-stationarity in time series analysis in STATA

By Divya Narang & Priya Chetty on December 2, 2017

The previous article discussed the process for setting the ‘Time variable’ while conducting time series analysis in STATA. The purpose of this article is to explain the process of determining and creating stationarity in time series analysis.

Determining the presence of stationarity

Creating a visual plot of data is the first step in time series analysis. Graphical representation of data helps understand it better. To plot a graph, follow these steps:

1. Click on ‘Statistics’ in the ribbon of the ‘Output’ Window.
2. Select ‘Time series’.
3. Select ‘Graph’.
4. Click on ‘Line plots’.

The figure below shows this step:

A dialogue box ‘Time-series lines plot’ will appear as shown in the figure below. Click on ‘Create’ to start. Then click on ‘OK’.

A new dialogue box will appear as shown in the figure below. Here, select the variable to plot in the ‘Y variable’.

Graphical Representation of GDP

Select the first variable of concern in ‘Y variable’, i.e. ‘gdp’. Then click on ‘Accept’. The figure below shows this step.

A graph of ‘gdp’ variable will appear in a separate window as shown in the figure below.

The figure below is a clearer picture of the graph.

As seen in the graph, the GDP of India is trending upward during 1996 – 2016. Therefore GDP time series has been increasing with minor fluctuations. Thus, time series cannot have a constant mean and variance. Hence, the primary assumption of time series, i.e. stationarity, is missing. ‘Stationarity’ implies that data framed in different time frames should have a constant mean and variance. The ‘gdp’ variable here is trending upward with non-constant mean and variances. Thus GDP time series is non-stationary. However, the graph is only a preliminary step in the formal test process of stationarity.

Augmented Dickey-Fuller test

The dickey-Fuller test helps examine the stationarity in time series data. An important assumption of this test is that the error term is uncorrelated. Therefore Augmented Dickey-Fuller test is conducted first. It checks the correlation in error terms by adding lags. To perform Dickey-Fuller Test:

1. Click on ‘Statistics’ in the ribbon
2. Select ‘Time series’
3. Select ‘Tests’
4. Click on ‘Augmented Dickey-Fuller unit root test’

A dialogue box “Augmented Dickey-Fuller Unit-root test” will appear. Select the time series variable which needs testing for stationarity or unit root problem in the ‘Variable’ option as the figure below shows. Also, if the time series variable graph shows any trend, select ‘Include Trend term in regression’. However, first, perform the Dickey-Fuller test without considering lags. Therefore, keep the ‘Lagged differences’ option as it is.

Select the concerned time series variable in the dialogue box for the Augmented Dickey-Fuller test. In this case, ‘gdp’ is selected. Select ‘Include Trend term in regression’. Then click on ‘OK’. The figure below shows this step.

OR

`Command: dfuller gdp, lag(0)`

The Dickey-Fuller test results will appear as shown in the figure below. To test stationarity, focus on only two values of the result; Z(t) and Mackinnon p-value for Z(t). For a time series data to be stationary, the Z(t) should have a large negative number. p-value should be significant at least at 5% level. Neither of these conditions is met in this test. Therefore null hypothesis i.e. time series data is non-stationary, cannot be rejected. And since the time series GDP is non-stationary, further analysis cannot be performed on it.

Augmented Dickey-Fuller test including lags

In the above Dickey-Fuller test, lags were excluded, assuming the error term is uncorrelated.  If lags are included, the stationarity of the GDP time series can change. To do the same, perform the Augmented Dickey-Fuller test again as shown in Figure 7. Select ‘gdp’ and ‘Include trend term in regression’ again. This time increase the lagged difference number to 12 as shown in the figure below. Then click on ‘Ok’.

The output window will appear as the figure below shows. To examine stationarity, again focus on only two values of the result; Z(t) and Mackinnon p-value for Z(t). Here again, Z(t) value does not have any large negative number. Also, the p-value is insignificant. Thus, again the null hypothesis of the Dickey-Fuller test, which states that the time series data is non-stationary, cannot be rejected. Therefore time series GDP is non-stationary even after taking lags for correlated error terms.

Stationarity is important in order to proceed with the remaining steps in Time Series analysis. Therefore the proceeding article explains the solution to non-stationarity.

Priya is the co-founder and Managing Partner of Project Guru, a research and analytics firm based in Gurgaon. She is responsible for the human resource planning and operations functions. Her expertise in analytics has been used in a number of service-based industries like education and financial services.

Her foundational educational is from St. Xaviers High School (Mumbai). She also holds MBA degree in Marketing and Finance from the Indian Institute of Planning and Management, Delhi (2008).

Some of the notable projects she has worked on include:

• Using systems thinking to improve sustainability in operations: A study carried out in Malaysia in partnership with Universiti Kuala Lumpur.
• Assessing customer satisfaction with in-house doctors of Jiva Ayurveda (a project executed for the company)
• Predicting the potential impact of green hydrogen microgirds (A project executed for the Government of South Africa)

She is a key contributor to the in-house research platform Knowledge Tank.

She currently holds over 300 citations from her contributions to the platform.

She has also been a guest speaker at various institutes such as JIMS (Delhi), BPIT (Delhi), and SVU (Tirupati).