# How to test time series autocorrelation in STATA?

By Rashmi Sajwan & Priya Chetty on October 22, 2018

The previous article showed how to perform heteroscedasticity tests of time series data in STATA. It also showed how to apply a correction for heteroscedasticity so as not to violate the Ordinary Least Squares (OLS) assumption of constant variance of errors.  This article shows a testing serial correlation of errors or time series autocorrelation in STATA. An autocorrelation problem arises when error terms in a regression model correlate over time or are dependent on each other.

## Why test for autocorrelation?

It is one of the main assumptions of the OLS estimator according to the Gauss-Markov theorem that in a regression model:

```Cov(ϵ_(i,) ϵ_j )=0 ∀i,j,i≠j,
where Cov is the covariance and ϵ is the residual.```

The presence of autocorrelation in the data causes and correlates with each other and violates the assumption, showing bias in the OLS estimator. It is therefore important to test for autocorrelation and apply corrective measures if it is present. This article focuses on two common tests for autocorrelation; the Durbin Watson D test and the Breusch Godfrey LM test. Like the previous article (Heteroscedasticity test in STATA for time series data), first run the regression with the same three variables Gross Domestic Product (GDP), Private Final Consumption (PFC) and Gross Fixed Capital Formation (GFC) for the time period 1997 to 2018.

## Durbin Watson test for autocorrelation

Durbin Watson’s test depends upon 2 quantities; the number of observations and the number of parameters to test. In the dataset, the number of observations is 84 and the number of parameters is 2 (GFC and PFC). In the Durbin-Watson table two numbers are present– dl and du. These are the “critical values” (figure below).

Durbin Watson’s statistic ranges from 0 to 4. As the above scale shows, a statistics value between 0 to dl represents positive serial autocorrelation. Values between dl and du; 4-du and 4-dl indicate serial correlation cannot be determined. The value between du and 4-du represents no autocorrelation. Finally, the value between 4-dl and 4 indicates a negative serial correlation at a 95% confidence interval.

Command for the Durbin Watson test is as follows:

`dwstat`

However, STATA does not provide the corresponding p-value. To obtain the Durbin-Watson test statistics from the table conclude whether the serial correlation exists or not. Download the Durbin Watson D table here.

In the above figure, the rows show the number of observations and the columns represents the “k” number of parameters. Here the number of parameters is 2 and the number of observations is 84.

##### Consequently:

Durbin Watson’s lower limit from the table (dl) = 1.600

Durbin Watson’s upper limit from the table (du) = 1.696

Therefore, when du and dl are plotted on the scale, the results are as follows (figure below).

Durbin Watson d statistics from the STATA command is 2.494, which lies between 4-dl and 4, implying there is a negative serial correlation between the residuals in the model.

## Breusch-Godfrey LM test for autocorrelation

The Breusch-Godfrey LM test has an advantage over the classical Durbin-Watson D test. The Durbin-Watson test relies upon the assumption that the distribution of residuals is normal whereas the Breusch-Godfrey LM test is less sensitive to this assumption. Another advantage of this test is that it allows researchers to test for serial correlation through a number of lags besides one lag which is a correlation between the residuals between time t and t-k (where k is the number of lags). This is unlike the Durbin-Watson test which allows testing for only correlation between t and t-1. Therefore if k is 1, then the results of the Breusch-Godfrey test and Durbin-Watson test will be the same.

Follow the below command for the Breusch Godfrey LM test in STATA.

`estat bgodfrey`

The following results will appear as shown below.

The hypothesis in this case is:

• Null hypothesis: There is no serial correlation.
• Alternative Hypothesis: There is a serial correlation.

Since from the above table, chi2 is less than 0.05 or 5%, the null hypothesis can be rejected. In other words, there is a serial correlation between the residuals in the model. Therefore correct for the violation of the assumption of no serial correlation.

Offer ID is invalid

## Correction for autocorrelation

To correct the autocorrelation problem, use the ‘prais’ command instead of regression (same as when running regression), and the ‘corc’ command at last after the names of the variables.

Below is the command for correcting autocorrelation.

`prais gdp gfcf pfce, corc`

The below results will appear.

At the end of the results, finally, calculate original and new Durbin Watson statistics as follows.

The New D-W statistic value is 2.0578 which lies between du and 4-du, implying that there is no autocorrelation now. Thus it has been corrected.

Furthermore, the next article discusses the issue of multicollinearity. Multicollinearity arises when two or more two explanatory variables in the regression model highly correlate with each other.

Priya is the co-founder and Managing Partner of Project Guru, a research and analytics firm based in Gurgaon. She is responsible for the human resource planning and operations functions. Her expertise in analytics has been used in a number of service-based industries like education and financial services.

Her foundational educational is from St. Xaviers High School (Mumbai). She also holds MBA degree in Marketing and Finance from the Indian Institute of Planning and Management, Delhi (2008).

Some of the notable projects she has worked on include:

• Using systems thinking to improve sustainability in operations: A study carried out in Malaysia in partnership with Universiti Kuala Lumpur.
• Assessing customer satisfaction with in-house doctors of Jiva Ayurveda (a project executed for the company)
• Predicting the potential impact of green hydrogen microgirds (A project executed for the Government of South Africa)

She is a key contributor to the in-house research platform Knowledge Tank.

She currently holds over 300 citations from her contributions to the platform.

She has also been a guest speaker at various institutes such as JIMS (Delhi), BPIT (Delhi), and SVU (Tirupati).