How to perform Heteroscedasticity test in STATA for time series data?
The previous articles showed how to perform normality tests in time series data. This article focuses on another important diagnostic test, i.e. heteroscedasticity test in STATA. Heteroskedastic means “differing variance” which comes from the Greek word “hetero” (‘different’) and “skedasis” (‘dispersion’). It refers to the variance of the error terms in a regression model in an independent variable.
If heteroscedasticity is present in the data, the variance differs across the values of the explanatory variables and violates the assumption. This will make the OLS estimator unreliable due to bias. It is therefore imperative to test for heteroscedasticity and apply corrective measures if it is present. Various tests help detect heteroscedasticities such as the Breusch-Pagan test and the White test.
Heteroscedasticity tests use the standard errors obtained from the regression results. Therefore, the first step is to run the regression with the same three variables considered in the previous article for the same period of 1997-98 to 2017-18.
Regression results
The previous article explained the procedure to run the regression with three variables in STATA. The regression result is as follows.
Now proceed to the heteroscedasticity test in STATA using two approaches.
Breusch-Pagan test for heteroscedasticity
The Breusch-Pagan test helps to check the null hypothesis versus the alternative hypothesis. A null hypothesis is where the error variances are all equal (homoscedasticity), whereas the alternative hypothesis states that the error variances are a multiplicative function of one or more variables (heteroscedasticity).
To perform the Breusch-Pagan test use this STATA command:
estat hettest
The below results will appear.
The figure above shows that the probability value of the chi-square statistic is less than 0.05. Therefore the null hypothesis of constant variance can be rejected at a 5% level of significance. It implies the presence of heteroscedasticity in the residuals.
Order now
White test for heteroscedasticity
To check heteroscedasticity using the White test, use the following command in STATA:
estat imtest, white
The below results will appear.
Similar to the results of the Breusch-Pagan test, here too prob > chi2 = 0.000. The null hypothesis of constant variance can be rejected at a 5% level of significance. The implication of the above finding is that there is heteroscedasticity in the residuals.
Graphical depiction of results from heteroscedasticity test in STATA
Present heteroscedasticity graphically using the following procedure (figure below):
- Go to ‘Graphics’
- Selecting ‘Regression diagnostic plots’
- Choose ‘Residuals-versus-fitted’.
The rvfplot box will appear (figure below). Click on ‘Reference lines’. Click on ‘OK’.
The ‘Reference lines (y-axis)’ window will appear (figure below). Enter ‘0’ in the box for ‘Add lines to the graph at specified y-axis values’. Then click on ‘Accept’.
The following graph will appear.
The above graph shows that residuals are somewhat larger near the mean of the distribution than at the extremes. Also, there is a systematic pattern of fitted values.
Presence of heteroscedasticity
Thus heteroscedasticity is present. This can be due to measurement error, model misspecifications or subpopulation differences. The consequences of the heteroscedasticity are that the OLS estimates are no longer BLUE (Best Linear Unbiased Estimator). Standard errors will be unreliable, which will further cause bias in test results and confidence intervals.
Therefore correct heteroscedasticity either by changing the functional form or by using a robust command in the regression.
Offer ID is invalidCorrection for heteroscedasticity
In order to get the robust standards errors, add the ‘vce (robust)’ command after the regression:
regress gdp gfcf pfce, vce(robust)
This will output the following result (figure below).
Thus the problem of heteroscedasticity is not present anymore. This gives robust standard errors, which are different from standard errors in figure 1. Here robust standard error for the variable gfcf is 0.1030497, which is different from 0.076651 given in figure 1. Similar is the case with the variable pfcf.
The presence of autocorrelation or serial correlation is a violation of another important ordinary least squares (OLS) assumption that errors in the regression model are uncorrelated with each other at all points in time.
I am a management graduate with specialisation in Marketing and Finance. I have over 12 years' experience in research and analysis. This includes fundamental and applied research in the domains of management and social sciences. I am well versed with academic research principles. Over the years i have developed a mastery in different types of data analysis on different applications like SPSS, Amos, and NVIVO. My expertise lies in inferring the findings and creating actionable strategies based on them.
Over the past decade I have also built a profile as a researcher on Project Guru's Knowledge Tank division. I have penned over 200 articles that have earned me 400+ citations so far. My Google Scholar profile can be accessed here.
I now consult university faculty through Faculty Development Programs (FDPs) on the latest developments in the field of research. I also guide individual researchers on how they can commercialise their inventions or research findings. Other developments im actively involved in at Project Guru include strengthening the "Publish" division as a bridge between industry and academia by bringing together experienced research persons, learners, and practitioners to collaboratively work on a common goal.
Discuss