Understanding Vector Auto-Regression (VAR) analysis

The previous articles on time series analysis showed how to perform Autoregressive Integrated Moving Average (ARIMA) on GDP of India for the period 1996 – 2016 using STATA. The underlining feature of ARIMA is that it studies the behaviour of univariate time series like GDP over a specified time period. Based on that, it recommends an ARIMA equation. This equation is then used for forecasting GDP for further years. However ARIMA is insufficient in defining an econometrics model with more than one variable. For instance, for finding the effect of Gross Fixed Capital Formation (GFC) and Private Final Consumption (PFC) on GDP, ARIMA is not the correct approach. That is where multivariate time series using VAR is useful.

Equation of Vector Auto-Regression (VAR)

In multivariate time series, the prominent  method of regression analysis is Vector Auto-Regression (VAR). VAR should be better understood in parts for clarity. Firstly, the term ‘auto-regression’ is used due to the appearance of lagged value of dependent variables on the right side. Secondly, the term ‘vector’ is used when dealing with vector of two or more variables. The resultant equation will be as follows:

Figure 1: Equation of VAR

Figure 1: Equation of VAR

In the above VAR equation,  simultaneous achievement of all three variables are inter-related. Popular inter-relation between variables are used to reach the assumption of above equation. Since GFC and PFC play a role in calculation of GDP, the simultaneity between these variables are universal.

Performing VAR analysis in STATA

To proceed with VAR analysis in STATA, it is important to recognize all the steps, assumptions and important tests to be performed.

Steps and assumptions

1. Lag selection of Variables It is noticeable in the above equation (Fig 1) that the variables are interrelated with lagged values of other variables. However, it is unclear for how many lags the variables are interrelated.

Therefore, to begin VAR, first it is imperative to recognize the exact level of lags at which variables are inter-connected or endogenously obtained.

2. Stationarity In the previous articles the time series data showed that GDP is non-stationary. Resultantly apply first differencing. The same case could also happen for GFC and PFC. Therefore, the second step would be to check and assure stationarity in data.
3. Test for Co-integration In case of co-Integration, suppose there are two or more than two non-stationary variables for regression. While estiamting residuals from regression, the residuals turns out to be stationary. That means, two or more than two non-stationary series may result in a stationary series. This is known as co-integration. The implication of co-integration is that, two variables have a long term casuality and in long run, the variables might converge towards an equilibrium value. Equilibrim value is steady, means have equal means and variance or ‘stationary’.  Therefore, before initaiting VAR, it is imperative to know if the prersent model contains any co-integration or equalibrium state. co-integration or co-integrated variables indicates long term association among two or more than non-stationary variables


1. If Co-integration is not present = We apply VAR. VAR technique where variables are endogenous and dependent on lagged values of other variables.
2. If co-integration is present = we apply Vector Error Correction Model (VECM).


VECM model takes into account the long term and short term causality dynamics. It offers a possibility to apply VAR to integrated multivariate time series
3. VECM diagnostic, tests and forecasting After constructing the VECM model, review further the assumptions of autocorrelation and normality. After that, perform forecasting.
4. ARCH (Autoregressive Conditionally Heteroscedastic Model) Time series models incorporating the effects on volatility.
5. Extensions of ARCH GARCH (Generalized Autoregressive Conditional Heteroskedasticity) and T-GARCH (Threshold- Generalized Autoregressive Conditional Heteroskedasticity)


Priya Chetty

Partner at Project Guru
Priya Chetty writes frequently about advertising, media, marketing and finance. In addition to posting daily to Project Guru Knowledge Tank, she is currently in the editorial board of Research & Analysis wing of Project Guru. She emphasizes more on refined content for Project Guru's various paid services. She has also reviewed about various insights of the social insider by writing articles about what social media means for the media and marketing industries. She has also worked in outdoor media agencies like MPG and hotel marketing companies like CarePlus.

Latest posts by Priya Chetty (see all)

Related articles

  • How to perform LASSO regression test? In statistics, to increase the prediction accuracy and interpret-ability of the model, LASSO (Least Absolute Shrinkage and Selection Operator) is extremely popular. It is a regression procedure that involves selection and regularisation and was developed in 1989. Lasso regression is an […]
  • How to apply logistic regression in a case? Machine learning involves solutions to predict scenarios based on past data. Logistic regression offers probability functions based on inputs and their corresponding output.
  • How to perform and apply Monte Carlo simulation? Monte Carlo simulation is an extension of statistical analysis where simulated data is produced. This method uses repeated sampling techniques to generate simulated data.
  • How to use an instrumental variable? Instrumental variable is a third variable that estimates causal relationships in the regression analysis when an endogenous variable is present. Instrumental variables are useful when the independent variable in the regression model correlates with the error term in the model.
  • How to perform nonlinear regression? Regression analysis is a statistical tool to study the relationship between variables. These variables are the outcome variable and one or more exposure variables. In other words, regression analysis is an equation which predicts a response from the value of a certain predictor.


We are looking for candidates who have completed their master's degree or Ph.D. Click here to know more about our vacancies.