# How to perform regression analysis using VAR in STATA?

The previous article on time series analysis showed how to perform **Autoregressive Integrated Moving Average** **(ARIMA)** on the **Gross Domestic Product (GDP)** of India for the period 1996 – 2016 using STATA. The underlining feature of **ARIMA** is that it studies the behavior of univariate time series like **GDP** over a specified time period. Based on that, it recommends an **ARIMA** equation. This equation then helps to forecast the **Gross Domestic Product (GDP)** for further years. However, **ARIMA** is insufficient in defining an econometrics model with more than one variable. For instance, to find the effect of ** Gross Fixed Capital Formation (GFC)** and

__Private Final Consumption (PFC)__on the

**GDP**,

**ARIMA**is not the correct approach. That is where multivariate time series is useful. Consequently, this article explains the process of performing a regression analysis using vector

*Auto-Regression (VAR)*in STATA.

## Equation of Vector Auto-Regression (VAR)

In multivariate time series, the prominent method of regression analysis is *Vector Auto-Regression (VAR)*. It is important to understand *VAR* for more clarity. Firstly, the term ‘auto-regression’ is used due to the appearance of the lagged value of dependent variables on the right side. Secondly, the term ‘vector’ refers to dealing with the vector of two or more variables. The resultant equation will be as follows:

In the above *VAR *equation, all three variables are inter-related and simultaneously achieved. Since ** GFC** and

__PFC__play a role in the calculation of

**GDP**, the simultaneity between these variables are universal.

To proceed with *VAR* in STATA, it is important to recognize all the steps, assumptions, and important tests in the process.

**5E25A5EE63214**to save 5000 on 15001 - 20000 words standard order of

**literature survey**.

Order now

^{}

## Steps in performing VAR in STATA

1. Lag selection of Variables | As noted in the above equation, the variables are interrelated with lagged values of other variables. However, it is unclear how many lags the variables show interrelation.
Therefore, to begin |

2. Stationarity | In the previous articles, the time series data showed that GDP is non-stationary. Therefore it uses the first differencing. The same case could also happen for and GFCPFC. Therefore, the second step would be to check and assure stationarity in data. |

3. Test for Co-integration | In the case of co-integration, suppose there are two or more non-stationary variables for regression. While estimating residuals from the regression, the residuals turn out to be stationary. That means, two or more non-stationary series may result in a stationary series. This is called as co-integration. The implication of co-integration is that two variables have a long-term casualty and in the long run, the variables might converge towards an equilibrium value. Equilibrium value is steady, therefore they have equal means and variance, or ‘stationary’. Therefore, before initiating VAR, find out if the present model contains any co-integration or equilibrium state. Co-integration indicates a long-term association between two or more non-stationary variables. |

4. If Co-integration is not present = We apply VAR. |
VAR technique where variables are endogenous and dependent on lagged values of other variables. |

5.If co-integration is present = apply Vector Error Correction Model (VECM) . |
VECM model takes into account the long term and short term causality dynamics. It also offers a possibility to apply VAR to integrated multivariate time series. |

6. VECM diagnostic, tests and forecasting | Based on the constructed VECM model, review the assumptions of autocorrelation and normality, and then proceed to forecast. |

7. ARCH (Autoregressive Conditionally Heteroscedastic Model) | Time series models incorporating the effects of volatility. |

8. Extensions of ARCH | GARCH (Generalized Autoregressive Conditional Heteroskedasticity) and T-GARCH (Threshold- Generalized Autoregressive Conditional Heteroskedasticity). |

Table 1: Tests of *VAR* Models

The next article shows the lag selection in a *VAR* model involving two variables **GDP** and PFC.

Priya is the co-founder and Managing Partner of Project Guru, a research and analytics firm based in Gurgaon. She is responsible for the human resource planning and operations functions. Her expertise in analytics has been used in a number of service-based industries like education and financial services.

Her foundational educational is from St. Xaviers High School (Mumbai). She also holds MBA degree in Marketing and Finance from the Indian Institute of Planning and Management, Delhi (2008).

Some of the notable projects she has worked on include:

- Using systems thinking to improve sustainability in operations: A study carried out in Malaysia in partnership with Universiti Kuala Lumpur.
- Assessing customer satisfaction with in-house doctors of Jiva Ayurveda (a project executed for the company)
- Predicting the potential impact of green hydrogen microgirds (A project executed for the Government of South Africa)

She is a key contributor to the in-house research platform Knowledge Tank.

She currently holds over 300 citations from her contributions to the platform.

She has also been a guest speaker at various institutes such as JIMS (Delhi), BPIT (Delhi), and SVU (Tirupati).

## Discuss