# Lag selection and stationarity in VAR with three variables in STATA

The previous article explained how to perform the lag selection, Johansen co-integration test and *Vector Auto Regression (VAR)* with two variables, **Gross Domestic Product (GDP)** and Private Final Consumption (PFC). This article incorporates ** Gross Fixed Capital Formation (GFC)** and again performs the lag selection test and check for stationarity for both,

**and**

*GFC*__PFC__. Thus this article incorporates the

*VAR*with three variables in STATA. The process also includes Johansen cointegration test for the model including all three-time series.

While running the regression for time series data, it is important to include the lagged values of the dependent as well as independent variables. For instance, last year’s **GDP** may be correlated to this year’s **GDP**, this showing lagged values. Thus if past values affect today’s values then more lags will be necessary. Therefore, to determine the exact level of lags, the first step is to perform lag selection criteria in STATA.

## Lag selection

To start with lag selection, follow the below steps:

- Click on ‘Statistics’ on Result window
- Choose ‘Multi-variate Time Series’
- Click on ‘VAR Diagnostic and Test’
- Select ‘Lag-order selection statistics’.

After selecting the lag-order selection statistics, a ‘varsoc’ window will open in STATA (figure below). Select two components on the main windo; the list of dependent variables (**GDP**, ** GFC** and

__PFC__), and the maximum lag order.

Select the maximum number of lags to check; for instance, “8”. Further in ‘Dependent variables’ option, select three main variables **GDP**, ** GFC** and

__PFC__.

The results will appear in the output window (figure below). The STATA command for lag selection is also visible here. Alternatively, use the below command to generate the result:

varsoc gdp gfc pfc, max lag (8)

In the result table, the number of lags in the first column and the parameters for optimal lags like Final Prediction Error (FPE), Akaike information criterion (AIC), Hannan Quinn Information Criteria (HQIC) and Schwartz Information Criteria (SBIC) are visible. STATA with this command computes four information criteria as well as a sequence of likelihood ratio (LR) tests.

## Identify the number of lags

To identify the number of lags, select the values showing “*”. For instance, FPE shows value at lag 8 and carries the sign “*”. Therefore, the lag as per FPE criteria is 8. In order to select the parameter with optimal lags for *VAR*, follow the majority number. That means, if three out of four parameters show the same number of lags, let’s say 8, then take 8 lags. However, in this case, it is not possible to follow the majority since two parameters show 1 and the other two show 8. Therefore, follow the AIC and FPE criteria, since the number of observations is more than 60.

## Stationarity

This case follows the same steps for **GFC** and __PFC__ used in the case of ARIMA where stationarity of GDP time series was checked. The figure below shows the time series graph of __PFC__ and ** GFC**.

As the graph shows, trends for both ** GFC** and

__PFC__is upward, indicating that mean and variances of both variables are non-constant. That means both the series are non-stationary. For more clarity, use the augmented Dickey-Fuller test. In order to transform the times series

__PFC__and

**into stationary, apply differencing. Following the below two commands:**

*GFC*- Generate gfc_d1 = d1.gfc
- Generate pfc_d1 = d1.pfc

The first differencing series of ** GFC** (gfc_d1) and

__PFC__(pfc_d1) is generated. To ascertain the stationarity, again perform augmented Dickey-Fuller test. The figure below shows the result for

**(gfc_d1) and**

*GFC*__PFC__(pfc_d1):

The p values for both differenced series of ** GFC** (gfc_d1) and

__PFC__(pfc_d1) is close to zero, which means the null hypothesis of non-stationarity can be rejected and first differenced time series are stationary.

This article shows how to perform lag selection and stationarity test in a *VAR* with three variables **GDP**, __PFC__ and __GFC__. The next article shows the co-integration test to choose the correct type of *VAR* model.

Priya is the co-founder and Managing Partner of Project Guru, a research and analytics firm based in Gurgaon. She is responsible for the human resource planning and operations functions. Her expertise in analytics has been used in a number of service-based industries like education and financial services.

Her foundational educational is from St. Xaviers High School (Mumbai). She also holds MBA degree in Marketing and Finance from the Indian Institute of Planning and Management, Delhi (2008).

Some of the notable projects she has worked on include:

- Using systems thinking to improve sustainability in operations: A study carried out in Malaysia in partnership with Universiti Kuala Lumpur.
- Assessing customer satisfaction with in-house doctors of Jiva Ayurveda (a project executed for the company)
- Predicting the potential impact of green hydrogen microgirds (A project executed for the Government of South Africa)

She is a key contributor to the in-house research platform Knowledge Tank.

She currently holds over 300 citations from her contributions to the platform.

She has also been a guest speaker at various institutes such as JIMS (Delhi), BPIT (Delhi), and SVU (Tirupati).

## Discuss