The previous article showed that the three-time series values **Gross Domestic Product (GDP)**, ** Gross Fixed Capital Formation (GFC)** and

__Private Final Consumption (PFC)__are non-stationary. Therefore they may have long-term causality. The general assumption, in this case, is that consumption

__PFC__affects

**GDP**, therefore these variables might be cointegrated. Resultantly, they may lead to an estimation of a stationary variable. Johansen cointegration test in

*Vector Auto Regression (VAR)*with two variables will help check the same.

To start with lag selection parameters in STATA, follow the steps below:

- Click on ‘Statistics’ on result window
- Choose ‘Multi-variate Time Series’
- Click on ‘
*VAR*Diagnostic and Test’ - Select ‘Lag-order selection statistics’.

The below screen will appear.

When clicked on ‘lag-order selection statistics’, a varsoc window will open in STATA as shown in figure 2. In the varsoc window, select two components on the main page: the list of dependent variables (**GDP** and __PFC__), and the maximum lag order. Here the maximum lag order refers to the maximum lag you want to check for the results.

In ‘Dependent variables’, select the main variables **GDP** and __PFC__. After selection for both dropdowns, click on ‘OK’.

After clicking on ‘OK’, the results will appear in the output window (figure 4). Here the STATA command for lag selection parameters is also visible. Use this command alternatively to generate the result.

##### Command

varsoc gdp pfc, max lag (8)

The results table will show the number of lags in the first column and a number of parameters. Select the optimal lags, like, Final Prediction Error (FPE), Akaike Information Criterion (AIC), Hannan Quinn Information Parameters (HQIC) and Schwartz Information Parameters (SBIC). STATA will compute four information parameters as well as a sequence of likelihood ratio tests.

## Identify the number of lags

To identify the number of lags, select the values showing*. *

For instance, in the values for FPE, value at lag 3 carries the sign.

Therefore, the lag as per FPE parameters is 3. Following the same rule, the lag as per AIC is also 3, and as per HQIC and SBIC is 2. To select parameters with optimal lags for VAR, follow the majority. That means if three or four out of four parameters shows the same number of lags (let’s say 3), then take 3 lags. However, in this case, a majority cannot be followed since two parameters show ‘2’ and other two others show ‘3’. Hence, since the number of observations, in this case, is more than 60, follow AIC and FPE parameters. Therefore, the number of lags selected for the present case is 3.

## Johansen cointegration test

Johansen cointegration test, also known as eigenvalue test or trace test, is a likelihood ratio test. There are two tests under Johansen cointegration; maximum eigenvalue test, and trace test. For both test statistics, the initial Johansen test is a test of the null hypothesis of no cointegration against the alternative of cointegration. The null hypothesis for this test differs in case of differing ranks. For clarity, the Johansen cointegration test is performed for variables **GDP** and __PFC__. Follow these steps to start (figure below):

- Click on ‘Statistics’ on ‘Result’ window
- Select ‘Multivariate Time-series’
- Select ‘Co-integrating rank of a VECM’.

‘Vecrank’ window will open in STATA (figure below). In this window, select values for two drop-down options; dependent variables and maximum lags for underlining *VAR* Model.

In ‘Dependent variables’ option, select two-time series variables **GDP** and __PFC__. Since co-integration analysis takes the case of non-stationary variables to check for causality, take **GDP** and __PFC__ instead of their first differences. Then select the number of lags. In this case, the lag selected parameters were conducted in a previous analysis, therefore, the number of lags here is 3.

After selecting for lag, click on ‘Reporting’ tab of the vecrank window and click on ‘Report maximum-eigenvalue statistic’ (figure below). Click on ‘OK’.

The results for Johansen cointegration test will appear in the window (figure below). Here the STATA command of Johansen cointegration test will also appear.

##### Command

vecrank gdp pfc, trend(constant) max

The result of the Johansen co-integration test can be interpreted in parts. Converge the focus towards three columns; maximum rank, trace statistics or max statistics and critical values.

### Maximum rank zero

Starting from maximum rank zero, the null and alternative hypotheses are as follows:

**Null Hypothesis:**There is no cointegration**Alternative Hypothesis:**There is cointegration

As the figure above shows, at maximum rank zero, the trace statistic (5.8121) do not exceed critical values (15.41). Therefore null hypothesis cannot be rejected. Also, this suggests that the time series variables **GDP** and __PFC__ are not cointegrated. Similarly, for max statistics, the value 3.5250 does not exceed the critical value of 14.07, thus suggesting a similar result that null hypothesis cannot be rejected. Thus, as per maximum rank 0, **GDP** and __PFC__ are not cointegrated. Following the above results, apply unrestricted *VAR *to time series **GDP** and __PFC__.

*VAR* Model

To start with the unrestricted *VAR* model in STATA, follow:

- Click on ‘Statistics’
- Select ‘Multivariate Time Series’
- Select
*‘VAR’*

The figure below will appear.

In ‘Dependent variables’ option, select the two-time series variables **GDP** and __PFC__. Next, select the number of lags (figure below). The number of lags for this case is same as the previous analysis, i.e. 3.

The figure below shows the results of *VAR* test. The results are in two parts. While the first one assumes **GDP** as a dependent variable, the second one assumes __PFC__ as a dependent variable. Since the aim is to verify the effect of __PFC__ on **GDP**, the first part is more relevant. As per the results:

- Only lag 1 of
__PFC__is significantly identified having an effect on**GDP**. - R square for
**GDP**model is also 99% verifying the goodness of fit. - Log-likelihood value 1064 is also highest, further indicating consistency.
- The constant identified in
**GDP**model is also significant with 0 p-value.

Therefore, the overall result presented in this article reveals two outcomes:

**GDP**and__PFC__Johnsen cointegration test indicates that there is no cointegration between the two-time series.*VAR*model indicates that__PFC__at lag one has significant effects on**GDP**.

The next article shows the analysis including an additional time series ** GFC**. The aim is to see how results of Johnsen cointegration test changes when adding

**as a variable in the**

*GFC**VAR*model along with

**GDP**and

__PFC__.

How to perform regression analysis using VAR in STATA? | How to perform Johansen cointegration test in VAR with three variables? |

### Divya Dhuria

#### Latest posts by Divya Dhuria (see all)

- How to identify ARCH effect for time series analysis in STATA? - October 4, 2018
- How to test and diagnose VECM in STATA? - October 4, 2018
- VECM in STATA for two cointegrating equations - September 27, 2018

Greetings, I’d first like to thank to the authors of making this kind of tutorials for Stata and the adecuate interpretations of the outputs. Second… I’ve got a question.

Why are you using VAR models when the variables in levels are non-stationary? Just because Johansen test indicated that there are no cointegration ecuations?

I’ve read a little about VAR and non-stationary variables and seems that it’s not good to use VAR in such variables for the spurious outcome.

Thank you.

John R

Hi,

I am wondering how I can perform Johansen fisher cointegration test for the panel data including the lag selection. Thank you.

best,

cba

This was very helpful to perform the analysis in my Thesis. Thanks.