# ARIMA modeling for time series analysis in STATA

In the previous article, all possibilities for performing **Autoregressive Integrated Moving Average (ARIMA) **modeling for the time series GDP were identified as under.

S. No |
ARIMA |

1 | (1,1,1) |

2 | (1,1,2) |

3 | (1,1,3) |

4 | (1,1,4) |

5 | (1,1,5) |

6 | (1,1,6) |

7 | (1,2,1) |

8 | (4,2,1) |

9 | (9,2,1) |

* *Table 1: **ARIMA** models as per ACF and PACF graphs.

## Testing ARIMA models in STATA for time series analysis

The present article tests all these **ARIMA** models and identifies the appropriate one for the process of forecasting time series GDP. To start with testing **ARIMA** models in STATA:

- Click on ‘Statistics’ in the ribbon
- Click on ‘time-series’
- Select ‘
**ARIMA**and ARMAX models’ (Figure 1 below)

## Test 1: ARIMA (1,1,1)

A dialogue box will appear as shown in the figure below. Here fill four important options to carry out **ARIMA** testing. First, select the time series variable fitting the **ARIMA** model. In the present case, the time series variable is GDP. Therefore select ‘gdp’ in the ‘Dependent variable’ option. Second, record the **ARIMA** model specifications estimated in the previous article. Therefore for the first **ARIMA** model, (1, 1, 1) (Table 1 above), select ‘1’ in ‘Autoregressive order (p)’, ‘1’ in ‘Integrated order (d)’, and ‘1’ in ‘Moving-average order (q)’.

After selecting the values for **ARIMA** model specifications, click on ‘Ok’ to proceed for results (Figure 3 below).

Now ARIMA (1, 1, 1) results will appear, as the figure below shows.

**5E25A5EE63214**to save 5000 on

**15001 - 20000 words**standard order of

**literature survey**service.

^{}**ARIMA** results can be analyzed through several components.

**Log-likelihood: **The log-likelihood component of the **ARIMA** model should be high, like in the present case. The value of log-likelihood (ignoring negative sign) is 554. This is sufficiently high. Compare the log-likelihood value of different **ARIMA** models and select the one which has the highest.

**Coefficient of AR:** The coefficient of

**should be less than 1 and at least a 5% level of significance. Here, the coefficient of**

*AR***is significant at 5% (0.000) but is close to 1 (0.98967). This suggests that differenced time series GDP may still be non-stationary. Therefore, compare different**

*AR***ARIMA**models based on the coefficients of

**and**

*AR***, their value (if close to zero), and their significance.**

__MA__**AIC/BIC: **The value of ‘AIC’ and ‘BIC’ should be lowest in comparison to other **ARIMA** models. The value of AIC/BIC is usually the reverse of the log-likelihood function. Therefore instead of log-likelihood, compare different **ARIMA** models based on the value of AIC/BIC. The **ARIMA** model with the lowest AIC/BIC value will be more appropriate for forecasting.

Similarly, to compare the applicability of ARIMA (1,1,1) calculate the next ARIMA model (1,1,2) to compare these two models.

### Test 2: ARIMA (1,1,2)

Again fill the values in **ARIMA** specifications as per (1, 1, 2). After selecting the values for **ARIMA** model specifications, click on ‘OK’ to proceed for results (Figure 5).

The figure below shows the results for ARIMA (1,1,2).

**ARIMA** results as presented in above Figure 6 can be analyzed through several components, as below:

**Log-likelihood: **the value of log-likelihood (ignoring negative sign) is 552 which is similar to the previous **ARIMA** model (1, 1, 1).

**Coefficient of AR:** The coefficient of

**and**

*AR***are significant but the coefficient of**

__MA__**is insignificant at 5%. This suggests that differenced time series GDP may still be non-stationary. Therefore, similar to the previous model, ARIMA (1,1,2) also is not appropriate for forecasting.**

*AR***AIC/BIC: **The value of AIC and BIC is less than the previous model but only up to 1 point. Therefore, no significant difference between ARIMA (1,1,1) and (1,1,2) can be seen. Thus both are inappropriate for forecasting time series GDP.

Test the remaining **ARIMA** models with different specifications following the same procedures (Figures 1, 2, and 3). Then click on ‘OK’ for results.

## Comparison of all ARIMA Models

This section presents a comparison of all **ARIMA** forecasting models mentioned in Table 1. Values of AR and MA coefficients, their significance, and values of AIC and BIC are evaluated.

As mentioned previously, the variables of interest in appropriate **ARIMA** modeling are ** AR** and

**components, AIC/BIC values, and significance level. Table 2 above has been organized as per these variables. The significance level of coefficients is indicated with sign “*”.**

__MA__To select the best **ARIMA** model, first, identify those models which have ** AR** and

**coefficients as significant as well as lesser than 1. In the table above all the ARIMA models either have**

__MA__**or**

*AR***coefficients close to 0 (indicating non-stationarity) or are insignificant at 5%. However, in the case of the**

__MA__**ARIMA**model (9, 2, 1), the majority of

**and**

*AR***coefficients are lesser than 1 and significant at 5%. Therefore, in terms of coefficient selection, the ARIMA model (9, 2, 1) is appropriate.**

__MA__Second, identify those **ARIMA** models with a minimum value of AIC or BIC. As per table 2, the ARIMA model (1, 2, 1) and the ARIMA model (9, 2, 1) are the only ones with the lowest AIC/BIC values. However, in the ARIMA model (1, 2, 1), the coefficient of ** MA** is almost 1, with insignificance greater than 5%. Therefore, this model cannot be treated for estimating the time series GDP. Therefore, ARIMA (9, 2, 1) is the most appropriate one to estimate the GDP time series.

Thus, the ARIMA model (9, 2, 1) is the perfect model exhibiting all the structural trends in GDP data and can be useful for forecasting GDP. The following article explains prediction and forecasting using **ARIMA** in STATA.

## Discuss