ARIMA modeling for time series analysis in STATA

By Divya Dhuria & Priya Chetty on March 20, 2018

In the previous article, all possibilities for performing Autoregressive Integrated Moving Average (ARIMA) modeling for the time series GDP were identified as under.

S. NoARIMA
1(1,1,1)
2(1,1,2)
3(1,1,3)
4(1,1,4)
5(1,1,5)
6(1,1,6)
7(1,2,1)
8(4,2,1)
9(9,2,1)

 Table 1: ARIMA models as per ACF and PACF graphs.

Testing ARIMA models in STATA for time series analysis

The present article tests all these ARIMA models and identifies the appropriate one for the process of forecasting time series GDP. To start with testing ARIMA models in STATA:

  1. Click on ‘Statistics’ in the ribbon
  2. Click on ‘time-series’
  3. Select ‘ARIMA and ARMAX models’ (Figure 1 below)
Figure 1: Path for ARIMA modeling in STATA
Figure 1: Path for ARIMA modeling in STATA

Test 1: ARIMA (1,1,1)

A dialogue box will appear as shown in the figure below. Here fill in four important options to carry out ARIMA testing. First, select the time series variable fitting the ARIMA model. In the present case, the time series variable is GDP. Therefore select ‘gdp’ in the ‘Dependent variable’ option. Second, record the ARIMA model specifications estimated in the previous article. Therefore for the first ARIMA model, (1, 1, 1) (Table 1 above), select ‘1’ in ‘Autoregressive order (p)’, ‘1’ in ‘Integrated order (d)’, and ‘1’ in ‘Moving-average order (q)’.

Figure 2: Dialogue box for ARIMA modeling in STATA
Figure 2: Dialogue box for ARIMA modeling in STATA

After selecting the values for ARIMA model specifications, click on ‘Ok’ to proceed with the results (Figure 3 below).

Figure 3: Dialogue box for ARIMA modeling in STATA
Figure 3: Dialogue box for ARIMA modeling in STATA

Now ARIMA (1, 1, 1) results will appear, as the figure below shows.

Figure 4: ARIMA (1,1,1) results for time series GDP
Figure 4: ARIMA (1,1,1) results for time series GDP
Use 5E25A5EE63214 to save 5000 on 15001 - 20000 words standard order of literature survey.
Order now

ARIMA results can be analyzed through several components.

Log-likelihood: The log-likelihood component of the ARIMA model should be high, like in the present case. The value of log-likelihood (ignoring the negative sign) is 554. This is sufficiently high. Compare the log-likelihood value of different ARIMA models and select the one which has the highest.

Coefficient of AR: The coefficient of AR should be less than 1 and at least a 5% level of significance. Here, the coefficient of AR is significant at 5% (0.000) but is close to 1 (0.98967). This suggests that differenced time series GDP may still be non-stationary. Therefore, compare different ARIMA models based on the coefficients of AR and MA, their value (if close to zero), and their significance.

AIC/BIC: The value of ‘AIC’ and ‘BIC’ should be lowest in comparison to other ARIMA models. The value of AIC/BIC is usually the reverse of the log-likelihood function. Therefore instead of log-likelihood, compare different ARIMA models based on the value of AIC/BIC. The ARIMA model with the lowest AIC/BIC value will be more appropriate for forecasting.

Similarly, to compare the applicability of ARIMA (1,1,1) calculate the next ARIMA model (1,1,2) to compare these two models.

Test 2: ARIMA (1,1,2)

Again fill in the values in ARIMA specifications as per (1, 1, 2). After selecting the values for ARIMA model specifications, click on ‘OK’ to proceed with the results (Figure 5).

Figure 5: Dialogue box for ARIMA modeling in STATA
Figure 5: Dialogue box for ARIMA modeling in STATA

The figure below shows the results for ARIMA (1,1,2).

Figure 6: ARIMA (1,1,2) results for time series GDP
Figure 6: ARIMA (1,1,2) results for time series GDP

ARIMA results as presented in Figure 6 can be analyzed through several components, as below:

Log-likelihood: the value of log-likelihood (ignoring the negative sign) is 552 which is similar to the previous ARIMA model (1, 1, 1).

Coefficient of AR: The coefficient of AR and MA are significant but the coefficient of AR is insignificant at 5%. This suggests that differenced time series GDP may still be non-stationary. Therefore, similar to the previous model, ARIMA (1,1,2) also is not appropriate for forecasting.

AIC/BIC: The value of AIC and BIC is less than the previous model but only up to 1 point.  Therefore, no significant difference between ARIMA (1,1,1) and (1,1,2) can be seen. Thus both are inappropriate for forecasting time series GDP.

Test the remaining ARIMA models with different specifications following the same procedures (Figures 1, 2, and 3). Then click on ‘OK’ for results.

Comparison of all ARIMA Models

This section presents a comparison of all ARIMA forecasting models mentioned in Table 1. Values of AR and MA coefficients, their significance, and values of AIC and BIC are evaluated.

Table 2: Comparison of ARIMA models for time series GDP in STATA
Table 2: Comparison of ARIMA models for time series GDP in STATA

As mentioned previously, the variables of interest in appropriate ARIMA modeling are AR and MA components, AIC/BIC values, and significance level. Table 2 above has been organized as per these variables. The significance level of coefficients is indicated with the sign “*”.

To select the best ARIMA model, first, identify those models which have AR and MA coefficients as significant as well as lesser than 1. In the table above all the ARIMA models either have AR or MA coefficients close to 0 (indicating non-stationarity) or are insignificant at 5%. However, in the case of the ARIMA model (9, 2, 1), the majority of AR and MA coefficients are lesser than 1 and significant at 5%. Therefore, in terms of coefficient selection, the ARIMA model (9, 2, 1) is appropriate.

Use 5E707E4BC22F0 to save 6000 on 10001 - 15000 words emergency order of research paper.
Order now

Second, identify those ARIMA models with a minimum value of AIC or BIC. As per table 2, the ARIMA model (1, 2, 1) and the ARIMA model (9, 2, 1) are the only ones with the lowest AIC/BIC values. However, in the ARIMA model (1, 2, 1), the coefficient of MA is almost 1, with insignificance greater than 5%. Therefore, this model cannot be treated for estimating the time series GDP. Therefore, ARIMA (9, 2, 1) is the most appropriate one to estimate the GDP time series.

Thus, the ARIMA model (9, 2, 1) is the perfect model exhibiting all the structural trends in GDP data and can be useful for forecasting GDP. The following article explains prediction and forecasting using ARIMA in STATA.

I am a management graduate with specialisation in Marketing and Finance. I have over 12 years' experience in research and analysis. This includes fundamental and applied research in the domains of management and social sciences. I am well versed with academic research principles. Over the years i have developed a mastery in different types of data analysis on different applications like SPSS, Amos, and NVIVO. My expertise lies in inferring the findings and creating actionable strategies based on them. 

Over the past decade I have also built a profile as a researcher on Project Guru's Knowledge Tank division. I have penned over 200 articles that have earned me 400+ citations so far. My Google Scholar profile can be accessed here

I now consult university faculty through Faculty Development Programs (FDPs) on the latest developments in the field of research. I also guide individual researchers on how they can commercialise their inventions or research findings. Other developments im actively involved in at Project Guru include strengthening the "Publish" division as a bridge between industry and academia by bringing together experienced research persons, learners, and practitioners to collaboratively work on a common goal. 

 

Discuss

8 thoughts on “ARIMA modeling for time series analysis in STATA”