Forecasting income stocks trend with the ARIMA model

By Riya Jain & Priya Chetty on May 7, 2021

Financial instruments have different degrees of risk and return. Investors deal with uncertainties in the stock market in order to optimize financial returns. This includes forecasting stock prices and stock market trends. This article uses ARIMA to predict the movement of income stocks. Among various technical and fundamental methods to forecast stock market trends, ARIMA is considered one of the most efficient ones. It includes analysis of historical data in order to predict future trends (Adebayo & Shangodoyin, 2014).

The previous article examined the application of ARIMA on growth stocks listed on the Bombay Stock Exchange for the period 2000 to 2020. Firstly, stationarity is established using the Augmented Dickey-Fuller test (ADF), partial correlation presence is determined by a partial correlogram, and lastly, the autocorrelation presence is examined using the correlogram. All these tests are performed at a 5% or 10% level of significance.

Stationarity test for income stocks

A dataset is regarded as stationary if it remains stable and unaffected by variation in time. In the financial market for building an efficient forecasting model, establishing the stationarity of the dataset is essential. The below sub-section examines the nature of the dataset for the average closing price and the average return of the income stocks for the selected period.

Average closing price

The assessment of the stationarity level for the average closing price is done in order to understand the presence of stability in the dataset to make a relevant prediction. Results for the examination of the average closing price dataset are shown in the below table.

VariableTest statistic5% Critical value10% Critical Valuep-value
AverageClose-1.29-2.86-2.570.63
D.AverageClose-61.62-2.86-2.570.00
D.DAverageClose-83.01-2.86-2.570.00
Table 1: Augment-Dickey Fuller test for the average closing price

The above table depicts that the p-value for the level test is 0.63 > 0.05 or 0.10 which is greater than the required significance level. Absolute t-statistic value is less than the absolute critical value i.e. 1.29 < 2.86 or 2.57. Thus, the null hypothesis of unit root presence in the dataset is not rejected. Trend analysis for the dataset is shown in Figure 1 below, wherein, the value of the closing price shows the presence of a time-based trend i.e. upward and downward movement. Thus, there is the presence of unit root in the series of the closing price.

Stationarity test for average closing price of income stocks
Figure 1: Stationarity test for the average closing price

As the series is non-stationary, the closing price values at the 1st-order difference level are considered. Herein, the p-value is 0.00 which is less than the significance level i.e. 0.05 or 0.10. The absolute test statistic value is also more than the absolute critical value. Thus, the null hypothesis depicting the presence of unit root in the series is rejected. Figure 2 below, also examines the nature of closing price at 1st order difference. Herein, the movement of closing price shows that variation of time does not have any influence on them. Thus, at 1st order difference closing price is stationary in nature.

Stationarity test for average closing price of income stocks at 1st order difference level
Figure 2: Stationarity test for average closing price at 1st order difference level

The assessment of the stationary nature of income stocks at the 2nd order difference level further depicts that the p-value is 0.00 < 0.05 or 0.10 and the absolute test statistic value is 83.01 > 2.86 or 2.57. Thus, the null hypothesis of the presence of unit root in the dataset is rejected. This result is also supported by the below figure wherein the trend line depicts that movement in the closing price of stocks is not related to time. Hence, there is stationarity present in the closing price of income stocks.

Stationarity test for average closing price of income stocks at 2nd order difference level
Figure 3: Stationarity test for average closing price at 2nd order difference level

The examination of the average closing price of income stocks thus is stationary at 1st-order and 2nd-order difference levels.

Average return

The average return is the earnings that an investor derives from a particular investment. In order to make relevant predictions about the average return, the stability present in the dataset needs to be observed in a stationarity test. The results of the analysis are shown below in the table.

VariableTest statistic5% Critical value10% Critical Valuep-value
AverageReturn-63.56-2.86-2.570.00
D.AverageReturn-90.84-2.86-2.570.00
D.DaverageReturn-94.43-2.86-2.570.00
Table 2: Augment-Dickey Fuller test for the average return

The above table shows that the p-value for the average return is 0.00 which is less than the significance level of 0.05 or 0.10. The absolute test statistic value is also greater than the absolute critical value i.e. 63.56 > 2.86 or 2.57. Thus, the null hypothesis of unit root presence in the dataset is rejected. The below figure supports the statistical results wherein the trend of average return is not related to the variation in time. Hence, the stationary form is derived at the level test.

Stationarity test for average return
Figure 4: Stationarity test for the average return

The examination of average return further at the 1st order difference level represents that with a p-value of 0.00 < 0.05 or 0.10 and an absolute test statistic value of 90.84 > 2.86 or 2.57, the null hypothesis is rejected. Thus, there is no unit root present in the dataset of the average return at 1st order difference for the selected period. This result is also verified by the trend-based assessment of the average return movement in figure 5 below wherein there is no linkage of variation with time. Hence, the dataset is in stationary form.

Stationarity test for average return at 1st order difference level
Figure 5: Stationarity test for average return at 1st order difference level

Furthermore, the analysis of the average return at 2nd order difference level examines the p-value i.e. 0.00 < 0.05 or 0.10 and absolute test statistic i.e. 94.43 > 2.86 or 2.57. Thus, the null hypothesis of unit root presence in the dataset is rejected. The below figure also represents similar results wherein the movement of average return at 2nd order difference is not related to the time variation. Hence, the stationary form of the dataset is derived.

Stationarity test for average return at 2nd order difference level
Figure 6: Stationarity test for average return at 2nd order difference level

The assessment of the average return for the income stocks depicts that there is a presence of stationary form in the dataset at level test, 1st order difference, and 2nd order difference level.

Correlogram test for income stocks

In order to examine the presence of autocorrelation in the dataset, the correlogram test for the time series is practised. Herein, the financial market involves the interaction of the current values of stocks with their past performance. Thus, the below sub-section represents a correlogram test for the average closing prices and average return.

Average closing prices

Average closing price at 1st Diff level correlogram test
Figure 7: Average closing price at 1st Diff level correlogram test

In Figure 7 above, the shaded region represents the acceptance region for the autocorrelation value while the straight line depicts the autocorrelation values at different lags. Herein, the moving average could take the lag value like 1, 9, 18, 33 and 34 and beyond it, as the lag value is within the acceptance region thus, they are not considered further for analysis. Thus, to forecast income stocks performance, the moving average level at 1st order difference is 1, 9, 18, 33 and 34.

Average closing price at 2nd Diff level correlogram test
Figure 8: Average closing price at 2nd Diff level correlogram test

Figure 8 above further shows that in the assessment of the correlogram at the 2nd difference level the autocorrelation value is outside the acceptance region for the lag values of 1, 8, 15, 19, 20 and 28. Thus, the lag value for the average closing price at the 2nd order difference level is 1,8, 15, 19, 20 and 28.

Average return

Average return correlogram test
Figure 9: Average return correlogram test

The assessment of the serial autocorrelation for the average return is shown in Figure 9 above. Herein, the autocorrelation value for most of the lags except 3, 6, 16 or 35 is within the acceptance region. Thus, to build a forecasting model for predicting average return, the possible moving average could be 3, 6, 16 or 35.

Figure 10: Average return at 1st Diff level correlogram test

The analysis of the serial autocorrelation at the 1st order difference level depicts that with the autocorrelation value within the acceptance region except for the lag value of 1, 6 and 35. Thus, the level of moving average considered for the forecasting model of the average return is 1, 6 or 35.

Average return at 2nd Diff level correlogram test
Figure 11: Average return at 2nd Diff level correlogram test

Figure 11 above further depicts the serial autocorrelation value at the 2nd order difference level. As the autocorrelation value of 1, 2, and 35 is outside the acceptance region, the forecasting model will have 1, 2 or 35 as the moving average level.

Partial correlogram test for income stocks

It examines the partial autocorrelation presence in a time series dataset. A partial correlogram test helps produce an adequate dataset. In order to predict the future movement of stocks, investors examine historical data. The below sub-sections examine the average closing price and average return performance for income stocks.

Average closing price

Partial correlogram test at 1st Diff average closing price
Figure 12: Partial correlogram test at 1st Diff average closing price

In the case of the assessment of average closing price partial autocorrelation, the analysis is shown in figure 12 above. Herein, the lag values 1, 2 and 4 are outside the acceptance region. Thus, in order to build in the forecasting model 1, 2 or 4 would be the selected autoregressive level in the case of 1st order difference value of average closing price.

Partial correlogram test at 2nd Diff average closing price of income stocks
Figure 13: Partial correlogram test at 2nd Diff average closing price

Figure 13 shows that at 2nd order difference level depicts that for all the three lags i.e. 1, 2 and 3 as the partial autocorrelation values are outside the acceptance region. Thus, the forecasting model for the average closing price at the 2nd order difference level could have 1, 2 or 3 values of autoregressive.

Average return

Partial correlogram test of average return of income stocks
Figure 14: Partial correlogram test of average return

The examination of the partial autocorrelation shown in figure 14 above depicts that as for the lag value of 1 to 4, the value is outside the acceptance region. Thus, the forecasting model of the average return at 0 levels of difference could have an autoregressive from 1 to 4.

Partial correlogram test at 1st Diff average return
Figure 15: Partial correlogram test at 1st Diff average return

Figure 15 above shows that for all the lag values of 1 to 4, the partial autocorrelation value is outside the acceptance region. Thus, the autoregressive value in the case of the forecasting model at the 1st order difference level could be from 1 to 4.

Partial correlogram test at 2nd Diff average return
Figure 16: Partial correlogram test at 2nd Diff average return

The examination of the partial autocorrelation values of average return is shown in Figure 16 above. As for the lag values of 1 to 3, the values are outside the acceptance region. Thus, the forecasting model in the case of average return has autoregressive values from 1 to 3.

ARIMA-based forecasting model for income stocks

The possible ARIMA model is shown in the below table for average return and average closing price.

Average closing priceAverage return
(1,1,1)(1,0,3)
(2,1,1)(2,0,3)
(4,1,1)(3,0,3)
(1,1,9)(4,0,3)
(2,1,9)(1,0,6)
(4,1,9)(2,0,6)
(1,1,18)(3,0,6)
(2,1,18)(4,0,6)
(4,1,18)(1,0,16)
(1,1,33)(2,0,16)
(2,1,33)(3,0,16)
(4,1,33)(4,0,16)
(1,1,34)(1,0,35)
(2,1,34)(2,0,35)
(4,1,34)(3,0,35)
(1,2,1)(4,0,35)
(2,2,1)(1,1,1)
(3,2,1)(2,1,1)
(1,2,8)(3,1,1)
(2,2,8)(4,1,1)
(3,2,8)(1,1,6)
(1,2,15)(2,1,6)
(2,2,15)(3,1,6)
(3,2,15)(4,1,6)
(1,2,19)(1,1,35)
(2,2,19)(2,1,35)
(3,2,19)(3,1,35)
(1,2,20)(4,1,35)
(2,2,20)(1,2,1)
(3,2,20)(2,2,1)
(1,2,28)(3,2,1)
(2,2,28)(1,2,2)
(3,2,28)(2,2,2)
(3,2,2)
(1,2,35)
(2,2,35)
(3,2,35)
Table 3: Possible ARIMA model

The above table states all the possible ARIMA models for the income stocks. These models are derived based on the identified level which needs to be included for deriving stability by ADF test, correlogram, and partial correlogram. This is necessary because effective prediction is possible only when the model is stable and free from the presence of serial correlation or partial correlation.

References

  • Adebayo, F. A., & Shangodoyin, K. (2014). Forecasting Stock Market Series with ARIMA Model. Journal of Statistical and Econometric Methods, 3(3), 65–77.
  • Dhyani, B., Kumar, M., Verma, P., & Jain, A. (2020). Stock Market Forecasting Technique using Arima Model. International Journal of Recent Technology and Engineering, 8(6), 2694–2697. https://doi.org/10.35940/ijrte.f8405.038620
NOTES

Discuss