Predicting value stocks trend using ARIMA

By Riya Jain & Priya Chetty on May 11, 2021

Stock market investors need to forecast the movement of stocks before making an investment decision. It helps to create an optimum portfolio for better returns. There are different forecasting models like fundamental, technical, or analytics techniques (Almasarweh & Wadi, 2018; Dr C. Viswanatha Reddy, n.d.). Among these, the linear technical model of prediction is the Autoregressive integrated moving average (ARIMA). This article identifies the forecast model for value stocks for 303 stocks listed in the Bombay Stock Exchange for the period 2000 to 2020.

The previous article assessed the nature of the dataset for income stocks using the ARIMA model. Considering the average closing price, and return data, initially the assessment of the stationary nature of the dataset by ADF (Augmented Dickey-Fuller) test is done. Furthermore, the presence of serial correlation and partial serial correlation is examined by Correlogram and partial Correlogram. Thus, verification of the basic assumptions of the ARIMA model, the dataset for value stocks is tested at a 5% or 10% level of significance.

Stationarity test of value stocks

A time-series dataset is said to be stationary if it is stable over a period of time, i.e. unaffected by time. Before proceeding to the ARIMA model, it is important to ensure that the dataset is stationary. The presence of a non-stationary form of dataset not only makes the results spurious but also unreliable. Financial markets witness continuous changes and the prediction of future movement is dependent on the historical movement of stocks. Thus for building a forecasting model, it is essential to examine the nature of the dataset.

Average closing price

An investor assesses the average closing price of a stock for making a prediction regarding the closing price in future. For making these predictions, however, it is essential to understand the presence of stability in the dataset. Analysis results for the stationary nature of the dataset are shown in the below table

Variable	Test statistic	5% Critical value	10% Critical Value	p-value
AverageClose	-1.47	-2.86	-2.57	0.55
D.AverageClose	-51.99	-2.86	-2.57	0.00
D.DAverageClose	-75.92	-2.86	-2.57	0.00

Table 1: Augmented-Dickey Fuller test for the average closing price

The P-value for the average close is 0.55, which is higher than the required value of 0.05 or 0.10. Even the absolute test statistic value is less than the absolute critical value. This depicts that the null hypothesis of unit root presence in the dataset is not rejected. These results are also supported by trends shown in Figure 1 wherein the upward and downward movement of the closing price is influenced by time. Thus, the average closing price dataset is not stationary. Therefore we proceed to test stationarity at 1^st order difference.

Stationarity test for average closing price of value stocks — Figure 1: Stationarity test for the average closing price

As shown in figure 1 above, a p-value of 0.00 at the 1st-order difference level (D.AverageClose) of the average closing price is less than the significance value of 0.05 or 0.10. The absolute test statistic value is also higher than the absolute critical value. This depicts that the null hypothesis of unit root presence in the dataset is rejected. Figure 2 below supports the statistical analysis wherein, the movement in the 1st-order difference level of average closing price is not influenced by time. Thus, the stationary form is derived 1^st order difference.

Stationarity test for average closing price of value stocks at 1st order difference level — Figure 2: Stationarity test for average closing price at 1^st order difference level

The 2^nd order difference level assessment of the average closing price shows the p-value of 0.00 < 0.05 or 0.10. The absolute test statistic value is more than the absolute critical value i.e. 75.92 > 2.86 or 2.57. Thus, the null hypothesis of a unit root in the dataset is rejected. Trend analysis also supports that ADF results wherein, the movement of average closing price at 2^nd order difference level is not affected by time. Hence, 2^nd order difference level average closing price is stationary.

Stationary test for average closing price of value stocks at 2nd order difference level — Figure 3: Stationary test for average closing price at 2^nd order difference level

The stationarity test of average closing prices for value stocks shows stationary nature at the 1^st and 2^nd order difference levels.

Average return

Average return defines the earning capacity from a particular stock. Investors making decisions sometimes focus on the return generation capacity of stock instead of price. Thus, it is required to know whether the dataset is stable enough to make predictions or not. The analysis results for the average return dataset is shown below.

Variable	Test statistic	5% Critical value	10% Critical Value
AverageReturn	-63.92	-2.86	-2.57
D.AverageReturn	-89.98	-2.86	-2.57
D.DaverageReturn	-89.64	-2.86	-2.57

Table 2: Augment-Dickey Fuller test for the average return

The above table has the average return p-value as 0.00 < 0.05 or 0.10. Even the absolute test statistic value is more than the absolute critical value. Thus, the null hypothesis of unit root presence in the dataset is rejected. Figure 4 below supports these results wherein, the movement of the average closing price is not influenced by variation in time. Hence, the dataset is stationary in nature.

Stationary test for average return of value stocks — Figure 4: Stationary test for the average return

Table 2 shows that the p-value of the average return at the 1st-order difference level is 0.00 which is less than the required value of 0.05 or 0.10. A comparison of the absolute test statistic value with the absolute critical value shows that 89.98 > 2.86 or 2.57. Thus, the null hypothesis of a unit root in the dataset is rejected. Trend assessment of the 1st order differenced average return also shows that the movement in return value is not affected by the change in time. Hence, the 1st-order difference level of average return is stationary.

Stationary test for average return of value stocks at 1st order difference level — Figure 5: Stationary test for average return at 1^st order difference level

The analysis is shown in Table 2 for the 2^nd order difference level also has a p-value of 0.00 < 0.05 or 0.10. The absolute test statistic value is also higher than the absolute critical value depicting the rejection of the null hypothesis wherein it is stated that unit root is present. Graphical analysis of data verifies the result with having no major influence of time on the variation in the movement of average return at the 2nd-order difference level. Thus, the stationary form is derived at the 2nd-order difference level.

Stationary test for average return of value stocks at 2nd order difference level — Figure 6: Stationary test for average return at 2^nd order difference level

Hence, the examination of average return shows that the stationary nature of the dataset is derived at level test, 1^st order and 2^nd order difference level.

Correlogram test for value stocks

The presence of randomness in the dataset often leads to errors over time. In financial markets, future performance is majorly influenced by past performance. Thus there is a risk of the influence of certain errors in past on the future. In order to avoid these, the serial correlation of the dataset is assessed using a correlogram test. The below sub-sections present the analysis for the average closing price and average return dataset.

Average closing price

After establishing stationarity, it is essential to also understand the presence of serial correlation in the dataset of the closing price. It helps determine that the current value of the average closing price is dependent on historical values. Analysis for serial assessment correlation of the variable is shown below.

The above sections show that the average closing price is derived from its stationary nature at the 1st and 2nd order difference level, thus serial correlation presence is assessed for the average closing price at these levels. In Figure 7, the shaded region represents the acceptance region while straight lines represent the autocorrelation value at different lags. Having the lag value outside the acceptance region for 5 and 12, they are considered the lags having serial correlation. The forecasting model will include the moving average of 5 or 12 at the 1st-order difference level to remove this impact.

Average closing price of value stocks at 2nd Diff level correlogram test — Figure 8: Average closing price at 2^nd Diff level correlogram test

For the 2^nd order difference level value of average closing price, only at lag level 1, the autocorrelation value is outside the acceptance region. Thus, the moving average level considered for the value stocks forecasting model would be 1 at the 2nd-order difference level.

Average return

An effective prediction about the average return could be made by linking it with historical values. Thus, to assess the presence of serial correlation for the variable, the correlogram testing is shown in the below figure.

The stationary form of the average return dataset is derived at level tests, 1st order difference and 2nd order difference level. Therefore, the presence of serial correlation is assessed for each of these levels. Figure 9 shows that lag level 3 and 12 have autocorrelation value outside the acceptance region. Thus, for building the forecasting model and reducing the influence of serial correlation, the moving average value that would be considered in the model is 3 and 12 at the level test.

Average return at 1st Diff level correlogram test — Figure 10: Average return at 1^st Diff level correlogram test

In the above, the autocorrelation value for lag 1, 4, 35, and 36 is outside the acceptance region. Thus, the moving average value at the 1^st order difference level in the forecasting model would be 1, 4, 35, and 36.

Average return at 2nd Diff level correlogram test — Figure 11: Average return at 2^nd Diff level correlogram test

The above figure shows that only for lag level 1, the autocorrelation value for average return is outside the acceptance region. Thus, the moving average value at the 2^nd order difference level in the average return forecasting model is 1.

Partial correlogram test for value stocks

A partial correlogram in a time series examines the presence of partial autocorrelation. As in financial markets, the movement of stocks’ value is dependent on their historical data, thus, it is essential to assess the existence of partial autocorrelation. The below sub-section presents the analysis for an average closing price and average return.

Average closing price

Similar to the serial correlation, partial autocorrelation presence also needs to be examined to determine the linkage with historical values. The analysis result for the average closing price is shown below.

Derivation of stationary form for average closing price at 1^st and 2^nd order difference level, partial autocorrelation presence is assessed at these levels. With lag 2 partial autocorrelation value outside the acceptance region, the forecasting model of average closing price would include 2 autoregressive.

Partial correlogram test at 2nd Diff average closing price of value stocks — Figure 13: Partial correlogram test at 2^nd Diff average closing price

The above figure shows that the partial autocorrelation value for the entire lag level is outside the acceptance region. Thus, to remove the influence of partial autocorrelation, 1 to 3 would be considered autoregressive at the 2nd-order difference level forecasting model.

Average return

For the average return too, the presence of partial autocorrelation is assessed using partial correlogram testing. The results of the analysis are shown below.

For average return, the stationary form was derived at level test, 1^st order and 2^nd order difference, thus a partial correlogram for an average return would be assessed at these levels. Figure 14 shows that lag value of 3 and 5 has partial autocorrelation value outside the acceptance region. Thus, the autoregressive of 3 and 5 would be considered in the forecasting model for eliminating the influence of partial autocorrelation.

Partial correlogram test at 1st Diff average return — Figure 15: Partial correlogram test at 1^st Diff average return

The above figure shows that for all the lags i.e. 1 to 4 the partial autocorrelation value is higher than the required acceptance value. Thus, the forecasting model of average return at 1st order difference level would include an autoregressive value of 1 to 4 to reduce the influence of partial autocorrelation.

Partial correlogram test at 2nd Diff average return — Figure 16: Partial correlogram test at 2^nd Diff average return

The above figure shows that for the lag values of 1 to 3, the partial autocorrelation value is outside the acceptance region. Thus, the autoregressive value from 1 to 3 could reduce the influence of partial autocorrelation in the forecasting model of average return at the 2nd-order difference level.

ARIMA-based forecasting model for value stocks

The models listed in Table 3 are identified as effective forecasting models for an average closing price, and average return.

Average Closing Price	Average Return
(2,1,5)	(3,0,3)
(2,1,12)	(5,0,3)
(1,2,1)	(3,0,12)
(2,2,1)	(5,0,12)
(3,2,1)	(1,1,1)
	(2,1,1)
	(3,1,1)
	(4,1,1)
	(1,1,4)
	(2,1,4)
	(3,1,4)
	(4,1,4)
	(1,1,35)
	(2,1,35)
	(3,1,35)
	(4,1,35)
	(1,1,36)
	(2,1,36)
	(3,1,36)
	(4,1,36)
	(1,2,1)
	(2,2,1)
	(3,2,1)

Table 3: Possible ARIMA model

The above table states all the possible ARIMA models for the income stocks. These models are derived based on the identified level that needs to be included to derive stability by ADF test, correlogram, and partial correlogram. This is necessary because effective prediction is possible only when the model is stable and free from the presence of serial correlation or partial correlation.

References

Almasarweh, M., & Wadi, S. AL. (2018). ARIMA Model in Predicting Banking Stock Market Data. Modern Applied Science, 12(11), 309. https://doi.org/10.5539/mas.v12n11p309
Dr. C. Viswanatha Reddy. (n.d.). Predicting the Stock Market Index Using Stochastic Time Series. 2018.
Shah, D., Isah, H., & Zulkernine, F. (2019). Stock market analysis: A review and taxonomy of prediction techniques. International Journal of Financial Studies, 7(2). https://doi.org/10.3390/ijfs7020026

Priya Chetty
Riya Jain

I am a management graduate with specialisation in Marketing and Finance. I have over 12 years' experience in research and analysis. This includes fundamental and applied research in the domains of management and social sciences. I am well versed with academic research principles. Over the years i have developed a mastery in different types of data analysis on different applications like SPSS, Amos, and NVIVO. My expertise lies in inferring the findings and creating actionable strategies based on them.

Over the past decade I have also built a profile as a researcher on Project Guru's Knowledge Tank division. I have penned over 200 articles that have earned me 400+ citations so far. My Google Scholar profile can be accessed here.

I now consult university faculty through Faculty Development Programs (FDPs) on the latest developments in the field of research. I also guide individual researchers on how they can commercialise their inventions or research findings. Other developments im actively involved in at Project Guru include strengthening the "Publish" division as a bridge between industry and academia by bringing together experienced research persons, learners, and practitioners to collaboratively work on a common goal.

I am a Senior Analyst at Project Guru, a research and analytics firm based in Gurugram since 2012. I hold a master’s degree in economics from Amity University (2019). Over 4 years, I have worked on worked on various research projects using a range of research tools like SPSS, STATA, VOSViewer, Python, EVIEWS, and NVIVO. My core strength lies in data analysis related to Economics, Accounting, and Financial Management fields.

Stationarity test of value stocks

Average closing price

Average return

Correlogram test for value stocks

Average closing price

Average return

Partial correlogram test for value stocks

Average closing price

Average return

ARIMA-based forecasting model for value stocks

References

Discuss