# Testing the accuracy of the stock market forecasting model

The stock market is an uncertain market consisting of opportunities to gain and lose. The risk presence in the market creates the need for a stock market forecasting model for the movements of stocks and an understanding of the possible position of stocks in position. Information on future stock movements can enable the investor to take profitable decisions (Patil & Bagodi, 2021).

The previous article, considering the stock market’s uncertain nature; is focused on building a forecasting model. Such a model shall help to understand the stock movements based on a time series dataset from 2000 to 2020. Using the ARIMA model, the most optimal model was identified, and based on it predictions were made for the next 5 years i.e. from 1^{st} April 2020 to 31^{st} March 2026.

However, before putting the stock market forecasting model into decision-making, there is a dire need to know the accuracy of the model (Guo, 2022). This article focuses on the comparison of the predicted data based on the model for the first 3 years i.e. from 1^{st} April 2020 to 31^{st} March 2023, with the actual Bombay Stock Exchange’s data for the same period.

The examination of model accuracy would be based on 2 rules (Kompella & Chilukuri, 2019) i.e.:

- The value of the
**Mean square error (MSE)**, and*Root mean square error (RMSE)*should be closer to 0 as the indicators are used for recording the deviation of the actual value from the predicted values. - The paired t-test null hypothesis of having no difference in the actual and predicted value should not be rejected.

The hypothesis considered for the testing is:

H

_{0}: There is no difference in the average return for the predicted and observed values.

H

~~: There is the presence of a significant difference between predicted and observed average return values.~~_{A}

The stated hypothesis will be tested at a 5% level of significance (Artman & Artman, 2014; Mindrila & Balentyne, 2013). The model that fulfils both conditions would be accepted as the accurate model and could be used further by the investors for predicting stock market performances.

## Accuracy check for the growth stocks

The average return model in the case of growth stocks is formulated using the (3,2,1) model of ARIMA for forecasts. The model enabled the computation of daily average return from 1^{st} April 2020 to 31^{st} March 2025. As currently the actual data is until 2023, thus; data from 1^{st} April 2020 to 31^{st} March 2023 is taken to have a comparison of 3 years’ performance of growth stocks.

For this two indicators are selected i.e. **MSE** and *RMSE*. **MSE** is the square of the average difference between the actual and predicted value for the given dataset. The measure is used to measure the variance of the residuals (Varshini et al., 2021). However, *RMSE* defined **MSE** square root by stating in residual standard deviation (Hodson, 2022). The results of both measures are shown below.

MSE for an average return | RMSE |
---|---|

0.0001 | 0.0085 |

The above table shows that the value of **MSE** for the average return daily data is 0.0001. This is close to 1 and the *RMSE* value is 0.0085. The value for each of the indicators is close to 0. This shows that there is a very slight difference between the values predicted and the actual observed value of the average return.

These accuracy indicator values verify that the model is effective and could be used for forecasting the average returns for the growth stocks. The results are further validated by hypothesis testing.

Variable | Mean | Std Error | Std dev | t-stat | p-value |
---|---|---|---|---|---|

Actual | -0.0021 | 0.0003 | 0.0092 | -0.3700 | 0.7114 |

Predicted | -0.0020 | 0.0002 | 0.0044 |

The above table represents that the mean value of the actual average return of growth stocks is -0.0021, while of predicted is -0.0020; which closes with a difference of -0.0001. Even, the standard error values are close and the presence of deviation recorded in the predicted values is less than the variation of the actual value.

The statistical examination of the difference in actual and observed values is done using a t-stat. The p-value for the test is 0.7114. This is more than the significance level of 0.05. The null hypothesis of having no difference in the actual and predicted average return values of the growth stocks is not rejected. Hence, there is no difference in the predicted and actual values and the model is accurate in predicting the average return for growth stocks.

## Accuracy of the stock market forecasting model for income stocks

The income stock model is developed using (3,0,3) ARIMA form wherein; based on the stated values, the forecasts for the average return values have been made. The comparison needs to be made for checking the accuracy of the model and as the data is available only till 2023, thus, the data is considered only till 31^{st} March 2023. For making the comparison value of **MSE** and *RMSE* is computed and the results of the measures are presented below.

MSE for an average return | RMSE |
---|---|

0.0001 | 0.0099 |

The above table 3, shows that the value for both indicators is close to 0; the **MSE** value is 0.0001 and *RMSE* is 0.0099. Here the measure values are within the required range of an accurate model. Thus, the model developed for forecasting the average return in the case of income stocks is appropriate and could be used for predictions.

Though the accuracy measures help in providing an overview of the data, statistical testing also is essential to know if there exists any difference between the values. The results of the paired t-test for comparison are shown below

Variable | Mean | Std Error | Std dev | t-stat | p-value |
---|---|---|---|---|---|

Actual | -0.0023 | 0.0004 | 0.0118 | -1.5427 | 0.1231 |

Predicted | -0.0017 | 0.0001 | 0.0018 |

The above table 4 reveals that the value of the mean is -0.0023, while the predicted one is -0.0017. The difference between both the values is -0.0005; revealing that there is a small difference between both values.

The standard error in both cases is different with a difference of 0.0003. Even, the standard deviation of actual and predicted ones are different as compared to the actual data which has more variation. While predicted values have less variation in their dataset.

To understand the difference, a t-test for the data is done and the results reveal that the p-value is 0.1231 > 0.05. The null hypothesis of having no difference in the values of actual and predicted average return for the income stocks is not rejected. As the hypothesis is not rejected, this shows the presence of no difference in the values making the predicted value close to the actual one. Hence, the stock market forecasting model to predict income stocks’ average return, is accurate.

## Testing the accuracy of the value stock forecasting model

The value stock forecasting model is developed using ARIMA with values of (5,0,12). The forecasted average return values depict the 5 years performance of the selected value stocks. But, as comparison can only be done for the data available till 2023; thus, the 3-year-based assessment has been done using **MSE** and *RMSE* measures. The results of *RMSE* and **MSE** measures are shown below.

MSE for an average return | RMSE |
---|---|

0.0001 | 0.0117 |

Each of the mentioned indicator values from the above table is close to 0. This shows that the required criteria of having less of a difference between the actual and predicted values are fulfilled.

A value close to 0 represents the existence of less difference and hence the model is accurate. These findings are further statistically verified by using paired t-tests. The results from the tests are given below.

Variable | Mean | Std Error | Std dev | t-stat | p-value |
---|---|---|---|---|---|

Actual | -0.0023 | 0.0005 | 0.0139 | 0.9411 | 0.3468 |

Predicted | -0.0027 | 0.0000 | 0.0016 |

The above table shows that the mean value of the predicted data is -0.0027 while of actual one is -0.0023. The values are different with a difference of -0.0004. The existence of close values represents that there is less variation in the predicted and observed values.

However, the presence of standard error can be observed in the actual data. Furthermore, the standard error for the predicted values is 0.0000. The variation in the average return values daily has been more in actual data as compared to the predicted values.

The mean, standard error, and deviation present an overview of the data. But for statistical comparison of both data, the t-test is performed. The p-value of 0.3468 is more than the required significance level of 0.05.

Furthermore, this shows that the null hypothesis of having no difference in the average return of predicted and actual value for value stocks is present. Hence, the model is effective in measuring the average return for value stocks and the predictions could be made for the future to make effective investment decisions.

## The stock market forecasting model developed using ARIMA is accurate

In the stock market, the optimisation of the forecasting model is essential to make accurate forecasts and take profitable investment decisions. This article overcame this hindrance by having the comparison of the actual and predicted values using different accuracy measures and statistical hypothesis testing. As the value of the **MSE** and *RMSE* were close to 0 for all stocks, the models are accurate.

Furthermore, to support this conclusion the hypothesis was not rejected. This shows that there is no difference in the actual and predicted values. Hence, there is no variation in the prediction and the actual values and the investors can consider the defined model for predictions.

#### References

- Artman, M., & Artman, M. (2014). An empirical study for testing the stock market efficiency and identifying abnormal return opportunities.
- Guo, Y. (2022). Stock Price Prediction Using Machine Learning. In Södertörn University | School of Social ScienceMaster.
- Kompella, S., & Chilukuri, K. C. (2019). Stock market prediction using Machine learning methods.
- International Journal of Computer Engineering & Technology (IJCET), 10(3), 20–30.
- Mindrila, D., & Balentyne, P. (2013). Tests of Significance. In The Basic Practice of Statistics.
- Patil, S., & Bagodi, V. (2021). A study of factors affecting investment decisions in India: The KANO way.
- Asia Pacific Management Review, 26(4), 197–214. https://doi.org/10.1016/j.apmrv.2021.02.004

## Discuss