Introduction to the Autoregressive Integrated Moving Average (ARIMA) model

By Riya Jain and Priya Chetty on September 29, 2020

A time series refers to the sequence of data recorded over the regular time intervals. It could be hourly, daily, monthly, quarterly, or yearly. Analyses of time series data provide meaningful information about the characteristics of the dataset and even helps in predicting future movements by understanding past observations. With a growing concern about uncertainties in the market, the need for forecasting trends has increased. One of the markets wherein the building of an accurate model is essential for tracking future trends in the stock market. As stock price data is of huge volumes witnessing change on a daily basis, investors always seek the optimal way to forecast trends to maximize profit and minimize risk. Among various fundamental and technical means of examining the stock market, Autoregressive Integrated Moving Average (ARIMA) is the statistical tool with a standard structure which though is simpler but provides skillful information about the stock market. Thus, focusing on the contribution of ARIMA, this article discusses the forecasting capability of the model.

Concept of Autoregressive Integrated Moving Average (ARIMA) model

In 1970, Box and Jenkins introduced the concept of Autoregressive Integrated Moving Average (ARIMA) as a methodology for identifying, diagnosing, and estimating time-series data. ARIMA model can be used in different fields such as in prediction of weather or sales, but financial forecasting is the most prominent field for effective results. Representing the future value of the variable as the linear combination of the past values and errors, a prediction based on the ARIMA model has outperformed complex structural models (Adebiyi et al., 2014).  The autoregressive model could be represented as

yt = ϕ0 + ϕ1yt−1 + ϕ2yt−2 + ⋯ + ϕpyt−p + εt – ϴ1εt-1– ϴ2εt-2-…- ϴqεq-1

Wherein, εt represents the past error value, Yt is the actual value, ϕ & ϴ are the coefficients, and p and q are referred to as the autoregressive and moving average.

ARIMA being the autoregressive integrated moving average consist of three main parameters i.e.

Components of the Autoregressive Integrated Moving Average (ARIMA) model
Figure 1: Components of the ARIMA model

Autoregressive (AR or p)

Stating the relationship of a variable with its own lagged values, autoregressive order represents the number of lagged values that are connected to the current value of the variable.

yt = ϕ1yt−1 + … + ϕpyt−p + et OR yt = ϴ + ϕ1yt−1 + … + ϕpyt−p + et

Where yt is the current value, yt−1 is the lagged value of variable y, e defines the error term,  is constant or drift, and p determines the number of period lag.

Integrated (I or d)

Integration of the variable state the degree of differencing required for converting the non-stationary form of the time series into the stationary one by removing the effect of seasonality or irregular events.

Moving average (MA or q)

Being the representative of the relationship between the observations and the residual error, the size of the moving average represents the status of dependency.

yt = et + ϴ1et−1 + … + ϴqet−q OR yt = α + et + ϴ1et−1 + … + ϕpyt−q

Where, yt is the current value, e is the residual term, q is the number of moving average, and is the constant term.

Why use Autoregressive Integrated Moving Average (ARIMA) for forecasting?

ARIMA model along with the integration also consists of the autoregressive and moving average. ‘AR’ helps in determining the change since last time, ‘MA’ smoothens the trend in data, and ‘I’ removes the non-stationary form of series. With the possibility of deriving the optimal model by changing the number of lags in each aspect, the ARIMA model work as a more statistical model in prediction compared to other methods like linear regression or exponential smoothing. As time-series is dynamic in nature, thus for having prediction it is essential that the selected model should be flexible which could adjust as per the requirement. Hence, ARIMA with its flexibility and smoothness captures the different natures of the data in one model.

Assumptions for the applicability of the ARIMA model

In order to fit the ARIMA model for future predictions, a time series should satisfy the below-stated assumptions (Subhasree, 2018):

  1. The time series used for the analysis should be stationary i.e. the properties of the series should be dependent and influenced by the time. White noise series or series exhibiting cyclical behavior only should be considered as stationary.
  2. The data should be univariate. In order to make the prediction based on the past values, the single variable data should be considered for framing the model.
  3. Bound of stationary i.e. absolute value of  in AR should be less than 1 (-1<<1). For if bound of stationary does not exist, the series is not autoregressive and could be either trending or drifting (McCleary & Hay, 1980).
  4. Bound of invertibility i.e. absolute value of  in MV should be less than 1 (-1<<1) as not existence of value in the limit would lead to having non-stationary series(McCleary & Hay, 1980).

Rules for identifying the Autoregressive Integrated Moving Average (ARIMA) model

For building a robust Autoregressive Integrated Moving Average (ARIMA) model, it is essential to identify the optimal number of lags, differencing, and the moving average size. Below stated rules should be followed to identify the optimal order (Nau, 2014).

  1. Model having no order of differencing consist of the constant term, one order differencing include a constant term for non-zero average trend series, and the two order differencing does not include a constant term.
  2. A model with no differencing specifies that the original time series is stationary, series with one order differencing has a constant average trend, and a series of two orders has a time-varying trend.
  3. The optimal order of differencing is often the one where the standard deviation of the series is lowest.
  4. A series has a very high number of lags i.e. more than 10 and positive autocorrelation needs the differencing at higher-order for an optimal model.
  5. For the series with lag 1 and the autocorrelation 0 or negative, no higher-order differencing is required while for the series with autocorrelation even less than -0.5, there is a possibility of over differencing.
  6. In case the partial autocorrelation function (PACF) shows a sharp cut-off or there is positive lag-1 autocorrelation, there should be an addition of one or more AR terms if the series is under differenced.
  7. If the autocorrelation function (ACF) shows a sharp cut-off or negative lag-1 autocorrelation, then MA should be added in the model in case of over differenced series.
  8. As AR and MA cancel out each other’s effects, thus a mix of AR-MA could be used with fewer MA and AR terms.
  9. In case the AR coefficient sum is almost 1, there should be a reduction in the AR terms by 1 and an increase in differencing by 1.
  10. In case the MA coefficient sum is almost 1, MA terms should be reduced by 1 and increase differencing by 1.
  11. There is an existence of unit root (coefficient sum as 1) for MA or AR if the long term forecasts are unstable or erratic.
  12. Order of seasonal differencing should be used in case of having a series that has consistent and strong seasonal patterns.
  13. If autocorrelation is appropriately differenced at positive lag s (number of periods in a season) – add s AR term in the model while in case of negative lag s, add s MA term.

Note, never to more than two total differencing or more than one seasonal differencing or more than one or two seasonal parameters to avoid overfitting of the model.

Steps for building the Autoregressive Integrated Moving Average (ARIMA) model

There are certain basic steps used for fitting the Autoregressive Integrated Moving Average (ARIMA) model to the time series (McCleary & Hay, 1980):

Autoregressive Integrated Moving Average (ARIMA) model fitting steps
Figure 2: ARIMA model fitting steps

Step 1: Plotting the data

Initially, the data of the variable is plotted against the time in order to inspect the features of the graph and identifying the unusualness and determine the stationary/ seasonality presence in the series.

Step 2: Transforming the data

Once the series characteristics are determined, the data is transformed like the natural log transformation is done in order to minimize the level of standard deviation.

Step 3: Identifying the orders of the model and estimating model

In case the series is stabilized by transformation, the Autoregressive Integrated Moving Average (ARIMA) model is fitted else the orders of AR, I, and MA are identified. Akaike’s information criterion (AIC) or Schwarz’s Bayesian Information criterion (SBIC) criteria could be used for determining the optimal and most effective number of parameters (i.e. p+q). The time series is plotted using the differencing, correlogram, and partial correlogram for determining the order. Based on the number of lags and order identified, the ARIMA model is estimated.

Step 4: Residual Diagnostic

Graph (Q-Q plot or histogram), statistics, ACF or PACF; validity of the model is identified. If the model is bad, Step 1-3 are repeated else predictions are made with model.

Characteristics of a good forecasting model

Forecasting is dependent on the model used for the prediction. In order to have the effective prediction, it is essential to formulate a good forecasting model. Below stated are the characteristics that a good model should possess

  • The model should fit the past data well and have the adjusted R2 value should be high.
  • Mean Absolute Percentage Error (MAPE) should be good.
  • Relative Standard Error (RSE) of the selected model should be low compared to other models.
  • The plot of the actual line should fit well with the predicted observations.
  • Forecasting of the future observation even in the withheld data should be good.
  • No significant patterns should be left in PACF and ACF.
  • It should be effective but simple with not too many coefficients (model should be parsimonious).
  • The estimated coefficients are the model should be statistically significant and not redundant.
  • Residual should be white noise.
  • The model should be invertible and stationary.


Riya Jain