The data used as an example throughout this report and the second report is from Family A from the Northland region. Family A has data recorded from 2013 til 2020, and we wish to compare between the predicted number of oranges and the true number of oranges for 2018. The purpose of this is to examine the realibility of the proposed Time Series models.
Aim : to forecast the number of oranges for 2018.
Modelling Procedure
Step 1. Plot the data and identify any unusual observations.
Comment: The data shows no particular trend, strong cyclic behavior and there seems to be an unusual pattern between 2015 and 2016. In addition, the mean is not constant, i.e. changes over time, implying non-stationarity. It might be sensible to remove data from 2013 up to 2015 to avoid invalid statistical results.
Comment: Since the data is recorded at monthly interval, we would expect to see some seasonility patterns in the time series plot. However, data was generated randomly which results in the data sort of behaving like white noise though non-stationarity is still strongly implied.
White noise is an important concept in time series analysis and forecasting. It is important for two main reasons: Predictability: If your time series is white noise, then, by definition, it is random. You cannot reasonably model it and make predictions. Model Diagnostics: The series of errors from a time series forecast model should ideally be white noise.
The decompositon of the Family A data uses the STL method. STL is an acronym for “Seasonal and Trend decomposition using Loess”, while Loess is a method for estimating non-linear relationships. It only provides facilities for additive decompositions.
Comment: In the above ACF plot, the dashed orange lines indicates whether the autocorrelations are (statistically) significantly different from zero within 95% confidence limits. Here the autocorrelations are significantly different from 0, indicating high autocorrelation.
Step 2. Split the data into training and test sets.
The accuracy of forecasts can only be determined by considering how well a model performs on new data that were not used when fitting the model. When choosing models, it is common practice to separate the available data into two portions, training and test sets, where the training data is used to fit a forecasting method and the test data is used to evaluate its accuracy. Because the test data is not used in determining the forecasts, it should provide a reliable indication of how well the model is likely to forecast on new data.
# The data is split into training set (Jan,2016-Dec,2017) and test set (2018).
training <- window(new.tsA, start = c(2016,1), end = c(2017,12))
test <- window(new.tsA, start = c(2018,1), end = c(2018, 12))
Step 3. Transform the data from the training set.
Box-Cox transformations is a family of transformations, that includes both logarithms and power transformations, used to transform the Family A data. This is recommended as the Hyndman-Khandakar algorithm of the auto.arima()
function only takes care of step 3-5 thus we still have to do steps 1, 2, 6 and 7 manually to ensure the residuals will be roughly homoscedastic. One important feature of power transformation is the \(\lambda\) parameter, where \(\lambda = 0\) is equivalent to a log-transformation. A good value of \(\lambda\) is one which makes the size of the seasonal variation about the same across the whole series, as that makes the forecasting model simpler. The BoxCox.lambda()
function can be used for choosing \(\lambda\) automatically instead of doing it manually.
lambda <- BoxCox.lambda(training)
trans.A <- BoxCox(training, lambda)
lambda
## [1] 1
The optimal value of \(\lambda\) is 1, this is equivalent to \(Y^{\lambda} = Y^1 = Y\) hence data transformation is not necessary. Note: If transformation was to be required, the data must be transformed AFTER being split into training and test sets in order to avoid data leakage, that is only the training data is transformed and not the test data.
Step 4. Fit the models. In this step, two methods were considered to fit and predict the data. The methods used are ARIMA (auto.arima()
) and ETS (ets()
) models.
The auto.arima()
function in R uses an algorithm which combines unit root tests, minimisation of the AICc and MLE to obtain an ARIMA model. The arguments stepwise = FALSE
and approximation = FALSE
are included in the below auto.arima()
function to ensure ALL fitted seasonal ARIMA models are considered. The ARIMA model overall has the form : \[ARIMA(p,d,q)(P,D,Q)[m]\], where p
is the order of the Autoregressive model, d
is the order of differencing and q
is the order of the Moving Average model. (P,D,Q)
is the same but are defined in the context of seasonality;. The final chosen model returned then has the form : \(ARIMA(1,0,0)\).
auto.arima.model <- auto.arima(training, stepwise = FALSE, approximation = FALSE)
auto.arima.model
## Series: training
## ARIMA(0,0,0) with non-zero mean
##
## Coefficients:
## mean
## 27.0833
## s.e. 3.2596
##
## sigma^2 estimated as 266.1: log likelihood=-100.55
## AIC=205.1 AICc=205.67 BIC=207.45
Comment: An ARIMA(0,0,0) model with zero mean is white noise, so it means that the errors are uncorrelated across time.
##
## Box-Ljung test
##
## data: resid(auto.arima.model)
## X-squared = 15.438, df = 12, p-value = 0.2184
The ACF plot of the residuals from the chosen ARIMA model shows that all autocorrelations are within the threshold limits indicating that the residuals are behaving like white noise. The histogram suggests that the residuals may not follow a Normal distribution. The Box-Ljung test returns a large p-value (p-value = 0.2184), also suggesting that the residuals resemble white noise.
In the case of ETS (Error, Trend, Seasonal) models, the ets()
function can be used to fit these types of model. The notation for each component is defined as Error = {A,M}, Trend = {N,A,Ad} and Seasonal = {N,A,M}, where A stands for additive and M stands for multiplicative.
ets.model <- ets(training)
ets.model
## ETS(A,N,N)
##
## Call:
## ets(y = training)
##
## Smoothing parameters:
## alpha = 1e-04
##
## Initial states:
## l = 27.0827
##
## sigma: 16.6794
##
## AIC AICc BIC
## 215.2654 216.4654 218.7995
The best ETS model selected is the \(ETS(A,N,N)\), as shown in the result above. Formally, this model is also known as the simple smoothing with additive errors model. In comparison to the \(ARIMA(0,0,0)\) model, the ETS model has a higher AIC value of 215.2654 (the ARIMA model’s AIC value is 205.1).
##
## Box-Ljung test
##
## data: resid(ets.model)
## X-squared = 16.789, df = 14, p-value = 0.2676
The ACF plot of the residuals from the chosen ETS model shows that all autocorrelations are within the threshold limits indicating that the residuals are bot behaving like white noise. The histogram suggests that the residuals don’t follow a Normal distribution. The Box-Ljung test returns a large p-value (p-value = 0.2676), again suggesting that the residuals resemble white noise.
Step 5. Forecast the data using the fitted models.
Only one thing is true about forecasts - they are always wrong.
The forecast()
function can be implemented in order to obtain forecasts from an ARIMA or an ETS model. Having done a transformation on the data, it is necessary to reverse the transformation (or back-transform) when forecasting the transformed data to obtain forecasts on the original scale. This can be done in R by adding the $\lambda$
(equal to the \(\lambda\) value selected for transforming the data) argument to the forecast()
function. In addition, the biasadj = TRUE
argument indicates that the mean of the forecasts is used and this mean is biased, whereas when biasadj = FALSE
(default) the median of the forecasts is used and is not biased.
While linear exponential smoothing models are all special cases of ARIMA models, the non-linear exponential smoothing models have no equivalent ARIMA counterparts. On the other hand, there are also many ARIMA models that have no exponential smoothing counterparts. In particular, all ETS models are non-stationary, while some ARIMA models are stationary.
a1 <- auto.arima.model %>% forecast(h = 12) %>%
accuracy(test)
a1[,c("RMSE", "MAE", "MAPE", "MASE")]
## RMSE MAE MAPE MASE
## Training set 15.96850 13.66667 390.53310 0.6612903
## Test set 13.80796 12.91667 82.10664 0.6250000
a2 <- ets.model %>% forecast(h = 12) %>%
accuracy(test)
a2[,c("RMSE", "MAE", "MAPE", "MASE")]
## RMSE MAE MAPE MASE
## Training set 15.96930 13.66715 390.56864 0.6613135
## Test set 13.80789 12.91667 82.10466 0.6250000
Comment: The R output shows similar error values regarding the forecasting performance of the two competing models over the test data.
2018 | True Data | Predicted ARIMA | Predicted ETS |
---|---|---|---|
Jan | 10 | 27 | 27 |
Feb | 42 | 27 | 27 |
Mar | 8 | 27 | 27 |
Apr | 37 | 27 | 27 |
May | 31 | 27 | 27 |
Jun | 19 | 27 | 27 |
Jul | 16 | 27 | 27 |
Aug | 45 | 27 | 27 |
Sep | 14 | 27 | 27 |
Oct | 43 | 27 | 27 |
Nov | 33 | 27 | 27 |
Dec | 9 | 27 | 27 |
Comment: Since the data appears to behave like a white noise series, it was expected that the forecasts didn’t come out too well. Though the errors of both models were small, this also indicates that both models might be over-fitting the data.
https://stackoverflow.com/questions/59672182/prediction-interval-level-on-forecast-autoplot
Page built on: 📆 2020-07-12 ‒ 🕢 21:19:17