First Report

Family A orange data

The data used as an example throughout this report and the second report is from Family A from the Northland region. Family A has data recorded from 2013 til 2020, and we wish to compare between the predicted number of oranges and the true number of oranges for 2018. The purpose of this is to examine the realibility of the proposed Time Series models.

Analysis of the data

Aim : to forecast the number of oranges for 2018.

Modelling Procedure

Plot the data.
If necessary, use a Box-Cox transformation to stabilize the variance.
If necessary, difference the data until appears stationary.
Plot the ACF/PACF of the differenced data and try to determine possible candidate models.
Run the chosen model and use AIC to select for a better model.
Check the residuals by plotting ACF and doing a portmanteau test of the residuals.
If the residuals look like white noise, calculate the forecasts. Otherwise, return to Step 4.

Step 1. Plot the data and identify any unusual observations.

Figure 1. Monthly Data Time Series Plot

Comment: The data shows no particular trend, strong cyclic behavior and there seems to be an unusual pattern between 2015 and 2016. In addition, the mean is not constant, i.e. changes over time, implying non-stationarity. It might be sensible to remove data from 2013 up to 2015 to avoid invalid statistical results.

Figure 2. New Time Series Monthly Data

Comment: Since the data is recorded at monthly interval, we would expect to see some seasonility patterns in the time series plot. However, data was generated randomly which results in the data sort of behaving like white noise though non-stationarity is still strongly implied.

Figure 3. New Time Series Monthly Data vs. Seasonally adjusted

White noise is an important concept in time series analysis and forecasting. It is important for two main reasons: Predictability: If your time series is white noise, then, by definition, it is random. You cannot reasonably model it and make predictions. Model Diagnostics: The series of errors from a time series forecast model should ideally be white noise.

Figure 4. STL decompostion of additive components

The decompositon of the Family A data uses the STL method. STL is an acronym for “Seasonal and Trend decomposition using Loess”, while Loess is a method for estimating non-linear relationships. It only provides facilities for additive decompositions.

Figure 5. ACF plot of the Tasman data

Comment: In the above ACF plot, the dashed orange lines indicates whether the autocorrelations are (statistically) significantly different from zero within 95% confidence limits. Here the autocorrelations are significantly different from 0, indicating high autocorrelation.

Step 2. Split the data into training and test sets.

The accuracy of forecasts can only be determined by considering how well a model performs on new data that were not used when fitting the model. When choosing models, it is common practice to separate the available data into two portions, training and test sets, where the training data is used to fit a forecasting method and the test data is used to evaluate its accuracy. Because the test data is not used in determining the forecasts, it should provide a reliable indication of how well the model is likely to forecast on new data.

# The data is split into training set (Jan,2016-Dec,2017) and test set (2018).
training <- window(new.tsA, start = c(2016,1), end = c(2017,12))
test <- window(new.tsA, start = c(2018,1), end = c(2018, 12))

Step 3. Transform the data from the training set.

Box-Cox transformations is a family of transformations, that includes both logarithms and power transformations, used to transform the Family A data. This is recommended as the Hyndman-Khandakar algorithm of the auto.arima() function only takes care of step 3-5 thus we still have to do steps 1, 2, 6 and 7 manually to ensure the residuals will be roughly homoscedastic. One important feature of power transformation is the $\lambda$ parameter, where $\lambda = 0$ is equivalent to a log-transformation. A good value of $\lambda$ is one which makes the size of the seasonal variation about the same across the whole series, as that makes the forecasting model simpler. The BoxCox.lambda() function can be used for choosing $\lambda$ automatically instead of doing it manually.

lambda <- BoxCox.lambda(training)
trans.A <- BoxCox(training, lambda)
lambda

## [1] 1

The optimal value of $\lambda$ is 1, this is equivalent to $Y^{\lambda} = Y^1 = Y$ hence data transformation is not necessary. Note: If transformation was to be required, the data must be transformed AFTER being split into training and test sets in order to avoid data leakage, that is only the training data is transformed and not the test data.

Step 4. Fit the models. In this step, two methods were considered to fit and predict the data. The methods used are ARIMA (auto.arima()) and ETS (ets()) models.

METHOD 1: ARIMA MODEL

The auto.arima() function in R uses an algorithm which combines unit root tests, minimisation of the AICc and MLE to obtain an ARIMA model. The arguments stepwise = FALSE and approximation = FALSEare included in the below auto.arima() function to ensure ALL fitted seasonal ARIMA models are considered. The ARIMA model overall has the form : \[ARIMA(p,d,q)(P,D,Q)[m]\], where p is the order of the Autoregressive model, d is the order of differencing and q is the order of the Moving Average model. (P,D,Q) is the same but are defined in the context of seasonality;. The final chosen model returned then has the form : $ARIMA(1,0,0)$.

auto.arima.model <- auto.arima(training, stepwise = FALSE, approximation = FALSE)
auto.arima.model

## Series: training 
## ARIMA(0,0,0) with non-zero mean 
## 
## Coefficients:
##          mean
##       27.0833
## s.e.   3.2596
## 
## sigma^2 estimated as 266.1:  log likelihood=-100.55
## AIC=205.1   AICc=205.67   BIC=207.45

Comment: An ARIMA(0,0,0) model with zero mean is white noise, so it means that the errors are uncorrelated across time.

Figure 6. Residuals of ARIMA(0,0,0) model

## 
##  Box-Ljung test
## 
## data:  resid(auto.arima.model)
## X-squared = 15.438, df = 12, p-value = 0.2184

The ACF plot of the residuals from the chosen ARIMA model shows that all autocorrelations are within the threshold limits indicating that the residuals are behaving like white noise. The histogram suggests that the residuals may not follow a Normal distribution. The Box-Ljung test returns a large p-value (p-value = 0.2184), also suggesting that the residuals resemble white noise.

METHOD 2 : ETS MODEL

In the case of ETS (Error, Trend, Seasonal) models, the ets() function can be used to fit these types of model. The notation for each component is defined as Error = {A,M}, Trend = {N,A,Ad} and Seasonal = {N,A,M}, where A stands for additive and M stands for multiplicative.

ets.model <- ets(training)
ets.model

## ETS(A,N,N) 
## 
## Call:
##  ets(y = training) 
## 
##   Smoothing parameters:
##     alpha = 1e-04 
## 
##   Initial states:
##     l = 27.0827 
## 
##   sigma:  16.6794
## 
##      AIC     AICc      BIC 
## 215.2654 216.4654 218.7995

The best ETS model selected is the $ETS(A,N,N)$, as shown in the result above. Formally, this model is also known as the simple smoothing with additive errors model. In comparison to the $ARIMA(0,0,0)$ model, the ETS model has a higher AIC value of 215.2654 (the ARIMA model’s AIC value is 205.1).

Figure 7. Residuals of ETS(A,N,A) model

## 
##  Box-Ljung test
## 
## data:  resid(ets.model)
## X-squared = 16.789, df = 14, p-value = 0.2676

The ACF plot of the residuals from the chosen ETS model shows that all autocorrelations are within the threshold limits indicating that the residuals are bot behaving like white noise. The histogram suggests that the residuals don’t follow a Normal distribution. The Box-Ljung test returns a large p-value (p-value = 0.2676), again suggesting that the residuals resemble white noise.

Step 5. Forecast the data using the fitted models.

Only one thing is true about forecasts - they are always wrong.

The forecast() function can be implemented in order to obtain forecasts from an ARIMA or an ETS model. Having done a transformation on the data, it is necessary to reverse the transformation (or back-transform) when forecasting the transformed data to obtain forecasts on the original scale. This can be done in R by adding the $\lambda$ (equal to the $\lambda$ value selected for transforming the data) argument to the forecast() function. In addition, the biasadj = TRUE argument indicates that the mean of the forecasts is used and this mean is biased, whereas when biasadj = FALSE (default) the median of the forecasts is used and is not biased.

ARIMA(1,0,0) MODEL

Figure 8. Forecasts for the seasonally adjusted Tasman data

ETS(A,N,A) MODEL

Figure 9. Forecasts for the seasonally adjusted Tasman data

ARIMA VS. ETS

While linear exponential smoothing models are all special cases of ARIMA models, the non-linear exponential smoothing models have no equivalent ARIMA counterparts. On the other hand, there are also many ARIMA models that have no exponential smoothing counterparts. In particular, all ETS models are non-stationary, while some ARIMA models are stationary.

a1 <- auto.arima.model %>% forecast(h = 12) %>% 
  accuracy(test)
a1[,c("RMSE", "MAE", "MAPE", "MASE")]

##                  RMSE      MAE      MAPE      MASE
## Training set 15.96850 13.66667 390.53310 0.6612903
## Test set     13.80796 12.91667  82.10664 0.6250000

a2 <- ets.model %>% forecast(h = 12) %>% 
  accuracy(test)
a2[,c("RMSE", "MAE", "MAPE", "MASE")]

##                  RMSE      MAE      MAPE      MASE
## Training set 15.96930 13.66715 390.56864 0.6613135
## Test set     13.80789 12.91667  82.10466 0.6250000

Comment: The R output shows similar error values regarding the forecasting performance of the two competing models over the test data.

Table 2: True values vs. Predicted values
2018	True Data	Predicted ARIMA	Predicted ETS
Jan	10	27	27
Feb	42	27	27
Mar	8	27	27
Apr	37	27	27
May	31	27	27
Jun	19	27	27
Jul	16	27	27
Aug	45	27	27
Sep	14	27	27
Oct	43	27	27
Nov	33	27	27
Dec	9	27	27

Comment: Since the data appears to behave like a white noise series, it was expected that the forecasts didn’t come out too well. Though the errors of both models were small, this also indicates that both models might be over-fitting the data.

To-Do List

Perform time series analysis on the data sets basing on Region levels.
Decompose annual GDP by region into monthly data.
Dummy seasonal (monthly) predictors.
Look into other forecasting methods, such as STL-ETS, NNAR, TBATS and forecast combinations (using several different methods on the same time series, and to average the resulting forecasts).