Volatility Modelling and Forecasting of Malaysian … modelling and forecasting 6161 ed from a...
Transcript of Volatility Modelling and Forecasting of Malaysian … modelling and forecasting 6161 ed from a...
Applied Mathematical Sciences, Vol. 8, 2014, no. 124, 6159 - 6169
HIKARI Ltd, www.m-hikari.com
http://dx.doi.org/10.12988/ams.2014.48650
Volatility Modelling and Forecasting
of Malaysian Crude Palm Oil Prices
Maizah Hura Ahmad1, Pung Yean Ping2 and Norizan Mahamed3
1,2Department of Mathematical Sciences, Faculty of Science
Universiti Teknologi Malaysia,
81310 UTM Skudai, Johor, Malaysia
3Jabatan Matematik, Fakulti Sains & Teknologi, Universiti Malaysia Terengganu,
21030 KualaTerengganu, Terengganu, Malaysia.
Copyright © 2014 Maizah Hura Ahmad et al. This is an open access article distributed under
the Creative Commons Attribution License, which permits unrestricted use, distribution, and
reproduction in any medium, provided the original work is properly cited.
Abstract
The purpose of the current study is to model and forecast the prices of Malaysian
crude palm oil. The oil palm industry is a contributor to Malaysia’s export
revenue. Autoregressive Integrated Moving Average (ARIMA) model is first used
to fit the series. To model the noise term of ARIMA model, Generalized
Autoregressive Conditional Heteroskedasticity (GARCH) is used. The model is
assessed using Akaike Information Criteria (AIC) and mean absolute percentage
error (MAPE).
Keywords: ARIMA, GARCH, volatility, heteroskedasticity, hybrid model
1 Introduction
Malaysia is one of the biggest producers and exporters of palm oil and palm
oil products. Currently, the industry is flourishing while providing employment to
its people. As a result of continuous R & D efforts, a wide variety of by-products
are produced.
One type of oil produced by oil palm is crude palm oil (CPO) which can be
further refined and fractionated to get a wide range of food and non-food palm
6160 Maizah Hura Ahmad et al.
products. The oil palm industry is a contributor to Malaysia’s export revenue.
Thus, modelling and forecasting of CPO prices are important so as to obtain
valuable information pertaining to the future of CPO prices.
Box-Jenkins approach was used to forecast monthly crude palm oil price [1].
In a study to forecast spot palm oil price, the performances of Vector Error
Correction Method (VECM), Multivariate Autoregressive Moving Average
(MARMA) model and the univariate model of Autoregressive Integrated Moving
Average (ARIMA) were used [2].
CPO price are volatile where the conditional variance of the price series
changes between high and low values. The current study attempts to model CPO
using the popular univariate model of Autoregressive Integrated Moving Average
(ARIMA) and improve the forecasts by hybridizing the ARIMA model with
Generalized Autoregressive Conditional Heteroskedasticity (GARCH). Akaike
information criterion (AIC) is used to assess the goodness of fit and mean
absolute percentage error (MAPE) is used to evaluate the forecasting performance.
All analyses are carried out using a software called E-views.
The rest of the paper is organized into 3 sections as follows: Section 2
presents the methodology used in capturing both the mean and the variance
behavior of the monthly CPO price. Section 3 presents the data analysis and the
study is concluded in Section 4.
2 Methodology
ARIMA and GARCH Models
ARIMA is a popular time series modeling developed by Box and Jenkins.
The model is applied in cases where data show evidence of non-stationarity [3].
Transformations such as differencing is used to remove non-stationarity in the
mean of the series while a proper variance stabilizing transformation introduced
by Box and Cox can be used to remove non-stationarity in the variance of the
series [3]. The model is defined as ARIMA (p, d, q) with the following equation:
tqt
d
p ByBB )()1)(( (1)
where yt is the monthly CPO price, p
pp BBB ...1)( 1 is the
autoregressive operator of order p; q
qq BBB ...1)( 1 is the moving
average operator of order q. The orders of p and q in the ARIMA model are
identified through the autocorrelation function (ACF) and the partial autocorrelation
function (PACF) of the sample data. In eq (1), (1B)d is the dth difference; B is
backward shift operator; and t is the error term at time t which are generally
assumed to be independent identically distributed random variables (i.i.d.) sampl-
Volatility modelling and forecasting 6161
ed from a normal distribution with zero mean, that is t ~ N(0,σ2) where σ2 is the
variance.
The variances of some time series errors are often time-varying and
conditional. To handle such variances, Engle introduced the autoregressive
conditional heteroskedasticity (ARCH) class of model [4] [5]. These models were
generalized by Bollerslev to develop generalized autoregressive conditional
heteroskedasticity (GARCH) [6]. GARCH are able to capture volatility clustering
or the periods of fluctuations in a time series [7]. Past variances and past variance
forecasts are used to forecast future variances. The model is defined as
GARCH (p, q) with the following equations:
tty ,where yt is the monthly CPO price, 2
tttu , )1,0(~ Nt
p
iti
q
itit
1
2
11
2
1
2 where )1( 10 , αi , βi > 0, α1+β1 < 1 for
stationarity; p is the order of the GARCH terms 2 , q is the order of the ARCH
terms 2, which is the information about volatility from the previous period
measured as the lag of squared residual from the mean equation.
Testing for Stationarity
A unit root test can be used to determine stationarity. One of the widely used
unit-root tests is Augmented Dickey-Fuller (ADF). The testing procedure is
applied to the model tt
k
iitt yyty
1
110
where yt is the CPO price, indicates the first difference, k is the lag order of the
autoregressive process. The null hypothesis states that the series tested is
non-stationary or a unit root is present.
Testing for serial correlation
To test for serial correlation, Ljung-Box Test (Q statistics) can be used. The
null hypothesis states that there is no serial correlation. Thus, rejection of the null
hypothesis implies that there is serial correlation of any order up to a certain order
lag.
Testing for heteroskedasticity
ARCH Lagrange Multiplier (ARCH-LM) test for testing heteroskedasticity
uses the F statistics on the squared residuals regression, 22
22
2
110
2 ... ptpttt , where p is the length of ARCH lags and
t is the residual of the series. Rejection of the null hypothesis implies that
ARCH effect exists.
Assessing the Goodness of Fit of Model
The current study uses Akaike Information Criteria (AIC) to assess the
goodness of fit of a model. It is defined as AIC = 2k 2 ln (L), where L is the
6162 Maizah Hura Ahmad et al.
maximizeed value of the likelihood function for the estimated model and k is the
number of free and independent parameters in the model.
Evaluating Forecast Error The current study uses mean absolute percentage error (MAPE) to measure
the accuracy of forecast in terms of percentage. It is defined as
MAPE = %100/ˆ
1
ny
yyn
t t
tt
where ty is the actual value; ty is the forecast value; n is the number of periods.
3 Data Analysis and Results
The data used in the study are monthly prices of Malaysian CPO recorded
from January 1999 until May 2014 as plotted in Figure 1. Observations from
January 1999 until December 2012 which account for about 90% of the data were
used for modelling. Out-sample forecasts were produced for observations in the
periods from January 2013 until May 2014.
500
1,000
1,500
2,000
2,500
3,000
3,500
4,000
25 50 75 100 125 150 175
PRICE
Figure 1: Monthly Malaysian CPO Prices from Jan 1999 to May 2014
There was an upward trend in the CPO price data indicating the necessity for
differencing. Before taking the first difference, the data was transformed using
logarithm. Figure 2 plots the first differenced of the transformed CPO price.
Volatility modelling and forecasting 6163
-.3
-.2
-.1
.0
.1
.2
.3
25 50 75 100 125 150 175
DLN
Figure 2: Plot of the First Difference of Transformed CPO Price
The plot in Figure 2 appears to be stationary with most data locating around
the mean of zero. ADF unit-root test was performed with the results as tabulated
in Table 1. From the table, the null hypothesis of non-stationarity in the data is
rejected.
Table 1: Augmented Dickey Fuller Unit Root Test on Transformed CPO
Price Null Hypothesis: DLN has a unit root
Exogenous: Constant
Lag Length: 0 (Automatic based on SIC, MAXLAG=13) t-Statistic Prob.* Augmented Dickey-Fuller test statistic -10.26078 0.0000
Test critical values: 1% level -3.466176
5% level -2.877186
10% level -2.575189 *MacKinnon (1996) one-sided p-values.
To develop an ARIMA model, the parameters were estimated using ordinary
least square method [8]. For the CPO price, the most appropriate model is
ARIMA(2, 1, 0) with an AIC value of -2.179461. The residuals were tested for
serial correlation. Table 2 presents the results of Ljung-Box Test (Q statistics).
6164 Maizah Hura Ahmad et al.
Table 2: Correlogram of Residuals lags AC PAC Q-Stat Prob lags AC PAC Q-Stat Prob 1 0.017 0.017 0.0462 19 0.056 0.018 22.528 0.165
2 0.001 0.001 0.0464 20 -0.076 -0.018 23.625 0.168
3 -0.022 -0.022 0.1304 0.718 21 -0.000 -0.004 23.625 0.211
4 0.161 0.162 4.5959 0.100 22 -0.083 -0.123 24.968 0.203
5 -0.014 -0.020 4.6292 0.201 23 0.085 0.104 26.369 0.193
6 -0.173 -0.178 9.8253 0.043 24 -0.148 -0.189 30.699 0.102
7 0.024 0.041 9.9227 0.077 25 -0.162 -0.143 35.872 0.043
8 -0.056 -0.085 10.483 0.106 26 -0.129 -0.132 39.177 0.026
9 -0.098 -0.107 12.191 0.094 27 0.072 0.041 40.213 0.028
10 0.027 0.101 12.321 0.137 28 0.022 0.075 40.308 0.036
11 0.067 0.051 13.127 0.157 29 -0.063 -0.023 41.117 0.040
12 0.139 0.129 16.642 0.083 30 -0.096 -0.112 43.015 0.035
13 -0.134 -0.102 19.907 0.047 31 0.020 -0.068 43.099 0.045
14 0.034 -0.011 20.121 0.065 32 0.086 0.061 44.645 0.042
15 0.056 0.022 20.703 0.079 33 0.054 0.030 45.264 0.047
16 -0.025 -0.071 20.820 0.106 34 -0.050 -0.036 45.787 0.054
17 -0.002 0.053 20.820 0.143 35 -0.003 -0.039 45.789 0.069
18 -0.077 -0.043 21.939 0.145 36 -0.048 0.011 46.283 0.078
The results in Table 2 indicate that at 5% significance level, there was no
serial correlation at most lags in the model. Figure 3 lists the descriptive statistics
of the residuals. The residuals have a mean which is very close to zero.
0
5
10
15
20
25
30
-0.3 -0.2 -0.1 -0.0 0.1 0.2
Series: Residuals
Sample 4 169
Observations 166
Mean 0.001599
Median 0.002450
Maximum 0.217637
Minimum -0.297021
Std. Dev. 0.080629
Skewness -0.482708
Kurtosis 4.186592
Jarque-Bera 16.18521
Probability 0.000306
Figure 3: Descriptive Statistics of Residuals
From the Jarque-Bera statistic, at 5% significance level, the null hypothesis
of residuals following the normal distribution is rejected. The residuals are plotted
in Figure 4 and are further examined.
Volatility modelling and forecasting 6165
-.4
-.3
-.2
-.1
.0
.1
.2
.3
25 50 75 100 125 150
D(LN_P) Residuals
Figure 4: Volatility Clustering in Residuals
There are volatility clustering, a condition where large changes tend to be
followed by large changes, of either sign, while small changes tend to be followed
by small changes. Figure 5 plots the squared residuals.
.00
.01
.02
.03
.04
.05
.06
.07
.08
.09
25 50 75 100 125 150 175
RESIDSQ
Figure 5: Residuals Squared
The plot does not appear to follow a random process. The variance is not
homoskedastic, where the variance at a certain time depends on variances at
preceding periods. Table 3 presents the correlogram of residuals squared which
indicate serial correlation.
The residuals are also tested for ARCH effects using ARCH- LM test. The
results are presented in Table 4.
The null hypothesis of ARCH effects do not exist is rejected. The null hypothesis
of no ARCH effects in the model is rejected at lag 2 and above. Thus, to handle
heteroskedasticity, a GARCH model is considered [9]. After several analyses, it is
concluded that the best hybrid model is ARIMA(2, 1, 0)-GARCH(3, 1). Table 5
presents the estimation results for the hybrid model as applied to the Malaysian
CPO price.
6166 Maizah Hura Ahmad et al.
Table 3: Correlogram of Residuals Squared lags AC PAC Q-Stat Prob lags AC PAC Q-Stat Prob 1 0.052 0.052 0.4517 17 -0.078 -0.068 31.606 0.007
2 0.349 0.347 21.185 18 -0.051 -0.092 32.087 0.010
3 0.094 0.073 22.699 0.000 19 -0.097 -0.095 33.860 0.009
4 0.015 -0.126 22.740 0.000 20 -0.115 -0.087 36.392 0.006
5 0.030 -0.030 22.900 0.000 21 -0.083 -0.016 37.710 0.006
6 0.006 0.043 22.906 0.000 22 -0.049 0.043 38.174 0.008
7 0.014 0.026 22.938 0.000 23 0.007 0.081 38.184 0.012
8 0.095 0.093 24.519 0.000 24 -0.015 0.004 38.226 0.017
9 0.079 0.074 25.615 0.001 25 0.051 0.038 38.740 0.021
10 0.077 0.005 26.666 0.001 26 0.054 0.077 39.330 0.025
11 0.124 0.061 29.413 0.001 27 0.001 -0.018 39.331 0.034
12 0.029 -0.007 29.567 0.001 28 -0.060 -0.098 40.061 0.038
13 -0.020 -0.102 29.637 0.002 29 -0.026 0.021 40.199 0.049
14 -0.019 -0.040 29.702 0.003 30 -0.105 -0.011 42.474 0.039
Table 4: Heteroskedasticity Test F-statistic 3.334270 Prob. F(2,163) 0.0381
Obs*R-squared 6.524354 Prob. Chi-Square(2) 0.0383 Table 5: Estimation Result for Variance Equation
R-squared 0.090899 Mean dependent var 0.001714
Adjusted R-squared 0.085356 S.D. dependent var 0.084816
S.E. of regression 0.081116 Akaike info criterion -2.287781
Sum squared resid 1.079086 Schwarz criterion -2.156553
Log likelihood 196.8859 Hannan-Quinn criter. -2.234515
Durbin-Watson stat 1.964733
ARCH and GARCH effects are the internal causes of volatility. At 5%
significant level, both the ARCH and GARCH effects are significant. The AIC
value of the hybrid model is -2.287781. The residuals of the ARIMA-GARCH are
tested for ARCH effects using the ARCH- LM test. The results are presented in
Table 6.
Variance Equation C 0.000496 0.000289 1.716242 0.0861
RESID(-1)^2 0.078780 0.016603 4.744893 0.0000
GARCH(-1) 2.000702 0.067788 29.51389 0.0000
GARCH(-2) -1.951634 0.086854 -22.47029 0.0000
GARCH(-3) 0.791973 0.067006 11.81937 0.0000
Volatility modelling and forecasting 6167
Table 6: Heteroskedasticity Test for Hybrid Model Heteroskedasticity Test: ARCH
F-statistic 0.364674 Prob. F(1,164) 0.5468
Obs*R-squared 0.368303 Prob. Chi-Square(1) 0.5439
The results in Table 6 indicate that at significance level of 5%, the null
hypothesis of no ARCH effects cannot be rejected. The hybrid model is then
tested for serial correlation as presented in Table 7.
Table 7: Ljung-Box Q-statistics on residuals squared for Hybrid Model lags AC PAC Q-Stat Prob lags AC PAC Q-Stat Prob 1 -0.050 -0.050 0.4155 19 -0.049 -0.041 10.122 0.898
2 0.097 0.094 2.0021 20 -0.006 -0.017 10.128 0.928
3 -0.035 -0.026 2.2112 0.137 21 -0.054 -0.050 10.682 0.934
4 0.003 -0.009 2.2129 0.331 22 -0.094 -0.085 12.409 0.901
5 0.049 0.055 2.6238 0.453 23 0.079 0.079 13.629 0.885
6 0.002 0.006 2.6242 0.623 24 0.025 0.039 13.756 0.910
7 0.033 0.024 2.8150 0.728 25 0.084 0.101 15.163 0.889
8 0.110 0.117 4.9493 0.550 26 0.023 0.068 15.272 0.913
9 0.004 0.010 4.9527 0.666 27 0.073 0.076 16.329 0.905
10 -0.020 -0.043 5.0256 0.755 28 -0.056 -0.056 16.962 0.910
11 0.058 0.064 5.6352 0.776 29 -0.029 -0.014 17.128 0.928
12 0.046 0.057 6.0251 0.813 30 -0.101 -0.071 19.220 0.891
13 -0.021 -0.044 6.1038 0.866 31 -0.015 -0.064 19.269 0.914
14 -0.050 -0.062 6.5671 0.885 32 -0.102 -0.149 21.443 0.874
15 -0.011 -0.008 6.5902 0.922 33 -0.036 -0.053 21.717 0.892
16 0.087 0.078 8.0125 0.889 34 0.024 0.021 21.838 0.912
17 -0.087 -0.091 9.4157 0.855 35 -0.016 -0.046 21.891 0.930
18 -0.036 -0.059 9.6606 0.884 36 -0.023 -0.054 22.005 0.944
Based on the results in Table 7, the null hypothesis of no serial correlation
cannot be rejected. The descriptive statistics of the residuals from the hybrid
model are presented in Figure 6.
0
4
8
12
16
20
-3 -2 -1 0 1 2
Series: Standardized Residuals
Sample 4 169
Observations 166
Mean 0.039177
Median 0.048757
Maximum 2.356547
Minimum -3.039927
Std. Dev. 1.003361
Skewness -0.277227
Kurtosis 3.217420
Jarque-Bera 2.453273
Probability 0.293277
Figure 6: Descriptive Statistics of the Residuals for Hybrid Model
6168 Maizah Hura Ahmad et al.
From the Jarque-Bera statistic in Figure 6, the null hypothesis which states
that residuals follow the normal distribution is not rejected. At this point, the
hybrid model can be used for forecasting. The MAPE values for in-sample and
out-sample forecasting are 0.819246 and 0.428993 respectively.
4 Conclusion
Box-Jenkins method applies autoregressive moving average ARMA or
ARIMA in finding the best fit of a time series to its past values. When the method
was used to model Malaysian crude palm oil price, the residuals were orthogonal
but not normal. Upon inspection, it was discovered that there still remains serial
correlation in the series. ARCH effect was present. The time series plot of
residuals also showed some cluster of volatility. To model volatility, GARCH
method was used to reflect more recent changes and fluctuations in the series. The
hybrid ARIMA (2, 1, 0)-GARCH (3, 1) model was the most appropriate model for
Malaysian CPO price. The residuals were independent with zero mean, normally
distributed, while ACF and PACF of squared residuals displayed no significant
lags.
Acknowledgement
This work was supported by RUG Vot No: Q.J130000.2526.08H46. The authors would like to thank Universiti Teknologi Malaysia (UTM) for providing the funds and facilities.
References
[1] Fatimah Mohd Arshad and Roslan A. Ghaffar, Crude Palm Oil Price
Forecasting: Box-Jenkins Approach, Pertanika, 9 (3), (1986), 359 – 367.
[2] Khin Aye Aye, Mohamed Zainalabidin, Nambhi Malarvizhi, Chinnasamy
Agamudai and Thambiah Seethaletchumy, Price Forecasting Methodology of
the Malaysian Palm Oil Market, International Journal of Applied Economics
& Finance, 7 (1), (2013), 23.
[3] S.R. Yaziz, N.A. Azizan, R. Zakaria and M.H. Ahmad, The Performance of
Hybrid ARIMA GARCH Modeling, In: 20th International Congress on
Modelling & Simulation 2013 (MODSIM2013), 1-6 December 2013,
Adelaide, Australia.
[4] R. F. Engle, An Introduction to the Use of ARCH/GARCH Models in
Applied Econometrics, Journal of Business, New York (1982).
Volatility modelling and forecasting 6169
[5] Pung Yean Ping, Nor Hamizah Miswan and Maizah Hura Ahmad,
Forecasting Malaysian Gold using GARCH Model, Applied Mathematical
Sciences, 7 (58), 2013, 2879-2884.
[6] Maizah Hura Ahmad and Pung Yean Ping, Modelling Malaysian Gold Using
Symmetric and Asymmetric GARCH Models, Applied Mathematical
Sciences, 8 (17), 2014, 817-822.
[7] T. Bollerslev, Generalized Autorregressive Conditional Heteroskedasticity,
Journal of Econometrics, 31 (1986), 307-327.
[8] Nor Hamizah Miswan, Pung Yean Ping and Maizah Hura Ahmad, On
Parameter Estimation for Malaysian Gold Prices Modelling and Forecasting,
International Journal of Mathematical Analysis, 7 (22), 2013, 1059-1068.
[9] Maizah Hura Ahmad, Pung Yean Ping, Siti Roslindar Yazir and Nor
Hamizah Miswan, A Hybrid Model for Improving Malaysian Gold Forecast
Accuracy, International Journal of Mathematical Analysis, 8 (28), 2014,
1377-1387.
Received: August 7, 2014