Volatility Modelling and Forecasting of Malaysian … modelling and forecasting 6161 ed from a...

Applied Mathematical Sciences, Vol. 8, 2014, no. 124, 6159 - 6169

HIKARI Ltd, www.m-hikari.com

http://dx.doi.org/10.12988/ams.2014.48650

Volatility Modelling and Forecasting

of Malaysian Crude Palm Oil Prices

Maizah Hura Ahmad1, Pung Yean Ping2 and Norizan Mahamed3

1,2Department of Mathematical Sciences, Faculty of Science

Universiti Teknologi Malaysia,

81310 UTM Skudai, Johor, Malaysia

3Jabatan Matematik, Fakulti Sains & Teknologi, Universiti Malaysia Terengganu,

21030 KualaTerengganu, Terengganu, Malaysia.

Copyright © 2014 Maizah Hura Ahmad et al. This is an open access article distributed under

the Creative Commons Attribution License, which permits unrestricted use, distribution, and

reproduction in any medium, provided the original work is properly cited.

Abstract

The purpose of the current study is to model and forecast the prices of Malaysian

crude palm oil. The oil palm industry is a contributor to Malaysia’s export

revenue. Autoregressive Integrated Moving Average (ARIMA) model is first used

to fit the series. To model the noise term of ARIMA model, Generalized

Autoregressive Conditional Heteroskedasticity (GARCH) is used. The model is

assessed using Akaike Information Criteria (AIC) and mean absolute percentage

error (MAPE).

Keywords: ARIMA, GARCH, volatility, heteroskedasticity, hybrid model

1 Introduction

Malaysia is one of the biggest producers and exporters of palm oil and palm

oil products. Currently, the industry is flourishing while providing employment to

its people. As a result of continuous R & D efforts, a wide variety of by-products

are produced.

One type of oil produced by oil palm is crude palm oil (CPO) which can be

further refined and fractionated to get a wide range of food and non-food palm

6160 Maizah Hura Ahmad et al.

products. The oil palm industry is a contributor to Malaysia’s export revenue.

Thus, modelling and forecasting of CPO prices are important so as to obtain

valuable information pertaining to the future of CPO prices.

Box-Jenkins approach was used to forecast monthly crude palm oil price [1].

In a study to forecast spot palm oil price, the performances of Vector Error

Correction Method (VECM), Multivariate Autoregressive Moving Average

(MARMA) model and the univariate model of Autoregressive Integrated Moving

Average (ARIMA) were used [2].

CPO price are volatile where the conditional variance of the price series

changes between high and low values. The current study attempts to model CPO

using the popular univariate model of Autoregressive Integrated Moving Average

(ARIMA) and improve the forecasts by hybridizing the ARIMA model with

Generalized Autoregressive Conditional Heteroskedasticity (GARCH). Akaike

information criterion (AIC) is used to assess the goodness of fit and mean

absolute percentage error (MAPE) is used to evaluate the forecasting performance.

All analyses are carried out using a software called E-views.

The rest of the paper is organized into 3 sections as follows: Section 2

presents the methodology used in capturing both the mean and the variance

behavior of the monthly CPO price. Section 3 presents the data analysis and the

study is concluded in Section 4.

2 Methodology

ARIMA and GARCH Models

ARIMA is a popular time series modeling developed by Box and Jenkins.

The model is applied in cases where data show evidence of non-stationarity [3].

Transformations such as differencing is used to remove non-stationarity in the

mean of the series while a proper variance stabilizing transformation introduced

by Box and Cox can be used to remove non-stationarity in the variance of the

series [3]. The model is defined as ARIMA (p, d, q) with the following equation:

tqt

d

p ByBB )()1)(( (1)

where yt is the monthly CPO price, p

pp BBB ...1)( 1 is the

autoregressive operator of order p; q

qq BBB ...1)( 1 is the moving

average operator of order q. The orders of p and q in the ARIMA model are

identified through the autocorrelation function (ACF) and the partial autocorrelation

function (PACF) of the sample data. In eq (1), (1B)d is the dth difference; B is

backward shift operator; and t is the error term at time t which are generally

assumed to be independent identically distributed random variables (i.i.d.) sampl-

http://en.wikipedia.org/wiki/Stationary_process

http://en.wikipedia.org/wiki/Independent_identically_distributed_random_variables

Volatility modelling and forecasting 6161

ed from a normal distribution with zero mean, that is t ~ N(0,σ2) where σ2 is the

variance.

The variances of some time series errors are often time-varying and

conditional. To handle such variances, Engle introduced the autoregressive

conditional heteroskedasticity (ARCH) class of model [4] [5]. These models were

generalized by Bollerslev to develop generalized autoregressive conditional

heteroskedasticity (GARCH) [6]. GARCH are able to capture volatility clustering

or the periods of fluctuations in a time series [7]. Past variances and past variance

forecasts are used to forecast future variances. The model is defined as

GARCH (p, q) with the following equations:

tty ,where yt is the monthly CPO price, 2

tttu , )1,0(~ Nt

p

iti

q

itit

1

2

11

2

1

2 where )1( 10 , αi , βi > 0, α1+β1 < 1 for

stationarity; p is the order of the GARCH terms 2 , q is the order of the ARCH

terms 2, which is the information about volatility from the previous period

measured as the lag of squared residual from the mean equation.

Testing for Stationarity

A unit root test can be used to determine stationarity. One of the widely used

unit-root tests is Augmented Dickey-Fuller (ADF). The testing procedure is

applied to the model tt

k

iitt yyty

1

110

where yt is the CPO price, indicates the first difference, k is the lag order of the

autoregressive process. The null hypothesis states that the series tested is

non-stationary or a unit root is present.

Testing for serial correlation

To test for serial correlation, Ljung-Box Test (Q statistics) can be used. The

null hypothesis states that there is no serial correlation. Thus, rejection of the null

hypothesis implies that there is serial correlation of any order up to a certain order

lag.

Testing for heteroskedasticity

ARCH Lagrange Multiplier (ARCH-LM) test for testing heteroskedasticity

uses the F statistics on the squared residuals regression, 22

22

2

110

2 ... ptpttt , where p is the length of ARCH lags and

t is the residual of the series. Rejection of the null hypothesis implies that

ARCH effect exists.

Assessing the Goodness of Fit of Model

The current study uses Akaike Information Criteria (AIC) to assess the

goodness of fit of a model. It is defined as AIC = 2k 2 ln (L), where L is the

http://en.wikipedia.org/wiki/Normal_distribution

http://en.wikipedia.org/wiki/Serial_correlation


maximizeed value of the likelihood function for the estimated model and k is the

number of free and independent parameters in the model.

Evaluating Forecast Error The current study uses mean absolute percentage error (MAPE) to measure

the accuracy of forecast in terms of percentage. It is defined as

MAPE = %100/ˆ

1

ny

yyn

t t

tt

where ty is the actual value; ty is the forecast value; n is the number of periods.

3 Data Analysis and Results

The data used in the study are monthly prices of Malaysian CPO recorded

from January 1999 until May 2014 as plotted in Figure 1. Observations from

January 1999 until December 2012 which account for about 90% of the data were

used for modelling. Out-sample forecasts were produced for observations in the

periods from January 2013 until May 2014.

500

1,000

1,500

2,000

2,500

3,000

3,500

4,000

25 50 75 100 125 150 175

PRICE

Figure 1: Monthly Malaysian CPO Prices from Jan 1999 to May 2014

There was an upward trend in the CPO price data indicating the necessity for

differencing. Before taking the first difference, the data was transformed using

logarithm. Figure 2 plots the first differenced of the transformed CPO price.


-.3

-.2

-.1

.0

.1

.2

.3

25 50 75 100 125 150 175

DLN

Figure 2: Plot of the First Difference of Transformed CPO Price

The plot in Figure 2 appears to be stationary with most data locating around

the mean of zero. ADF unit-root test was performed with the results as tabulated

in Table 1. From the table, the null hypothesis of non-stationarity in the data is

rejected.

Table 1: Augmented Dickey Fuller Unit Root Test on Transformed CPO

Price Null Hypothesis: DLN has a unit root

Exogenous: Constant

Lag Length: 0 (Automatic based on SIC, MAXLAG=13) t-Statistic Prob.* Augmented Dickey-Fuller test statistic -10.26078 0.0000

Test critical values: 1% level -3.466176

5% level -2.877186

10% level -2.575189 *MacKinnon (1996) one-sided p-values.

To develop an ARIMA model, the parameters were estimated using ordinary

least square method [8]. For the CPO price, the most appropriate model is

ARIMA(2, 1, 0) with an AIC value of -2.179461. The residuals were tested for

serial correlation. Table 2 presents the results of Ljung-Box Test (Q statistics).


Table 2: Correlogram of Residuals lags AC PAC Q-Stat Prob lags AC PAC Q-Stat Prob 1 0.017 0.017 0.0462 19 0.056 0.018 22.528 0.165

2 0.001 0.001 0.0464 20 -0.076 -0.018 23.625 0.168

3 -0.022 -0.022 0.1304 0.718 21 -0.000 -0.004 23.625 0.211

4 0.161 0.162 4.5959 0.100 22 -0.083 -0.123 24.968 0.203

5 -0.014 -0.020 4.6292 0.201 23 0.085 0.104 26.369 0.193

6 -0.173 -0.178 9.8253 0.043 24 -0.148 -0.189 30.699 0.102

7 0.024 0.041 9.9227 0.077 25 -0.162 -0.143 35.872 0.043

8 -0.056 -0.085 10.483 0.106 26 -0.129 -0.132 39.177 0.026

9 -0.098 -0.107 12.191 0.094 27 0.072 0.041 40.213 0.028

10 0.027 0.101 12.321 0.137 28 0.022 0.075 40.308 0.036

11 0.067 0.051 13.127 0.157 29 -0.063 -0.023 41.117 0.040

12 0.139 0.129 16.642 0.083 30 -0.096 -0.112 43.015 0.035

13 -0.134 -0.102 19.907 0.047 31 0.020 -0.068 43.099 0.045

14 0.034 -0.011 20.121 0.065 32 0.086 0.061 44.645 0.042

15 0.056 0.022 20.703 0.079 33 0.054 0.030 45.264 0.047

16 -0.025 -0.071 20.820 0.106 34 -0.050 -0.036 45.787 0.054

17 -0.002 0.053 20.820 0.143 35 -0.003 -0.039 45.789 0.069

18 -0.077 -0.043 21.939 0.145 36 -0.048 0.011 46.283 0.078

The results in Table 2 indicate that at 5% significance level, there was no

serial correlation at most lags in the model. Figure 3 lists the descriptive statistics

of the residuals. The residuals have a mean which is very close to zero.

0

5

10

15

20

25

30

-0.3 -0.2 -0.1 -0.0 0.1 0.2

Series: Residuals

Sample 4 169

Observations 166

Mean 0.001599

Median 0.002450

Maximum 0.217637

Minimum -0.297021

Std. Dev. 0.080629

Skewness -0.482708

Kurtosis 4.186592

Jarque-Bera 16.18521

Probability 0.000306

Figure 3: Descriptive Statistics of Residuals

From the Jarque-Bera statistic, at 5% significance level, the null hypothesis

of residuals following the normal distribution is rejected. The residuals are plotted

in Figure 4 and are further examined.


-.4

-.3

-.2

-.1

.0

.1

.2

.3

25 50 75 100 125 150

D(LN_P) Residuals

Figure 4: Volatility Clustering in Residuals

There are volatility clustering, a condition where large changes tend to be

followed by large changes, of either sign, while small changes tend to be followed

by small changes. Figure 5 plots the squared residuals.

.00

.01

.02

.03

.04

.05

.06

.07

.08

.09

25 50 75 100 125 150 175

RESIDSQ

Figure 5: Residuals Squared

The plot does not appear to follow a random process. The variance is not

homoskedastic, where the variance at a certain time depends on variances at

preceding periods. Table 3 presents the correlogram of residuals squared which

indicate serial correlation.

The residuals are also tested for ARCH effects using ARCH- LM test. The

results are presented in Table 4.

The null hypothesis of ARCH effects do not exist is rejected. The null hypothesis

of no ARCH effects in the model is rejected at lag 2 and above. Thus, to handle

heteroskedasticity, a GARCH model is considered [9]. After several analyses, it is

concluded that the best hybrid model is ARIMA(2, 1, 0)-GARCH(3, 1). Table 5

presents the estimation results for the hybrid model as applied to the Malaysian

CPO price.


Table 3: Correlogram of Residuals Squared lags AC PAC Q-Stat Prob lags AC PAC Q-Stat Prob 1 0.052 0.052 0.4517 17 -0.078 -0.068 31.606 0.007

2 0.349 0.347 21.185 18 -0.051 -0.092 32.087 0.010

3 0.094 0.073 22.699 0.000 19 -0.097 -0.095 33.860 0.009

4 0.015 -0.126 22.740 0.000 20 -0.115 -0.087 36.392 0.006

5 0.030 -0.030 22.900 0.000 21 -0.083 -0.016 37.710 0.006

6 0.006 0.043 22.906 0.000 22 -0.049 0.043 38.174 0.008

7 0.014 0.026 22.938 0.000 23 0.007 0.081 38.184 0.012

8 0.095 0.093 24.519 0.000 24 -0.015 0.004 38.226 0.017

9 0.079 0.074 25.615 0.001 25 0.051 0.038 38.740 0.021

10 0.077 0.005 26.666 0.001 26 0.054 0.077 39.330 0.025

11 0.124 0.061 29.413 0.001 27 0.001 -0.018 39.331 0.034

12 0.029 -0.007 29.567 0.001 28 -0.060 -0.098 40.061 0.038

13 -0.020 -0.102 29.637 0.002 29 -0.026 0.021 40.199 0.049

14 -0.019 -0.040 29.702 0.003 30 -0.105 -0.011 42.474 0.039

Table 4: Heteroskedasticity Test F-statistic 3.334270 Prob. F(2,163) 0.0381

Obs*R-squared 6.524354 Prob. Chi-Square(2) 0.0383 Table 5: Estimation Result for Variance Equation

R-squared 0.090899 Mean dependent var 0.001714

Adjusted R-squared 0.085356 S.D. dependent var 0.084816

S.E. of regression 0.081116 Akaike info criterion -2.287781

Sum squared resid 1.079086 Schwarz criterion -2.156553

Log likelihood 196.8859 Hannan-Quinn criter. -2.234515

Durbin-Watson stat 1.964733

ARCH and GARCH effects are the internal causes of volatility. At 5%

significant level, both the ARCH and GARCH effects are significant. The AIC

value of the hybrid model is -2.287781. The residuals of the ARIMA-GARCH are

tested for ARCH effects using the ARCH- LM test. The results are presented in

Table 6.

Variance Equation C 0.000496 0.000289 1.716242 0.0861

RESID(-1)^2 0.078780 0.016603 4.744893 0.0000

GARCH(-1) 2.000702 0.067788 29.51389 0.0000

GARCH(-2) -1.951634 0.086854 -22.47029 0.0000

GARCH(-3) 0.791973 0.067006 11.81937 0.0000


Table 6: Heteroskedasticity Test for Hybrid Model Heteroskedasticity Test: ARCH

F-statistic 0.364674 Prob. F(1,164) 0.5468

Obs*R-squared 0.368303 Prob. Chi-Square(1) 0.5439

The results in Table 6 indicate that at significance level of 5%, the null

hypothesis of no ARCH effects cannot be rejected. The hybrid model is then

tested for serial correlation as presented in Table 7.

Table 7: Ljung-Box Q-statistics on residuals squared for Hybrid Model lags AC PAC Q-Stat Prob lags AC PAC Q-Stat Prob 1 -0.050 -0.050 0.4155 19 -0.049 -0.041 10.122 0.898

2 0.097 0.094 2.0021 20 -0.006 -0.017 10.128 0.928

3 -0.035 -0.026 2.2112 0.137 21 -0.054 -0.050 10.682 0.934

4 0.003 -0.009 2.2129 0.331 22 -0.094 -0.085 12.409 0.901

5 0.049 0.055 2.6238 0.453 23 0.079 0.079 13.629 0.885

6 0.002 0.006 2.6242 0.623 24 0.025 0.039 13.756 0.910

7 0.033 0.024 2.8150 0.728 25 0.084 0.101 15.163 0.889

8 0.110 0.117 4.9493 0.550 26 0.023 0.068 15.272 0.913

9 0.004 0.010 4.9527 0.666 27 0.073 0.076 16.329 0.905

10 -0.020 -0.043 5.0256 0.755 28 -0.056 -0.056 16.962 0.910

11 0.058 0.064 5.6352 0.776 29 -0.029 -0.014 17.128 0.928

12 0.046 0.057 6.0251 0.813 30 -0.101 -0.071 19.220 0.891

13 -0.021 -0.044 6.1038 0.866 31 -0.015 -0.064 19.269 0.914

14 -0.050 -0.062 6.5671 0.885 32 -0.102 -0.149 21.443 0.874

15 -0.011 -0.008 6.5902 0.922 33 -0.036 -0.053 21.717 0.892

16 0.087 0.078 8.0125 0.889 34 0.024 0.021 21.838 0.912

17 -0.087 -0.091 9.4157 0.855 35 -0.016 -0.046 21.891 0.930

18 -0.036 -0.059 9.6606 0.884 36 -0.023 -0.054 22.005 0.944

Based on the results in Table 7, the null hypothesis of no serial correlation

cannot be rejected. The descriptive statistics of the residuals from the hybrid

model are presented in Figure 6.

0

4

8

12

16

20

-3 -2 -1 0 1 2

Series: Standardized Residuals

Sample 4 169

Observations 166

Mean 0.039177

Median 0.048757

Maximum 2.356547

Minimum -3.039927

Std. Dev. 1.003361

Skewness -0.277227

Kurtosis 3.217420

Jarque-Bera 2.453273

Probability 0.293277

Figure 6: Descriptive Statistics of the Residuals for Hybrid Model


From the Jarque-Bera statistic in Figure 6, the null hypothesis which states

that residuals follow the normal distribution is not rejected. At this point, the

hybrid model can be used for forecasting. The MAPE values for in-sample and

out-sample forecasting are 0.819246 and 0.428993 respectively.

4 Conclusion

Box-Jenkins method applies autoregressive moving average ARMA or

ARIMA in finding the best fit of a time series to its past values. When the method

was used to model Malaysian crude palm oil price, the residuals were orthogonal

but not normal. Upon inspection, it was discovered that there still remains serial

correlation in the series. ARCH effect was present. The time series plot of

residuals also showed some cluster of volatility. To model volatility, GARCH

method was used to reflect more recent changes and fluctuations in the series. The

hybrid ARIMA (2, 1, 0)-GARCH (3, 1) model was the most appropriate model for

Malaysian CPO price. The residuals were independent with zero mean, normally

distributed, while ACF and PACF of squared residuals displayed no significant

lags.

Acknowledgement

This work was supported by RUG Vot No: Q.J130000.2526.08H46. The authors would like to thank Universiti Teknologi Malaysia (UTM) for providing the funds and facilities.

References

[1] Fatimah Mohd Arshad and Roslan A. Ghaffar, Crude Palm Oil Price

Forecasting: Box-Jenkins Approach, Pertanika, 9 (3), (1986), 359 – 367.

[2] Khin Aye Aye, Mohamed Zainalabidin, Nambhi Malarvizhi, Chinnasamy

Agamudai and Thambiah Seethaletchumy, Price Forecasting Methodology of

the Malaysian Palm Oil Market, International Journal of Applied Economics

& Finance, 7 (1), (2013), 23.

[3] S.R. Yaziz, N.A. Azizan, R. Zakaria and M.H. Ahmad, The Performance of

Hybrid ARIMA GARCH Modeling, In: 20th International Congress on

Modelling & Simulation 2013 (MODSIM2013), 1-6 December 2013,

Adelaide, Australia.

[4] R. F. Engle, An Introduction to the Use of ARCH/GARCH Models in

Applied Econometrics, Journal of Business, New York (1982).


[5] Pung Yean Ping, Nor Hamizah Miswan and Maizah Hura Ahmad,

Forecasting Malaysian Gold using GARCH Model, Applied Mathematical

Sciences, 7 (58), 2013, 2879-2884.

[6] Maizah Hura Ahmad and Pung Yean Ping, Modelling Malaysian Gold Using

Symmetric and Asymmetric GARCH Models, Applied Mathematical

Sciences, 8 (17), 2014, 817-822.

[7] T. Bollerslev, Generalized Autorregressive Conditional Heteroskedasticity,

Journal of Econometrics, 31 (1986), 307-327.

[8] Nor Hamizah Miswan, Pung Yean Ping and Maizah Hura Ahmad, On

Parameter Estimation for Malaysian Gold Prices Modelling and Forecasting,

International Journal of Mathematical Analysis, 7 (22), 2013, 1059-1068.

[9] Maizah Hura Ahmad, Pung Yean Ping, Siti Roslindar Yazir and Nor

Hamizah Miswan, A Hybrid Model for Improving Malaysian Gold Forecast

Accuracy, International Journal of Mathematical Analysis, 8 (28), 2014,

1377-1387.

Received: August 7, 2014

Volatility Modelling and Forecasting of Malaysian … modelling and forecasting 6161 ed from a...

Documents

Transcript of Volatility Modelling and Forecasting of Malaysian … modelling and forecasting 6161 ed from a...