ARIMA Models - · PDF fileARIMA Models Dan Saunders I will discuss models with a dependent...

ARIMA ModelsDan Saunders

I will discuss models with a dependent variable yt, a potentially endogenous error termεt, and an exogenous error term ηt, each with a subscript t denoting time. With just thesethree objects, we may consider a rich class of models called:

Autoregressive︸︷︷︸AR

Integrated︸︷︷︸I

Moving Averages︸︷︷︸MA

Autoregression

To start, consider an AR(1) model:

yt = φyt−1 + εt

Right away you notice this is no different from any standard regression yi = βxi + εi. Wehave simply relabeled the coefficient β → φ and the right-hand-side variable is a lag of thedependent variable x→ yt−1. Since it is no different from any other regression, an exogenouserror term is enough for OLS to be consistent:

If E(εt) = 0 and E(εtyt−1) = 0 and E(y2t ) <∞

Then φols =1n

∑Tt=1 ytyt−1

1n

∑Tt=1 y

2t−1

=1n

∑Tt=1(φyt−1 + εt)yt−1

1n

∑Tt=1 y

2t−1

plim−→ φ+E(εtyt−1)

E(y2t−1)= φ

In words, as long as assumptions 1-3 hold, OLS is consistent. Originally, assumption4, homoskedasticity, did not affect the unbiasedness or consistency of OLS. With laggeddependent variables, this is no longer true. Serial correlation in the error term generatesendogeneity bias, assumption 2, due to omitted variables bias. This is an implicit violationof assumption 1, i.e., we have misspecified the model. Let’s start with some basic intuition.Suppose the serial correlation of the error term is itself AR(1):

εt = ρεt−1 + ηt

Then it is clear that the error term is correlated with the right-hand-side variable:

yt = φ yt−1︸︷︷︸depends on εt−1

+ εt︸︷︷︸depends on εt−1

So, what should we do? If we had a way for Eviews to control for the serial correlationof the error, then the remaining error would be exogenous, and we could perform OLS.Well, that’s exactly what the AR() function in Eviews does. To clarify, this is a functionin Eviews that controls for AR() serial correlation in the error term, not y, regardless of

1

whether that autocorrelation generates bias. Thus, we could use it in any regression, usingCochrane-Orcutt:

If yi = βxi + εi, and εi = ρεi−1 + ηi, then ls y x AR(1)

If we have a lagged dependent variable, then this solution works as well, using Iterated-Cochrane-Orcutt:

If yt = φyt−1 + εt, and εt = ρεt−1 + ηt, then ls y y(-1) AR(1)

The coefficient on y(-1) will be φ and on AR(1) will be ρ. (Or will they? More later...)However, as I said earlier, this is really a violation of assumption 1. To see this, firstsubstitute the AR(1) error equation into the AR(1) main equation:

yt = φyt−1 + ρεt−1 + ηt

Now, substitute the lagged main equation in for εt−1:

yt = φyt−1 + ρ(yt−1 − φyt−2) + ηt

Collecting terms, we can see that the true model (the one we implicitly wrote down) is:

yt = (φ+ ρ)yt−1 − (ρφ)yt−2 + ηt

where the error is now exogenous. We may choose to re-write the equation; using differentletters to differentiate the true coefficients from the misspecified model:

yt = λ1yt−1 + λ2yt−2 + ηt

If we try to solve for φ and ρ as functions of λ1 and λ2, we find:

φ =λ1 +

√λ21 + 4λ22

, ρ =λ1 −

√λ21 + 4λ22

OR φ =λ1 −

√λ21 + 4λ22

, ρ =λ1 +

√λ21 + 4λ22

We have no way of knowing which, and for the purposes of forecasting, it doesn’t matter.If Eviews sets φ = ρ and ρ = φ, we will have the exact same forecast. Whether Eviewsconverges to the correct solution, or the reverse, depends upon the initial condition usedfor Iterated-Cochrane-Orcutt. However, why not simply run OLS on the correctly specifiedequation, and generate identical forecasts:

ls y y(-1) y(-2)

The lesson embedded in this problem is important. In general, any AR(p) model for y withan AR(b) model error term is, infact, a misspecified AR(p+h) model of y. Thus, findingthe correct specification for any autoregressive process will resolve the autocorrelation in theerror term and, hence, remove the bias. This is why we should reject any AR model with

2

serially correlated residuals and try higher order AR models, rather than try to control forthe serial correlation of the error directly.

Okay. Lesson learned, let’s go run a bunch of AR models. But wait, if the AR() commandin Eviews refers to the error term, not the dependent variable, then what’s the command wewant? Well, if we believe y is AR(1) and the error is exogenous, then we run OLS:

yt = φyt−1 + εt

εt = ηt

ls y y(-1)

Alternatively, we could regress y on nothing, but assume the error is serially correlated:

yt = εt

εt = φεt−1 + ηt

Why is this an equivalent model? Repeat the steps from above. First, substitute the serialcorrelation equation into the main equation:

yt = φεt−1 + ηt

Second, use the main equation to replace εt−1:

yt = φyt−1 + ηt

Therefore, in Eviews we run the command:

ls y AR(1)

which literally says “regress y on nothing, but control for an AR(1) serially correlated errorterm.” Yet the result is an estimation the exact same model. This result is also importantbecause it extends to all cases. Suppose we believe that y is an AR(p) process when themodel is correctly specified model:

yt = φ1yt−1 + φ2yt−2 + · · ·+ φpyt−p + εt

εt = ηt

Then we may estimate this equation in Eviews as:

yt = εt

εt = φ1εt−1 + φ2εt−2 + · · ·+ φpεt−p + ηt

ls y AR(1) AR(2) · · ·AR(p)

The main difference will now be that Eviews understands you are performing time seriesanalysis and stores the auto-correlation functions for the model, so you should always do itthis way.

Okay, so now you understand. Lagged dependent variables with serial correlation in theresiduals means you should try a different AR(p) specification using the AR(1)· · ·AR(p)commands. Likewise, for a regression without lagged dependent variables, but with seriallycorrelated errors, you may add the AR(1)· · ·AR(p) commands to restore homoskedasticity.Both methods work with the same simple, flexible commands (or at least that’s the idea).

3

Moving Average

So what’s a moving average? It is most simple to understand with real data and evenweighting. Suppose we have any random data over time. We may ask, what’s the three dayrunning average? Of course, we need the first three numbers in order to calculate the firstterm, so we will only have n− 2 averages when we’re done:

More generally, we may construct a moving average of order q for any data:

xt(q) =1

q

q∑i=1

xt−i

We don’t even require equal weights:

xt(q) =

q∑i=1

αi · xt−i where

q∑i=1

αi = 1

From this perspective, moving averages seem quite simple. What makes our moving averagesdifficult is that they are defined for ε, the unobservable error term. Moreover, it is assumedthat the error term is exogenous, i.e., the AR process is correctly specified so that εt = ηt.Finally, our weights don’t sum to one. Instead, they satisfy the unit root restriction. Again,in Eviews the MA() function is an assumption about the error, so we could estimate anMA(1) as follows:

yt = εt

εt = θηt−1 + ηt

ls y MA(1)

4

Again, this is more familiar if we substitute the error equation into the main equation:

yt = θηt−1 + ηt

Likewise, we may imagine any MA(q) model:

yt = θ1ηt−1 + θ2ηt−2 + · · ·+ θqηt−q + ηt

And we could estimate any such model in Eviews as:

ls y MA(1) MA(2) · · · MA(q)

It is important to note that the moving average is with respect to the exogenous error term.Thus, in order to have any chance at accurately estimating the moving average coefficients,we must first believe that the residuals we observe are not serially correlated. This takes usback to the principal question. How are we to select an ARMA model? The answer:

1. We must select an AR(p) process that is a plausibly correct specification. One necessary(but not sufficient) condition is that the residuals not be serially correlated. We shouldadd as many terms as necessary but no more.

2. Once we can obtain unbiased residuals, we may use them to estimate a moving averageon the exogenous error. We should add as many terms as necessary but no more.

We do all of this simultaneously by running many ARMA(p,q) models. We must throwout any models with serially correlated residuals. Among the remaining models, we mustbalance out our desire for correct specification with parsimony (simplicity). One methodis to select the model with the minimum Akaike Information Criterion or (often preferred)the minimum Schwarz Criterion (minimum means most negative). However, these are by nomeans the only methods for selecting a model. We may also appeal to graphical arguments(correlograms), test statistics, or forecasting performance when selecting a model.

A Technical Note

I have omitted a constant for the usual reason: easier math. However, you may notice that:

ls y c AR(1) · · · AR(p) MA(1) · · · MA(q)

AND

ls y c y(-1) · · · y(-p) MA(1) · · · MA(q)

produce different estimates of the constant coefficient c. The short answer is, “Who caresabout the constant anyway? It has no economic significance.” I don’t mean to imply thatyou should drop the constant, as that could cause omitted variable bias (the bias we justworked so hard to resolve). Rather, subtract off the mean of the dependent variable fromeach observation (y∗t = yt − y). Then you can drop the constant from the regression sincethe process will be mean zero by construction (assuming stationarity).

(From here on out we shall assume that the ARMA model is well specified. So ε is purelyexogenous, ε = η. This is called “White Noise” in time series econometrics.)

5

Integrated ARMA

In order for ARMA estimation to work at all, we must believe that the dependent variableis stationary. There two definitions:

1. Weakly Stationary : the covariance Cov(yt, yt−j) = σj does not change over time

2. Strictly Stationary : the distribution of yt does not change over time

The weak definition is sufficient for ARMA models; although, it’s easier to imagine thestrict definition. In order to transform non-stationary data into something stationary, wewill consider taking first and second order differences; a process known as “integration”.Consider a time-series data with a time trend. One option is to de-trend the data:

yt = α + µt+ εt

In this case, yt is called “trend-stationary”, and adding @trend in Eviews is sufficient torestore stationarity. On the other hand, we may have a random walk with drift:

yt = µ+ yt−1 + εt

In this case, the model is called “difference-stationary” because de-trending solves the non-stationarity of the drift, not the random walk, while first-differencing solves both. We couldeasily run this model in Eviews using the d() function, which tells the software to calculatethe first difference:

ls d(y) c

Because Eviews understands d(y) to mean “the dependent variable is the first difference ofy”, this syntax is carried through to the AR() commands. We may also want to calculatethe second order difference, i.e., the difference of the difference:

[(yt − yt−1)− (yt−1 − yt−2)] = φ [(yy−1 − yt−2)− (yt−2 − yt−3)] + εt

To run this in Eviews we would iterate the differences:

ls d(d(y)) AR(1)

While it is mathematically straightforward to extend this concept indefinitely, we typicallydo not go beyond first or second differencing, as it is hard to imagine the applicability. Moregenerally, an ARIMA(p,1,q), a first order integrated ARMA(p,q) model, looks like:

(yt − yt−1) = φ1(yt−1 − yt−2) + · · ·+ φp(yt−p − yt−p−1) + θ1εt−1 + · · ·+ θqεt−q + εt

This could be run in Eviews as:

ls d(y) AR(1) · · · AR(p) MA(1) · · · MA(q)

6

ARIMA Models - · PDF fileARIMA Models Dan Saunders I will discuss models with a dependent...

Documents

Transcript of ARIMA Models - · PDF fileARIMA Models Dan Saunders I will discuss models with a dependent...