Unit Roots and Cointegration -...

25
Unit Roots and Cointegration Massimo Guidolin March 2018 1 Defining Unit Root Processes Deterministic trends: linear or nonlinear functions of time t, y t = f (t)+ t . Example: y t = Q X j =0 δ j t j + t where { t } is a white noise process. The trending effect caused by functions of t are permanent and hence impress a trend, as time is irreversible, for instance when Q =1 Δy t = y t - y t-1 = δ 1 +( t - t-1 ). They are trend stationary because y t+1 can differ from its trend value by the amount t+1 and because this deviation is stationary, the series will exhibit only temporary departures from the trend the long-term forecast of y t+1 will converge to the trend line Q j =0 δ j t j . Stochastic trend: all series that can be written as y t = y 0 + μt + t X τ =1 τ V ar[ t ]= σ 2 and y t+1 = y 0 + μt + t τ =1 τ + μ + t+1 = μ + y t + t+1 The series follows a random walk with drift process. or more generally, y t contains a stochastic trend ⇐⇒ it can be decomposed as y t = y 0 + μt + t X τ =1 η τ 1

Transcript of Unit Roots and Cointegration -...

Page 1: Unit Roots and Cointegration - unibocconi.itdidattica.unibocconi.it/mypage/dwload.php?nomefile=... · Unit Roots and Cointegration Massimo Guidolin March 2018 1 De ning Unit Root

Unit Roots and Cointegration

Massimo Guidolin

March 2018

1 Defining Unit Root Processes

Deterministic trends: linear or nonlinear functions of time t, yt = f(t) + εt. Example:

yt =

Q∑j=0

δjtj + εt

where {εt} is a white noise process.

• The trending effect caused by functions of t are permanent and hence impress a trend,as time is irreversible, for instance when Q = 1 ⇒ ∆yt = yt − yt−1 = δ1 + (εt − εt−1).

• They are trend stationary because yt+1 can differ from its trend value by the amountεt+1 and because this deviation is stationary, the series will exhibit only temporarydepartures from the trend ⇒ the long-term forecast of yt+1 will converge to the trendline

∑Qj=0 δjt

j.

Stochastic trend: all series that can be written as

yt = y0 + µt+t∑

τ=1

ετ

• V ar[εt] = σ2 and yt+1 = y0 + µt+∑t

τ=1 ετ + µ+ εt+1 = µ+ yt + εt+1

⇒ The series follows a random walk with drift process.

or more generally, yt contains a stochastic trend ⇐⇒ it can be decomposed as

yt = y0 + µt+t∑

τ=1

ητ

1

Page 2: Unit Roots and Cointegration - unibocconi.itdidattica.unibocconi.it/mypage/dwload.php?nomefile=... · Unit Roots and Cointegration Massimo Guidolin March 2018 1 De ning Unit Root

where ητ follows any ARMA(p, q) stationary process.This expression is centered on the first-differences of yt.

• All deterministic trends can be converted into stochastic trends, while the opposite isnot true.

• A series {yt} can be decomposed as yt=trend + stationary component+noise=(deterministictrend+stochastic trend)+stationary component+noise.

• In large samples, the deterministic time trend induced by the drift component dom-inates the time series, while in small samples it is not always easy to discern thedifference between a driftless RW and a model with drift.

Figure 1: Comparing Trend-Stationary Series and Stochastic Trends

Figure 2: Estimated Deterministic Trends and Estimated Random Walk

Figure 1 and 2 compare simulated deterministic (linear, quadratic, and cubic) trends to ran-dom walk (RW) stochastic trend series (with no drift and with positive/negative drift): thedeterministic time trend induced by the drift component, dominates the time series.

2

Page 3: Unit Roots and Cointegration - unibocconi.itdidattica.unibocconi.it/mypage/dwload.php?nomefile=... · Unit Roots and Cointegration Massimo Guidolin March 2018 1 De ning Unit Root

Expectation of a Random Walk:

1. If µ = 0

E[yt] = E[yt+s] = y0 + E[∑

τ

]ετ = y0 +

∑τ

E[ετ ] = y0

that is constant equal to the initial value. However, all stochastic shocks have non-decaying effects on the series.If µ 6= 0

E[yt] = Et[y0 + µt+t∑

τ=1

ετ ] = y0 + µt

2. If µ = 0

Et[yt+s] = Et[yt+s−1 + εt+s] = Et[yt+s−2 + εt+s + εt+s−1] = ... = Et[yt] +Et[s∑i=1

εt+i] = yt

that is for any s ≥ 2 the conditional means for all values of yt+s are equivalent and thecurrent yt represents the minimum mean-squared loss function forecast of all futurevalues of the series.If µ 6= 0

Et[yt+s] = Et[y0 + µ(t+ s) +t∑

τ=1

εt] = y0 + µs

that is the forecast function changes deterministically with time.

Expectation of a driftless Random Walk (µ = 0):

(a)

V ar[yt] = V ar[y0 +t∑

τ=1

ετ ] =t∑

τ=1

V ar[ετ ] = tσ2

the variance is time-dependent and therefore a RW is not covariance stationary

(b)

V ar[yt+s] = V ar[y0 +t+s∑τ=1

ετ ] =t+s∑τ=1

V ar[ετ ] = (t+ s)σ2

as s→∞ the variance approaches infinity.

(c) Given that the mean is constant

E[(yt − y0)(yt+s − y0)] = E[t∑

τ=1

ετ ·t+s∑τ=1

ετ ] = E[t∑

τ=1

ε2τ ] = tσ2

3

Page 4: Unit Roots and Cointegration - unibocconi.itdidattica.unibocconi.it/mypage/dwload.php?nomefile=... · Unit Roots and Cointegration Massimo Guidolin March 2018 1 De ning Unit Root

(d)

Corr[yt, yt+s] =E[(yt − y0)(yt+s − y0)]√

V ar[yt]V ar[yt+s]=

tσ2√tσ2(t+ s)σ2

=

√t

t+ s

• A driftless RW meanders without exhibiting any tendency to increase or decrease.In fact, for any fixed real number M , the random walk and its absolute will exceedM with probability 1 as the series lengthens.

• For sufficiently large samples, Corr[yt, yt+s] ' 1, but as s grows Corr[yt, yt+s]declines below 1 ⇒ We cannot distinguish between a unit root process and astationary process with an autoregressive coefficient that is close to unity usingthe ACF.

Figure 3: Sample ACF and Estimated AR(p) Models on RW Series

Figure 3 shows that, even though the sample autocorrelations are close to 1, they visi-bly decay possibly instilling the doubt that we may be facing a highly persistent AR(1)process.

Deterministic De-Trending: De-trending entails regressing a variable on a deter-ministic (polynomial) function of time and saving the residuals,{εt} that come then torepresent the new, de-trended series

yt+1 =

Q∑j=0

δjtj + εt+1

where the coefficients can be simply estimated by OLS.

• The appropriate degree of the polynomial can be set by standard t-tests, F-tests,and/or information criteria.

• The regression is usually estimated using the largest value of Q considered rea-sonable given the type of data or their frequency.

4

Page 5: Unit Roots and Cointegration - unibocconi.itdidattica.unibocconi.it/mypage/dwload.php?nomefile=... · Unit Roots and Cointegration Massimo Guidolin March 2018 1 De ning Unit Root

• The de-trended process can then be modeled using traditional methods

Unit Root Process and dth Order Integration: When a time series process {yt}needs to be differenced d times before being reduced to the sum of constant terms plusa white noise process, {yt} is said to contain d unit roots or to be integrated of orderd; we also write that yt ∼ I(d).

Example: RW with drift process.Take its first-difference:

∆yt+1 ≡ yt+1 − yt = (µ+ yt + εt+1)− yt = µ+ εt+1

⇒ white noise series plus a constant intercept. It is a I(1) and it contains one unit root.

• If {yt} contains d unit roots, then {yt} is non-stationary, while the opposite doesnot hold (e.g. explosive autoregressive process: yt+1 = µ+ φyt + εt+1).

• Typically φ = 1 is used to characterize non-stationarity because it describes ac-curately many time series.

• All I(d) processes (with d > 0) are non-stationary, but there are non-stationaryprocesses that do not fit the definition of unit root processes.

Figure 4: De-trended and First-Differenced Series

Figure 5: Sample ACF of De-trended and First-Differenced Series

5

Page 6: Unit Roots and Cointegration - unibocconi.itdidattica.unibocconi.it/mypage/dwload.php?nomefile=... · Unit Roots and Cointegration Massimo Guidolin March 2018 1 De ning Unit Root

1.1 What Happens if One Incorrectly De-Trends a Unit RootSeries?

De-trending a Unit Root Process: When a time series process {yt} is I(d) but anattempt is made to remove its stochastic trend by fitting deterministic (often polyno-mial, spline-type) time trend functions, the resulting OLS residuals will still containone or more unit roots. Equivalently, deterministic de-trending does not remove thestochastic trends.

Example: RW with drift processDe-trend using a simple linear time trend, i.e. subtracting δ0 + δ1t

yt − yde−trendt = y0 + µt+t∑

τ=1

ετ − δ0 − δ1t = (y0 − δ0) + (µ− δ1)t+t∑

τ=1

ετ

that is the resulting expression for yt− yde−trendt still equals∑t

τ=1 ετ even when δ0 = y0and δ1 = µ. Therefore, the source of unit roots and, consequently, the stochastic trendhave not been removed.

Figure 6: Incorrect Polynomial De-trending of a Random Walk

Figure 6 plots the series yt− yde−trendt on the left, and the ACF of yt− yde−trendt on theright.

1.2 What Happens if One Incorrectly Applies Differencing to(Deterministic) Trend-Stationary Series?

Differentiating a Trend-Stationary Processes: When a time series process {yt}contains a deterministic time trend but it is otherwise I(0), (i.e., it is trend-stationary)

6

Page 7: Unit Roots and Cointegration - unibocconi.itdidattica.unibocconi.it/mypage/dwload.php?nomefile=... · Unit Roots and Cointegration Massimo Guidolin March 2018 1 De ning Unit Root

and an attempt is made to remove the deterministic trend by differentiating the series dtimes, the resulting differentiated series will contain d unit roots in its moving-averagecomponents and will therefore be not invertible. Equivalently, differentiating a trend-stationary series, creates new stochastic trends that are shifted inside the shocks of theseries.

• The more you differentiate a trend-stationary process, the higher the number ofunit roots in the resulting MA process.

Example:Starting from a deterministic trend

yt =

Q∑j=0

δjtj + εt

Take its first difference

∆yt ≡ yt−yt−1 = (

Q∑j=0

δjtj+εt+1)−(

Q∑j=0

δj (t−1)j +εt-1) =

Q∑j=0

δj [tj −(t−1)j ]+(εt −εt-1)

that is a MA(1) with a complex time trend characterized by a unit root in the MAcomponent.When Q = 1 this becomes an integrated moving-average of order 1

∆yt = δ1 + (εt − εt-1)

If we differentiate a second time we obtain an integrated MA(2) process with two visibleroots in excess of 1, and an increasingly complex time trend function

∆2yt ≡ ∆yt−∆yt−1 = (

Q∑j=0

δj[tj−(t−1)j]+εt+1−εt)−(

Q∑j=0

δj [(t−1)j−(t−2)j]+εt-1−εt−2) =

=

Q∑j=0

δj {[tj − (t − 1)j ] − [(t − 1)j − (t − 2)j ]} + εt − 2εt-1 + εt−2

7

Page 8: Unit Roots and Cointegration - unibocconi.itdidattica.unibocconi.it/mypage/dwload.php?nomefile=... · Unit Roots and Cointegration Massimo Guidolin March 2018 1 De ning Unit Root

Figure 7: Incorrect First-Differencing of a Trend-Stationary Series

Figure 7 plots the series ∆yt on the left, and the ACF of ∆yt on the right.

1.3 What Happens if One Incorrectly Applies Differencing toa Stationary Series?

• Even when the trend-stationary component is absent, if the time series is I(0)but it is differenced d times, the resulting differentiated series will contain d unitroots in its moving-average components and will therefore be not invertible.

1.4 What Happens if One Incorrectly Applies Differencingd + r Times to an I (d) Series?

• It is a generalization of the previous cases, that is differencing d = r times an I(0)series that contained a deterministic trend and differencing d = r times an I(0)series

Two cases when differencing yt ∼ I(d) d+ r times:

(a) If r > 0, when a time series is I(d) but it is differenced more than d times,the resulting differentiated series will contain r unit roots in its moving-averagecomponents and will therefore be not invertible (Over-differentiation).

(b) If r < 0, if the original time series is I(d) but it is differenced less than d times,the resulting differentiated series will still contain d + r < d unit roots and willtherefore remain non-stationary (Under-differentiation).

8

Page 9: Unit Roots and Cointegration - unibocconi.itdidattica.unibocconi.it/mypage/dwload.php?nomefile=... · Unit Roots and Cointegration Massimo Guidolin March 2018 1 De ning Unit Root

Figure 8: Excessive Differencing of a I(1) Series

Figure 8 shows that the over-differenced integrated MA series tends to be spikier andmore volatile than the correctly differenced white noise series should be.

2 The Spurious Regression Problem

Sums of Stationary and Non-Stationary Series: Consider N time series, y1,t ∼I(d1), y2,t ∼ I(d2), ..., yN,t ∼ I(dN). Then, unless special conditions occur, theirweighted sum will be integrated with an order that is the maximum across all in-tegration orders:

N∑i=1

wiyi,t ∼ I(max(d1, d2, ..., dN))

(Heuristic) proof in case of three series:Let yt ∼ I(1), xt ∼ I(1) and ηt be three series such that yt and xt are independent.The regression of yt on xt

yt = a+ bxt + ηt

9

Page 10: Unit Roots and Cointegration - unibocconi.itdidattica.unibocconi.it/mypage/dwload.php?nomefile=... · Unit Roots and Cointegration Massimo Guidolin March 2018 1 De ning Unit Root

can be rewritten asηt = yt − a− bxt

Substituting yt with the stochastic trend equation we obtain

ηt = (y0 + µyt+t∑

τ=1

εyτ )− a− b(x0 + µxt+t∑

τ=1

εxτ ) =

(y0 − a− bx0) + (µy − bµx)t+t∑

τ=1

(εyτ − bεxτ )

⇒ ηt ia a unit root process, given that εxτ and εyτ are independent white noise errors sothat their sum is also a white noise.

ηt = (y0 + µyt+t∑

τ=1

εyτ )− a− b(x0 + µxt+t∑

τ=1

εxτ ) =

= [(µy − bµx) + εyt − bεxt ] + (y0 − a− bx0) + (µy − bµx)(t− 1) +t−1∑τ=1

(εyτ − bεxτ )

= (µy − bµx) + ηt−1 + (εty − bεxt )

that is, another random walk with drift.This follows from the result on Sums of stationary and Non-stationary series statedabove, in the case of a sum of two I(1) variables.

• ηt is the weighted sum minus a constant of the two I(1) variables. Thereforeηt ∼ I(1), which is the highest integration order of the variables that we arecombining.

• Starting from yt = a+ bxt + ηt, when the regressand and regressor are both I(1),it turns out that the regression errors must then also contain, unless special con-ditions occur, a unit root.⇒ When ηt is a random walk with drift, the assumptions of the classical regres-sion model (yt and xt are stationary and the errors have a zero mean and a finitevariance) are not respected.

Spurious regression: a regression where

(a) the residuals are I(1) and as such any shock has a permanent effect on the regres-sion, being equivalent to a permanent change of the intercept of the model

(b) R-square is high and t-statistics appear to be significant, but the results are voidof any economic meaning

(c) Standard OLS estimators are inconsistent and the associated inferential proce-dures are invalid and statistically meaningless.

10

Page 11: Unit Roots and Cointegration - unibocconi.itdidattica.unibocconi.it/mypage/dwload.php?nomefile=... · Unit Roots and Cointegration Massimo Guidolin March 2018 1 De ning Unit Root

• Estimating spurious regressions and reporting and discussing their results is use-less.

• Problems will also arise in regression analysis when the regressand and the regres-sors are integrated of different orders. Regression equations using such variablesare meaningless.

Example: if in

ηt = (y0 − a− bx0) + (µy − bµx)t+t∑

τ=1

(εyτ − bεxτ )

only xt is I(1), the resulting regression errors would be I(1)

η′t = (µy + εyt )− a− b(x0 + µxt+t∑

τ=1

εxτ ) = (µy − a− bx0)− bµxt− bt∑

τ=1

εxτ + εyt

3 Unit Root Tests

• Because the estimated AR(1) coefficient equals Corr[yt, yt−1], this means that anyattempt to recover a unit root from generated RW data will fail because the OLSestimate of the unit root coefficient will contain a downward bias.⇒ Dickey and Fuller offer a procedure to formally test for the presence of a unitroot that takes this bias into account.

3.1 Classical Dickey-Fuller Tests

Dickey and Fuller procedure: find critical values using a Monte Carlo design

(a) Generate a large number of random walk sequences and for each of them calculatethe estimated value of the AR(1) coefficient

(b) For the law of large numbers, the larger the number of simulation trials, the moreprecise the resulting distribution of the statistic

(c) Obtain critical values, which are generated by assuming a random walk.

Dickey-Fuller test: test for the presence of a unit root. From the AR(1) modelequation specified as

yt+1 − yt ≡ ∆yt+1 = (φ− 1)yt + εt+1

11

Page 12: Unit Roots and Cointegration - unibocconi.itdidattica.unibocconi.it/mypage/dwload.php?nomefile=... · Unit Roots and Cointegration Massimo Guidolin March 2018 1 De ning Unit Root

where (φ− 1) = α, or as

∆yt+1 = αyt + εt+1 no constant

∆yt+1 = µ+ δt+ αyt + εt+1 with a deterministic time trend

From one of these, obtain the OLS estimate of α, the corresponding t-statistic andcompare it with the critical value found by Dickey and Fuller, in a one-sided t-test.

• It would be incorrect to simply use the standard t-student because standard in-ference is invalid when the assumed stochastic process is non-stationary

• The t-test used takes into account that under the null of a random walk, itsdistribution is nonstandard and cannot be analytically evaluated. A standardinference would be invalid since the assumed stochastic process is non-stationary.

• The limiting distribution α is not affected by the removal of any deterministic Sseasonal components, through dummy variables inserted in yt+1 − yt ≡ ∆yt+1 =(φ− 1)yt + εt+1 and ∆yt+1 = µ+ δt+ αyt + εt+1, like in

∆yt+1 = µ+ δt+S−1∑s=1

λsDs + αyt + εt+1

where Ds are standard seasonal dummies in a number equal to S − 1.

• α has a non-standard (finite sample) distribution.

• The critical values need to be approximated by simulation, using null models thatare adjusted to either exclude the drift or include a time trend. The decision toinclude a time trend depends also on economic theory or financial models.

• The ADF critical values depend on which deterministic regressors are included,if any: the confidence intervals around α = 0 dramatically expand if a drift anda time trend are included in the model.

• When the series is stationary the distribution of the t-statistic does not dependon the presence of the other regressors.

3.2 The Augmented Dickey-Fuller Test

• The DF test is limited by the fact that, given the null hypothesis of a randomwalk, the alternative hypothesis is specified as an AR(1) stationary process. How-ever, not all stationary time-series variables can be well represented by an AR(1)process.

12

Page 13: Unit Roots and Cointegration - unibocconi.itdidattica.unibocconi.it/mypage/dwload.php?nomefile=... · Unit Roots and Cointegration Massimo Guidolin March 2018 1 De ning Unit Root

Augmented Dickey-Fuller (ADF) test: for general AR(p) processesConsider the AR(p) process

yt+1 = φ0 + φ1yt + φ2yt−1 + ...+ φp−1yt−p+2 + φpyt−p+1 + εt+1

Add and subtract φpyt−p+2

yt+1 = φ0 + φ1yt + φ2yt−1 + ...+ (φp−1 + φp)yt−p+2 − φp∆yt−p+1 + εt+1

Add and subtract (φp−1 + φp)yt−p+3

yt+1 = φ0 +φ1yt+ ...+(φp−2 +φp−1 +φp)yt−p+3− (φp−1 +φp)∆yt−p+2−φp∆yt−p+1 + εt+1

Repeat this step p times and obtain

∆yt+1 = φ0 + αyt +

p∑i=1

γi∆yt−i+1 + εt+1

where α ≡ −(1−∑p

i=1 φi) and γi ≡ −∑p

j=1 φj.Alternatively, the regression can be specified as

∆yt+1 = αyt +

p∑i=1

γi∆t−i+1 + εt+1

∆yt+1 = φ0 +

Q∑j=1

δjtj + αyt +

p∑i=1

γi∆yt−i+1 + εt+1

If α = 0 the equation is entirely in first differences and so it has a unit root, whileα < 1 is indication of stationarity.

• The ADF test implements a parametric correction for higher-order correlation.

• When p = 1, the ADF test boils down to a classical DF test because∑1

i=2 γi∆yt−i+1 =0.

• The appropriate statistic to use depends on the deterministic components includedin the regression equation.

• α and its standard error cannot be properly estimated unless all of the autoregres-sive terms are included in the estimating equation, through correct specificationof p.

• If p is not correctly specified:

– Few lags: the regression residuals do not behave like a white noise process⇒the model does not appropriately capture the actual error process ⇒ α andits standard error are not correctly estimated.

– Too many lags: the power of the test to reject the null of a unit root is reducedsince the increased number of lags necessitates the estimation of additionalparameters and a loss of degrees of freedom.

13

Page 14: Unit Roots and Cointegration - unibocconi.itdidattica.unibocconi.it/mypage/dwload.php?nomefile=... · Unit Roots and Cointegration Massimo Guidolin March 2018 1 De ning Unit Root

• Select p:

– Start with a relatively long lag length and pare down the model by the usualt- and/or F -tests.

– In regressions containing a mixture of I(1) and I(0) variables, t- and F -testsapplied to the coefficients of the stationary variables will be (asymptotically)valid only when the residuals are white noise ⇒ Information criteria.

• The most general of the alternative regressions is not necessary the best choice,since the presence of the additional estimated parameters may reduce the degreesof freedom and the power of the test.

• The first regression assumes an AR(p) process, but in reality the DGP may con-tain both autoregressive and moving average components. However, because aninvertible MA model can be transformed into an AR model, the procedure canbe generalized to allow for moving average components.

• Low power issues because ADF tests cannot distinguish between a series with acharacteristic root that is close to unity and a true unit root process, or betweena trend stationary and a stochastic trend processes. This is because in finite sam-ples trend stationary process can be arbitrarily well approximated by a unit rootprocess, and a unit root process can be arbitrarily well approximated by a trendstationary process.

Figure 9: Four Simulated Series Based on Identical Shocks

The plot on the left compares 180 generated values from the two models:

yt+1 = 1.1yt − 0.1yt−1 + εt+1 (Nonstationary AR(2))

zt+1 = 1.1zt − 0.15zt−1 + εt+1 (Stationary AR(2))

14

Page 15: Unit Roots and Cointegration - unibocconi.itdidattica.unibocconi.it/mypage/dwload.php?nomefile=... · Unit Roots and Cointegration Massimo Guidolin March 2018 1 De ning Unit Root

both initialized at y0 = y1 = 0, in which the errors are Gaussian IID and kept identicalacross simulated values.The rightmost plot of Figure 9 shows that it can be difficult to distinguish between atrend stationary and a random walk with drift process.

Quasi-differenced series: a series that depends on a parametric value choice a rep-resenting the specific point alternative against which we wish to test H0 : ∆ayt+1 =yt+1 − ayt for t = 2, ..., T (∆ay1 = y1).

Alternative ADF test (ERS): Estimate an OLS regression of the quasi-differenceddata (i.e. de-trended data) on the quasi-differenced deterministic regressors

∆ayt+1 = ∆ax′t+1δ(a) + νt+1

where ∆at = t− at, ∆at2 = t2 − at2 and so on.

Apply standard ADF test to the estimated quasi differenced, (GLS) de-trended data,yt+1(a) = yt −∆ax

′t+1δ(a)

∆yt+1(a) = αyt(a) +

p∑i=1

γi∆yt+1−i(a) + εt+1

where the recommended a is equal to (1− (7/T )), when there is only an intercept, and1(13.5/T ) when there is also a linear time trend.

• Instead of modelling a constant and/or time trends in the test regression, the dataare de-trended so that the deterministic explanatory variables are “taken out” ofthe data prior to estimating the test regression.

3.3 Other Unit Root Tests

(a) Phillips and Perron (PP) non-parametric method of controlling for serial correla-tion when testing for a unit root: estimates the classical DF test equations andmodifies the t-ratio of the coefficient α so that serial correlation in the residualsdoes not affect the asymptotic distribution of the test statistic.

tPPα = tDFα ς − ψ =α

se(α)

√√√√ T−mT

1T

∑Tt=1 ε

2t∑[T 2/9]

i=[−T 2/9]1

T−i∑T

t=i+1 εtεt−i+

−se(α)T (

∑[T 2/9]

i=[−T 2/9]1

T−i∑T

t=i+1 εtεt−i −T−mT

∑Tt=1 ε

2t )

2√

(∑[T 2/9]

i=[−T 2/9]1

T−i∑T

t=i+1 εtεt−i)(1T

∑Tt=1 ε

2t )

where m is the number of deterministic regressors included and se(α) is the OLSstandard error of the estimated coefficient.

15

Page 16: Unit Roots and Cointegration - unibocconi.itdidattica.unibocconi.it/mypage/dwload.php?nomefile=... · Unit Roots and Cointegration Massimo Guidolin March 2018 1 De ning Unit Root

• Because ς, ψ > 0, even when ς > 1, it is possible for tPPα < tDFα , so that whilea DF test would not reject the null of a unit root, the PP test will.

• Instead of “whitening” the test regression residuals by fitting some AR(p),PP propose to directly adjust the test statistic in “HAC-way”.

• The asymptotic distribution of the PP modified t-ratio is the same as thatof the ADF statistic and the test is also performed under the null of a unitroot.

(b) Kwiatkowski, Phillips, Schmidt, and Shin (KPSS) test: based on the residualsfrom the OLS regression of the series on the exogenous, deterministic factors

yt+1 = x′tδ + ut+1

where xt may contain the intercept and a polynomial time trend and the series isassumed to be (trend-)stationary under the null.

KPSST ≡∑T

t=1(∑t

i=1 ui)2

T 2∑T 2/9

i=(−T 2/9)1

T−i∑T

t=i+1 εtεt−i

where the denominator is an estimator of the residual spectrum at frequency zero.

3.4 Testing for Unit Roots in Moving Average Processes

• Unit roots may characterize the MA(q) components of a time series process, whichmade it non-invertible.

• Testing for a unit root in the MA polynomial is equivalent to testing that a timeseries has been over-differenced.

• Testing for a unit root in the MA polynomial gives a chance to distinguish be-tween trend-stationary and stochastically trending processes, because we knowthat while the dth difference of an I(d) process is I(0), the dth difference of atrend stationary process will contain d unit roots in the MA polynomial.

• Let {εt} be observations from εt+1 = θηt + ηt+1 with ηt+1IID(0, σ2η). Under

H0 : θ = −1 the statistic T (θ+1) based on the MLE θ converges to a distributioncharacterized by Davis and Dunsmuir.A one-sided test with H0 : θ = −1 vs. H1 : θ > −1 can be shaped on thislimiting result by rejecting the null when θ > −1 + csize/T where csize is the(1size)-quantile of the limit distribution of T (θ + 1).

16

u

t

u

Page 17: Unit Roots and Cointegration - unibocconi.itdidattica.unibocconi.it/mypage/dwload.php?nomefile=... · Unit Roots and Cointegration Massimo Guidolin March 2018 1 De ning Unit Root

4 Cointegration and Error Correction Models

4.1 The Relationship Between Cointegration and EconomicTheory

• There are situations in which the choice to simultaneously difference all non-stationary variables to transform them into a set of stationary variables mayimply a major loss of information, possibly invalid inference, and sub-optimalpredictive performance.

Cointegrated variables: N variables whose linear combination is stationary.

• If a linear relationship is already stationary, differencing it entails a misspecifica-tion error, thus transforming cointegrated variables to be I(0) is a mistake.

Standard discounted dividend/earnings growth model:

(r − g) Pt+1 = Ft+1 + εt+1

where εt+1IID(0, σ2), so that

Pt+1 − κFt+1 = εt+1

withκ ≡ (1 + g)/(r − g) > 0Pt+1 = (real) stock priceFt+1 = some fundamental real quantity that investors discount when pricing stocks,e.g. cash dividends or earningsr = fixed real discount rate that reflects the riskiness of stocksg = constant real rate of growth of fundamentalsεt+1 = news vs. fair value of the stock index, that is temporary deviations of market prices from what is justified by fundamentals.

• It is an example of equilibrium frameworks in which deviations from equilibriummust be temporary.

• Because the time series of mispricings, {εt+1}, represents temporary deviations,we expect them to be stationary, or even to be white noise.

• If {εt+1} were a I(1) series, this would mean that pricing errors would never becorrected and on the contrary could diverge forever.

17

Guidolin
Line
Page 18: Unit Roots and Cointegration - unibocconi.itdidattica.unibocconi.it/mypage/dwload.php?nomefile=... · Unit Roots and Cointegration Massimo Guidolin March 2018 1 De ning Unit Root

• If the time series {Pt+1} and {Ft+1} are I(1), a coefficient κ exists such that aweighted sum of real prices and fundamentals yields a stationary variable:

I(1)− κI(1) ∼ I(0)

that violates the result on Sums of Stationary and Non-Stationary Series.

• It is necessary that the time paths of the two non-stationary variables {Pt+1} and{Ft+1} are closely linked.

4.2 Definition of Cointegration

Cointegrated System: The components of the N-dimensional random vectoryt ≡

[y1t y2t ... yNT

]′are said to be cointegrated of order d, b, denoted by yt ∼

CI(d, b), if all components of yt are integrated of order d, and there exists a vector κ ≡[κ1 κ2 ... κN

]′such that the linear combination κ′yt is integrated of order (d− b)

where b > 0. The vector κ is called the cointegrating vector.

Example: Consider a set of economic variables in a long-run equilibrium when

κ1y1t + κ2y2t + ...+ κNyNt = 0

or compactly, κ′yt = 0. The deviation from the long-run equilibrium (equilibriumerror) is a scalar random variable εt = κ′yt. If the equilibrium is meaningful, it mustbe the case that the equilibrium error process is stationary.

• In general, given a set of N integrated variables, these will not be cointegrated,implying that there is no long-run equilibrium among the variables, so that theycan wander arbitrarily far from each other.

• It is quite possible that nonlinear long-run relationships exist among a set ofintegrated variables.

• The cointegrating vector is not unique: if[κ1 κ2 ... κN

]′is a cointegrating

vector, then for any nonzero value of ξ,[ξκ1 ξκ2 ... ξκN

]′is also a cointegrating vector, which is therefore not unique.

• Typically, one of the variables is used to normalize the cointegrating vector byfixing its coefficient at unity, e.g.

[1 (κ2/κ1) (κ3/κ1) ... (κN/κ1) ]′.

• By definition, if two or more variables are integrated of different orders, they can-not be cointegrated.

18

Page 19: Unit Roots and Cointegration - unibocconi.itdidattica.unibocconi.it/mypage/dwload.php?nomefile=... · Unit Roots and Cointegration Massimo Guidolin March 2018 1 De ning Unit Root

Multiplicity of Cointegrating Vectors: If yt has N non-stationary components,there may be as many as (N–1) linearly independent cointegrating vectors. The num-ber of cointegrating vectors is called the cointegrating rank of yt.

• If yt contains only two variables, there can be at most one independent cointe-grating vector.

• If

κ1y1t + κ2y2t + ...+ κNyNT = εt ∼ I(0)⇒ κ2y2t + ...+ κNyNT = εt− κ1y1t ∼ I(1)

⇒[y1t y2t ... yNT

]′ ∼ CI(d, b)

this does not imply that subset of the same N variables needs to be also CI(d, b).

• Cointegrated variables are such because they share common stochastic trends.Example: dividend growth model where {Pt} and {Ft} are constructed as tworandom walks plus a stationary noise process

Pt = RW Pt + εPt and Ft = RW F

t + εFt

If Pt and Ft are CI(1, 1)⇒ there must be nonzero values of some coefficients κ1and κ2 such that κ1Pt + κ2Ft is stationary. But

κ1Pt+κ2Ft = κ1(RWPt +εPt )+κ2(RW

Ft +εFt ) = (κ1RW

Pt +κ2RW

Ft )+(κ1ε

Pt +κ2ε

Ft )

therefore for κ1Pt + κ2Ft to be stationary κ1RWPt + κ2RW

Ft should be zero.

However RW Pt and RW F

t are random variables whose realized values will becontinually changing over time that cannot be simply set to zero. It follows that

κ1Pt + κ2Ft is stationary ⇐⇒ RW Pt = −κ2

κ1RW F

t

In the N-variables case:yt = RWt + εt

where RWt is a N×1 vector of random walk components and εt has an equivalentdefinition for white noise processes.If one trend can be expressed as a linear combination of the other trends in thesystem, it means that

∃κ : κ′RWt = 0⇒ κ′yt = κ′RWt + κ′εt = κ′εt ∼ I(0)

Because the series in yt share one or more common stochastic trends, they cannotdrift “too far apart”.

19

Page 20: Unit Roots and Cointegration - unibocconi.itdidattica.unibocconi.it/mypage/dwload.php?nomefile=... · Unit Roots and Cointegration Massimo Guidolin March 2018 1 De ning Unit Root

4.3 Error Correction Models

• The dynamics of the cointegrated variables are influenced by the magnitude ofany departures from the long-run equilibrium: if the system is to return to theequilibrium, the movements of at least some of the variables must respond to themagnitude of the recorded disequilibrium.

Vector Error Correction Representation:If the N I(1) variables in yt have avector error-correction (VECM) representation,

∆yt+1 = π0 + Πyt +

p∑i=1

Πi∆yt+1−i + ut+1

(where the error vectors ut+1 can be serially correlated) then, because the vector equa-tion will need to balanced, Πyt ∼ I(0) will imply that the variables in yt are CI(1, 1).Because the N ×N matrix Π 6= 0 contains only constants, each row of this matrix isa cointegrating vector of yt and rank(Π) is the cointegrating rank of yt.

Example: dynamic error-correction model

∆Pt+1 = λP (Pt − κFt) + uPt λP ≤ 0

∆Ft+1 = −λF (Pt − κFt) + uFt λF ≤ 0

where uPt and uFt are white-noise disturbance terms which may be correlated, and λPand λF are estimable parameters.The unique long-run equilibrium is attained when Pt = κFt.because we have assumed that prices and fundamentals are all I(1), ∆Pt+1 and ∆Ft+1

are I(0), uPt and uFt are stationary. Furthermore, for the dynamic error-correctionmodel to be sensible the right-hand side must be I(0).⇒ Pt − κFt must be stationary, so that prices and fundamentals must be cointegratedwith the cointegrating vector [1 − κ]′. This doesn’t change if lagged changes of eachvariable are introduced in the model

∆Pt+1 = λP (Pt − κFt) +

p∑i=1

φPPi ∆Pt+1−i +

p∑i=1

φPFi ∆Ft+1−i + uPt λP ≤ 0

∆Ft+1 = −λF (Pt − κFt) +

p∑i=1

φFPi ∆Pt+1−i +

p∑i=1

φFFi ∆Ft+1−i + uFt λF ≤ 0

This is a bi-variate VAR(p) in first differences that contains error-correction terms,and in which λP and λF can be interpreted as speed of adjustment parameters. Atleast one of the speed of adjustment terms must be non-zero, otherwise the long-runequilibrium relation-ship would not appear and the model would not be one of errorcorrection or cointegration.

20

Page 21: Unit Roots and Cointegration - unibocconi.itdidattica.unibocconi.it/mypage/dwload.php?nomefile=... · Unit Roots and Cointegration Massimo Guidolin March 2018 1 De ning Unit Root

• The only case in which a VECM fails to imply a cointegrating relationship, iswhen Π = 0 ⇒ the VECM boils down to a traditional reduced-form VAR(p) infirst differences and yt+1 will not respond to previous period’s deviation from thelong-run equilibrium.

• When Π 6= 0 ⇒ yt+1 will respond to the previous period’s deviations from thelong-run equilibrium ⇒ estimating the VECM as a VAR(p) in first differences isinappropriate ⇒ the omission of Πyt leads to misspecification.

• In a model in which the error correction terms have been unduly dropped, theVECM has no long-run solution, thus it has nothing to say about whether thevariables have an equilibrium relationship.

• It is preferable to use the first differences if the N I(1) variables are not cointe-grated because

– If cointegration is incorrectly assumed, tests lose power because of the esti-mation of N2 additional parameters.

– For a VAR in levels, tests for Granger causality conducted on the I(1)variables do not have a standard F-distribution.

– The impulse responses at long horizons would also lead to inconsistent esti-mates of the true responses.

Granger’s Representation Theorem: For any set of N I(d) variables with identicalintegration order, error correction and cointegration are equivalent representations.

Example: VAR(1) model when financial markets are efficient

Pt+1 = a11Pt + a12Ft + εPt+1 and Ft+1 = a20 + a22Ft + εFt+1

where a12 6= 0, |a11| < 1, and εPt+1 and εFt+1 are possibly correlated white noise processes.We drop an intercept from the price equation to prevent the presence of a drift thatwould cause arbitrage opportunities.If a22 = 1 the equation for fundamentals is a random walk with drift and as such takingthe first difference of both equations gives

∆Pt+1 = (a11−1)Pt+a12Ft+εPt+1 = −a12

[1− a11a12

Pt−Ft]+εPt+1 and ∆Ft+1 = a20+ε

Ft+1

which shows that the first equation is balanced, in the sense that both its left- andright-hand sides are I(0) ⇐⇒ [((1 − a11)/a12)Pt − Ft] ∼ I(0) independently of thevalue taken by a12. Impose two additional restrictions on the VAR to imply that pricesand fundamentals are CI(1, 1):

21

Page 22: Unit Roots and Cointegration - unibocconi.itdidattica.unibocconi.it/mypage/dwload.php?nomefile=... · Unit Roots and Cointegration Massimo Guidolin March 2018 1 De ning Unit Root

(a) a12 > 0⇒ when ((1− a11)/a12)Pt < Ft and price is too low then− a12[((1− a11)/a12)Pt − Ft] > 0⇒ valid error correction dynamic

(b) a11 < 1 ⇒ [((1 − a11)/a12) 1]' ⇒ the model reaches its long-run equilibrium whenFt = ((1− a11)/a12)Pt or Ft/Pt = (1− a11)/a12.

Therefore, all VECMs can be interpreted as special, restricted VAR models.Given that

Pt+1 = a11Pt + a12Ft + εPt+1 and Ft+1 = a20 + a22Ft + εFt+1

when a11 = a22 = 1

Pt+1 = (P0 + a12F0) + a12a20t+ a12

t∑τ=1

εFτ +t+1∑τ=1

εPτ

Ft+1 = F0 + a20(t+ 1) +t+1∑τ=1

εFτ

⇒ prices are I(2) and fundamentals are I(1), then they cannot be cointegrated.When |a22| > 1 and/or |a11| > 1 ⇒ both variables are non-stationary and explosiveand therefore no cointegration exists.

4.4 Testing for Cointegration

Tests for cointegration:

(a) Univariate, regression-based tests: Engle and Granger’s univariate methodology:determine whether the residuals of an estimated equilibrium relationship are sta-tionary.

Example: dividend/earnings growth model.

• Assume that {Pt+1} and {Ft+1} are both I(1)

• Estimate the long-run equilibrium relationship

Pt = κ0 + κ1Ft + et

• Cointegrated variables⇒ OLS regression yields a super-consistent estimatorof the cointegrating parameters κ0 and κ1⇒ OLS estimator converges fasterthan in OLS models using stationary variables.

22

Page 23: Unit Roots and Cointegration - unibocconi.itdidattica.unibocconi.it/mypage/dwload.php?nomefile=... · Unit Roots and Cointegration Massimo Guidolin March 2018 1 De ning Unit Root

• Determine whether prices and fundamentals are actually cointegrated ⇒time series of estimated deviations from the long-run relationship, et =Pt − κ0 − κ1Ft, are stationary ⇒ unit root test where the null expressed asthe impossibility to reject the null of a unit root in the residuals of the Engle-Granger’s regression which implies that we cannot reject the null hypothesisthat prices and fundamentals are not cointegrated.

• Standard ADF critical values in Engle-Granger tests will contain a biastoward finding a stationary error process⇒ adjusted critical values⇒ tests on the normalized autocorrelation coefficient zα ≡ T α

1−∑p

i=2 γ2i

⇒ Durbin-Watson (DW) test statistic (Cointegrating Regression DurbinWatson (CRDW) test): under H0 of a unit root in the errors, CRDW willbe close to zero, so H0 is rejected if the CRDW statistic is larger than therelevant critical value⇒ PP test (Phillips-Ouliaris’ test): {εt} under p = 0 are used to computeestimates of the long-run variance (V0) to perform the adjustment to theestimated autocorrelation coefficient given by αPP = α− T V0(

∑Tt=2 e

2t )−1/2

so that zPPα = T αPP .

• When the null of no cointegration is rejected, estimate (usually by OLS)the VECM. In this case, it is a VAR(p)) in which the error correction termsdirectly use the stationary estimated residuals from

∆Pt+1 = λP et +

p∑i=1

aP1i∆Pt+1−i +

p∑t+1−i

aP2i∆Ft+1−i + εPt+1

∆Ft+1 = λF et + aF0 +

p∑i=1

aF1i∆Pt+1−i +

p∑i=1

aF2i∆Ft+1−i + εFt+1

• The estimation of the long-run equilibrium regression requires that the re-searcher places one variable on the left-hand side and uses the others asregressors.

• Relies on a two-step estimator, in which the first step regression residualsare used in the second step to estimate an ADF (or PP)-type regression,which causes errors and contamination deriving from a generated regressorsproblem.

• No systematic procedure to perform the separate estimation of multiplecointegrating vectors.

(b) Multivariate, VECM-based tests: Take the potential existence of multiple coin-tegrating vectors into account by using single-step full information maximum

23

Page 24: Unit Roots and Cointegration - unibocconi.itdidattica.unibocconi.it/mypage/dwload.php?nomefile=... · Unit Roots and Cointegration Massimo Guidolin March 2018 1 De ning Unit Root

likelihood estimation (Multivariate generalization of the Dickey-Fuller tests).

Example: reduced form VAR(p) model

yt+1 =

p∑i=1

Aiyt+1−i + εt+1 εt+1IID(0,Σε)

• Add and subtract Apyt−p+2 to the right-hand side

yt+1 =

p−2∑i=1

Aiyt+1−i + (Ap−1 + Ap)yt−p+2 + Ap∆yt−p+1 + εt+1

• Add and subtract (Ap−1 + Ap)yt−p+3 and so on, obtaining

∆yt+1 = Πyt+

p−1∑i=1

Γi∆yt+1−i+εt+1 Π ≡ −(IN−p∑i=1

Ai) Γi ≡ −p∑

j=i+1

Aj

• For a set of N variables yt+1 that can be represented as in the equation∆yt+1 above, the rank of Π equals the number of cointegrating vectors, r.If Π consists of all zeroes, so that rank(Π) = 0, then all the variables in thevector yt+1 contain a unit root and there is no cointegrating relationship. Ifrank(Π) = N , ∆yt+1 represents a convergent system of difference equationsso that all variables are stationary. If N > rank(Π) > 0, then Πyt is theerror correction term such that

0 = E[∆yt+1] = Πyt +

p−1∑i=1

ΓiE[∆yt+1−i] + E[εt+1]⇒ Πyt = 0

and Π = ΛK′, where K is the N×r matrix of cointegrating vectors and Λ isthe N × r matrix of weights with which each cointegrating vector enters theN equations of the VAR. Λ can also be interpreted as containing r differentN × 1 vectors of adjustment coefficients.

(c) Estimate the matrix Π from the unrestricted VAR for N non-stationary series

(d) Test whether we can reject the restrictions implied by the reduced rank of Π:test for the number of eigenvalues that are insignificantly different from unityusing

H0 : number of distinct cointegrating vectors ≤ r

H1 : number of distinct cointegrating vectors > r

24

Page 25: Unit Roots and Cointegration - unibocconi.itdidattica.unibocconi.it/mypage/dwload.php?nomefile=... · Unit Roots and Cointegration Massimo Guidolin March 2018 1 De ning Unit Root

λtrace(r) ≡ −TN∑

i=r+1

ln(1− λi)

H0 : the number of cointegrating vectors = r H1 : the number of cointegrating vectors = r+1

λmax(r, r + 1) ≡ −T ln(1− λr+1)

where λ1 > λ2 > ... > λN are the estimated values of the eigen-values obtainedfrom Π. The critical values of the statistics are obtained using a Monte Carloapproach, while the distribution is non-standard. The critical values depend onthe value of Nr, the number of non-stationary components, and whether con-stants are included in each of the equations.

• The statistics have heterogeneous functional form, thus may give different results.λmax is preferred when trying to pin down the number of cointegrating vectors.

• The VECM cannot be estimated by OLS because it is necessary to impose cross-equation restrictions on the Π matrix ⇒ ML estimation methods.

25