Unit Roots and Cointegration - Unit Roots and Cointegration Massimo Guidolin March 2018 1 De ning...

download Unit Roots and Cointegration - Unit Roots and Cointegration Massimo Guidolin March 2018 1 De ning Unit

of 25

  • date post

  • Category


  • view

  • download


Embed Size (px)

Transcript of Unit Roots and Cointegration - Unit Roots and Cointegration Massimo Guidolin March 2018 1 De ning...

  • Unit Roots and Cointegration

    Massimo Guidolin

    March 2018

    1 Defining Unit Root Processes

    Deterministic trends: linear or nonlinear functions of time t, yt = f(t) + �t. Example:

    yt =

    Q∑ j=0

    δjt j + �t

    where {�t} is a white noise process.

    • The trending effect caused by functions of t are permanent and hence impress a trend, as time is irreversible, for instance when Q = 1 ⇒ ∆yt = yt − yt−1 = δ1 + (�t − �t−1).

    • They are trend stationary because yt+1 can differ from its trend value by the amount �t+1 and because this deviation is stationary, the series will exhibit only temporary departures from the trend ⇒ the long-term forecast of yt+1 will converge to the trend line

    ∑Q j=0 δjt


    Stochastic trend: all series that can be written as

    yt = y0 + µt+ t∑



    • V ar[�t] = σ2 and yt+1 = y0 + µt+ ∑t

    τ=1 �τ + µ+ �t+1 = µ+ yt + �t+1 ⇒ The series follows a random walk with drift process.

    or more generally, yt contains a stochastic trend ⇐⇒ it can be decomposed as

    yt = y0 + µt+ t∑




  • where ητ follows any ARMA(p, q) stationary process. This expression is centered on the first-differences of yt.

    • All deterministic trends can be converted into stochastic trends, while the opposite is not true.

    • A series {yt} can be decomposed as yt=trend + stationary component+noise=(deterministic trend+stochastic trend)+stationary component+noise.

    • In large samples, the deterministic time trend induced by the drift component dom- inates the time series, while in small samples it is not always easy to discern the difference between a driftless RW and a model with drift.

    Figure 1: Comparing Trend-Stationary Series and Stochastic Trends

    Figure 2: Estimated Deterministic Trends and Estimated Random Walk

    Figure 1 and 2 compare simulated deterministic (linear, quadratic, and cubic) trends to ran- dom walk (RW) stochastic trend series (with no drift and with positive/negative drift): the deterministic time trend induced by the drift component, dominates the time series.


  • Expectation of a Random Walk:

    1. If µ = 0

    E[yt] = E[yt+s] = y0 + E [∑


    ] �τ = y0 +

    ∑ τ

    E[�τ ] = y0

    that is constant equal to the initial value. However, all stochastic shocks have non- decaying effects on the series. If µ 6= 0

    E[yt] = Et[y0 + µt+ t∑


    �τ ] = y0 + µt

    2. If µ = 0

    Et[yt+s] = Et[yt+s−1 + �t+s] = Et[yt+s−2 + �t+s + �t+s−1] = ... = Et[yt] +Et[ s∑ i=1

    �t+i] = yt

    that is for any s ≥ 2 the conditional means for all values of yt+s are equivalent and the current yt represents the minimum mean-squared loss function forecast of all future values of the series. If µ 6= 0

    Et[yt+s] = Et[y0 + µ(t+ s) + t∑


    �t] = y0 + µs

    that is the forecast function changes deterministically with time.

    Expectation of a driftless Random Walk (µ = 0):


    V ar[yt] = V ar[y0 + t∑


    �τ ] = t∑


    V ar[�τ ] = tσ 2

    the variance is time-dependent and therefore a RW is not covariance stationary


    V ar[yt+s] = V ar[y0 + t+s∑ τ=1

    �τ ] = t+s∑ τ=1

    V ar[�τ ] = (t+ s)σ 2

    as s→∞ the variance approaches infinity.

    (c) Given that the mean is constant

    E[(yt − y0)(yt+s − y0)] = E[ t∑


    �τ · t+s∑ τ=1

    �τ ] = E[ t∑


    �2τ ] = tσ 2


  • (d)

    Corr[yt, yt+s] = E[(yt − y0)(yt+s − y0)]√

    V ar[yt]V ar[yt+s] =

    tσ2√ tσ2(t+ s)σ2


    √ t

    t+ s

    • A driftless RW meanders without exhibiting any tendency to increase or decrease. In fact, for any fixed real number M , the random walk and its absolute will exceed M with probability 1 as the series lengthens.

    • For sufficiently large samples, Corr[yt, yt+s] ' 1, but as s grows Corr[yt, yt+s] declines below 1 ⇒ We cannot distinguish between a unit root process and a stationary process with an autoregressive coefficient that is close to unity using the ACF.

    Figure 3: Sample ACF and Estimated AR(p) Models on RW Series

    Figure 3 shows that, even though the sample autocorrelations are close to 1, they visi- bly decay possibly instilling the doubt that we may be facing a highly persistent AR(1) process.

    Deterministic De-Trending: De-trending entails regressing a variable on a deter- ministic (polynomial) function of time and saving the residuals,{�̂t} that come then to represent the new, de-trended series

    ŷt+1 =

    Q∑ j=0

    δ̂jt j + �̂t+1

    where the coefficients can be simply estimated by OLS.

    • The appropriate degree of the polynomial can be set by standard t-tests, F-tests, and/or information criteria.

    • The regression is usually estimated using the largest value of Q considered rea- sonable given the type of data or their frequency.


  • • The de-trended process can then be modeled using traditional methods

    Unit Root Process and dth Order Integration: When a time series process {yt} needs to be differenced d times before being reduced to the sum of constant terms plus a white noise process, {yt} is said to contain d unit roots or to be integrated of order d; we also write that yt ∼ I(d).

    Example: RW with drift process. Take its first-difference:

    ∆yt+1 ≡ yt+1 − yt = (µ+ yt + �t+1)− yt = µ+ �t+1

    ⇒ white noise series plus a constant intercept. It is a I(1) and it contains one unit root.

    • If {yt} contains d unit roots, then {yt} is non-stationary, while the opposite does not hold (e.g. explosive autoregressive process: yt+1 = µ+ φyt + �t+1).

    • Typically φ = 1 is used to characterize non-stationarity because it describes ac- curately many time series.

    • All I(d) processes (with d > 0) are non-stationary, but there are non-stationary processes that do not fit the definition of unit root processes.

    Figure 4: De-trended and First-Differenced Series

    Figure 5: Sample ACF of De-trended and First-Differenced Series


  • 1.1 What Happens if One Incorrectly De-Trends a Unit Root Series?

    De-trending a Unit Root Process: When a time series process {yt} is I(d) but an attempt is made to remove its stochastic trend by fitting deterministic (often polyno- mial, spline-type) time trend functions, the resulting OLS residuals will still contain one or more unit roots. Equivalently, deterministic de-trending does not remove the stochastic trends.

    Example: RW with drift process De-trend using a simple linear time trend, i.e. subtracting δ̂0 + δ̂1t

    yt − ŷde−trendt = y0 + µt+ t∑


    �τ − δ̂0 − δ̂1t = (y0 − δ̂0) + (µ− δ̂1)t+ t∑



    that is the resulting expression for yt− ŷde−trendt still equals ∑t

    τ=1 �τ even when δ̂0 = y0 and δ̂1 = µ. Therefore, the source of unit roots and, consequently, the stochastic trend have not been removed.

    Figure 6: Incorrect Polynomial De-trending of a Random Walk

    Figure 6 plots the series yt− ŷde−trendt on the left, and the ACF of yt− ŷde−trendt on the right.

    1.2 What Happens if One Incorrectly Applies Differencing to (Deterministic) Trend-Stationary Series?

    Differentiating a Trend-Stationary Processes: When a time series process {yt} contains a deterministic time trend but it is otherwise I(0), (i.e., it is trend-stationary)


  • and an attempt is made to remove the deterministic trend by differentiating the series d times, the resulting differentiated series will contain d unit roots in its moving-average components and will therefore be not invertible. Equivalently, differentiating a trend- stationary series, creates new stochastic trends that are shifted inside the shocks of the series.

    • The more you differentiate a trend-stationary process, the higher the number of unit roots in the resulting MA process.

    Example: Starting from a deterministic trend

    yt =

    Q∑ j=0

    δjt j + �t

    Take its first difference

    ∆yt ≡ yt−yt−1 = ( Q∑ j=0

    δjt j+�t+1)−(

    Q∑ j=0

    δj (t−1)j +�t-1) = Q∑ j=0

    δj [t j −(t−1)j ]+(�t −�t-1)

    that is a MA(1) with a complex time trend characterized by a unit root in the MA component. When Q = 1 this becomes an integrated moving-average of order 1

    ∆yt = δ1 + (�t − �t-1)

    If we differentiate a second time we obtain an integrated MA(2) process with two visible roots in excess of 1, and an increasingly complex time trend function

    ∆2yt ≡ ∆yt−∆yt−1 = ( Q∑ j=0

    δj[t j−(t−1)j]+�t+1−�t)−(

    Q∑ j=0

    δj [(t−1)j−(t−2)j]+�t-1−�t−2) =


    Q∑ j=0

    δj {[tj − (t − 1)j ] − [(t − 1)j − (t − 2)j ]} + �t − 2�t-1 + �t−2


  • Figure 7: Incorrect First-Differencing of a Trend-Stationary Series

    Figure 7 plots the series ∆yt on the left, and the ACF of ∆yt on the right.

    1.3 What Happens if One Incorrectly Applies Differencing to a Stationary Series?

    • Even when the trend-stationary component is absent, if the ti