y φy ε ,ε ,σ2 t - UW Faculty Web Server · 2006-05-01 · Since the above result holds for any...

31
Introduction Consider the simple AR(1) model for t =1,...,T y t = φy t1 + ε t t WN(02 ) If |φ| < 1, then y t I (0) and y t = ψ(L)ε t (L)= X k=0 ψ k L k k = φ k such that X k=0 k|ψ k | < LRV = σ 2 ψ(1) 2 = σ 2 (1 φ) 2 <

Transcript of y φy ε ,ε ,σ2 t - UW Faculty Web Server · 2006-05-01 · Since the above result holds for any...

Page 1: y φy ε ,ε ,σ2 t - UW Faculty Web Server · 2006-05-01 · Since the above result holds for any r∈[0,1],one might expect that the result holds uniformly for r∈ [0,1].In fact,

Introduction

Consider the simple AR(1) model for t = 1, . . . , T

yt = φyt−1 + εt, εt ∼WN(0, σ2)

If |φ| < 1, then yt ∼ I(0) and

yt = ψ(L)εt, ψ(L) =∞Xk=0

ψkLk, ψk = φk

such that

∞Xk=0

k|ψk| < ∞

LRV = σ2ψ(1)2 = σ2(1− φ)−2 <∞

Page 2: y φy ε ,ε ,σ2 t - UW Faculty Web Server · 2006-05-01 · Since the above result holds for any r∈[0,1],one might expect that the result holds uniformly for r∈ [0,1].In fact,

Furthermore, by the LLN and the CLT

T−1TXt=1

ytp→ E[yt] = 0

T−1/2TXt=1

ytd→ N(0, σ2ψ(1)2) = N(0, σ2(1− φ)−2)

T 1/2(φ− φ)d→ N(0, (1− φ2))

where φ =³PT

t=1 y2t−1

´−1PTt=1 yt−1yt is the least

squares estimate of φ.

Page 3: y φy ε ,ε ,σ2 t - UW Faculty Web Server · 2006-05-01 · Since the above result holds for any r∈[0,1],one might expect that the result holds uniformly for r∈ [0,1].In fact,

If φ = 1, then yt ∼ I(1) and

ψk = 1∞Xk=0

k|ψk| = ∞

σ2ψ(1)2 = ∞

Furthermore,

T−1TXt=1

yt→∞ as T →∞

T−1/2TXt=1

yt→∞ as T →∞

T 1/2(φ− 1) p→ 0

Clearly, the asymptotic results for I(0) processes are

not applicable.

Page 4: y φy ε ,ε ,σ2 t - UW Faculty Web Server · 2006-05-01 · Since the above result holds for any r∈[0,1],one might expect that the result holds uniformly for r∈ [0,1].In fact,

Sample Moments of I(1) Processes

When φ = 1

yt = yt−1 + εt

= y0 +tX

j=1

εj

=tX

j=1

εj if y0 = 0

Now, consider the sample mean of yt when y0 = 0 :

y = T−1TXt=1

yt = T−1TXt=1

⎛⎝ tXj=1

εj

⎞⎠Notice that the sample mean is a normalized sum of

partial sums of the white noise error term εt. As such,

it exhibits very different probabilistic behavior than

the sum of stationary and ergodic errors. It turns out

that the limit behavior of y when φ = 1 is described

by simple functionals of Brownian motion.

Page 5: y φy ε ,ε ,σ2 t - UW Faculty Web Server · 2006-05-01 · Since the above result holds for any r∈[0,1],one might expect that the result holds uniformly for r∈ [0,1].In fact,

Brownian Motion

Standard Brownian motion (Wiener process) is a continuous-

time processW (·) associating each date r ∈ [0, 1] thescalar random ariable W (r) such that

1. W (0) = 0

2. For any dates 0 ≤ r1 < r2 < · · · < rk ≤ k, the

random increments

W (r2)−W (r1),W (r3)−W (r2), . . . ,W (rk)−W (rk−1)

are independent Gaussian random variables with

W (t)−W (s) ∼ N(0, t− s)

3. For any given realization, W (r) is continuous at

r with probability 1. That is, W (r) ∈ C[0, 1] =

space of continuous real valued functions on [0, 1].

Page 6: y φy ε ,ε ,σ2 t - UW Faculty Web Server · 2006-05-01 · Since the above result holds for any r∈[0,1],one might expect that the result holds uniformly for r∈ [0,1].In fact,

The standard Brownian motion, or Wiener process,

may be intuitively thought of as the continuous-time

limit of a random walk process in which the integer

time index t = 1, 2, . . . ,∞ has been rescaled to the

continuous time index r = 0, . . . , 1. The Wiener pro-

cess may be shown to have the following properties:

1. W (r) ∼ N(0, r)

2. σW (r) = B(r) ∼ N(0, σ2r)

3. W (r)2 ∼ r · χ2(1)

4. W (r) is not differentiable and exhibits unbounded

variation.

Page 7: y φy ε ,ε ,σ2 t - UW Faculty Web Server · 2006-05-01 · Since the above result holds for any r∈[0,1],one might expect that the result holds uniformly for r∈ [0,1].In fact,

Partial Sum Processes and the Functional Central

Limit Theorem

Let εt ∼WN(0, σ2). For r ∈ [0, 1], define the partialsum process

XT (r) = T−1[Tr]Xt=1

εt

[Tr] = integer part of T · r

Page 8: y φy ε ,ε ,σ2 t - UW Faculty Web Server · 2006-05-01 · Since the above result holds for any r∈[0,1],one might expect that the result holds uniformly for r∈ [0,1].In fact,

For example, let T = 10 and consider XT (r) for r =

0, 0.01, 0.1, 0.2 :

r = 0, [10 · 0] = 0 : X10(0) =1

10

[10·0]Xt=1

εt = 0

r = 0.01, [10 · 0.01] = 0 : X10(0.01) =1

10

[10·0.01]Xt=1

εt = 0

r = 0.1, [10 · 0.1] = 1 : X10(0.1) =1

10

[10·0.1]Xt=1

εt =ε110

r = 0.2, [10 · 0.2] = 2 : X10(0.2) =1

10

[10·0.2]Xt=1

εt =ε1 + ε210

In general,

X10(r) =ε1 + · · ·+ εj

10,

j

10≤ r <

j + 1

10

Page 9: y φy ε ,ε ,σ2 t - UW Faculty Web Server · 2006-05-01 · Since the above result holds for any r∈[0,1],one might expect that the result holds uniformly for r∈ [0,1].In fact,

For a sequence of errors ε1, . . . , εT :

1. the function XT (r) is a random step function de-

fined on [0, 1].

2. As T gets bigger the spaces between the steps

gets smaller and the random step function begins

to look more and more like a Wiener process.

Page 10: y φy ε ,ε ,σ2 t - UW Faculty Web Server · 2006-05-01 · Since the above result holds for any r∈[0,1],one might expect that the result holds uniformly for r∈ [0,1].In fact,

The Functional Central Limit Theorem

For any fixed r ∈ [0, 1], consider

√TXT (r) =

√T

⎛⎜⎝T−1 [Tr]Xt=1

εt

⎞⎟⎠=

1√T

[Tr]Xt=1

εt

=

⎛⎜⎝q[Tr]√T

⎞⎟⎠⎛⎜⎝ 1q

[Tr]

[Tr]Xt=1

εt

⎞⎟⎠Now, as T →∞ q

[Tr]√T

→√r

1q[Tr]

[Tr]Xt=1

εtd→ N(0, σ2)

Page 11: y φy ε ,ε ,σ2 t - UW Faculty Web Server · 2006-05-01 · Since the above result holds for any r∈[0,1],one might expect that the result holds uniformly for r∈ [0,1].In fact,

It follows from Slutsky’s theorem that

√TXT (r)

d→ N(0, r · σ2) ≡ σ ·W (r)

or√TXT (r)/σ

d→ N(0, r) ≡W (r)

Notice that when r = 1, we have the usual result

√TXT (1)/σ =

1

σ√T

TXt=1

εtd→ N(0, 1) ≡W (1)

Page 12: y φy ε ,ε ,σ2 t - UW Faculty Web Server · 2006-05-01 · Since the above result holds for any r∈[0,1],one might expect that the result holds uniformly for r∈ [0,1].In fact,

Since the above result holds for any r ∈ [0, 1], one

might expect that the result holds uniformly for r ∈[0, 1]. In fact, the probability distribution of the se-

quence of stochastic step functions

{√TXT (·)/σ}∞T=1

defined on [0, 1] converges asymptotically to that of

standard Brownian motion W (·).

This convergence result, know as Donsker’s Theorem

for Partial Sums or the Functional Central Limit The-

orem (FCLT), is often represented as√TXT (·)/σ ⇒W (·)

The symbol “⇒” denotes convergence in distributionfor random functions.

Page 13: y φy ε ,ε ,σ2 t - UW Faculty Web Server · 2006-05-01 · Since the above result holds for any r∈[0,1],one might expect that the result holds uniformly for r∈ [0,1].In fact,

The Continuous Mapping Theorem

Recall, if XT is a sequence of random variables such

that XTd→ X and g(·) is a continuous function then

g(XT )d→ g(X). A similar result holds for random

functions and is called the Continuous Mapping The-

orem (CMT).

Let {ST (·)}∞T=1 be a sequence of random functions

such that

ST (·) ⇒ S(·)g(·) = continuous functional

Then the CMT states that

g(ST (·))⇒ g(S(·))

Page 14: y φy ε ,ε ,σ2 t - UW Faculty Web Server · 2006-05-01 · Since the above result holds for any r∈[0,1],one might expect that the result holds uniformly for r∈ [0,1].In fact,

Example 1

Suppose ST (·) =√TXT (·)/σ so that S (·) = W (·)

by the FCLT. Let g(ST (·)) = σ · ST (·) . Then

g(ST (·))⇒ g(W (·)) = σW (·)

Example 2

Let g(ST (·)) =R 10 ST (r)dr. Then

g(ST (·))⇒ g(W (·)) =Z 10W (r)dr

Page 15: y φy ε ,ε ,σ2 t - UW Faculty Web Server · 2006-05-01 · Since the above result holds for any r∈[0,1],one might expect that the result holds uniformly for r∈ [0,1].In fact,

Convergence of Sample Moments of I(1) Processes

Let yt be the I(1) process

yt = yt−1 + εt, εt ∼WN(0, σ2)

For r ∈ [0, 1], define the partial sum process

XT (r) = T−1[Tr]Xt=1

εt

such that√TXT (·) ⇒ σW (·). The FCLT and the

CMT may be used to deduce the following results:

T−3/2TXt=1

yt−1 ⇒ σZ 10W (r)dr

T−2TXt=1

y2t−1 ⇒ σ2Z 10W (r)2dr

T−1TXt=1

yt−1εt ⇒ σ2Z 10W (r)dW (r)

= σ2³W (1)2 − 1

´/2

Page 16: y φy ε ,ε ,σ2 t - UW Faculty Web Server · 2006-05-01 · Since the above result holds for any r∈[0,1],one might expect that the result holds uniformly for r∈ [0,1].In fact,

For example, it can be shown that

T−3/2TXt=1

yt−1 =Z 10

√TXT (r)dr ⇒ σ

Z 10W (r)dr

using the FCLT and the CMT. The details are given

in chapter 17 of Hamilton.

Page 17: y φy ε ,ε ,σ2 t - UW Faculty Web Server · 2006-05-01 · Since the above result holds for any r∈[0,1],one might expect that the result holds uniformly for r∈ [0,1].In fact,

Application: Unit Root Tests

To illustrate the convergence of sample moments of

I(1) processes, consider the AR(1) regression

yt = φyt−1 + εt, εt ∼WN(0, σ2)

If φ = 1 then yt ∼ I(1); if |φ| < 1 then yt ∼ I(0). A

test of yt ∼ I(1) against the alternative that yt ∼ I(0)

may therefore be formulated as

H0 : φ = 1 vs. H1 : |φ| < 1

A natural test statistic is the t-statistic

tφ=1 =φ− 1SE(φ)

Page 18: y φy ε ,ε ,σ2 t - UW Faculty Web Server · 2006-05-01 · Since the above result holds for any r∈[0,1],one might expect that the result holds uniformly for r∈ [0,1].In fact,

where

φ =

⎛⎝ TXt=1

y2t−1

⎞⎠−1 TXt=1

yt−1yt

SE(φ) =

⎛⎜⎝σ2⎛⎝ TXt=1

y2t−1

⎞⎠−1⎞⎟⎠1/2

σ2 = T−1TXt=1

(yt − φyt−1)2

Page 19: y φy ε ,ε ,σ2 t - UW Faculty Web Server · 2006-05-01 · Since the above result holds for any r∈[0,1],one might expect that the result holds uniformly for r∈ [0,1].In fact,

Consistency of φ under H0 : φ = 1

Under H0 : φ = 1

φ− 1 =

⎛⎝ TXt=1

y2t−1

⎞⎠−1 TXt=1

yt−1εt⎛⎝T−2 TXt=1

y2t−1

⎞⎠−1 T−2 TXt=1

yt−1εt

Using the results

T−2TXt=1

y2t−1 ⇒ σ2Z 10W (r)2dr

T−1TXt=1

yt−1εt ⇒ σ2Z 10W (r)dW (r)

and the CMT, it follows that

φ− 1 p→Ãσ2Z 10W (r)2dr

!−1× 0 = 0

so that φp→ 1.

Page 20: y φy ε ,ε ,σ2 t - UW Faculty Web Server · 2006-05-01 · Since the above result holds for any r∈[0,1],one might expect that the result holds uniformly for r∈ [0,1].In fact,

DF Test with Intercept

yt = c+ φyt−1 + εt

= x0tβ + εt

xt = (1, yt−1)0, β = (c, φ)0

OLS gives

β =

⎛⎝ TXt=1

xtx0t

⎞⎠−1 TXt=1

xtyt−1

TXt=1

xtx0t =

ÃT

PTt=1 yt−1PT

t=1 yt−1PTt=1 y

2t−1

!TXt=1

xtyt−1 =

à PTt=1 yt−1PTt=1 ytyt−1

!

Page 21: y φy ε ,ε ,σ2 t - UW Faculty Web Server · 2006-05-01 · Since the above result holds for any r∈[0,1],one might expect that the result holds uniformly for r∈ [0,1].In fact,

Now, under H0 : φ = 1 and c = 0

β − β =

Ãc− 0φ− 1

!=

⎛⎝ TXt=1

xtx0t

⎞⎠−1 TXt=1

xtεt

ÃT

PTt=1 yt−1PT

t=1 yt−1PTt=1 y

2t−1

!−1Ã PTt=1 εtPT

t=1 yt−1εt

!

Problem: Elements ofPTt=1 xtx

0t and

PTt=1 xtεt con-

verge at different rates!ÃT

PTt=1 yt−1PT

t=1 yt−1PTt=1 y

2t−1

!=

ÃO(T ) Op(T 3/2)

Op(T 3/2) Op(T 2)

!Ã PT

t=1 εtPTt=1 yt−1εt

!=

ÃOp(T 1/2)Op(T )

!

Page 22: y φy ε ,ε ,σ2 t - UW Faculty Web Server · 2006-05-01 · Since the above result holds for any r∈[0,1],one might expect that the result holds uniformly for r∈ [0,1].In fact,

Implication: Cannot get sensible convergence results

using traditional scaling

T³β − β

´=

⎛⎝T−1 TXt=1

xtx0t

⎞⎠−1 T−1 TXt=1

xtεt

=

ÃT−1 T−2

PTt=1 yt−1

T−1PTt=1 yt−1 T−2

PTt=1 y

2t−1

!−1

×Ã

T−1PTt=1 εt

T−1PTt=1 yt−1εt

!

⇒Ã0 0

0 σ2R 10 W (r)2dr

!−1×Ã

0

σ2R 10 W (r)dW (r)

!which is not well defined.

Page 23: y φy ε ,ε ,σ2 t - UW Faculty Web Server · 2006-05-01 · Since the above result holds for any r∈[0,1],one might expect that the result holds uniformly for r∈ [0,1].In fact,

Sims-Stock-Watson Trick

Define the diagonal and invertible scaling matrix

DT =

ÃT 1/2 00 T

!Then write

DT

³β − β

´= DT

⎛⎝ TXt=1

xtx0t

⎞⎠−1DTD−1T

TXt=1

xtεt

=

⎛⎝D−1T TXt=1

xtx0tD−1T

⎞⎠−1D−1T TXt=1

xtεt

Page 24: y φy ε ,ε ,σ2 t - UW Faculty Web Server · 2006-05-01 · Since the above result holds for any r∈[0,1],one might expect that the result holds uniformly for r∈ [0,1].In fact,

where

DT

³β − β

´=

ÃT 1/2c

T (φ− 1)

!

D−1TTXt=1

xtx0tD−1T

=

Ã1 T−3/2

PTt=1 yt−1

T−3/2PTt=1 yt−1 T−2

PTt=1 y

2t−1

!

D−1TTXt=1

xtεt =

ÃT−1/2

PTt=1 εt

T−1PTt=1 yt−1εt

!

Page 25: y φy ε ,ε ,σ2 t - UW Faculty Web Server · 2006-05-01 · Since the above result holds for any r∈[0,1],one might expect that the result holds uniformly for r∈ [0,1].In fact,

Therefore,

DT

³β − β

´⇒

Ã1 σ

R 10 W (r)

σR 10 W (r) σ2

R 10 W (r)2dr

!−1×Ã

N(0, σ2)

σ2R 10 W (r)dW (r)

!Straightforward algebra shows that

T 1/2cd9 N(0, σ2)

T (φ− 1) ⇒ÃZ 10Wμ(r)2dr

!−1 Z 10Wμ(r)dW (r)

Wμ(r) = W (r)−Z 10W (r)

Page 26: y φy ε ,ε ,σ2 t - UW Faculty Web Server · 2006-05-01 · Since the above result holds for any r∈[0,1],one might expect that the result holds uniformly for r∈ [0,1].In fact,

Convergence of Sample Moments with General Serial

Correlation

yt = yt−1 + ψ∗(L)εt, εt ∼WN(0, σ2)= yt−1 + ut

ψ∗(L) is 1-summable

LRV = σ2ψ∗(1)2 = γ0 + 2∞Xj=1

γj

γj = cov(ut, ut−j)

FCLT

√TXT (·) =

1√T

[T ·]Xt=1

ut⇒ LRV×W (·)

Page 27: y φy ε ,ε ,σ2 t - UW Faculty Web Server · 2006-05-01 · Since the above result holds for any r∈[0,1],one might expect that the result holds uniformly for r∈ [0,1].In fact,

1. T−3/2PTt=1 yt−1⇒

√LRV

R 10 W (r)dr

2. T−2PTt=1 y

2t−1⇒LRV

R 10 W (r)2dr

3. T−1PTt=1 yt−1ut⇒LRV

R 10 W (r)dW (r)+ω, ω =

12(LRV−γ0)

4. T−1PTt=1 yt−1εt⇒

√σ2LRV

R 10 W (r)dW (r)

5. T−1PTt=1 yt−1ut−1 =LRV

R 10 W (r)dW (r) + ω +

γ0

Page 28: y φy ε ,ε ,σ2 t - UW Faculty Web Server · 2006-05-01 · Since the above result holds for any r∈[0,1],one might expect that the result holds uniformly for r∈ [0,1].In fact,

Application: Asymptotic Distribution of ADF test

Assume yt is I(1) and that ∆yt ∼ AR(1)

∆yt = ξ∆yt−1 + εt, εt ∼WN(0, σ2)|ξ| < 1

Therefore, ∆yt has Wold representation

∆yt = ψ∗(L)εt = ut

ψ∗(L) = (1− ξL)−1 =∞Xj=0

ψ∗jLj, ψ∗j = ξj

LRV = σ2ψ∗(L) = σ2(1− ξ)−1

The ADF test regression is

yt = φyt−1 + ξ∆yt−1 + εt

x0tβ + εt

x0t = (yt−1,∆yt−1)0, β = (φ, ξ)0

Page 29: y φy ε ,ε ,σ2 t - UW Faculty Web Server · 2006-05-01 · Since the above result holds for any r∈[0,1],one might expect that the result holds uniformly for r∈ [0,1].In fact,

Notice that

xt =

Ãyt−1∆yt−1

!∼ I(1)∼ I(0)

OLS on the ADF test regression gives

β − β =

⎛⎝ TXt=1

xtx0t

⎞⎠−1 TXt=1

xtεt

where

TXt=1

xtx0t =

à PTt=1 y

2t−1

PTt=1 yt−1∆yt−1PT

t=1∆yt−1yt−1PTt=1∆y2t−1

!

=

ÃOp(T 2) Op(T )

Op(T ) Op(T 1/2)

!TXt=1

xtεt =

à PTt=1 yt−1εtPTt=1∆yt−1εt

!

=

ÃOp(T )

Op(T 1/2)

!

Page 30: y φy ε ,ε ,σ2 t - UW Faculty Web Server · 2006-05-01 · Since the above result holds for any r∈[0,1],one might expect that the result holds uniformly for r∈ [0,1].In fact,

Use Sims-Stock-Watson trick and define the scaling

matrix

DT =

ÃT 0

0 T 1/2

!Then write

DT

³β − β

´= DT

⎛⎝ TXt=1

xtx0t

⎞⎠−1DTD−1T

TXt=1

xtεt

=

⎛⎝D−1T TXt=1

xtx0tD−1T

⎞⎠−1D−1T TXt=1

xtεt

where

DT

³β − β

´=

⎛⎝ T (φ− 1)T 1/2

³ξ − ξ

´ ⎞⎠D−1T

TXt=1

xtx0tD−1T

=

ÃT−2

PTt=1 y

2t−1 T−3/2

PTt=1 yt−1∆yt−1

T−3/2PTt=1∆yt−1yt−1 T−1

PTt=1∆y2t−1

!

Page 31: y φy ε ,ε ,σ2 t - UW Faculty Web Server · 2006-05-01 · Since the above result holds for any r∈[0,1],one might expect that the result holds uniformly for r∈ [0,1].In fact,

and

D−1TTXt=1

xtεt =

ÃT−1

PTt=1 yt−1εt

T−1/2PTt=1∆yt−1εt

!Note: ∆yt−1εt = ut−1εt is a stationary and ergodicMDS with

E[(ut−1εt)2] = E[E (ut−1εt)

2 |It−1]= E[u2t−1E[ε

2t ]] = σ2γ0

Therefore, by the appropriate CLT

T−1/2TXt=1

∆yt−1εt→ N(0, σ2γ0)

Using the convergence results for the sample moments

of serially correlated I(1) process, the above result,

and the CMT gives

T (φ− 1)⇒R 10 W (r)dW (r)R 10 W (r)2dr

T 1/2³ξ − ξ

´d→ N(0, σ2γ0)

Furthermore, φ and ξ are asymptotically independent.