y φy ε ,ε ,σ2 t - UW Faculty Web Server · 2006-05-01 · Since the above result holds for any...

Post on 31-Dec-2019

0 views 0 download

Transcript of y φy ε ,ε ,σ2 t - UW Faculty Web Server · 2006-05-01 · Since the above result holds for any...

Introduction

Consider the simple AR(1) model for t = 1, . . . , T

yt = φyt−1 + εt, εt ∼WN(0, σ2)

If |φ| < 1, then yt ∼ I(0) and

yt = ψ(L)εt, ψ(L) =∞Xk=0

ψkLk, ψk = φk

such that

∞Xk=0

k|ψk| < ∞

LRV = σ2ψ(1)2 = σ2(1− φ)−2 <∞

Furthermore, by the LLN and the CLT

T−1TXt=1

ytp→ E[yt] = 0

T−1/2TXt=1

ytd→ N(0, σ2ψ(1)2) = N(0, σ2(1− φ)−2)

T 1/2(φ− φ)d→ N(0, (1− φ2))

where φ =³PT

t=1 y2t−1

´−1PTt=1 yt−1yt is the least

squares estimate of φ.

If φ = 1, then yt ∼ I(1) and

ψk = 1∞Xk=0

k|ψk| = ∞

σ2ψ(1)2 = ∞

Furthermore,

T−1TXt=1

yt→∞ as T →∞

T−1/2TXt=1

yt→∞ as T →∞

T 1/2(φ− 1) p→ 0

Clearly, the asymptotic results for I(0) processes are

not applicable.

Sample Moments of I(1) Processes

When φ = 1

yt = yt−1 + εt

= y0 +tX

j=1

εj

=tX

j=1

εj if y0 = 0

Now, consider the sample mean of yt when y0 = 0 :

y = T−1TXt=1

yt = T−1TXt=1

⎛⎝ tXj=1

εj

⎞⎠Notice that the sample mean is a normalized sum of

partial sums of the white noise error term εt. As such,

it exhibits very different probabilistic behavior than

the sum of stationary and ergodic errors. It turns out

that the limit behavior of y when φ = 1 is described

by simple functionals of Brownian motion.

Brownian Motion

Standard Brownian motion (Wiener process) is a continuous-

time processW (·) associating each date r ∈ [0, 1] thescalar random ariable W (r) such that

1. W (0) = 0

2. For any dates 0 ≤ r1 < r2 < · · · < rk ≤ k, the

random increments

W (r2)−W (r1),W (r3)−W (r2), . . . ,W (rk)−W (rk−1)

are independent Gaussian random variables with

W (t)−W (s) ∼ N(0, t− s)

3. For any given realization, W (r) is continuous at

r with probability 1. That is, W (r) ∈ C[0, 1] =

space of continuous real valued functions on [0, 1].

The standard Brownian motion, or Wiener process,

may be intuitively thought of as the continuous-time

limit of a random walk process in which the integer

time index t = 1, 2, . . . ,∞ has been rescaled to the

continuous time index r = 0, . . . , 1. The Wiener pro-

cess may be shown to have the following properties:

1. W (r) ∼ N(0, r)

2. σW (r) = B(r) ∼ N(0, σ2r)

3. W (r)2 ∼ r · χ2(1)

4. W (r) is not differentiable and exhibits unbounded

variation.

Partial Sum Processes and the Functional Central

Limit Theorem

Let εt ∼WN(0, σ2). For r ∈ [0, 1], define the partialsum process

XT (r) = T−1[Tr]Xt=1

εt

[Tr] = integer part of T · r

For example, let T = 10 and consider XT (r) for r =

0, 0.01, 0.1, 0.2 :

r = 0, [10 · 0] = 0 : X10(0) =1

10

[10·0]Xt=1

εt = 0

r = 0.01, [10 · 0.01] = 0 : X10(0.01) =1

10

[10·0.01]Xt=1

εt = 0

r = 0.1, [10 · 0.1] = 1 : X10(0.1) =1

10

[10·0.1]Xt=1

εt =ε110

r = 0.2, [10 · 0.2] = 2 : X10(0.2) =1

10

[10·0.2]Xt=1

εt =ε1 + ε210

In general,

X10(r) =ε1 + · · ·+ εj

10,

j

10≤ r <

j + 1

10

For a sequence of errors ε1, . . . , εT :

1. the function XT (r) is a random step function de-

fined on [0, 1].

2. As T gets bigger the spaces between the steps

gets smaller and the random step function begins

to look more and more like a Wiener process.

The Functional Central Limit Theorem

For any fixed r ∈ [0, 1], consider

√TXT (r) =

√T

⎛⎜⎝T−1 [Tr]Xt=1

εt

⎞⎟⎠=

1√T

[Tr]Xt=1

εt

=

⎛⎜⎝q[Tr]√T

⎞⎟⎠⎛⎜⎝ 1q

[Tr]

[Tr]Xt=1

εt

⎞⎟⎠Now, as T →∞ q

[Tr]√T

→√r

1q[Tr]

[Tr]Xt=1

εtd→ N(0, σ2)

It follows from Slutsky’s theorem that

√TXT (r)

d→ N(0, r · σ2) ≡ σ ·W (r)

or√TXT (r)/σ

d→ N(0, r) ≡W (r)

Notice that when r = 1, we have the usual result

√TXT (1)/σ =

1

σ√T

TXt=1

εtd→ N(0, 1) ≡W (1)

Since the above result holds for any r ∈ [0, 1], one

might expect that the result holds uniformly for r ∈[0, 1]. In fact, the probability distribution of the se-

quence of stochastic step functions

{√TXT (·)/σ}∞T=1

defined on [0, 1] converges asymptotically to that of

standard Brownian motion W (·).

This convergence result, know as Donsker’s Theorem

for Partial Sums or the Functional Central Limit The-

orem (FCLT), is often represented as√TXT (·)/σ ⇒W (·)

The symbol “⇒” denotes convergence in distributionfor random functions.

The Continuous Mapping Theorem

Recall, if XT is a sequence of random variables such

that XTd→ X and g(·) is a continuous function then

g(XT )d→ g(X). A similar result holds for random

functions and is called the Continuous Mapping The-

orem (CMT).

Let {ST (·)}∞T=1 be a sequence of random functions

such that

ST (·) ⇒ S(·)g(·) = continuous functional

Then the CMT states that

g(ST (·))⇒ g(S(·))

Example 1

Suppose ST (·) =√TXT (·)/σ so that S (·) = W (·)

by the FCLT. Let g(ST (·)) = σ · ST (·) . Then

g(ST (·))⇒ g(W (·)) = σW (·)

Example 2

Let g(ST (·)) =R 10 ST (r)dr. Then

g(ST (·))⇒ g(W (·)) =Z 10W (r)dr

Convergence of Sample Moments of I(1) Processes

Let yt be the I(1) process

yt = yt−1 + εt, εt ∼WN(0, σ2)

For r ∈ [0, 1], define the partial sum process

XT (r) = T−1[Tr]Xt=1

εt

such that√TXT (·) ⇒ σW (·). The FCLT and the

CMT may be used to deduce the following results:

T−3/2TXt=1

yt−1 ⇒ σZ 10W (r)dr

T−2TXt=1

y2t−1 ⇒ σ2Z 10W (r)2dr

T−1TXt=1

yt−1εt ⇒ σ2Z 10W (r)dW (r)

= σ2³W (1)2 − 1

´/2

For example, it can be shown that

T−3/2TXt=1

yt−1 =Z 10

√TXT (r)dr ⇒ σ

Z 10W (r)dr

using the FCLT and the CMT. The details are given

in chapter 17 of Hamilton.

Application: Unit Root Tests

To illustrate the convergence of sample moments of

I(1) processes, consider the AR(1) regression

yt = φyt−1 + εt, εt ∼WN(0, σ2)

If φ = 1 then yt ∼ I(1); if |φ| < 1 then yt ∼ I(0). A

test of yt ∼ I(1) against the alternative that yt ∼ I(0)

may therefore be formulated as

H0 : φ = 1 vs. H1 : |φ| < 1

A natural test statistic is the t-statistic

tφ=1 =φ− 1SE(φ)

where

φ =

⎛⎝ TXt=1

y2t−1

⎞⎠−1 TXt=1

yt−1yt

SE(φ) =

⎛⎜⎝σ2⎛⎝ TXt=1

y2t−1

⎞⎠−1⎞⎟⎠1/2

σ2 = T−1TXt=1

(yt − φyt−1)2

Consistency of φ under H0 : φ = 1

Under H0 : φ = 1

φ− 1 =

⎛⎝ TXt=1

y2t−1

⎞⎠−1 TXt=1

yt−1εt⎛⎝T−2 TXt=1

y2t−1

⎞⎠−1 T−2 TXt=1

yt−1εt

Using the results

T−2TXt=1

y2t−1 ⇒ σ2Z 10W (r)2dr

T−1TXt=1

yt−1εt ⇒ σ2Z 10W (r)dW (r)

and the CMT, it follows that

φ− 1 p→Ãσ2Z 10W (r)2dr

!−1× 0 = 0

so that φp→ 1.

DF Test with Intercept

yt = c+ φyt−1 + εt

= x0tβ + εt

xt = (1, yt−1)0, β = (c, φ)0

OLS gives

β =

⎛⎝ TXt=1

xtx0t

⎞⎠−1 TXt=1

xtyt−1

TXt=1

xtx0t =

ÃT

PTt=1 yt−1PT

t=1 yt−1PTt=1 y

2t−1

!TXt=1

xtyt−1 =

à PTt=1 yt−1PTt=1 ytyt−1

!

Now, under H0 : φ = 1 and c = 0

β − β =

Ãc− 0φ− 1

!=

⎛⎝ TXt=1

xtx0t

⎞⎠−1 TXt=1

xtεt

ÃT

PTt=1 yt−1PT

t=1 yt−1PTt=1 y

2t−1

!−1Ã PTt=1 εtPT

t=1 yt−1εt

!

Problem: Elements ofPTt=1 xtx

0t and

PTt=1 xtεt con-

verge at different rates!ÃT

PTt=1 yt−1PT

t=1 yt−1PTt=1 y

2t−1

!=

ÃO(T ) Op(T 3/2)

Op(T 3/2) Op(T 2)

!Ã PT

t=1 εtPTt=1 yt−1εt

!=

ÃOp(T 1/2)Op(T )

!

Implication: Cannot get sensible convergence results

using traditional scaling

T³β − β

´=

⎛⎝T−1 TXt=1

xtx0t

⎞⎠−1 T−1 TXt=1

xtεt

=

ÃT−1 T−2

PTt=1 yt−1

T−1PTt=1 yt−1 T−2

PTt=1 y

2t−1

!−1

×Ã

T−1PTt=1 εt

T−1PTt=1 yt−1εt

!

⇒Ã0 0

0 σ2R 10 W (r)2dr

!−1×Ã

0

σ2R 10 W (r)dW (r)

!which is not well defined.

Sims-Stock-Watson Trick

Define the diagonal and invertible scaling matrix

DT =

ÃT 1/2 00 T

!Then write

DT

³β − β

´= DT

⎛⎝ TXt=1

xtx0t

⎞⎠−1DTD−1T

TXt=1

xtεt

=

⎛⎝D−1T TXt=1

xtx0tD−1T

⎞⎠−1D−1T TXt=1

xtεt

where

DT

³β − β

´=

ÃT 1/2c

T (φ− 1)

!

D−1TTXt=1

xtx0tD−1T

=

Ã1 T−3/2

PTt=1 yt−1

T−3/2PTt=1 yt−1 T−2

PTt=1 y

2t−1

!

D−1TTXt=1

xtεt =

ÃT−1/2

PTt=1 εt

T−1PTt=1 yt−1εt

!

Therefore,

DT

³β − β

´⇒

Ã1 σ

R 10 W (r)

σR 10 W (r) σ2

R 10 W (r)2dr

!−1×Ã

N(0, σ2)

σ2R 10 W (r)dW (r)

!Straightforward algebra shows that

T 1/2cd9 N(0, σ2)

T (φ− 1) ⇒ÃZ 10Wμ(r)2dr

!−1 Z 10Wμ(r)dW (r)

Wμ(r) = W (r)−Z 10W (r)

Convergence of Sample Moments with General Serial

Correlation

yt = yt−1 + ψ∗(L)εt, εt ∼WN(0, σ2)= yt−1 + ut

ψ∗(L) is 1-summable

LRV = σ2ψ∗(1)2 = γ0 + 2∞Xj=1

γj

γj = cov(ut, ut−j)

FCLT

√TXT (·) =

1√T

[T ·]Xt=1

ut⇒ LRV×W (·)

1. T−3/2PTt=1 yt−1⇒

√LRV

R 10 W (r)dr

2. T−2PTt=1 y

2t−1⇒LRV

R 10 W (r)2dr

3. T−1PTt=1 yt−1ut⇒LRV

R 10 W (r)dW (r)+ω, ω =

12(LRV−γ0)

4. T−1PTt=1 yt−1εt⇒

√σ2LRV

R 10 W (r)dW (r)

5. T−1PTt=1 yt−1ut−1 =LRV

R 10 W (r)dW (r) + ω +

γ0

Application: Asymptotic Distribution of ADF test

Assume yt is I(1) and that ∆yt ∼ AR(1)

∆yt = ξ∆yt−1 + εt, εt ∼WN(0, σ2)|ξ| < 1

Therefore, ∆yt has Wold representation

∆yt = ψ∗(L)εt = ut

ψ∗(L) = (1− ξL)−1 =∞Xj=0

ψ∗jLj, ψ∗j = ξj

LRV = σ2ψ∗(L) = σ2(1− ξ)−1

The ADF test regression is

yt = φyt−1 + ξ∆yt−1 + εt

x0tβ + εt

x0t = (yt−1,∆yt−1)0, β = (φ, ξ)0

Notice that

xt =

Ãyt−1∆yt−1

!∼ I(1)∼ I(0)

OLS on the ADF test regression gives

β − β =

⎛⎝ TXt=1

xtx0t

⎞⎠−1 TXt=1

xtεt

where

TXt=1

xtx0t =

à PTt=1 y

2t−1

PTt=1 yt−1∆yt−1PT

t=1∆yt−1yt−1PTt=1∆y2t−1

!

=

ÃOp(T 2) Op(T )

Op(T ) Op(T 1/2)

!TXt=1

xtεt =

à PTt=1 yt−1εtPTt=1∆yt−1εt

!

=

ÃOp(T )

Op(T 1/2)

!

Use Sims-Stock-Watson trick and define the scaling

matrix

DT =

ÃT 0

0 T 1/2

!Then write

DT

³β − β

´= DT

⎛⎝ TXt=1

xtx0t

⎞⎠−1DTD−1T

TXt=1

xtεt

=

⎛⎝D−1T TXt=1

xtx0tD−1T

⎞⎠−1D−1T TXt=1

xtεt

where

DT

³β − β

´=

⎛⎝ T (φ− 1)T 1/2

³ξ − ξ

´ ⎞⎠D−1T

TXt=1

xtx0tD−1T

=

ÃT−2

PTt=1 y

2t−1 T−3/2

PTt=1 yt−1∆yt−1

T−3/2PTt=1∆yt−1yt−1 T−1

PTt=1∆y2t−1

!

and

D−1TTXt=1

xtεt =

ÃT−1

PTt=1 yt−1εt

T−1/2PTt=1∆yt−1εt

!Note: ∆yt−1εt = ut−1εt is a stationary and ergodicMDS with

E[(ut−1εt)2] = E[E (ut−1εt)

2 |It−1]= E[u2t−1E[ε

2t ]] = σ2γ0

Therefore, by the appropriate CLT

T−1/2TXt=1

∆yt−1εt→ N(0, σ2γ0)

Using the convergence results for the sample moments

of serially correlated I(1) process, the above result,

and the CMT gives

T (φ− 1)⇒R 10 W (r)dW (r)R 10 W (r)2dr

T 1/2³ξ − ξ

´d→ N(0, σ2γ0)

Furthermore, φ and ξ are asymptotically independent.