Introduction to Time Series Analysis. Lecture 3. · Introduction to Time Series Analysis. Lecture...

Post on 04-Jun-2018

237 views 1 download

Transcript of Introduction to Time Series Analysis. Lecture 3. · Introduction to Time Series Analysis. Lecture...

Introduction to Time Series Analysis. Lecture 3.Peter Bartlett

1. Review: Autocovariance, linear processes

2. Sample autocorrelation function

3. ACF and prediction

4. Properties of the ACF

1

Mean, Autocovariance, Stationarity

A time series{Xt} hasmean function µt = E[Xt]

andautocovariance function

γX(t+ h, t) = Cov(Xt+h, Xt)

= E[(Xt+h − µt+h)(Xt − µt)].

It is stationary if both are independent oft.

Then we writeγX(h) = γX(h, 0).

Theautocorrelation function (ACF) is

ρX(h) =γX(h)

γX(0)= Corr(Xt+h, Xt).

2

Linear Processes

An important class of stationary time series:

Xt = µ+∞∑

j=−∞

ψjWt−j

where {Wt} ∼WN(0, σ2w)

and µ, ψj are parameters satisfying∞∑

j=−∞

|ψj | <∞.

3

Linear Processes

Xt = µ+∞∑

j=−∞

ψjWt−j

Examples:

• White noise:ψ0 = 1.

• MA(1): ψ0 = 1, ψ1 = θ.

• AR(1): ψ0 = 1, ψ1 = φ, ψ2 = φ2, ...

4

Estimating the ACF: Sample ACF

Recall: Suppose that{Xt} is a stationary time series.

Its mean is

µ = E[Xt].

Its autocovariance function is

γ(h) = Cov(Xt+h, Xt)

= E[(Xt+h − µ)(Xt − µ)].

Its autocorrelation function is

ρ(h) =γ(h)

γ(0).

5

Estimating the ACF: Sample ACF

For observationsx1, . . . , xn of a time series,

thesample mean is x̄ =1

n

n∑

t=1

xt.

Thesample autocovariance function is

γ̂(h) =1

n

n−|h|∑

t=1

(xt+|h| − x̄)(xt − x̄), for −n < h < n.

Thesample autocorrelation function is

ρ̂(h) =γ̂(h)

γ̂(0).

6

Estimating the ACF: Sample ACF

Sample autocovariance function:

γ̂(h) =1

n

n−|h|∑

t=1

(xt+|h| − x̄)(xt − x̄).

≈ the sample covariance of(x1, xh+1), . . . , (xn−h, xn), except that

• we normalize byn instead ofn− h, and

• we subtract the full sample mean.

7

Sample ACF for white Gaussian (hence i.i.d.) noise

−20 −15 −10 −5 0 5 10 15 20−0.2

0

0.2

0.4

0.6

0.8

1

1.2

Red lines=c.i.

8

Sample ACF

We can recognize the sample autocorrelation functions of many non-white

(even non-stationary) time series.

Time series: Sample ACF:

White zero

Trend Slow decay

Periodic Periodic

MA(q) Zero for |h| > q

AR(p) Decays to zero exponentially

9

Sample ACF: Trend

0 10 20 30 40 50 60 70 80 90 100−4

−3

−2

−1

0

1

2

3

4

10

Sample ACF: Trend

−60 −40 −20 0 20 40 60−0.2

0

0.2

0.4

0.6

0.8

1

1.2

(why?)

11

Sample ACF

Time series: Sample ACF:

White zero

Trend Slow decay

Periodic Periodic

MA(q) Zero for |h| > q

AR(p) Decays to zero exponentially

12

Sample ACF: Periodic

0 10 20 30 40 50 60 70 80 90 100−4

−3

−2

−1

0

1

2

3

4

5

6

13

Sample ACF: Periodic

0 10 20 30 40 50 60 70 80 90 100−4

−3

−2

−1

0

1

2

3

4

5

6signalsignal plus noise

14

Sample ACF: Periodic

−100 −80 −60 −40 −20 0 20 40 60 80 100−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

(why?)

15

Sample ACF

Time series: Sample ACF:

White zero

Trend Slow decay

Periodic Periodic

MA(q) Zero for |h| > q

AR(p) Decays to zero exponentially

16

ACF: MA(1)

−10 −8 −6 −4 −2 0 2 4 6 8 100

0.2

0.4

0.6

0.8

1

θ/(1+θ2)

MA(1): Xt = Z

t + θ Z

t−1

17

Sample ACF: MA(1)

−10 −8 −6 −4 −2 0 2 4 6 8 10−0.2

0

0.2

0.4

0.6

0.8

1

1.2ACFSample ACF

18

Sample ACF

Time series: Sample ACF:

White zero

Trend Slow decay

Periodic Periodic

MA(q) Zero for |h| > q

AR(p) Decays to zero exponentially

19

ACF: AR(1)

−10 −8 −6 −4 −2 0 2 4 6 8 100

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

φ|h|

AR(1): Xt = φ X

t−1 + Z

t

20

Sample ACF: AR(1)

−10 −8 −6 −4 −2 0 2 4 6 8 10−0.2

0

0.2

0.4

0.6

0.8

1

1.2ACFSample ACF

21

Introduction to Time Series Analysis. Lecture 3.

1. Sample autocorrelation function

2. ACF and prediction

3. Properties of the ACF

22

ACF and prediction

0 2 4 6 8 10 12 14 16 18 20−3

−2

−1

0

1

2

white noiseMA(1)

−10 −8 −6 −4 −2 0 2 4 6 8 10−0.2

0

0.2

0.4

0.6

0.8

1

1.2ACFSample ACF

23

ACF of a MA(1) process

−5 0 5−5

0

5

lag 0−5 0 5

−5

0

5

lag 1−5 0 5

−5

0

5

lag 2−5 0 5

−5

0

5

lag 3

24

ACF and least squares prediction

Best least squares estimate ofY is EY :

minc

E(Y − c)2 = E(Y − EY )2.

Best least squares estimate ofY givenX is E[Y |X ]:

minf

E(Y − f(X))2 = minf

E[

E[(Y − f(X))2|X ]]

= E[

E[(Y − E[Y |X ])2|X ]]

= var[Y |X ].

Similarly, the best least squares estimate ofXn+h givenXn is

f(Xn) = E[Xn+h|Xn].

25

ACF and least squares prediction

Suppose thatX = (X1, . . . , Xn+h) is jointly Gaussian:

fX(x) =1

(2π)n/2|Σ|1/2exp

(

−1

2(x− µ)′Σ−1(x− µ)

)

.

Then the joint distribution of(Xn, Xn+h) is

N

µn

µn+h

,

σ2n ρσnσn+h

ρσnσn+h σ2n+h

,

and the conditional distribution ofXn+h givenXn is

N

(

µn+h + ρσn+h

σn(xn − µn), σ2

n+h(1 − ρ2)

)

.

26

ACF and least squares prediction

So for Gaussian and stationary{Xt}, the best estimate ofXn+h given

Xn = xn is

f(xn) = µ+ ρ(h)(xn − µ),

and the mean squared error is

E(Xn+h − f(Xn))2 = σ2(1 − ρ(h)2).

Notice:

• Prediction accuracy improves as|ρ(h)| → 1.

• Predictor is linear:f(x) = µ(1 − ρ(h)) + ρ(h)x.

27

ACF and least squares linear prediction

Consider alinear predictor of Xn+h givenXn = xn. Assume first that{Xt} is stationary with EXn = 0, and predictXn+h with f(xn) = axn.The best linear predictor minimizes

E(Xn+h − aXn)2 = E(

X2n+h

)

− E(2aXn+hXn) + E(

a2X2n

)

= σ2 − 2aγ(h) + a2σ2,

and this is minimized whena = ρ(h), that is,

f(xn) = ρ(h)Xn.

For this optimal linear predictor, the mean squared error is

E(Xn+h − f(Xn))2 = σ2 − 2ρ(h)γ(h) + ρ(h)2σ2

= σ2(1 − ρ(h)2).

28

ACF and least squares linear prediction

Consider the followinglinear predictor of Xn+h givenXn = xn, when{Xn} is stationary and EXn = µ:

f(xn) = a(xn − µ) + b.

The linear predictor that minimizes

E(Xn+h − (a(Xn − µ) + b))2

hasa = ρ(h), b = µ, that is,

f(xn) = ρ(h)(Xn − µ) + µ.

For this optimal linear predictor, the mean squared error isagain

E(Xn+h − f(Xn))2 = σ2(1 − ρ(h)2).

29

Least squares prediction of Xn+h given Xn

f(Xn) = µ+ ρ(h)(Xn − µ).

E(f(Xn) −Xn+h)2 = σ2(1 − ρ(h)2).

• If {Xt} is stationary,f is theoptimal linear predictor.

• If {Xt} is also Gaussian,f is theoptimal predictor.

• Linear prediction is optimal for Gaussian time series.

• Over all stationary processes with that value ofρ(h) andσ2, the optimal

mean squared error is maximized by the Gaussian process.

• Linear prediction needs only second order statistics.

• Extends to longer histories,(Xn, Xn − 1, . . .).

30

Introduction to Time Series Analysis. Lecture 3.

1. Sample autocorrelation function

2. ACF and prediction

3. Properties of the ACF

31

Properties of the autocovariance function

For the autocovariance functionγ of a stationary time series{Xt},

1. γ(0) ≥ 0, (variance is non-negative)

2. |γ(h)| ≤ γ(0), (from Cauchy-Schwarz)

3. γ(h) = γ(−h), (from stationarity)

4. γ is positive semidefinite.

Furthermore, any functionγ : Z → R that satisfies (3) and (4) is the

autocovariance of some stationary time series.

32

Properties of the autocovariance function

A functionf : Z → R is positive semidefiniteif for all n, the matrixFn,

with entries(Fn)i,j = f(i− j), is positive semidefinite.

A matrix Fn ∈ Rn×n is positive semidefinite if, for all vectorsa ∈ R

n,

a′Fa ≥ 0.

To see thatγ is psd, consider the variance of(X1, . . . , Xn)a.

33

Properties of the autocovariance function

For the autocovariance functionγ of a stationary time series{Xt},

1. γ(0) ≥ 0,

2. |γ(h)| ≤ γ(0),

3. γ(h) = γ(−h),

4. γ is positive semidefinite.

Furthermore, any functionγ : Z → R that satisfies (3) and (4) is the

autocovariance of some stationary time series (in particular, a Gaussian

process).

e.g.: (1) and (2) follow from (4).

34

Introduction to Time Series Analysis. Lecture 3.

1. Sample autocorrelation function

2. ACF and prediction

3. Properties of the ACF

35