Chapter 4 Discrete Time Stationary Processes - McGill CIMpeterc/04lectures510/04four.pdf · Chapter...

40
Chapter 4 Discrete Time Stationary Processes Stationary Discrete Time Stationary LS 3 Processes Consider the discrete time system LS 3 : x k+1 = Ax k + Cw k , k Z + , (1) where x 0 N (0, Σ),w i.i.d., w k N (0,W ),x 0 w, and let R k,j Δ Ex k x T j , k,j Z + . Then R k,k satisfies the discrete time time-varying Lyapunov equation : R k+1,k+1 = E(Ax k + Cw k )(Ax k + Cw k ) T = AR k,k A T + CWC T . (2) Let A be asymptotically stable, i.e. let |λ i (A)|≤ ρ< 1, 1 i n. Since (2) gives R k,k = k-1 i=0 A k-i-1 CWC T (A k-i-1 ) T + A k ΣA k T , k Z 1 , (3) we may show that R k,k ,k Z + , converges to an easily computed limit in the following way. We recall that the spectral norm of a matrix M is defined by M s Δ sup x=1 Mx x and that the spectral and Euclidean norms of matrices of the same dimensions are compatible. For the matrix A this implies sup x=1 A k x x γρ k ,k Z 1 , for some γ> 0. Hence we have A k ΣA k T s A k s A k T s Σ s γ A · γ A T · ρ 2k Σ s 0, 1

Transcript of Chapter 4 Discrete Time Stationary Processes - McGill CIMpeterc/04lectures510/04four.pdf · Chapter...

Page 1: Chapter 4 Discrete Time Stationary Processes - McGill CIMpeterc/04lectures510/04four.pdf · Chapter 4 Discrete Time Stationary Processes ... If the system is given the random initial

Chapter 4

Discrete Time Stationary Processes

Stationary Discrete Time Stationary LS3 Processes

Consider the discrete time system

LS3 : xk+1 = Axk + Cwk, k ∈ Z+, (1)

where x0 ∼ N(0,Σ), w i.i.d., wk ∼ N(0,W ), x0

∐w, and let

Rk,j∆ExkxTj , k, j ∈ Z+.

Then Rk,k satisfies the discrete time time-varying Lyapunov equation :

Rk+1,k+1 = E(Axk + Cwk)(Axk + Cwk)T

= ARk,kAT + CWCT . (2)

Let A be asymptotically stable, i.e. let |λi(A)| ≤ ρ < 1, 1 ≤ i ≤ n. Since (2) gives

Rk,k =k−1∑i=0

Ak−i−1CWCT (Ak−i−1)T + AkΣAkT

, k ∈ Z1, (3)

we may show that Rk,k, k ∈ Z+, converges to an easily computed limit in the following way.

We recall that the spectral norm of a matrix M is defined by ‖M‖s∆ sup‖x‖=1‖Mx‖‖x‖ and

that the spectral and Euclidean norms of matrices of the same dimensions are compatible.

For the matrix A this implies

sup‖x‖=1‖Akx‖‖x‖ ≤ γρk, k ∈ Z1, for some γ > 0. Hence we have

‖AkΣAkT ‖s ≤ ‖Ak‖s‖AkT ‖s‖Σ‖s

≤ γA · γAT · ρ2k‖Σ‖s → 0,

1

Page 2: Chapter 4 Discrete Time Stationary Processes - McGill CIMpeterc/04lectures510/04four.pdf · Chapter 4 Discrete Time Stationary Processes ... If the system is given the random initial

for any Σ as k → ∞, and so the initial condition effect given by the second term in (3)

decays to zero as k → ∞. (Note that γA = γAT , since ‖A‖s = ‖AT‖s (reader check).) The

geometric decay of the terms in (3) shows that {Rk,k, k ∈ Z+} forms a Cauchy sequence and

hence converges. Further, we may in fact deduce that the following monotonic convergence

of positive covariance matrices takes place:

0 ≤k−1∑i=0

Ak−i−1CWCT (Ak−i−1)T ∇ Rk−10 ↑

∞∑k=0

AkCWCTAkT

< ∞, (4)

as k →∞. This is because, first,

0 ≤ Rk−10 ≤ Rk−1

0 + {AkCWCTAkT } = Rk

0 , ∀k ∈ Z1,

shows that for any x ∈ Rn the terms xTRk0x constitute a sequence of positive numbers

increasing with respect to k, which, furthermore, is bounded since

‖k−1∑i=0

Ak−i−1CWCT (Ak−i−1)T‖s ≤ ‖CWCT‖sγ2A

(1− ρ2)< ∞.

And, second,

limk→∞

Rk,k =∞∑k=0

AkCWCTAkT ∇ R∞ <∞ (5)

is the limiting matrix since, for all x, y (and hence for all ei, ej, 1 ≤ i, j ≤ n),

2xTRk,ky = (x+ y)TRk,k(x+ y)− xTRk,kx− yTRk,ky,

and each of the three terms on the right hand side converges since each is a bounded increas-

ing sequence of real numbers. From (5) and the completeness of the space Hw, it follows

that x∞k+1∆∞∑τ=0

AτCwk−τ , k ∈ Z, is a well defined zero mean finite variance random vari-

able with covariance R∞. Further, for each k ∈ Z, let a linear stochastic system LS3 with

x−N,−N ∼ N(0,Σ0) and x−N,−N∐w∞−N for each N ∈ Z+ generate the sequence,

xk+1,−N = AN+k+1x−N,−N +∑N+k

τ=0 AτCwk−τ , N ∈ Z+.

2

Page 3: Chapter 4 Discrete Time Stationary Processes - McGill CIMpeterc/04lectures510/04four.pdf · Chapter 4 Discrete Time Stationary Processes ... If the system is given the random initial

Then it may be verified via the Cauchy sequence criterion that for each k ∈ Z the mean

square limit x∞k of the sequence {xk,−N ;N ∈ Z+} as N →∞ exists as a well defined element

of Hw.

Next, if we initialize an LS3 process x in the state distribution x0 ∼ N(0, R∞), and

assume the standard LS3 conditions hold for x0, w, we obtain

Ex1xT1 = E(Ax0 + Cw0)(Ax0 + Cw0)

T

= AR∞AT + CWCT

= A(∞∑i=0

AiCWCTAiT

)AT + CWCT

=∞∑i=1

AiCWCTAiT

+ CWCT

=∞∑i=0

AiCWCTAiT

= R∞.

So we see that the covariance Rk,k is shift invariant since ExkxTk = R∞ for each k ∈ Z+.

Now R∞ is seen above to satisfy R∞ = AR∞AT + CWCT . Further,

Rk+τ,k = E(Aτxk +τ−1∑i=0

Aτ−i−1Cwi+k)(xk)T

= AτR∞, τ ∈ Z1, k, k + τ ∈ R+,

by xk∐w∞k , and so also

Rk−τ,k = (ExkxTk−τ )

T = R∞AτT

, τ ∈ Z1,k + τ ∈ Z+.

3

Page 4: Chapter 4 Discrete Time Stationary Processes - McGill CIMpeterc/04lectures510/04four.pdf · Chapter 4 Discrete Time Stationary Processes ... If the system is given the random initial

Finally, assume min(k + ρ, `+ ρ) ≥ 0 with k ≥ `.

Rk+ρ,`+ρ = Exk+ρxT`+ρ

= E

(Ak−`x`+ρ +

k−`−1∑i=0

Ak−`−1−iCwi+`+ρ

)xT`+ρ k + ρ ≥ `+ ρ, k, `, ρ ∈ Z+

= Ak−`Ex`+ρxT`+ρ

= Ak−`Ex`xT` (by Rk,k shift invariant)

= Rk,` = Ak−`R∞,

and analogously for ` > k. Hence the state covariance Rk,`, k, ` ∈ Z+ is shift invariant. Since

the process x has zero mean and is Gaussian it follows that x is strictly stationary. We

summarize the facts above as follows.

Theorem 4.1

If the LS3 system (2) is such that max1≤i≤n |λi| < 1 (i.e. A is asymptotically stable) then:

(i) There exists an invariant distribution N(0, R∞) for the system state satisfying

R∞ = AR∞AT + CWCT . (6)

(ii) If the system is given the random initial condition x∞k ∼ N(0, R∞), at any k ∈ Z,

then the generated process {x∞j ; j ≥ k} is strictly stationary Gaussian with covariance

function R∞k+τ,k = AτR∞, or R∞

k,k+τ = R∞AτT

, for τ ≥ 0.

(iii) x∞ in (ii) can be generated as the mean square limit, as N →∞, of {xk,−N ; k ≥ −N}

generated by the LS3 (1) with initial state distribution N(0,Σ), for all N , and for

which the standard Gaussian assumptions for an LS3 are satisfied with x−N,−N∐w∞−N

for all N .

(iv) The Gaussian finite dimensional distributions with zero mean and covariance functions

R∞k,j, for all j, k ∈ Z, form a compatible family to which there corresponds a strictly

stationary process x∞ on (−∞,∞) which is also the mean square limit of {xk,−N ; k ≥

−N}, as N →∞.

4

Page 5: Chapter 4 Discrete Time Stationary Processes - McGill CIMpeterc/04lectures510/04four.pdf · Chapter 4 Discrete Time Stationary Processes ... If the system is given the random initial

Wide Sense Stationary Stochastic Processes

Definition 4.1. A wide sense stationary (wss) discrete time stochastic process x is a process

such that E‖x0‖2 <∞ and

(i) µ∆ Exk = Ex0, ∀k ∈ Z

(ii) Exk+τxTk = Exτx

T0 , ∀k∀τ ∈ Z.

Note that E‖xk+τxTk ‖ ≤ (E‖xk+τ‖2)1/2(E‖xk‖2)1/2 by the Cauchy-Schwartz inequality.

So E‖x0‖2 <∞ and (ii) taken at any k, with τ = 0, gives (via trace) E‖xk‖2 = E‖x0‖2 <∞.

Hence E‖xk+τxTk ‖ <∞, k, τ ∈ Z, and so (ii) is meaningful for all k, τ .

Henceforth, unless otherwise stated, all wide sense stationary stochastic processes will be

taken to have zero mean.

Given Rτ∆ExτxT0 , τ ∈ Z, for a wide sense stationary stochastic process x, define the

Fourier transform matrix Φx(eiθ) of the sequence {Rτ}∞−∞ by

Φx(eiθ) =

∞∑τ=−∞

Rτeiτθ, θ ∈ [0, 2π],

whenever∑∞

τ=−∞ ‖Rτ‖ <∞.

From the definition,

‖Φx(eiθ)‖ ≤

∞∑τ=−∞

‖Rτ‖ <∞, θ ∈ [0, 2π],

and so the Fourier transform exists; the Fourier inversion formula then gives

1

∫ 2π

0

e−ijθΦx(eiθ)dθ =

∞∑τ=−∞

∫ 2π

0

e−ijθeiτθdθ = Rj, j ∈ Z.

5

Page 6: Chapter 4 Discrete Time Stationary Processes - McGill CIMpeterc/04lectures510/04four.pdf · Chapter 4 Discrete Time Stationary Processes ... If the system is given the random initial

Definition 4.2.

(a) The function Φx defined in terms of the covariance sequence {Rτ ; τ ∈ Z} of a wide

sense stationary stochastic process x is called the spectral density (matrix) of x.

(b) A (real coefficient) spectral density matrix Φ is a complex matrix function such that

(i) ΦT (e−iθ) = Φ(eiθ),

(ii) Φ(eiθ) = Φ(e−iθ), ∀θ ∈ [0, 2π],

(iii) Φ(eiθ) ≥ 0.

(iv) Φ ∈ L1[0, 2π].

We observe that in the scalar case, the defining properties (i) and (ii) of a spectral

density imply that Φ(.) is real and, furthermore, that (i) and (ii) evidently hold for the

spectral density matrix Φx of a wide sense stationary process x.

Subject to∑∞

k=−∞ |k|‖Rk‖ <∞, we may prove that (iii) holds for Φx as follows:

λTΦx(e

iθ)λ =∞∑−∞

λTRτλe

iτθ ∀λ ∈ Cn,∀θ ∈ [0, 2π]

= limN→∞

2N∑τ=−2N

EλTxτx

T0 λe

iτθ

= limN→∞

1

(2N + 1)E

2N∑τ=−2N

((2N + 1)− |τ |)λTxτxT0 λeiτθ

(by∞∑

k=−∞

|k|‖Rk‖ <∞)

= limN→∞

1

(2N + 1)E

2N∑τ=1

N−τ∑k=−N

+N∑

k=−Nτ=0

+−1∑

τ=−2N

N∑k=−N−τ

λTxτ+ke

i(τ+k)θxTk λe−ikθ

= limN→∞

1

(2N + 1)λT{E

(N∑

s=−N

xseisθ

)(N∑

s=−N

xTs e−isθ

)}λ

≥ 0.

6

Page 7: Chapter 4 Discrete Time Stationary Processes - McGill CIMpeterc/04lectures510/04four.pdf · Chapter 4 Discrete Time Stationary Processes ... If the system is given the random initial

Example 4.1

The following are examples of spectral density matrices:

(a) Φ1(eiθ) =

5 + eiθ + e−iθ 1 + 3e−iθ

1 + 3eiθ 10 + 3eiθ + 3e−iθ

,

(b) Φ2(eiθ) =

(2 + eiθ + e−iθ) +

(1

54− 1

2eiθ− 1

2e−iθ

)1+3e−iθ

1− 12eiθ

1+3eiθ

1− 12e−iθ 10 + 3eiθ + 3e−iθ

.

Weak Spectral Representation Theory for Strictly Stationary Processes

We first construct for any given process x an analogue of the Fourier transform of the

process.

Consider a wss stochastic process x taking values in R1; define

xN(θ) =1

(2N + 1)1/2

N∑−N

xneinθ, θ ∈ [0, 2π] (7)

and assume there exists a random variable x(θ) whose distribution is the limit of the distri-

butions of xN(θ) as N →∞, i.e. at all points of continuity of the distribution Fx(θ)(.), the

distributions FxN (θ)(.), N ∈ Z, converge to the value of Fx(θ)(.).

Let x be a strictly stationary stochastic process which has a summable covariance se-

quence and which, further, is φ-mixing such that the positive square roots of the φ-mixing

coefficients are also summable. (These conditions will hold in the wide sense stationary Gaus-

sian LS3 case; reader check.) Then it is standard result [ Billingsley, 1968] (see e.g. [Caines,

1988, p. 805, Appendix I]) that the sequence {xN(θ);N ∈ Z+} converges in distribution to

a normally distributed limiting random variable. Moreover, it follows that the asymptotic

7

Page 8: Chapter 4 Discrete Time Stationary Processes - McGill CIMpeterc/04lectures510/04four.pdf · Chapter 4 Discrete Time Stationary Processes ... If the system is given the random initial

normality property holds for vectors [.., xTN(θi−1), xTN(θi), ...] of any given length evaluated at

any finite set of frequencies (with possibly complex conjugated components xTN(θj)).

The result of this construction is that at any specified finite set of frequencies we have

a compatible family of finite dimensional distributions; hence, by the Daniel-Kolmogorov

Theorem, we obtain the spectral representation stochastic process xs∆{x(θ), θ ∈ [0, 2π].}

We note that via this construction the processes x and xs are not defined on the same

probability space and the expectation operations E which appear below must be interpreted

accordingly.

Variance of xs at θ

Assume that∑∞

k=−∞ |k|‖Rk‖ <∞, then at each θ ∈ [0, 2π] we have the following calculation:

Ex(θ)xT (θ) = limN→∞

ExN(θ)xTN(θ) (convergence in distribution and existence of second moments)

= limN→∞

1

(2N + 1)

N∑n=−N

N∑m=−N

ExneinθxTme

−imθ

= limN→∞

1

(2N + 1)

N∑n=−N

N∑m=−N

Rn−mei(n−m)θ

= limN→∞

1

(2N + 1)

2N∑τ=1

N−τ∑k=−N

+N∑

k=−Nτ=0

+−1∑

τ=−2N

N∑k=−N−τ

Rτeiτθ

= limN→∞

2N∑τ=−2N

((2N + 1)− |τ |

2N + 1

)Rτe

iτθ

=∞∑

k=−∞

Rkeikθ (by

∑∞k=−∞ |k|‖Rk‖ <∞ and Kronecker’s Lemma)

= Φx(eiθ). (8)

So the spectral density at θ is equal to the variance of the spectral representation process

xs at θ; clearly it is a positive matrix and we note that the proof above essentially reverses

the sequence of steps in the original proof following Definition 4.2 of the positivity of Φx(eiθ).

Values of the process xs∆{x(θ); θ ∈ [0, 2π]}, at distinct frequencies in [0, 2π], are orthog-

onal on [0, 2π]. This is shown by the following calculation involving the second moments of

8

Page 9: Chapter 4 Discrete Time Stationary Processes - McGill CIMpeterc/04lectures510/04four.pdf · Chapter 4 Discrete Time Stationary Processes ... If the system is given the random initial

xs which are computed via a limiting operation on the second moments of the (converging )

distribution of the product (xN(θ)xN(ψ)) as n→∞.

Ex(θ)xT (ψ) = limN→∞

ExN(θ)xTN(ψ) (convergence in distribution and existence of second moments)

= limN→∞

1

(2N + 1)

N∑n=−N

N∑m=−N

Rn−meinθe−imψ θ, ψ ∈ [0, 2π]

= limN→∞

2N∑τ=−2N

[1

(2N + 1)

N−τ∧N∑n=−N+τ∨−N

ein(θ−ψ)

]Rτe

iτψ, (τ∆n−m).

(9)

where the expression in square brackets above is interpreted in the distributional sense, that

is to say,

Ex(θ)xT (ψ) =∞∑

τ=−∞

δ(θ − ψ)Rτeiτψ

= 0, θ 6= ψ,

= Φx(eiθ), θ = ψ.

(10)

where the integration against any L2 function of θ (which necessarily has an L2 convergent

Fourier series) of the expressions on the left or right hand side of the equality gives integrals

of the same value.

Consequently, in this sense, the values x(θ), x(ψ) of the spectral representation process

xs for θ 6= ψ are orthogonal and at θ = ψ the covariance of x(θ), θ ∈ [0, 2π], gives a spectral

density matrix.

Example 4.3 White Noise

Let w be a scalar orthogonal white noise process, i.e. a process such that Ewk =

0, EwkwT` = Σδ(k − `), k, ` ∈ Z. Then Rτ = Σδτ , τ ∈ Z. We see that Φw(eiθ) =

9

Page 10: Chapter 4 Discrete Time Stationary Processes - McGill CIMpeterc/04lectures510/04four.pdf · Chapter 4 Discrete Time Stationary Processes ... If the system is given the random initial

∑∞τ=−∞Rτe

iτθ = Σ = Στ > 0. By definition the approximating spectral process wN(θ

is given by

wN(θ) =1√

2N + 1

N∑−N

wneinθ, N ∈ Z+, θ ∈ [0, 2π].

and this satisfies the exact relationship

ΦwN(θ)∆EwN(θ)wTN(θ) =

2N∑τ=−2N

((2N + 1)− |τ |

2N + 1

)Rτe

iτθ = Σ.

w is called white noise because it has a flat spectrum resembling that of ideal white light.

Example 4.4

Let w be as above and xn∆wn + 12wn−1, n ∈ R. Then

Rτ =

12Σ, τ = 1,

54Σ, τ = 0, and Rτ = 0 otherwise

12Σ, τ = −1,

Hence

Φx(θ) =Σ

4(2eiθ + 5 + 2e−iθ) =

Σ

4(5 + 4 cos θ).

The Action of Linear Systems on Strictly Stationary Stochastic Processes

Let x be a Rn valued wss stochastic process with spectral density matrix Φx. Assume

there exists k′ <∞ such that

Φx(eiθ) < k′I for all θ ∈ [0, 2π].

Further let {Ak; k ≥ 0} be a sequence of real m× n matrices such that A(eiθ)∆∑∞

k=0Akeikθ

exists as an almost everywhere limit on [0, 2π] satisfying ‖A(eiθ)‖ < k′′ for almost all

θ ∈ [0, 2π] for some k′′ < ∞. This implies the summability of the norms of the sequence

{AkATk ; k ≥ 0} and hence the square summability of the norms of both of the sequence

{Ak; k ≥ 0} and {ATk ; k ≥ 0}.

10

Page 11: Chapter 4 Discrete Time Stationary Processes - McGill CIMpeterc/04lectures510/04four.pdf · Chapter 4 Discrete Time Stationary Processes ... If the system is given the random initial

Let

yn = m.s.limN→∞

N∑k=0

Akxn−k =∞∑k=0

Akxn−k,

where the limit necessarily exists since the uniform bound on the spectral density of x and the

summability of the norms of the sequence {AkATk ; k ≥ 0} implies the partial sums indexed

by N form a Cauchy sequence in Hx. (This is most easily seen by turning the double sum

that arises in the calculation of E(yNn −yMn )(yNn −yMn )T into an integral of the corresponding

Fourier transforms ANM(eiθ), Φx(eiθ), (ANM)T (e−iθ).) Then a calculation yields the following

distributions for the random process y(θ) constructed via the Daniel-Kolmogorov Theorem

(where ≡ denotes the equality of distribution of the random variable on either side of the

symbol).

y(θ) ≡ lim.dist.N→∞1

(2N + 1)1/2

N∑k=−N

ykeikθ

= lim.dist.N,M→∞1

(2N + 1)1/2

N∑k=−N

{M∑j=0

Ajxk−jei(k−j)θeijθ

}

= lim.dist.M→∞

M∑j=0

Ajeijθ

{lim.dist.N→∞

(1

(2N + 1)1/2

N−j∑τ=−N−j

xτeiτθ

)}≡ A(eiθ)x(θ).

Hence, using (8)

Φy(eiθ) = Ey(θ)yT (θ)

= A(eiθ)Ex(θ)xT (θ)AT (e−iθ)

= A(eiθ)Φx(eiθ)AT (e−iθ),

(11)

where the Euclidean norm of this matrix spectral density is bounded for almost all θ due

to the boundedness a.e. of the spectral density of x and the boundedness a.e. of ‖A(eiθ)‖.

(11) is called the Wiener-Khinchin formula. It may be established independently of the

11

Page 12: Chapter 4 Discrete Time Stationary Processes - McGill CIMpeterc/04lectures510/04four.pdf · Chapter 4 Discrete Time Stationary Processes ... If the system is given the random initial

use of the spectral representation processes by first adopting slightly strengthened hypotheses

and then using the time domain calculation in the subsection below.

The definition of xs may be taken as way to introduce a completely heuristic analogue

to the Wiener process associated with x on any interval in the frequency domain. We shall

denote this totally fictional process by∫ θ2θ1x(θ)

√dθ, 0 ≤ θ1 < θ2 ≤ 2π, and observe that it

has zero mean.

In terms of this notation, the spectral density formula (8) and the orthogonality formula

(??) yield

E(

∫ θ2

θ1

x(θ)√dθ)(

∫ θ2

θ1

xT (ψ)√dψ) =

∫ θ2

θ1

∫ θ2

θ1

Ex(θ)xT (ψ)√dθ√dψ

=

∫ θ2

θ1

∫ θ2

θ1

Φx(θ)δ(θ − ψ)√dθ√dψ

=

∫ θ2

θ1

Φx(θ)dθ.

In particular this gives

E(

∫ 2π

0

x(θ)√dθ)(

∫ 2π

0

x(θ)T√

dθ) =

∫ 2π

0

Φ(θ)dθ = R0 = Ex0xT0 .

(8) and (??) also yield the orthogonality property for the fictional process in the frequency

domain expressed by

E(

∫ θ2+θ′

θ1

x(θ)√dθ)(

∫ θ3

θ2

x(ψ)T√

dψ =

∫ θ2+θ′

θ2

Φ(θ)dθ, θ1 < θ2 < θ2 + θ′ < θ3.

The heuristic calculations above show that the fictional measure x(θ)√dθ may be inter-

preted as corresponding to the stochastic measure dζ whose existence and properties are

described in Theorem 4.2 below.

12

Page 13: Chapter 4 Discrete Time Stationary Processes - McGill CIMpeterc/04lectures510/04four.pdf · Chapter 4 Discrete Time Stationary Processes ... If the system is given the random initial

The Action of Linear Systems on Wide Sense Stochastic Processes

Assume∑∞

k=0 ‖Ak‖ < ∞, which, we note, implies the (everywhere) boundedness of the

corresponding matrix transfer function. Assume further that the covariance sequence of x

satisfies∑∞

k=−∞ ‖Rk‖ < ∞; then the spectral density matrix Φx is defined by the Fourier

transform of the covariance sequence for all θ, and there exists k′ < ∞ which bounds (ev-

erywhere) the norm of the spectral density matrix Φx(·) of x.

Then the covariance function {Ryτ ; τ ∈ Z} of the process y is given by

Ryτ = Eyk+τy

Tk

= Em.s.limN→∞

(N∑j=0

Ajxk+τ−j

)m.s.limN→∞

(N∑`=0

A`xk−`

)T

= limN→∞

E

(N∑j=0

Ajxk+τ−j

)(N∑`=0

A`xk−`

)T

=∞∑`=0

∞∑j=0

AjRxτ−j+`A

T` ,

(12)

where the double sum converges because (i) the partial sums satisfy the Cauchy condition

due to the summability of the A sequence and (ii) the the fact that the (spectral) norm of

any covariance matrix is bounded by the (spectral) norm of the zero shift covariance matrix.

Then we again obtain the Wiener-Khinchin formula via

Φy(eiθ) =∞∑

τ=−∞

Ryτeiτθ θ ∈ [0, 2π]

=∞∑j=0

∞∑`=0

∞∑τ=−∞

AjeijθRx

τ−j+`ei(τ−j+`)θAT` e

−i`θ

= A(eiθ)Φ(eiθ)AT (e−iθ),

13

Page 14: Chapter 4 Discrete Time Stationary Processes - McGill CIMpeterc/04lectures510/04four.pdf · Chapter 4 Discrete Time Stationary Processes ... If the system is given the random initial

where the triple sum above converges by the absolute summability of the individual series.

In the most general case, the summability of the impulse responses is not assumed (see

LSS, Chapter 2), but only the uniform boundedness of the transforms of the impluse re-

sponse and of the spectral density. Then the existence of the mean square limit defining

the y process and of the double sums above giving the covariance of the y process follows

from the summability of the norms of the sequence {AkATk ; k ≥ 0} and the assumed uniform

bound on Φx(eiθ). This is because (in the case τ = 0, for example) the partial sums indexed

by N form a Cauchy sequence in Hx. This in turn is most easily seen by equating the

double sum that arises in the calculation of E(yNn − yMn )(yNn − yMn )T into an integral of the

corresponding Fourier transforms ANM(eiθ), Φx(eiθ), (ANM)T (e−iθ).

Definition 4.3 Complex Hermitian Matrices

A matrix function Z : [0, 2π] → Cn2is a (complex) Hermitian matrix (with real coeffi-

cients) if

Z(eiθ) = ZT(eiθ) = ZT (e−iθ), θ ∈ [0, 2π].

Z is a positive (respectively strictly positive) (complex) Hermitian matrix if

λTZ(eiθ)λ ≥ 0 (> 0 resp.),∀θ ∈ [0, 2π], ∀λ ∈ CN (∀λ 6= 0, resp.).

This is denoted by Z ≡ Z(eiθ) ≥ 0 (resp. > 0).

Note that spectral density matrices Φ are positive Hermitian matrices and so are the

spectral distribution matrices, associated with a spectral density, given by

F (θ) ≡ F (eiθ)∆

∫ θ

0

Φ(eiλ)dλ, θ ∈ [0, 2π].

Such matrices satisfy the following definition.

14

Page 15: Chapter 4 Discrete Time Stationary Processes - McGill CIMpeterc/04lectures510/04four.pdf · Chapter 4 Discrete Time Stationary Processes ... If the system is given the random initial

Definition 4.4

A matrix distribution function F is a bounded right continuous matrix valued function on

[0, 2π], such that F (0) = 0 and such that, for all λ2 ≥ λ1, λ1, λ2 ∈ [0, 2π], F (eiλ2)−F (eiλ1) is a

positive complex Hermitian matrix.

Theorem 4.2 Existence of a Spectral Representation Process (Ref: LSS 1988)

Let x be an Rn valued wss stochastic process with zero mean and covariance matrix

sequence {Rτ ; τ ∈ Z}. Then there exists a Cn valued orthogonal increment process ζ defined

on [0, 2π] with associated stochastic measure dζ, such that

xn =

∫ 2π

0

e−inθdζ(θ), n ∈ Z, a.s

and there exists a matrix distribution function F such that

Rτ = ExτxT0 = E

(∫ 2π

0

e−iτθdζ(θ)

)(∫ 2π

0

dζT(θ)

)=

1

∫ 2π

0

e−iτθdF (θ), τ ∈ Z.

where

E(ζ(λ)− ζ(0))(ζ(λ)− ζ(0))T

=1

∫ λ

0

dF (θ) =1

2π(F (λ)− F (0)), λ ∈ [0, 2π].

Whenever a density F ′ exists for the distribution F , it follows that

Rτ =1

∫ 2π

0

e−iτθF ′(θ)dθ, τ ∈ Z,

and hence the Fourier series representation of F ′ shows the a.e. equality of F ′ and Φx(ei·):

F ′(θ) =∞∑−∞

Rτeiτθ = Φx(e

iθ), θ ∈ [0, 2π].

Hence we see that intuitively: dζθ “=” x(θ)(dθ)1/2, so dζθ is the “stochastic density”

defined before Theorem 6.2.

15

Page 16: Chapter 4 Discrete Time Stationary Processes - McGill CIMpeterc/04lectures510/04four.pdf · Chapter 4 Discrete Time Stationary Processes ... If the system is given the random initial

Using the spectral representation process ζ defined above, we may give a rigorous mathe-

matical expression to the sequence of equalities (??) which correspond to the transformation

in Figure 1.

Under the given a.e. boundedness conditions on A(eiθ) and Φx(eiθ),

yn∆ m.s.limN→∞

N∑k=0

Akxn−k, n ∈ Z,

is defined by the sequence of partial sums which may be verified to form a Cauchy sequence

in the Hilbert space spanned by the process x. By the calculation in (12), y is a wide sense

stationary stochastic process. Substituting the spectral representation of x in the sum above

yields the orthogonal increment spectral representation process ζy(θ) satisfying

yn =

∫ 2π

0

e−inθdζy(θ), n ∈ Z,

where

dζy(θ) = A(eiθ)dζx(θ), θ ∈ [0, 2π],

i.e.

yn =

∫ 2π

0

e−inθA(eiθ)dζx(θ), n ∈ Z.

The spectral distribution process of y hence necessarily satisfies

E

(∫ λ2

λ1

dζy(θ)

)(∫ λ2

λ1

dζyT(θ)

)=

∫ λ2

λ1

A(eiθ)Φx(eiθ)A

T(eiθ)dθ, λ ∈ [0, 2π],

that is

Fy(λ2)− Fy(λ1) =

∫ λ2

λ1

A(eiθ)Φx(eiθ)A

T(eiθ)dθ

and hence the Wiener-Khinchin formula

Φy(eiθ) = A(eiθ)Φx(e

iθ)AT(eiθ)

is satisfied. Consequently,

Ryτ =

1

∫ 2π

0

e−iτθA(eiθ)Φx(eiθ)A

T(eiθ)dθ, τ ∈ Z.

16

Page 17: Chapter 4 Discrete Time Stationary Processes - McGill CIMpeterc/04lectures510/04four.pdf · Chapter 4 Discrete Time Stationary Processes ... If the system is given the random initial

Furthermore, given analogous hypotheses on a sequence of systems, the following result

holds.

Theorem 4.3 Concatenation Theorem [Caines, 1988]

Let the a.e. boundedness conditions hold on the system transfer functions of a sequence

of conformable systems A1, A2, · · · , An which act (in that order) on a process x with a.e.

bounded spectral density matrix so as to generate a process yn; then

dζyn

(θ) =n−1∏j=0

An−j(eiθ)dζx(θ), θ ∈ [0, 2π], and Φyn

=

(n−1∏j=0

An−j

)Φx

n−1∏j=0

An−j

T

.

Let the formal transform of the Fourier coefficients of the transfer functionsA1(eiθ), A2(e

iθ), · · · ,

An(eiθ), be denoted A1(z), A2(z), · · · , An(z); then the formal transform of the output process

yn is given by the formal power-seris equation

yn(z) = Σ∞k=−∞y

nk z

k =n−1∏j=0

An−j(z)x(z).

Example 4.3

Let

Z(eiθ) =1 + αeiθ

1 + βeiθ, |β| < 1,

and let x be a wide sense stationary stochastic process with spectral density∣∣∣ 1+eiθ

1− 14eiθ

∣∣∣2.Then the action of Z on x yields a wide sense stationary stochastic process y, where

dζy(θ) = Z(eiθ)dζx(θ) =1 + αeiθ

1 + βeiθdζx(θ)

and hence

yn =

∫ 2π

0

e−inθ(1 + αeiθ)

(1 + βeiθ)dζx, n ∈ Z;

with the formal z-transform representation of the action of Z on x being given by

y(z) = Z(z)x(z) =1 + αz

1 + βzx(z).

17

Page 18: Chapter 4 Discrete Time Stationary Processes - McGill CIMpeterc/04lectures510/04four.pdf · Chapter 4 Discrete Time Stationary Processes ... If the system is given the random initial

Further

Φy(eiθ) =

∣∣∣∣1 + αeiθ

1 + βeiθ

∣∣∣∣2 ∣∣∣∣ 1 + eiθ

1− 14eiθ

∣∣∣∣2and

Ryn =

1

∫ 2π

0

e−inθΦydθ, n ∈ Z.

The Wiener-Khinchin Theorem and its recursive formulation in terms of the Concatena-

tion Theorem provide a calculus for the second order properties of the operation of linear

systems on second order processes. This is in direct analogy with the calculus which is ob-

tained for deterministic discrete (respectively, continuous) time functions and systems (or,

signals and systems) and their discrete (respectively, Laplace) transforms. We recall [ ] that

in the latter case the relatively complex operations of iterated convolutions are replaced

by pointwise operations with complex functions (or formal algebraic series, (see below and

[Caines, 1988]); clearly this is also the operational result obtained from the transform theory

for second order processes described by the Concatenation Theorem.

18

Page 19: Chapter 4 Discrete Time Stationary Processes - McGill CIMpeterc/04lectures510/04four.pdf · Chapter 4 Discrete Time Stationary Processes ... If the system is given the random initial

ARMA Processes and General Formal Transforms

Autoregressive Moving Average (ARMA) processes are Rn valued (output) stochastic

processes y which are related to an Rm valued (input) orthogonal white noise process w by

a recursive scheme of the form

yn + A1yn−1 + .....+ Apyn−p = B0wn +B1wn−1 + .....+Bqyn−q, (13)

for some p, q, n ∈ Z+ and with initial conditions given at n = 0.

In case the input process is another second order stochastic process x, we say that x, y

are related by an ARMA system.

Let us take the formal z transform of the equation above on a positive semi-infinite in-

terval on which the values of the input and output processes (deterministic or stochastic)

are defined, that is to say, on time intervals of the form {k; k+M ∈ Z+} for some M ∈ Z+,

where, unless otherwise stated, we take M = 0, i.e. the interval Z+. Then we obtain the fol-

lowing equation in a possibly infinite series in positive powers of the algebraic indeterminate

(i.e. symbol) z and a finite series (i.e. a polynomial) in negative powers of z:

A(z)y(z) = B(z)w(z) + IC(z); (14)

here A(z) =∑p

k=0Akzk, and similarly for B(z), y(z) =

∑∞k=0 ykz

k, and similarly for w(z),

and IC(z) codes the initial conditions in a polynomial in z and z−1 so that equality of the

coefficients of all powers of z holds in the equation.

We next generalize this construction by including the class of positive power series

(p.p.s.o.) operators; this is defined to be the set of infinite positive power series of the

indeterminates z and z−1 of the form A(z) =∑∞

k=−mAkzk and similarly for B(z). The for-

mal positive power series (p.p.s) inverse A(z) =∑∞

k=−mAkzk is clearly uniquely defined (and

19

Page 20: Chapter 4 Discrete Time Stationary Processes - McGill CIMpeterc/04lectures510/04four.pdf · Chapter 4 Discrete Time Stationary Processes ... If the system is given the random initial

recursively computable) in case A−m is non-singular. Consequently, only the positive power

series inverse of positive powers series are considered and they exist when the non-singular

A−m condition holds. (Often m = 0). All inverse formal z-transforms will be taken as p.p.s.

expansions since then (i) the coefficients of the (finite negative, infinite positive) power series

in the

y(z) = A−1(z)B(z)w(z) + A(z)−1IC(z), (15)

are defined by finite operations on the coefficients of the constituent power series, and (ii)

the resulting equations relating the members of the y, w and IC(z) sequences are exactly

those given by (14).

The inverse of operators M(z) appearing in such equations are obtained by computing

the solution to the set of recursive equations given via the coefficients of the powers of z in

M(z)N(z) = I. When such a solution exists, and when, in addition, M(z) is specified as a

meromorphic function of z (taken as a complex variable), then the p.p.s.e. N(z) = M(z)−1

obtained via the recursive equations is also given by the coefficients of the Laurent series

(when it exists) converging in a sufficiently small annulus surrounding the origin in the com-

plex plane. (Note that z = 0 is excluded from the domain of convergence when non-zero

terms in powers of z−1 are present.)

Example 4.5

A simple illustration of the notions introduced above is given by the positive power series

equation (p.p.s.e.) ((i.e. equation in p.p.s. series and operators)

(2z−1 + exp(z))y(z) = (z−2 + z2)w(z) + IC(z), (16)

where we take the irrational function exp(z) to denote the standard p.p.s. expansion of the

exponential function (given by the analytic power series expansion around 0). The initial

20

Page 21: Chapter 4 Discrete Time Stationary Processes - McGill CIMpeterc/04lectures510/04four.pdf · Chapter 4 Discrete Time Stationary Processes ... If the system is given the random initial

condition is chosen as IC(z) = −w0z−2 in order that a solution with zero values yk = 0, for

k = −1,−2, .., is defined for the input w which is chosen such that wk = 0, for k = −1,−2, ...

From (16), the output sequence {yj; j ∈ Z+} may be determined from the input sequence

{wj; j ∈ Z+} via the infinite recursive set of equations indexed by {z−1, z0, z1, ...}. This set

of equations is found by evaluating the formal positive power series expansion of each side

of (16) and equating the coefficients of terms with equal powers. In this example, the set of

equations has the form:

z−1(2y0) = z−1(w1),

z0(2y1 + y0) = z0(w2), (17)

........ = ........

By virtue of the definition of the inverse of a p.p.s. operator, the solution for the output

power series y(z) in terms of (i) the input power series w(z), (ii) the initial conditions IC(z),

and (iii) the formal power series operators, is given by

y(z) = (2z−1 + exp(z))−1[(z−2 + z2)w(z) + IC(z)]

= z−1(1

2− z

4− z2

8....)[(1 + z4)]w(z) + z(

1

2− z

4− z2

8....)IC(z) (18)

where IC(z) = −w0z−2. Each of the equivalent schemes (17) and(18) above gives rise to the

solution sequence beginning y(z) = 12w1 + (1

2w2 − 1

4w1)z + ....

In the case of ARMA systems, just as for the general case, the positive power series

expansions of the inverse of the denominators (scalar case), or left, or right, matrix inverse

operators (matrix case), exists when the matrix (respectively, scalar coefficient) correspond-

ing to the lowest power of z is invertible (respectively, non-zero) (see [Caines, 1988], Appendix

2). A special feature of the ARMA case is the existence of (matrix) partial fraction expres-

sions which facilitate the evaluation of the inverse of matrix polynomials. The simplicity of

21

Page 22: Chapter 4 Discrete Time Stationary Processes - McGill CIMpeterc/04lectures510/04four.pdf · Chapter 4 Discrete Time Stationary Processes ... If the system is given the random initial

the p.p.s. expansions of terms of the form (I+ zA)−1, and the simplicity of the computation

of the annuli of convergence (when z is taken as a complex variable) is useful in solving

rational transform system equations to obtain the generated processes.

So far, no issues of convergence (deterministic or stochastic) have arisen in this develop-

ment since we have considered behaviour on intervals bounded below. However, for deter-

ministic or stochastic behaviour on infinite intervals with no finite initial value the impulse

response (i.e. infinite MA representation) of the system must be given stablility properties

which fit the class of inputs.

Throughout we shall assume:

(i) For deterministic systems impulse responses will be taken to be in the class of sequences

whose normed elements are summable (i.e. l1 sequences) and inputs will be taken to have

elements whose norms are uniformly bounded, that is to say lie in l∞.

(ii) Stochastic systems shall be subject to hypotheses which make the impulse responses

square summable or the transfer functions a.e. bounded and the input processes shall have

a.e. bounded spectral densities.

In the ARMA case, when the zeroes of the equation det(A(z)) = 0 lie strictly outside the

closed unit disc in the complex plane, the p.p.s. expansion of the inverse operator A(z)−1

has geometrically decaying coefficients. Hence, the formal positive power series equation

(p.p.s.e.)

∞∑k=−∞

ykzk = y(z) = A−1(z)C(z)x(z) = A−1(z)C(z)

∞∑k=−∞

xkzk, (19)

describes the action of the asymptotically stable ARMA system A−1(z)C(z) on the doubly

infinite series in z, z−1 whose coefficients are the values of the process x. Evidently, whenever

x is a wide sense stationary process the scheme above corresponds to an infinite set of mean

square convergent sums defining the value of the process y at each instant. At this point the

reader is recommended to consider a rediscription of the operations in Example 4.3 in terms

of formal power series in z.

22

Page 23: Chapter 4 Discrete Time Stationary Processes - McGill CIMpeterc/04lectures510/04four.pdf · Chapter 4 Discrete Time Stationary Processes ... If the system is given the random initial

Theorem 4.4 The Wold Decomposition

Let x be a wide sense stationary Rp valued stochastic process with a full rank innovations

process {en = (xn − (xn|Hxn−1));n ∈ Z}. Further assume that x is purely linearly non-

deterministic, i.e. ∩n∈ZHn = 0. Then

xn = Σ∞k=0Aken−k, n ∈ Z;

the innovations covariance matrix Σ = EeneTn is invertible and the unique impulse response

coefficients Ak; k ∈ Z+ are given by

Ex0eT−k = AkΣ, k ∈ Z+, A0 = I.

For a proof of the Wold Decomposition Theorem see pages 23 through 33 of LSS.

Theorem 4.5 Spectral Factorization Theorem (LSS, pp 204-206)

Given a (p× p) spectral density matrix Φ(e.) such that Φ(eiθ) < kI <∞ and Φ−1(eiθ) <

kI < ∞ for almost all θ ∈ [0, 2π], there exists a (p × p)matrix function Z(z); z ∈ C such

that, for some ` <∞,

(i) ‖Z(eiθ)‖ < `, ‖Z−1(eiθ)‖ < `, for almost all θ ∈ [0, 2π],

(ii) Z(z), Z−1(z) are analytic in |z| < 1,

(iii) Z(eiθ)ZT (e−iθ) = Φ(eiθ) a.e. θ ∈ [0, 2π].

Such a matrix factor of Z is called a strong spectral factor and is unique up to right multi-

plication by constant orthogonal matrices when Φ and Φ−1 possess analytic extensions in a

neighbourhood of |z| = 1.

If a matrix Z satisfies the asymptotic stability conditions of (i)-(iii) but lacks the inverse

asymptotic stable conditions, it is simply called a spectral factor.

23

Page 24: Chapter 4 Discrete Time Stationary Processes - McGill CIMpeterc/04lectures510/04four.pdf · Chapter 4 Discrete Time Stationary Processes ... If the system is given the random initial

Example 4.6 A strong spectral factorization of the spectral density matrices in Example

4.1 is given by

Φ1(eiθ) =

(2 +√

3)1/2 + (2 +√

3)−1/2eiθ 1

0 (3 + eiθ)

×

(2 +√

3) + (2 +√

3)−1/2e−iθ 0

1 (3 + e−iθ)

∇ Z1(e

iθ)ZT1 (e−iθ),

where we note that Z1(z) is asymptotically stable since it is analytic in |z| < 1, and is

asymptotically inverse stable since Z−11 (z) has poles at −(2 +

√3), and −3.

A spectral factorization of Φ2(eiθ) in that example is given by

Φ2(eiθ) =

1 + eiθ 11−1/2eiθ

0 (3 + eiθ)

1 + e−iθ 0

11−1/2e−iθ (3 + e−iθ)

.In this case a strong factorization cannot exist because Φ2(z) has zeros at −1.

24

Page 25: Chapter 4 Discrete Time Stationary Processes - McGill CIMpeterc/04lectures510/04four.pdf · Chapter 4 Discrete Time Stationary Processes ... If the system is given the random initial

Theorem 4.6 Strong Spectral Factors give Wold Decompositions

Let Φ(e.) be the spectral density of a full rank wide sense stationary stochastic process x,

and let Φ(e.) satisfy the conditions of Theorem 4.5. Let Z(z) be the formal z-transform of the

(Fourier series) matrix coefficient sequence of a strong spectral factor of Φ(e.) (equivalently

of a Laurent series of a strong spectral factor Z(z)) and let W (z)∆Z(z)Z−1(0). Then W (z)

is the formal z-transform of the matrix coefficient sequence of the Wold decomposition of x

and Z(0)ZT (0) is the innovations covariance of x.

In terms of formal z-transforms and spectral representation processes, the relation be-

tween e and x is given by x(z) = W (z)e(z).

Conversely, subject to the hypotheses of theorem, the formal transform of the coefficient

sequence of the Wold decomposition, multiplied on the right by a matrix square root of the

innovations covariance matrix, is a strong spectral factor.

This theorem is proved by a direct application of the results above and by the identifica-

tion of the space spanned by the process x up to any instant and the space spanned by the

orthogonal process e which is to being shown to be the innovations process of x, in which

case

We note that the system xk = wk − wk−1, k ∈ Z, where w is a wide sense stationary

orthogonal process, generates the wide sense stationary process x; and this systems is evi-

dently the Wold decomposition of x, but the spectral density φx(z) of x does not posses a

strong spectral factor.

Hence the significance of the strong spectral factor of the spectral density Φ(z) of a pro-

cess y is that, when y is passed as input through the linear system corresponding to the

inverse of this factor, the ouput is the innovations process ν of y (up to normalization by

the coefficient matrix of the zero order term of the factor). That is to say, ν is an orthogonal

white noise process which generates the same Hilbert space as y over (−∞, k] for all k ∈ Z.

This process may be verified to be the prediction error process for the linear least squares

one step ahead prediction of y. Finally, when ν is passed as input through the linear system

25

Page 26: Chapter 4 Discrete Time Stationary Processes - McGill CIMpeterc/04lectures510/04four.pdf · Chapter 4 Discrete Time Stationary Processes ... If the system is given the random initial

corresponding to the strong spectral factor, it not only generates a process with the spectral

density Φ(z), but the resulting output process is a.s. equal to y at any instant. [See Assign-

ment 7.]

Example 4.7 For the process x generated by

LS : xk+1 =1

3xk + wk, k ∈ Z+,

where x0 is distributed N(0, σ2x), and w is an i.i.d. process distributed N(0, 1), and

x0

∐w, we seek:

(i) the Lyapunov equation for vk∆= Ex2

k, k ∈ Z+,

(ii) the steady state solution v∞ to the Lyapunov equation.

(iii) Let {x∞k ; k ∈ Z+ be the process generated by the LS with initial condition x∞0

distributed N(0, v∞). Then we wish to verify that x∞ is a strictly stationary

process on Z¯+.

Let {τk∆= Ex∞k x

∞0 ; k ∈ Z} be the covariance sequence for the corresponding process

on Z = {· · · ,−1, 0, 1, · · · } and let eiθ correspond to the unit backward shift. We also

wish to use the Wiener-Khinchin theorem (or the Concatenation Theorem) to find:

(iv) the spectral density function Φx(eiθ), 0 ≤ θ ≤ 2π, of x∞,

(v) the spectral densities of the processes y, z generated by

yk = x∞k − 1

3x∞k−1, k ∈ Z,

and

zk =1

3x∞k − x∞k−1, k ∈ Z.

Finally we wish to know what the process y is called in terms of x.

The answers to these questions are as follows:

26

Page 27: Chapter 4 Discrete Time Stationary Processes - McGill CIMpeterc/04lectures510/04four.pdf · Chapter 4 Discrete Time Stationary Processes ... If the system is given the random initial

(i) The Lyapunov equation for vk = Ex2k, k ∈ Z+, is given by:

vk+1 = Ex2k+1 = E

(1

3xk + wk

)2

=1

9Ex2

k +2

3Exkwk + Ew2

k (wl∐xl,∀l)

=1

9vk + 0 + 1

=1

9vk + 1, v0 = Ex2

0 = σ2x.

(ii) The steady state solution is given by:

v∞ =1

9v∞ + 1 ⇒ v∞ =

9

8.

(iii) Let {x∞k }∞0 } be generated by LS with initial condition x∞0 ∼ N(0, v∞).

First, Ex∞k+1 = 13Ex∞k + wk = 1

3Ex∞k . Then Ex∞0 gives Ex∞k = 0, ∀k ∈ Z+.

Second, for all k ∈ Z+ and all l ∈ Z such that k + l ∈ Z+,

Ex∞k+lx∞l = E

[(1

3

)kx∞l +

k−1∑i=0

(1

3

)k−1−l

wl+i

]x∞l

=

(1

3

)kE (x∞l )2 since wl

∐x∞l ,∀l

=

(1

3

)kv∞ since E (x∞l )2 = E (x∞0 )2 = v∞.

Hence rk+l,l∆= Ex∞k+lx

∞k = Ex∞k x

∞k+l = rl,k+l =

(13

)kv∞

=

∇ rk, k, k + l ∈ Z+, is

shift invariant w.r.t. l ∈ Z+.

Hence the zero mean Gaussian process x∞ is strictly stationary on Z+. This is

because the strict stationarity of a Gaussian process is established by the shift

invariance of its first two moments (i.e. its wide sense stationarity).

(iv) Let {rk}∞−∞ denote the covariance sequence of the corresponding process on Z.

Then

Φx(eiθ) = Z(eiθ)1Z(e−iθ) =

(e−iθ − 1

3

)−1

eiθe−iθ(eiθ − 1

3

)−1

=1(

1− eiθ

3

)(1− e−iθ

3

) .27

Page 28: Chapter 4 Discrete Time Stationary Processes - McGill CIMpeterc/04lectures510/04four.pdf · Chapter 4 Discrete Time Stationary Processes ... If the system is given the random initial

This may be deduced two ways. First, we may simply take the Fourier transform

of the covariance sequence and verify that it is given by the spectral factrization

formula above. Second, we may treat x∞ as the (instant by instant) mean square

limit of the process generated by

LS∞ : x∞k+1 =1

3x∞k + wk, k ≥ −N,

with the initial condition x∞−N ∼ N(0, v∞) where N → ∞. The limiting process

is then generated by the ARMA scheme

LS∞ : x∞k+1 =1

3x∞k + wk, k ∈ Z+.

Then the MA(∞) formal power series system equation relating w(z) and x(z)

is solvable (term by term) by the Concatenation Theorem (which includes the

relevant special case of the Wiener-Khinchin Theorem) to give the wide sense

stationary process x with the spectral density as given above, i.e.(1− z

3

)x(z) = zw(z) ⇒ x(z) = Z(z)w(z); where Z(z)

∆=

(e−iθ − 1

3

)−1

eiθ.

(v) (a) If yk = x∞k − 13x∞k−1, then

y(z) =(1− z

3

)x∞(z) =

(1− ez

3

)Z(z)w(z).

And so

Φy(eiθ) =

(1− eiθ

3

)Φx(e

iθ)

(1− e−iθ

3

)=

(1− eiθ

3

)(1− eiθ

3

)−1

eiθe−iθ(

1− e−iθ

3

)−1(1− e−iθ

3

)= 1.

(b) Similarly,

zk =1

3x∞k − x∞k−1 ⇒ z(z) =

(1

3− z

)x(z).

28

Page 29: Chapter 4 Discrete Time Stationary Processes - McGill CIMpeterc/04lectures510/04four.pdf · Chapter 4 Discrete Time Stationary Processes ... If the system is given the random initial

And so

Φz(eiθ) =

(1

3− eiθ

)(e−iθ − 1

3

)−1(eiθ − 1

3

)−1(1

3− e−iθ

)= 1.

Here y is innovations process of x, but z is not, since in the first case the span of

the space generated by x is equal to that generated by the orthogonal process y,

but the spaces are not equal in the case of x and z. (Notice that (Z(z), w(z)) is not

the Wold decomposition of x, but (z−1Z(z), zw(z)) is the Wold decomposition.)

The first assertion above is true since by the asymptotic stability of the system

in (v)(a) (obvious), and its inverse, Hyn ⊂ Hx

n and Hxn ⊂ Hy

n;n ∈ Z+. While in

the latter case the system in (ii)b is not non-anticipatively invertible.

29

Page 30: Chapter 4 Discrete Time Stationary Processes - McGill CIMpeterc/04lectures510/04four.pdf · Chapter 4 Discrete Time Stationary Processes ... If the system is given the random initial

Second Order Processes

Just as for a discrete time process, a continuous time wide sense stationary Rn valued

process is defined as a second order process for which

Ext = µ, ∀t ∈ R1,

R(t, s) = E(xt − µ)(xs − µ) = E(xt−s − µ)(x0 − µ)

∇R(t− s), ∀t, s ∈ R1.

Example 5.6

As in Theorem 5.2, the scalar LS3 process x generated by

dxt = αxtdt+ qdwt, α < 0, t ∈ R+,

with x0 ∼ N(0, π), x0

∐w and π satisfying 0 = απ+ πα+ q2, is a wss process on R+. From

the explicit calculation for the system (20), Extxs = R(t − s) = −q22αe−α|t−s|, and for any

distribution N(0, σ20) for the initial condition x0,

ExT+txT+s → R(t− s),

as T →∞.

Definition 5.6

An Rn valued stochastic process {xt; t ∈ R} is quadratic mean (q.m.) continuous if

E‖xt+h−xt‖2 → 0 as h→ 0 for all t ∈ R.

When it exists, the q.m. limit

d

dtxt ∆ q.m.limh→0

1

h(xt+h − xt) (20)

of a second order process x is called the quadratic mean (q.m.) derivative of x at t ∈ R. If

ddtxt exists for all t ∈ R then x is called a q.m. differentiable process.

30

Page 31: Chapter 4 Discrete Time Stationary Processes - McGill CIMpeterc/04lectures510/04four.pdf · Chapter 4 Discrete Time Stationary Processes ... If the system is given the random initial

It may be verified that a zero mean second order process for which the covariance R(t, s)

is jointly continuous in (t, s) at (u, u) for any u ∈ R is q.m. continuous for all t ∈ R. Con-

versely, a second order process x which is q.m. continuous for all t, s ∈ R has a covariance

function which is continuous in (t, s).

Theorem 5.6

(a) A second order process x is q.m. differentiable if ∂2R(t,s)∂t∂s

exists and is continuous at

(t, t) for all t ∈ R.

(b) If a second order process x is q.m. differentiable, then the second partial differentials

taken in either order exist.

Proof

For notational convenience we shall only consider scalar processes in this proof.

(a) To begin, observe that the limit (20) exists if and only if the Cauchy condition

limh,h′→0

E(1

h(xt+h − xt)−

1

h′(xt+h′ − xt))

2 = 0, (21)

holds.

Now the existence and continuity at (t, t) of the mixed partial derivative ∂2R(t,s)∂t∂s

of R(t, s)

implies that its value is independent of the order of evaluation of the partial derivatives.

This implies that a unique limit

lim(h,h′)→(0,0)h′−1{h−1[R(t+ h′, t+ h)−R(t+ h′, t)]− h−1[R(t, t+ h)−R(t, t)]}

= lim(h,h′)→(0,0)(hh′)−1[(Ext+h′xt+h−Ext+h′xt)− (Extxt+h−Extxt)] = lim(h,h′)→0E∆(h, h′)

31

Page 32: Chapter 4 Discrete Time Stationary Processes - McGill CIMpeterc/04lectures510/04four.pdf · Chapter 4 Discrete Time Stationary Processes ... If the system is given the random initial

exists for all sequences {(h, h′)} converging to (0, 0), where

∆(h, h′) ∆ (hh′)−1(xt+h − xt)(xt+h′ − xt).

Hence

lim(h,h′)→(0,0)(E[∆(h, h) + ∆(h′, h′)− 2∆(h, h′)])

= limh→0

E∆(h, h) + limh→0

E∆(h′, h′)− lim(h,h′)→(0,0)2E∆(h, h′) = 0.

However, the left-most expression above is equal to E( 1h(xt+h − xt)) − ( 1

h′(xt+h′ − xt))

2

(reader check), and hence the proven convergence to zero establishes that the differences

{h−1(xt+h − xt);h ∈ R} form an L2 Cauchy sequence as h → 0. This establishes the exis-

tence of the q.m. differential of the process x for t ∈ R+.

(b) If the process x is q.m. differentiable, the approximating differences form an L2

Cauchy sequence as h → 0. Hence the approximating differences for the second partial dif-

ferentials in either order exist, as required.

For processes which are q.m. differentiable we evidently have

E(d

dtx)2 =

∂2R(t, t)

∂s∂t|t=s ∀t ∈ R+.

Example 5.6

A Wiener process is q.m. continuous at any t ∈ R1+, and this fact is implied by the joint

continuity of the covariance function min(t, s)I; t, s ∈ R+. However, the second partial

derivatives do not exist for the covariance function of a Wiener process, and this implies

that it is not q.m. differentiable. This, of course, is evident by a direct calculation involving

approximating differences of the Wiener process.

32

Page 33: Chapter 4 Discrete Time Stationary Processes - McGill CIMpeterc/04lectures510/04four.pdf · Chapter 4 Discrete Time Stationary Processes ... If the system is given the random initial

Spectral Theory for WSS Continuous Time Stochastic Process

Let x be an Rn valued zero mean q.m. continuous wss stochastic processes with matrix

covariance function R(τ); τ ∈ R1. Assume∫∞−∞ ‖R(τ)‖dτ ≤ ∞.

Since x is q.m. continuous, R(.) is continuous and so R(.) ∈ L1∩C0, hence we may define

the spectral density matrix of x via

Φ(ω) ∆

∫ ∞

−∞e−2πiωtR(t)dt, ω ∈ R.

When iω replaces eiθ in the matrix function {Φ(ω);ω ∈ R} it can be seen that the three

conjugate transpose, conjugate and positivity properties of a spectral density matrix hold as

given in Definition 5.3 for (discrete time) spectral density matrices; namely Φ(.) is a positive

complex Hermitian matrix.

Since R(.) ∈ L1∩C0, it may be verified that Φ(.) ∈ L1∩C0, and so the inversion formula

Ext+τxt = R(τ) =

∫ ∞

−∞e2πiωtΦ(ω)dω, ω ∈ R,

holds. Next, in analogy with the discrete time case, we may consider the action of a (not

necessarily non-anticipative) linear system with L2 matrix impulse response {H(τ); τ ∈ R}

on a wss stochastic process satisfying the hypotheses of this section.

In the time domain we may define the output process y by

yt =

∫ ∞

−∞H(τ)x(t− τ)dτ, τ ∈ R.

Since H(.) ∈ L2 the transfer function

H(ω) ∆

∫ ∞

−∞e−2πiωtH(t)dt, ω ∈ R,

also lies in L2. Finally this gives the fundamental relation

33

Page 34: Chapter 4 Discrete Time Stationary Processes - McGill CIMpeterc/04lectures510/04four.pdf · Chapter 4 Discrete Time Stationary Processes ... If the system is given the random initial

Ry(τ) = Eyt+τyt

= E(

∫ ∞

−∞H(p)x(t+ τ − p)dp)(

∫ ∞

−∞H(q)x(t− q)dp)

=

∫ ∞

−∞e2πiωtH(ω)Φx(ω)H(ω)

Tdω, ω ∈ R.

The formula above immediately yields the continuous time version of the Wiener-Khinchin

formula for the class of processes under consideration; namely

Φy(ω) = H(ω)Φx(ω)H(ω)T, ω ∈ R.

Orthogonal Representations: The Karhunen-Loeve Expansion

Again in this section we assume all processes to have zero mean. Let x be a q.m.

continuous second order (not necessarily stationary) stochastic process taking values in the

Hilbert space H. For convenience we take x to be a scalar process. We seek a set of random

second order time independent basis functions for H, say {ψn(ω);n ∈ Z+}, such that a

representation of the form

xt(ω) =∞∑n=0

αn,tψn(ω) t ∈ R+. (22)

holds for some set of time dependent coefficients {αn,t;n ∈ Z+} which are non-random.

Moreover, we seek an orthonormal (o.n.) complex valued family {Zn(ω);n ∈ Z+} such that

E|Zn(ω)|2 = 1,n ∈ Z+,

and

〈Zn, Zm〉∆EZn(ω)Zm(ω) = 0 n 6= m, n,m ∈ Z+.

Consider the non-random set of time functions

σn(t)∆〈xt, Zn〉 = Ext(ω)Zn(ω), n ∈ Z+

34

Page 35: Chapter 4 Discrete Time Stationary Processes - McGill CIMpeterc/04lectures510/04four.pdf · Chapter 4 Discrete Time Stationary Processes ... If the system is given the random initial

where it is assumed that {Zn;n ∈ Z+} is an o.n. family.

Then

0 ≤ E|xt −∞∑n=0

〈xt, Zn〉Zn|2

= E|xt|2 − 2∞∑n=0

〈xt, Zn〉〈xt, Zn〉+∞∑n=0

|〈xt, Zn〉|2

= E|xt|2 −∞∑n=0

|〈xt, Zn〉|2.

Consider the Hilbert space H and assume that for all xt ∈ H

‖xt‖2 = 〈xt, xt〉 =∞∑n=0

|〈xt, Zn〉|2;

then we say {Zn;n ∈ Z+} is a complete o.n. (c.o.n.) family for the space H.

In this case we necessarily have

xt(ω) = q.m.limN→∞

N∑n=0

σn(t)Zn(ω)

∇∞∑n=0

σn(t)Zn(ω).

The functions {σn(t); t ∈ R} may be verified to be continuous because x is q.m. continuous.

(Reader check.)

Assume xt is not contained in any proper subspace of {Zn}∞n=0. Then the functions σn(·)

are linearly independent; otherwise

N∑n=0

anσn(t) = 0, for some {an}N0 ,

and hence

0 =N∑n=0

an〈xt, Zn〉 = 〈xt,N∑n=0

anZn〉;

which implies

xt ⊥ {N∑n=0

anZn} while xt =∞∑n=0

αn(t)Zn.

35

Page 36: Chapter 4 Discrete Time Stationary Processes - McGill CIMpeterc/04lectures510/04four.pdf · Chapter 4 Discrete Time Stationary Processes ... If the system is given the random initial

But this contradicts the proper subspace condition.

Separable Covariance Function

Suppose xt =∑∞

n=0 σn(t)Zn, t ∈ R, where {Zn;n ∈ Z+} is a c.o.n. family. Then

R(t, s) = Extxs = E[∞∑n=0

σn(t)Zn][∞∑m=0

σm(s)Zm]

=∞∑n=0

σn(t)σn(s), ∀t, s ∈ R. (23)

A covariance function of the form (23) is called separable.

Mercer’s Theorem [ ] states that continuous positive functions R(s, t); s, t ∈ R on L2×L2

possess complete orthonormal families of eigenfunctions with respect to which they have

expansions of the form (23) which converge uniformly over compact sets of the form {−M ≤

s, t ≤M ; s, t ∈ R}. Hence, in particular, continuous covariance functions are separable.

Clearly, if

R(s, t) =∞∑

n=−∞

σn(s)σn(t), s, t ∈ R, (24)

with∫∞−∞ σn(s)σm(s)ds = λnδnm, n,m ∈ Z, then∫ ∞

−∞R(s, t)σn(t)dt =

∫ ∞

−∞

(∞∑k=0

σk(s)σk(t)

)σn(t)dt

=∞∑k=0

σk(s)λnδk,n

= λnσn(s), ∀s ∈ R.

Theorem 5.7 Karhunen-Loeve

Let x be a q.m. continous second order process with covariance function R(t, s), t, s ∈ R.

(a) Let {ψn;n ∈ Z+} be the set of orthonormal eigenfunctions of R(., .) such that∫ ∞

−∞R(t, s)ψn(s)ds = λnψn(t), ∀t ∈ R,

36

Page 37: Chapter 4 Discrete Time Stationary Processes - McGill CIMpeterc/04lectures510/04four.pdf · Chapter 4 Discrete Time Stationary Processes ... If the system is given the random initial

i.e. for which {λn;n ∈ Z+} is the set of eigenvalues. Further let

bn(ω) =√λn

−1∫ ∞

−∞x(ω, t)ψn(t)dt

for which, necessarily,

Ebnbm = δm,n, m, n ∈ Z+.

Then

x(ω, t) = q.m.limN→∞

N∑n=0

√λnψn(t)bn(ω) (25)

uniformly on compact intervals.

(b) Conversely, if x(ω, t) has an expansion of the form∫ ∞

−∞ψm(t)ψn(t)dt = δmn = Ebmbn m,n ∈ Z+,

then {ψn;n ∈ Z+} and the associated {λn;n ∈ Z+} are the eigenfunctions and eigen-

values respectively of R(·, ·).

Note that for real processes x, ψ and λ are real.

Proof

(a)

From the hypotheses of the theorem and the definition of the terms, we see that

{bn;n ∈ Z+} is a c.o.n. for the Hilbert space H spanned by x. Consequently, the

result follows directly from

E|xt −N∑n=0

√λnψn(t)bn|2

=R(t, t)−N∑n=0

λn|ψn(t)|2 → 0

as N →∞, where the latter convergence follows from Mercer’s Theorem.

37

Page 38: Chapter 4 Discrete Time Stationary Processes - McGill CIMpeterc/04lectures510/04four.pdf · Chapter 4 Discrete Time Stationary Processes ... If the system is given the random initial

(b) If xt =∑∞

n=0

√λnψn(t)bn(ω), then

R(t, s) = Extxs =∞∑n=0

λnψn(t)ψn(s).

Hence∫ ∞

−∞R(t, s)ψm(s)ds =

∫ ∞

−∞

∞∑n=0

λnψn(t)ψn(s)ψm(s)ds

= λmψm(t), ∀m ∈ Z+,∀t ∈ R.

Example 5.7 (Wong [1971, p.87])

Let R(t, s) = min(t, s), t, s ∈ R; in other words R(., .) is the covariance of the Wiener

process. Consider ∫ T

0

min(t, s)ψ(s)ds = λψ(t), 0 ≤ t ≤ T,

or, equivalently,∫ t

0

sψ(s)ds+ t

∫ T

t

ψ(s)ds = λψ(t), 0 ≤ s, t ≤ T.

This gives

tψ(t)− tψ(t) +

∫ T

t

ψ(s)ds = λd

dtψ(t),

and so

ψ̈(t) = −1

λψ(t), λ 6= 0,

with ψ(0) = 0, ddtψ(T ) = 0. This gives

ψ(t) = A sin1√λt, with cos

T√λ

= 0, implying√λ =

2T

(2n+ 1)π, n ∈ Z+,

where here, since sin(−x) = − sinx, we do not need to consider the negative integers n ∈ Z

in order to find additional eigenfunctions.

So, on Z+, the set of normalized eigenfunctions are

38

Page 39: Chapter 4 Discrete Time Stationary Processes - McGill CIMpeterc/04lectures510/04four.pdf · Chapter 4 Discrete Time Stationary Processes ... If the system is given the random initial

ψn(t) =

(2

T

)1/2

sin

(2n+ 1

2

)(πt

T

), n ∈ Z+,

and hence the process x with the specified covariance R(t, s) satisfies

xt = q.m.limN→∞

N∑n=0

(2T

(2n+ 1)π

)(2

T

)1/2(sin

(2n+ 1)

T

(πt

2

))bn(ω),

where

bn(ω) =

(2T

(2n+ 1)π

)−1 ∫ T

0

(2

T

)1/2(sin

(2n+ 1

2

)(πt

T

))x(ω, t)dt

in q.m. Incidentally, as observed by Wong, Mercer’s Theorem in this case gives the expansion

min(t, s) =2

T

∞∑n=0

T 2

π2(n+ 12)2

(sin(n+

1

2)πt

T

)(sin(n+

1

2)πs

T

),

where the convergence above is uniform on [0, T ]2.

Weiner-Kolmogorov Filtering

39

Page 40: Chapter 4 Discrete Time Stationary Processes - McGill CIMpeterc/04lectures510/04four.pdf · Chapter 4 Discrete Time Stationary Processes ... If the system is given the random initial

x−→ A(eiθ)y−→

Figure 1:

40