Chapter 4 Discrete Time Stationary Processes - McGill CIMpeterc/04lectures510/04four.pdf · Chapter...
Transcript of Chapter 4 Discrete Time Stationary Processes - McGill CIMpeterc/04lectures510/04four.pdf · Chapter...
Chapter 4
Discrete Time Stationary Processes
Stationary Discrete Time Stationary LS3 Processes
Consider the discrete time system
LS3 : xk+1 = Axk + Cwk, k ∈ Z+, (1)
where x0 ∼ N(0,Σ), w i.i.d., wk ∼ N(0,W ), x0
∐w, and let
Rk,j∆ExkxTj , k, j ∈ Z+.
Then Rk,k satisfies the discrete time time-varying Lyapunov equation :
Rk+1,k+1 = E(Axk + Cwk)(Axk + Cwk)T
= ARk,kAT + CWCT . (2)
Let A be asymptotically stable, i.e. let |λi(A)| ≤ ρ < 1, 1 ≤ i ≤ n. Since (2) gives
Rk,k =k−1∑i=0
Ak−i−1CWCT (Ak−i−1)T + AkΣAkT
, k ∈ Z1, (3)
we may show that Rk,k, k ∈ Z+, converges to an easily computed limit in the following way.
We recall that the spectral norm of a matrix M is defined by ‖M‖s∆ sup‖x‖=1‖Mx‖‖x‖ and
that the spectral and Euclidean norms of matrices of the same dimensions are compatible.
For the matrix A this implies
sup‖x‖=1‖Akx‖‖x‖ ≤ γρk, k ∈ Z1, for some γ > 0. Hence we have
‖AkΣAkT ‖s ≤ ‖Ak‖s‖AkT ‖s‖Σ‖s
≤ γA · γAT · ρ2k‖Σ‖s → 0,
1
for any Σ as k → ∞, and so the initial condition effect given by the second term in (3)
decays to zero as k → ∞. (Note that γA = γAT , since ‖A‖s = ‖AT‖s (reader check).) The
geometric decay of the terms in (3) shows that {Rk,k, k ∈ Z+} forms a Cauchy sequence and
hence converges. Further, we may in fact deduce that the following monotonic convergence
of positive covariance matrices takes place:
0 ≤k−1∑i=0
Ak−i−1CWCT (Ak−i−1)T ∇ Rk−10 ↑
∞∑k=0
AkCWCTAkT
< ∞, (4)
as k →∞. This is because, first,
0 ≤ Rk−10 ≤ Rk−1
0 + {AkCWCTAkT } = Rk
0 , ∀k ∈ Z1,
shows that for any x ∈ Rn the terms xTRk0x constitute a sequence of positive numbers
increasing with respect to k, which, furthermore, is bounded since
‖k−1∑i=0
Ak−i−1CWCT (Ak−i−1)T‖s ≤ ‖CWCT‖sγ2A
(1− ρ2)< ∞.
And, second,
limk→∞
Rk,k =∞∑k=0
AkCWCTAkT ∇ R∞ <∞ (5)
is the limiting matrix since, for all x, y (and hence for all ei, ej, 1 ≤ i, j ≤ n),
2xTRk,ky = (x+ y)TRk,k(x+ y)− xTRk,kx− yTRk,ky,
and each of the three terms on the right hand side converges since each is a bounded increas-
ing sequence of real numbers. From (5) and the completeness of the space Hw, it follows
that x∞k+1∆∞∑τ=0
AτCwk−τ , k ∈ Z, is a well defined zero mean finite variance random vari-
able with covariance R∞. Further, for each k ∈ Z, let a linear stochastic system LS3 with
x−N,−N ∼ N(0,Σ0) and x−N,−N∐w∞−N for each N ∈ Z+ generate the sequence,
xk+1,−N = AN+k+1x−N,−N +∑N+k
τ=0 AτCwk−τ , N ∈ Z+.
2
Then it may be verified via the Cauchy sequence criterion that for each k ∈ Z the mean
square limit x∞k of the sequence {xk,−N ;N ∈ Z+} as N →∞ exists as a well defined element
of Hw.
Next, if we initialize an LS3 process x in the state distribution x0 ∼ N(0, R∞), and
assume the standard LS3 conditions hold for x0, w, we obtain
Ex1xT1 = E(Ax0 + Cw0)(Ax0 + Cw0)
T
= AR∞AT + CWCT
= A(∞∑i=0
AiCWCTAiT
)AT + CWCT
=∞∑i=1
AiCWCTAiT
+ CWCT
=∞∑i=0
AiCWCTAiT
= R∞.
So we see that the covariance Rk,k is shift invariant since ExkxTk = R∞ for each k ∈ Z+.
Now R∞ is seen above to satisfy R∞ = AR∞AT + CWCT . Further,
Rk+τ,k = E(Aτxk +τ−1∑i=0
Aτ−i−1Cwi+k)(xk)T
= AτR∞, τ ∈ Z1, k, k + τ ∈ R+,
by xk∐w∞k , and so also
Rk−τ,k = (ExkxTk−τ )
T = R∞AτT
, τ ∈ Z1,k + τ ∈ Z+.
3
Finally, assume min(k + ρ, `+ ρ) ≥ 0 with k ≥ `.
Rk+ρ,`+ρ = Exk+ρxT`+ρ
= E
(Ak−`x`+ρ +
k−`−1∑i=0
Ak−`−1−iCwi+`+ρ
)xT`+ρ k + ρ ≥ `+ ρ, k, `, ρ ∈ Z+
= Ak−`Ex`+ρxT`+ρ
= Ak−`Ex`xT` (by Rk,k shift invariant)
= Rk,` = Ak−`R∞,
and analogously for ` > k. Hence the state covariance Rk,`, k, ` ∈ Z+ is shift invariant. Since
the process x has zero mean and is Gaussian it follows that x is strictly stationary. We
summarize the facts above as follows.
Theorem 4.1
If the LS3 system (2) is such that max1≤i≤n |λi| < 1 (i.e. A is asymptotically stable) then:
(i) There exists an invariant distribution N(0, R∞) for the system state satisfying
R∞ = AR∞AT + CWCT . (6)
(ii) If the system is given the random initial condition x∞k ∼ N(0, R∞), at any k ∈ Z,
then the generated process {x∞j ; j ≥ k} is strictly stationary Gaussian with covariance
function R∞k+τ,k = AτR∞, or R∞
k,k+τ = R∞AτT
, for τ ≥ 0.
(iii) x∞ in (ii) can be generated as the mean square limit, as N →∞, of {xk,−N ; k ≥ −N}
generated by the LS3 (1) with initial state distribution N(0,Σ), for all N , and for
which the standard Gaussian assumptions for an LS3 are satisfied with x−N,−N∐w∞−N
for all N .
(iv) The Gaussian finite dimensional distributions with zero mean and covariance functions
R∞k,j, for all j, k ∈ Z, form a compatible family to which there corresponds a strictly
stationary process x∞ on (−∞,∞) which is also the mean square limit of {xk,−N ; k ≥
−N}, as N →∞.
4
Wide Sense Stationary Stochastic Processes
Definition 4.1. A wide sense stationary (wss) discrete time stochastic process x is a process
such that E‖x0‖2 <∞ and
(i) µ∆ Exk = Ex0, ∀k ∈ Z
(ii) Exk+τxTk = Exτx
T0 , ∀k∀τ ∈ Z.
Note that E‖xk+τxTk ‖ ≤ (E‖xk+τ‖2)1/2(E‖xk‖2)1/2 by the Cauchy-Schwartz inequality.
So E‖x0‖2 <∞ and (ii) taken at any k, with τ = 0, gives (via trace) E‖xk‖2 = E‖x0‖2 <∞.
Hence E‖xk+τxTk ‖ <∞, k, τ ∈ Z, and so (ii) is meaningful for all k, τ .
Henceforth, unless otherwise stated, all wide sense stationary stochastic processes will be
taken to have zero mean.
Given Rτ∆ExτxT0 , τ ∈ Z, for a wide sense stationary stochastic process x, define the
Fourier transform matrix Φx(eiθ) of the sequence {Rτ}∞−∞ by
Φx(eiθ) =
∞∑τ=−∞
Rτeiτθ, θ ∈ [0, 2π],
whenever∑∞
τ=−∞ ‖Rτ‖ <∞.
From the definition,
‖Φx(eiθ)‖ ≤
∞∑τ=−∞
‖Rτ‖ <∞, θ ∈ [0, 2π],
and so the Fourier transform exists; the Fourier inversion formula then gives
1
2π
∫ 2π
0
e−ijθΦx(eiθ)dθ =
∞∑τ=−∞
Rτ
2π
∫ 2π
0
e−ijθeiτθdθ = Rj, j ∈ Z.
5
Definition 4.2.
(a) The function Φx defined in terms of the covariance sequence {Rτ ; τ ∈ Z} of a wide
sense stationary stochastic process x is called the spectral density (matrix) of x.
(b) A (real coefficient) spectral density matrix Φ is a complex matrix function such that
(i) ΦT (e−iθ) = Φ(eiθ),
(ii) Φ(eiθ) = Φ(e−iθ), ∀θ ∈ [0, 2π],
(iii) Φ(eiθ) ≥ 0.
(iv) Φ ∈ L1[0, 2π].
We observe that in the scalar case, the defining properties (i) and (ii) of a spectral
density imply that Φ(.) is real and, furthermore, that (i) and (ii) evidently hold for the
spectral density matrix Φx of a wide sense stationary process x.
Subject to∑∞
k=−∞ |k|‖Rk‖ <∞, we may prove that (iii) holds for Φx as follows:
λTΦx(e
iθ)λ =∞∑−∞
λTRτλe
iτθ ∀λ ∈ Cn,∀θ ∈ [0, 2π]
= limN→∞
2N∑τ=−2N
EλTxτx
T0 λe
iτθ
= limN→∞
1
(2N + 1)E
2N∑τ=−2N
((2N + 1)− |τ |)λTxτxT0 λeiτθ
(by∞∑
k=−∞
|k|‖Rk‖ <∞)
= limN→∞
1
(2N + 1)E
2N∑τ=1
N−τ∑k=−N
+N∑
k=−Nτ=0
+−1∑
τ=−2N
N∑k=−N−τ
λTxτ+ke
i(τ+k)θxTk λe−ikθ
= limN→∞
1
(2N + 1)λT{E
(N∑
s=−N
xseisθ
)(N∑
s=−N
xTs e−isθ
)}λ
≥ 0.
6
Example 4.1
The following are examples of spectral density matrices:
(a) Φ1(eiθ) =
5 + eiθ + e−iθ 1 + 3e−iθ
1 + 3eiθ 10 + 3eiθ + 3e−iθ
,
(b) Φ2(eiθ) =
(2 + eiθ + e−iθ) +
(1
54− 1
2eiθ− 1
2e−iθ
)1+3e−iθ
1− 12eiθ
1+3eiθ
1− 12e−iθ 10 + 3eiθ + 3e−iθ
.
Weak Spectral Representation Theory for Strictly Stationary Processes
We first construct for any given process x an analogue of the Fourier transform of the
process.
Consider a wss stochastic process x taking values in R1; define
xN(θ) =1
(2N + 1)1/2
N∑−N
xneinθ, θ ∈ [0, 2π] (7)
and assume there exists a random variable x(θ) whose distribution is the limit of the distri-
butions of xN(θ) as N →∞, i.e. at all points of continuity of the distribution Fx(θ)(.), the
distributions FxN (θ)(.), N ∈ Z, converge to the value of Fx(θ)(.).
Let x be a strictly stationary stochastic process which has a summable covariance se-
quence and which, further, is φ-mixing such that the positive square roots of the φ-mixing
coefficients are also summable. (These conditions will hold in the wide sense stationary Gaus-
sian LS3 case; reader check.) Then it is standard result [ Billingsley, 1968] (see e.g. [Caines,
1988, p. 805, Appendix I]) that the sequence {xN(θ);N ∈ Z+} converges in distribution to
a normally distributed limiting random variable. Moreover, it follows that the asymptotic
7
normality property holds for vectors [.., xTN(θi−1), xTN(θi), ...] of any given length evaluated at
any finite set of frequencies (with possibly complex conjugated components xTN(θj)).
The result of this construction is that at any specified finite set of frequencies we have
a compatible family of finite dimensional distributions; hence, by the Daniel-Kolmogorov
Theorem, we obtain the spectral representation stochastic process xs∆{x(θ), θ ∈ [0, 2π].}
We note that via this construction the processes x and xs are not defined on the same
probability space and the expectation operations E which appear below must be interpreted
accordingly.
Variance of xs at θ
Assume that∑∞
k=−∞ |k|‖Rk‖ <∞, then at each θ ∈ [0, 2π] we have the following calculation:
Ex(θ)xT (θ) = limN→∞
ExN(θ)xTN(θ) (convergence in distribution and existence of second moments)
= limN→∞
1
(2N + 1)
N∑n=−N
N∑m=−N
ExneinθxTme
−imθ
= limN→∞
1
(2N + 1)
N∑n=−N
N∑m=−N
Rn−mei(n−m)θ
= limN→∞
1
(2N + 1)
2N∑τ=1
N−τ∑k=−N
+N∑
k=−Nτ=0
+−1∑
τ=−2N
N∑k=−N−τ
Rτeiτθ
= limN→∞
2N∑τ=−2N
((2N + 1)− |τ |
2N + 1
)Rτe
iτθ
=∞∑
k=−∞
Rkeikθ (by
∑∞k=−∞ |k|‖Rk‖ <∞ and Kronecker’s Lemma)
= Φx(eiθ). (8)
So the spectral density at θ is equal to the variance of the spectral representation process
xs at θ; clearly it is a positive matrix and we note that the proof above essentially reverses
the sequence of steps in the original proof following Definition 4.2 of the positivity of Φx(eiθ).
Values of the process xs∆{x(θ); θ ∈ [0, 2π]}, at distinct frequencies in [0, 2π], are orthog-
onal on [0, 2π]. This is shown by the following calculation involving the second moments of
8
xs which are computed via a limiting operation on the second moments of the (converging )
distribution of the product (xN(θ)xN(ψ)) as n→∞.
Ex(θ)xT (ψ) = limN→∞
ExN(θ)xTN(ψ) (convergence in distribution and existence of second moments)
= limN→∞
1
(2N + 1)
N∑n=−N
N∑m=−N
Rn−meinθe−imψ θ, ψ ∈ [0, 2π]
= limN→∞
2N∑τ=−2N
[1
(2N + 1)
N−τ∧N∑n=−N+τ∨−N
ein(θ−ψ)
]Rτe
iτψ, (τ∆n−m).
(9)
where the expression in square brackets above is interpreted in the distributional sense, that
is to say,
Ex(θ)xT (ψ) =∞∑
τ=−∞
δ(θ − ψ)Rτeiτψ
= 0, θ 6= ψ,
= Φx(eiθ), θ = ψ.
(10)
where the integration against any L2 function of θ (which necessarily has an L2 convergent
Fourier series) of the expressions on the left or right hand side of the equality gives integrals
of the same value.
Consequently, in this sense, the values x(θ), x(ψ) of the spectral representation process
xs for θ 6= ψ are orthogonal and at θ = ψ the covariance of x(θ), θ ∈ [0, 2π], gives a spectral
density matrix.
Example 4.3 White Noise
Let w be a scalar orthogonal white noise process, i.e. a process such that Ewk =
0, EwkwT` = Σδ(k − `), k, ` ∈ Z. Then Rτ = Σδτ , τ ∈ Z. We see that Φw(eiθ) =
9
∑∞τ=−∞Rτe
iτθ = Σ = Στ > 0. By definition the approximating spectral process wN(θ
is given by
wN(θ) =1√
2N + 1
N∑−N
wneinθ, N ∈ Z+, θ ∈ [0, 2π].
and this satisfies the exact relationship
ΦwN(θ)∆EwN(θ)wTN(θ) =
2N∑τ=−2N
((2N + 1)− |τ |
2N + 1
)Rτe
iτθ = Σ.
w is called white noise because it has a flat spectrum resembling that of ideal white light.
Example 4.4
Let w be as above and xn∆wn + 12wn−1, n ∈ R. Then
Rτ =
12Σ, τ = 1,
54Σ, τ = 0, and Rτ = 0 otherwise
12Σ, τ = −1,
Hence
Φx(θ) =Σ
4(2eiθ + 5 + 2e−iθ) =
Σ
4(5 + 4 cos θ).
The Action of Linear Systems on Strictly Stationary Stochastic Processes
Let x be a Rn valued wss stochastic process with spectral density matrix Φx. Assume
there exists k′ <∞ such that
Φx(eiθ) < k′I for all θ ∈ [0, 2π].
Further let {Ak; k ≥ 0} be a sequence of real m× n matrices such that A(eiθ)∆∑∞
k=0Akeikθ
exists as an almost everywhere limit on [0, 2π] satisfying ‖A(eiθ)‖ < k′′ for almost all
θ ∈ [0, 2π] for some k′′ < ∞. This implies the summability of the norms of the sequence
{AkATk ; k ≥ 0} and hence the square summability of the norms of both of the sequence
{Ak; k ≥ 0} and {ATk ; k ≥ 0}.
10
Let
yn = m.s.limN→∞
N∑k=0
Akxn−k =∞∑k=0
Akxn−k,
where the limit necessarily exists since the uniform bound on the spectral density of x and the
summability of the norms of the sequence {AkATk ; k ≥ 0} implies the partial sums indexed
by N form a Cauchy sequence in Hx. (This is most easily seen by turning the double sum
that arises in the calculation of E(yNn −yMn )(yNn −yMn )T into an integral of the corresponding
Fourier transforms ANM(eiθ), Φx(eiθ), (ANM)T (e−iθ).) Then a calculation yields the following
distributions for the random process y(θ) constructed via the Daniel-Kolmogorov Theorem
(where ≡ denotes the equality of distribution of the random variable on either side of the
symbol).
y(θ) ≡ lim.dist.N→∞1
(2N + 1)1/2
N∑k=−N
ykeikθ
= lim.dist.N,M→∞1
(2N + 1)1/2
N∑k=−N
{M∑j=0
Ajxk−jei(k−j)θeijθ
}
= lim.dist.M→∞
M∑j=0
Ajeijθ
{lim.dist.N→∞
(1
(2N + 1)1/2
N−j∑τ=−N−j
xτeiτθ
)}≡ A(eiθ)x(θ).
Hence, using (8)
Φy(eiθ) = Ey(θ)yT (θ)
= A(eiθ)Ex(θ)xT (θ)AT (e−iθ)
= A(eiθ)Φx(eiθ)AT (e−iθ),
(11)
where the Euclidean norm of this matrix spectral density is bounded for almost all θ due
to the boundedness a.e. of the spectral density of x and the boundedness a.e. of ‖A(eiθ)‖.
(11) is called the Wiener-Khinchin formula. It may be established independently of the
11
use of the spectral representation processes by first adopting slightly strengthened hypotheses
and then using the time domain calculation in the subsection below.
The definition of xs may be taken as way to introduce a completely heuristic analogue
to the Wiener process associated with x on any interval in the frequency domain. We shall
denote this totally fictional process by∫ θ2θ1x(θ)
√dθ, 0 ≤ θ1 < θ2 ≤ 2π, and observe that it
has zero mean.
In terms of this notation, the spectral density formula (8) and the orthogonality formula
(??) yield
E(
∫ θ2
θ1
x(θ)√dθ)(
∫ θ2
θ1
xT (ψ)√dψ) =
∫ θ2
θ1
∫ θ2
θ1
Ex(θ)xT (ψ)√dθ√dψ
=
∫ θ2
θ1
∫ θ2
θ1
Φx(θ)δ(θ − ψ)√dθ√dψ
=
∫ θ2
θ1
Φx(θ)dθ.
In particular this gives
E(
∫ 2π
0
x(θ)√dθ)(
∫ 2π
0
x(θ)T√
dθ) =
∫ 2π
0
Φ(θ)dθ = R0 = Ex0xT0 .
(8) and (??) also yield the orthogonality property for the fictional process in the frequency
domain expressed by
E(
∫ θ2+θ′
θ1
x(θ)√dθ)(
∫ θ3
θ2
x(ψ)T√
dψ =
∫ θ2+θ′
θ2
Φ(θ)dθ, θ1 < θ2 < θ2 + θ′ < θ3.
The heuristic calculations above show that the fictional measure x(θ)√dθ may be inter-
preted as corresponding to the stochastic measure dζ whose existence and properties are
described in Theorem 4.2 below.
12
The Action of Linear Systems on Wide Sense Stochastic Processes
Assume∑∞
k=0 ‖Ak‖ < ∞, which, we note, implies the (everywhere) boundedness of the
corresponding matrix transfer function. Assume further that the covariance sequence of x
satisfies∑∞
k=−∞ ‖Rk‖ < ∞; then the spectral density matrix Φx is defined by the Fourier
transform of the covariance sequence for all θ, and there exists k′ < ∞ which bounds (ev-
erywhere) the norm of the spectral density matrix Φx(·) of x.
Then the covariance function {Ryτ ; τ ∈ Z} of the process y is given by
Ryτ = Eyk+τy
Tk
= Em.s.limN→∞
(N∑j=0
Ajxk+τ−j
)m.s.limN→∞
(N∑`=0
A`xk−`
)T
= limN→∞
E
(N∑j=0
Ajxk+τ−j
)(N∑`=0
A`xk−`
)T
=∞∑`=0
∞∑j=0
AjRxτ−j+`A
T` ,
(12)
where the double sum converges because (i) the partial sums satisfy the Cauchy condition
due to the summability of the A sequence and (ii) the the fact that the (spectral) norm of
any covariance matrix is bounded by the (spectral) norm of the zero shift covariance matrix.
Then we again obtain the Wiener-Khinchin formula via
Φy(eiθ) =∞∑
τ=−∞
Ryτeiτθ θ ∈ [0, 2π]
=∞∑j=0
∞∑`=0
∞∑τ=−∞
AjeijθRx
τ−j+`ei(τ−j+`)θAT` e
−i`θ
= A(eiθ)Φ(eiθ)AT (e−iθ),
13
where the triple sum above converges by the absolute summability of the individual series.
In the most general case, the summability of the impulse responses is not assumed (see
LSS, Chapter 2), but only the uniform boundedness of the transforms of the impluse re-
sponse and of the spectral density. Then the existence of the mean square limit defining
the y process and of the double sums above giving the covariance of the y process follows
from the summability of the norms of the sequence {AkATk ; k ≥ 0} and the assumed uniform
bound on Φx(eiθ). This is because (in the case τ = 0, for example) the partial sums indexed
by N form a Cauchy sequence in Hx. This in turn is most easily seen by equating the
double sum that arises in the calculation of E(yNn − yMn )(yNn − yMn )T into an integral of the
corresponding Fourier transforms ANM(eiθ), Φx(eiθ), (ANM)T (e−iθ).
Definition 4.3 Complex Hermitian Matrices
A matrix function Z : [0, 2π] → Cn2is a (complex) Hermitian matrix (with real coeffi-
cients) if
Z(eiθ) = ZT(eiθ) = ZT (e−iθ), θ ∈ [0, 2π].
Z is a positive (respectively strictly positive) (complex) Hermitian matrix if
λTZ(eiθ)λ ≥ 0 (> 0 resp.),∀θ ∈ [0, 2π], ∀λ ∈ CN (∀λ 6= 0, resp.).
This is denoted by Z ≡ Z(eiθ) ≥ 0 (resp. > 0).
Note that spectral density matrices Φ are positive Hermitian matrices and so are the
spectral distribution matrices, associated with a spectral density, given by
F (θ) ≡ F (eiθ)∆
∫ θ
0
Φ(eiλ)dλ, θ ∈ [0, 2π].
Such matrices satisfy the following definition.
14
Definition 4.4
A matrix distribution function F is a bounded right continuous matrix valued function on
[0, 2π], such that F (0) = 0 and such that, for all λ2 ≥ λ1, λ1, λ2 ∈ [0, 2π], F (eiλ2)−F (eiλ1) is a
positive complex Hermitian matrix.
Theorem 4.2 Existence of a Spectral Representation Process (Ref: LSS 1988)
Let x be an Rn valued wss stochastic process with zero mean and covariance matrix
sequence {Rτ ; τ ∈ Z}. Then there exists a Cn valued orthogonal increment process ζ defined
on [0, 2π] with associated stochastic measure dζ, such that
xn =
∫ 2π
0
e−inθdζ(θ), n ∈ Z, a.s
and there exists a matrix distribution function F such that
Rτ = ExτxT0 = E
(∫ 2π
0
e−iτθdζ(θ)
)(∫ 2π
0
dζT(θ)
)=
1
2π
∫ 2π
0
e−iτθdF (θ), τ ∈ Z.
where
E(ζ(λ)− ζ(0))(ζ(λ)− ζ(0))T
=1
2π
∫ λ
0
dF (θ) =1
2π(F (λ)− F (0)), λ ∈ [0, 2π].
Whenever a density F ′ exists for the distribution F , it follows that
Rτ =1
2π
∫ 2π
0
e−iτθF ′(θ)dθ, τ ∈ Z,
and hence the Fourier series representation of F ′ shows the a.e. equality of F ′ and Φx(ei·):
F ′(θ) =∞∑−∞
Rτeiτθ = Φx(e
iθ), θ ∈ [0, 2π].
Hence we see that intuitively: dζθ “=” x(θ)(dθ)1/2, so dζθ is the “stochastic density”
defined before Theorem 6.2.
15
Using the spectral representation process ζ defined above, we may give a rigorous mathe-
matical expression to the sequence of equalities (??) which correspond to the transformation
in Figure 1.
Under the given a.e. boundedness conditions on A(eiθ) and Φx(eiθ),
yn∆ m.s.limN→∞
N∑k=0
Akxn−k, n ∈ Z,
is defined by the sequence of partial sums which may be verified to form a Cauchy sequence
in the Hilbert space spanned by the process x. By the calculation in (12), y is a wide sense
stationary stochastic process. Substituting the spectral representation of x in the sum above
yields the orthogonal increment spectral representation process ζy(θ) satisfying
yn =
∫ 2π
0
e−inθdζy(θ), n ∈ Z,
where
dζy(θ) = A(eiθ)dζx(θ), θ ∈ [0, 2π],
i.e.
yn =
∫ 2π
0
e−inθA(eiθ)dζx(θ), n ∈ Z.
The spectral distribution process of y hence necessarily satisfies
E
(∫ λ2
λ1
dζy(θ)
)(∫ λ2
λ1
dζyT(θ)
)=
∫ λ2
λ1
A(eiθ)Φx(eiθ)A
T(eiθ)dθ, λ ∈ [0, 2π],
that is
Fy(λ2)− Fy(λ1) =
∫ λ2
λ1
A(eiθ)Φx(eiθ)A
T(eiθ)dθ
and hence the Wiener-Khinchin formula
Φy(eiθ) = A(eiθ)Φx(e
iθ)AT(eiθ)
is satisfied. Consequently,
Ryτ =
1
2π
∫ 2π
0
e−iτθA(eiθ)Φx(eiθ)A
T(eiθ)dθ, τ ∈ Z.
16
Furthermore, given analogous hypotheses on a sequence of systems, the following result
holds.
Theorem 4.3 Concatenation Theorem [Caines, 1988]
Let the a.e. boundedness conditions hold on the system transfer functions of a sequence
of conformable systems A1, A2, · · · , An which act (in that order) on a process x with a.e.
bounded spectral density matrix so as to generate a process yn; then
dζyn
(θ) =n−1∏j=0
An−j(eiθ)dζx(θ), θ ∈ [0, 2π], and Φyn
=
(n−1∏j=0
An−j
)Φx
n−1∏j=0
An−j
T
.
Let the formal transform of the Fourier coefficients of the transfer functionsA1(eiθ), A2(e
iθ), · · · ,
An(eiθ), be denoted A1(z), A2(z), · · · , An(z); then the formal transform of the output process
yn is given by the formal power-seris equation
yn(z) = Σ∞k=−∞y
nk z
k =n−1∏j=0
An−j(z)x(z).
Example 4.3
Let
Z(eiθ) =1 + αeiθ
1 + βeiθ, |β| < 1,
and let x be a wide sense stationary stochastic process with spectral density∣∣∣ 1+eiθ
1− 14eiθ
∣∣∣2.Then the action of Z on x yields a wide sense stationary stochastic process y, where
dζy(θ) = Z(eiθ)dζx(θ) =1 + αeiθ
1 + βeiθdζx(θ)
and hence
yn =
∫ 2π
0
e−inθ(1 + αeiθ)
(1 + βeiθ)dζx, n ∈ Z;
with the formal z-transform representation of the action of Z on x being given by
y(z) = Z(z)x(z) =1 + αz
1 + βzx(z).
17
Further
Φy(eiθ) =
∣∣∣∣1 + αeiθ
1 + βeiθ
∣∣∣∣2 ∣∣∣∣ 1 + eiθ
1− 14eiθ
∣∣∣∣2and
Ryn =
1
2π
∫ 2π
0
e−inθΦydθ, n ∈ Z.
The Wiener-Khinchin Theorem and its recursive formulation in terms of the Concatena-
tion Theorem provide a calculus for the second order properties of the operation of linear
systems on second order processes. This is in direct analogy with the calculus which is ob-
tained for deterministic discrete (respectively, continuous) time functions and systems (or,
signals and systems) and their discrete (respectively, Laplace) transforms. We recall [ ] that
in the latter case the relatively complex operations of iterated convolutions are replaced
by pointwise operations with complex functions (or formal algebraic series, (see below and
[Caines, 1988]); clearly this is also the operational result obtained from the transform theory
for second order processes described by the Concatenation Theorem.
18
ARMA Processes and General Formal Transforms
Autoregressive Moving Average (ARMA) processes are Rn valued (output) stochastic
processes y which are related to an Rm valued (input) orthogonal white noise process w by
a recursive scheme of the form
yn + A1yn−1 + .....+ Apyn−p = B0wn +B1wn−1 + .....+Bqyn−q, (13)
for some p, q, n ∈ Z+ and with initial conditions given at n = 0.
In case the input process is another second order stochastic process x, we say that x, y
are related by an ARMA system.
Let us take the formal z transform of the equation above on a positive semi-infinite in-
terval on which the values of the input and output processes (deterministic or stochastic)
are defined, that is to say, on time intervals of the form {k; k+M ∈ Z+} for some M ∈ Z+,
where, unless otherwise stated, we take M = 0, i.e. the interval Z+. Then we obtain the fol-
lowing equation in a possibly infinite series in positive powers of the algebraic indeterminate
(i.e. symbol) z and a finite series (i.e. a polynomial) in negative powers of z:
A(z)y(z) = B(z)w(z) + IC(z); (14)
here A(z) =∑p
k=0Akzk, and similarly for B(z), y(z) =
∑∞k=0 ykz
k, and similarly for w(z),
and IC(z) codes the initial conditions in a polynomial in z and z−1 so that equality of the
coefficients of all powers of z holds in the equation.
We next generalize this construction by including the class of positive power series
(p.p.s.o.) operators; this is defined to be the set of infinite positive power series of the
indeterminates z and z−1 of the form A(z) =∑∞
k=−mAkzk and similarly for B(z). The for-
mal positive power series (p.p.s) inverse A(z) =∑∞
k=−mAkzk is clearly uniquely defined (and
19
recursively computable) in case A−m is non-singular. Consequently, only the positive power
series inverse of positive powers series are considered and they exist when the non-singular
A−m condition holds. (Often m = 0). All inverse formal z-transforms will be taken as p.p.s.
expansions since then (i) the coefficients of the (finite negative, infinite positive) power series
in the
y(z) = A−1(z)B(z)w(z) + A(z)−1IC(z), (15)
are defined by finite operations on the coefficients of the constituent power series, and (ii)
the resulting equations relating the members of the y, w and IC(z) sequences are exactly
those given by (14).
The inverse of operators M(z) appearing in such equations are obtained by computing
the solution to the set of recursive equations given via the coefficients of the powers of z in
M(z)N(z) = I. When such a solution exists, and when, in addition, M(z) is specified as a
meromorphic function of z (taken as a complex variable), then the p.p.s.e. N(z) = M(z)−1
obtained via the recursive equations is also given by the coefficients of the Laurent series
(when it exists) converging in a sufficiently small annulus surrounding the origin in the com-
plex plane. (Note that z = 0 is excluded from the domain of convergence when non-zero
terms in powers of z−1 are present.)
Example 4.5
A simple illustration of the notions introduced above is given by the positive power series
equation (p.p.s.e.) ((i.e. equation in p.p.s. series and operators)
(2z−1 + exp(z))y(z) = (z−2 + z2)w(z) + IC(z), (16)
where we take the irrational function exp(z) to denote the standard p.p.s. expansion of the
exponential function (given by the analytic power series expansion around 0). The initial
20
condition is chosen as IC(z) = −w0z−2 in order that a solution with zero values yk = 0, for
k = −1,−2, .., is defined for the input w which is chosen such that wk = 0, for k = −1,−2, ...
From (16), the output sequence {yj; j ∈ Z+} may be determined from the input sequence
{wj; j ∈ Z+} via the infinite recursive set of equations indexed by {z−1, z0, z1, ...}. This set
of equations is found by evaluating the formal positive power series expansion of each side
of (16) and equating the coefficients of terms with equal powers. In this example, the set of
equations has the form:
z−1(2y0) = z−1(w1),
z0(2y1 + y0) = z0(w2), (17)
........ = ........
By virtue of the definition of the inverse of a p.p.s. operator, the solution for the output
power series y(z) in terms of (i) the input power series w(z), (ii) the initial conditions IC(z),
and (iii) the formal power series operators, is given by
y(z) = (2z−1 + exp(z))−1[(z−2 + z2)w(z) + IC(z)]
= z−1(1
2− z
4− z2
8....)[(1 + z4)]w(z) + z(
1
2− z
4− z2
8....)IC(z) (18)
where IC(z) = −w0z−2. Each of the equivalent schemes (17) and(18) above gives rise to the
solution sequence beginning y(z) = 12w1 + (1
2w2 − 1
4w1)z + ....
In the case of ARMA systems, just as for the general case, the positive power series
expansions of the inverse of the denominators (scalar case), or left, or right, matrix inverse
operators (matrix case), exists when the matrix (respectively, scalar coefficient) correspond-
ing to the lowest power of z is invertible (respectively, non-zero) (see [Caines, 1988], Appendix
2). A special feature of the ARMA case is the existence of (matrix) partial fraction expres-
sions which facilitate the evaluation of the inverse of matrix polynomials. The simplicity of
21
the p.p.s. expansions of terms of the form (I+ zA)−1, and the simplicity of the computation
of the annuli of convergence (when z is taken as a complex variable) is useful in solving
rational transform system equations to obtain the generated processes.
So far, no issues of convergence (deterministic or stochastic) have arisen in this develop-
ment since we have considered behaviour on intervals bounded below. However, for deter-
ministic or stochastic behaviour on infinite intervals with no finite initial value the impulse
response (i.e. infinite MA representation) of the system must be given stablility properties
which fit the class of inputs.
Throughout we shall assume:
(i) For deterministic systems impulse responses will be taken to be in the class of sequences
whose normed elements are summable (i.e. l1 sequences) and inputs will be taken to have
elements whose norms are uniformly bounded, that is to say lie in l∞.
(ii) Stochastic systems shall be subject to hypotheses which make the impulse responses
square summable or the transfer functions a.e. bounded and the input processes shall have
a.e. bounded spectral densities.
In the ARMA case, when the zeroes of the equation det(A(z)) = 0 lie strictly outside the
closed unit disc in the complex plane, the p.p.s. expansion of the inverse operator A(z)−1
has geometrically decaying coefficients. Hence, the formal positive power series equation
(p.p.s.e.)
∞∑k=−∞
ykzk = y(z) = A−1(z)C(z)x(z) = A−1(z)C(z)
∞∑k=−∞
xkzk, (19)
describes the action of the asymptotically stable ARMA system A−1(z)C(z) on the doubly
infinite series in z, z−1 whose coefficients are the values of the process x. Evidently, whenever
x is a wide sense stationary process the scheme above corresponds to an infinite set of mean
square convergent sums defining the value of the process y at each instant. At this point the
reader is recommended to consider a rediscription of the operations in Example 4.3 in terms
of formal power series in z.
22
Theorem 4.4 The Wold Decomposition
Let x be a wide sense stationary Rp valued stochastic process with a full rank innovations
process {en = (xn − (xn|Hxn−1));n ∈ Z}. Further assume that x is purely linearly non-
deterministic, i.e. ∩n∈ZHn = 0. Then
xn = Σ∞k=0Aken−k, n ∈ Z;
the innovations covariance matrix Σ = EeneTn is invertible and the unique impulse response
coefficients Ak; k ∈ Z+ are given by
Ex0eT−k = AkΣ, k ∈ Z+, A0 = I.
For a proof of the Wold Decomposition Theorem see pages 23 through 33 of LSS.
Theorem 4.5 Spectral Factorization Theorem (LSS, pp 204-206)
Given a (p× p) spectral density matrix Φ(e.) such that Φ(eiθ) < kI <∞ and Φ−1(eiθ) <
kI < ∞ for almost all θ ∈ [0, 2π], there exists a (p × p)matrix function Z(z); z ∈ C such
that, for some ` <∞,
(i) ‖Z(eiθ)‖ < `, ‖Z−1(eiθ)‖ < `, for almost all θ ∈ [0, 2π],
(ii) Z(z), Z−1(z) are analytic in |z| < 1,
(iii) Z(eiθ)ZT (e−iθ) = Φ(eiθ) a.e. θ ∈ [0, 2π].
Such a matrix factor of Z is called a strong spectral factor and is unique up to right multi-
plication by constant orthogonal matrices when Φ and Φ−1 possess analytic extensions in a
neighbourhood of |z| = 1.
If a matrix Z satisfies the asymptotic stability conditions of (i)-(iii) but lacks the inverse
asymptotic stable conditions, it is simply called a spectral factor.
23
Example 4.6 A strong spectral factorization of the spectral density matrices in Example
4.1 is given by
Φ1(eiθ) =
(2 +√
3)1/2 + (2 +√
3)−1/2eiθ 1
0 (3 + eiθ)
×
(2 +√
3) + (2 +√
3)−1/2e−iθ 0
1 (3 + e−iθ)
∇ Z1(e
iθ)ZT1 (e−iθ),
where we note that Z1(z) is asymptotically stable since it is analytic in |z| < 1, and is
asymptotically inverse stable since Z−11 (z) has poles at −(2 +
√3), and −3.
A spectral factorization of Φ2(eiθ) in that example is given by
Φ2(eiθ) =
1 + eiθ 11−1/2eiθ
0 (3 + eiθ)
1 + e−iθ 0
11−1/2e−iθ (3 + e−iθ)
.In this case a strong factorization cannot exist because Φ2(z) has zeros at −1.
24
Theorem 4.6 Strong Spectral Factors give Wold Decompositions
Let Φ(e.) be the spectral density of a full rank wide sense stationary stochastic process x,
and let Φ(e.) satisfy the conditions of Theorem 4.5. Let Z(z) be the formal z-transform of the
(Fourier series) matrix coefficient sequence of a strong spectral factor of Φ(e.) (equivalently
of a Laurent series of a strong spectral factor Z(z)) and let W (z)∆Z(z)Z−1(0). Then W (z)
is the formal z-transform of the matrix coefficient sequence of the Wold decomposition of x
and Z(0)ZT (0) is the innovations covariance of x.
In terms of formal z-transforms and spectral representation processes, the relation be-
tween e and x is given by x(z) = W (z)e(z).
Conversely, subject to the hypotheses of theorem, the formal transform of the coefficient
sequence of the Wold decomposition, multiplied on the right by a matrix square root of the
innovations covariance matrix, is a strong spectral factor.
This theorem is proved by a direct application of the results above and by the identifica-
tion of the space spanned by the process x up to any instant and the space spanned by the
orthogonal process e which is to being shown to be the innovations process of x, in which
case
We note that the system xk = wk − wk−1, k ∈ Z, where w is a wide sense stationary
orthogonal process, generates the wide sense stationary process x; and this systems is evi-
dently the Wold decomposition of x, but the spectral density φx(z) of x does not posses a
strong spectral factor.
Hence the significance of the strong spectral factor of the spectral density Φ(z) of a pro-
cess y is that, when y is passed as input through the linear system corresponding to the
inverse of this factor, the ouput is the innovations process ν of y (up to normalization by
the coefficient matrix of the zero order term of the factor). That is to say, ν is an orthogonal
white noise process which generates the same Hilbert space as y over (−∞, k] for all k ∈ Z.
This process may be verified to be the prediction error process for the linear least squares
one step ahead prediction of y. Finally, when ν is passed as input through the linear system
25
corresponding to the strong spectral factor, it not only generates a process with the spectral
density Φ(z), but the resulting output process is a.s. equal to y at any instant. [See Assign-
ment 7.]
Example 4.7 For the process x generated by
LS : xk+1 =1
3xk + wk, k ∈ Z+,
where x0 is distributed N(0, σ2x), and w is an i.i.d. process distributed N(0, 1), and
x0
∐w, we seek:
(i) the Lyapunov equation for vk∆= Ex2
k, k ∈ Z+,
(ii) the steady state solution v∞ to the Lyapunov equation.
(iii) Let {x∞k ; k ∈ Z+ be the process generated by the LS with initial condition x∞0
distributed N(0, v∞). Then we wish to verify that x∞ is a strictly stationary
process on Z¯+.
Let {τk∆= Ex∞k x
∞0 ; k ∈ Z} be the covariance sequence for the corresponding process
on Z = {· · · ,−1, 0, 1, · · · } and let eiθ correspond to the unit backward shift. We also
wish to use the Wiener-Khinchin theorem (or the Concatenation Theorem) to find:
(iv) the spectral density function Φx(eiθ), 0 ≤ θ ≤ 2π, of x∞,
(v) the spectral densities of the processes y, z generated by
yk = x∞k − 1
3x∞k−1, k ∈ Z,
and
zk =1
3x∞k − x∞k−1, k ∈ Z.
Finally we wish to know what the process y is called in terms of x.
The answers to these questions are as follows:
26
(i) The Lyapunov equation for vk = Ex2k, k ∈ Z+, is given by:
vk+1 = Ex2k+1 = E
(1
3xk + wk
)2
=1
9Ex2
k +2
3Exkwk + Ew2
k (wl∐xl,∀l)
=1
9vk + 0 + 1
=1
9vk + 1, v0 = Ex2
0 = σ2x.
(ii) The steady state solution is given by:
v∞ =1
9v∞ + 1 ⇒ v∞ =
9
8.
(iii) Let {x∞k }∞0 } be generated by LS with initial condition x∞0 ∼ N(0, v∞).
First, Ex∞k+1 = 13Ex∞k + wk = 1
3Ex∞k . Then Ex∞0 gives Ex∞k = 0, ∀k ∈ Z+.
Second, for all k ∈ Z+ and all l ∈ Z such that k + l ∈ Z+,
Ex∞k+lx∞l = E
[(1
3
)kx∞l +
k−1∑i=0
(1
3
)k−1−l
wl+i
]x∞l
=
(1
3
)kE (x∞l )2 since wl
∐x∞l ,∀l
=
(1
3
)kv∞ since E (x∞l )2 = E (x∞0 )2 = v∞.
Hence rk+l,l∆= Ex∞k+lx
∞k = Ex∞k x
∞k+l = rl,k+l =
(13
)kv∞
=
∇ rk, k, k + l ∈ Z+, is
shift invariant w.r.t. l ∈ Z+.
Hence the zero mean Gaussian process x∞ is strictly stationary on Z+. This is
because the strict stationarity of a Gaussian process is established by the shift
invariance of its first two moments (i.e. its wide sense stationarity).
(iv) Let {rk}∞−∞ denote the covariance sequence of the corresponding process on Z.
Then
Φx(eiθ) = Z(eiθ)1Z(e−iθ) =
(e−iθ − 1
3
)−1
eiθe−iθ(eiθ − 1
3
)−1
=1(
1− eiθ
3
)(1− e−iθ
3
) .27
This may be deduced two ways. First, we may simply take the Fourier transform
of the covariance sequence and verify that it is given by the spectral factrization
formula above. Second, we may treat x∞ as the (instant by instant) mean square
limit of the process generated by
LS∞ : x∞k+1 =1
3x∞k + wk, k ≥ −N,
with the initial condition x∞−N ∼ N(0, v∞) where N → ∞. The limiting process
is then generated by the ARMA scheme
LS∞ : x∞k+1 =1
3x∞k + wk, k ∈ Z+.
Then the MA(∞) formal power series system equation relating w(z) and x(z)
is solvable (term by term) by the Concatenation Theorem (which includes the
relevant special case of the Wiener-Khinchin Theorem) to give the wide sense
stationary process x with the spectral density as given above, i.e.(1− z
3
)x(z) = zw(z) ⇒ x(z) = Z(z)w(z); where Z(z)
∆=
(e−iθ − 1
3
)−1
eiθ.
(v) (a) If yk = x∞k − 13x∞k−1, then
y(z) =(1− z
3
)x∞(z) =
(1− ez
3
)Z(z)w(z).
And so
Φy(eiθ) =
(1− eiθ
3
)Φx(e
iθ)
(1− e−iθ
3
)=
(1− eiθ
3
)(1− eiθ
3
)−1
eiθe−iθ(
1− e−iθ
3
)−1(1− e−iθ
3
)= 1.
(b) Similarly,
zk =1
3x∞k − x∞k−1 ⇒ z(z) =
(1
3− z
)x(z).
28
And so
Φz(eiθ) =
(1
3− eiθ
)(e−iθ − 1
3
)−1(eiθ − 1
3
)−1(1
3− e−iθ
)= 1.
Here y is innovations process of x, but z is not, since in the first case the span of
the space generated by x is equal to that generated by the orthogonal process y,
but the spaces are not equal in the case of x and z. (Notice that (Z(z), w(z)) is not
the Wold decomposition of x, but (z−1Z(z), zw(z)) is the Wold decomposition.)
The first assertion above is true since by the asymptotic stability of the system
in (v)(a) (obvious), and its inverse, Hyn ⊂ Hx
n and Hxn ⊂ Hy
n;n ∈ Z+. While in
the latter case the system in (ii)b is not non-anticipatively invertible.
�
29
Second Order Processes
Just as for a discrete time process, a continuous time wide sense stationary Rn valued
process is defined as a second order process for which
Ext = µ, ∀t ∈ R1,
R(t, s) = E(xt − µ)(xs − µ) = E(xt−s − µ)(x0 − µ)
∇R(t− s), ∀t, s ∈ R1.
Example 5.6
As in Theorem 5.2, the scalar LS3 process x generated by
dxt = αxtdt+ qdwt, α < 0, t ∈ R+,
with x0 ∼ N(0, π), x0
∐w and π satisfying 0 = απ+ πα+ q2, is a wss process on R+. From
the explicit calculation for the system (20), Extxs = R(t − s) = −q22αe−α|t−s|, and for any
distribution N(0, σ20) for the initial condition x0,
ExT+txT+s → R(t− s),
as T →∞.
Definition 5.6
An Rn valued stochastic process {xt; t ∈ R} is quadratic mean (q.m.) continuous if
E‖xt+h−xt‖2 → 0 as h→ 0 for all t ∈ R.
When it exists, the q.m. limit
d
dtxt ∆ q.m.limh→0
1
h(xt+h − xt) (20)
of a second order process x is called the quadratic mean (q.m.) derivative of x at t ∈ R. If
ddtxt exists for all t ∈ R then x is called a q.m. differentiable process.
30
It may be verified that a zero mean second order process for which the covariance R(t, s)
is jointly continuous in (t, s) at (u, u) for any u ∈ R is q.m. continuous for all t ∈ R. Con-
versely, a second order process x which is q.m. continuous for all t, s ∈ R has a covariance
function which is continuous in (t, s).
Theorem 5.6
(a) A second order process x is q.m. differentiable if ∂2R(t,s)∂t∂s
exists and is continuous at
(t, t) for all t ∈ R.
(b) If a second order process x is q.m. differentiable, then the second partial differentials
taken in either order exist.
Proof
For notational convenience we shall only consider scalar processes in this proof.
(a) To begin, observe that the limit (20) exists if and only if the Cauchy condition
limh,h′→0
E(1
h(xt+h − xt)−
1
h′(xt+h′ − xt))
2 = 0, (21)
holds.
Now the existence and continuity at (t, t) of the mixed partial derivative ∂2R(t,s)∂t∂s
of R(t, s)
implies that its value is independent of the order of evaluation of the partial derivatives.
This implies that a unique limit
lim(h,h′)→(0,0)h′−1{h−1[R(t+ h′, t+ h)−R(t+ h′, t)]− h−1[R(t, t+ h)−R(t, t)]}
= lim(h,h′)→(0,0)(hh′)−1[(Ext+h′xt+h−Ext+h′xt)− (Extxt+h−Extxt)] = lim(h,h′)→0E∆(h, h′)
31
exists for all sequences {(h, h′)} converging to (0, 0), where
∆(h, h′) ∆ (hh′)−1(xt+h − xt)(xt+h′ − xt).
Hence
lim(h,h′)→(0,0)(E[∆(h, h) + ∆(h′, h′)− 2∆(h, h′)])
= limh→0
E∆(h, h) + limh→0
E∆(h′, h′)− lim(h,h′)→(0,0)2E∆(h, h′) = 0.
However, the left-most expression above is equal to E( 1h(xt+h − xt)) − ( 1
h′(xt+h′ − xt))
2
(reader check), and hence the proven convergence to zero establishes that the differences
{h−1(xt+h − xt);h ∈ R} form an L2 Cauchy sequence as h → 0. This establishes the exis-
tence of the q.m. differential of the process x for t ∈ R+.
(b) If the process x is q.m. differentiable, the approximating differences form an L2
Cauchy sequence as h → 0. Hence the approximating differences for the second partial dif-
ferentials in either order exist, as required.
For processes which are q.m. differentiable we evidently have
E(d
dtx)2 =
∂2R(t, t)
∂s∂t|t=s ∀t ∈ R+.
Example 5.6
A Wiener process is q.m. continuous at any t ∈ R1+, and this fact is implied by the joint
continuity of the covariance function min(t, s)I; t, s ∈ R+. However, the second partial
derivatives do not exist for the covariance function of a Wiener process, and this implies
that it is not q.m. differentiable. This, of course, is evident by a direct calculation involving
approximating differences of the Wiener process.
32
Spectral Theory for WSS Continuous Time Stochastic Process
Let x be an Rn valued zero mean q.m. continuous wss stochastic processes with matrix
covariance function R(τ); τ ∈ R1. Assume∫∞−∞ ‖R(τ)‖dτ ≤ ∞.
Since x is q.m. continuous, R(.) is continuous and so R(.) ∈ L1∩C0, hence we may define
the spectral density matrix of x via
Φ(ω) ∆
∫ ∞
−∞e−2πiωtR(t)dt, ω ∈ R.
When iω replaces eiθ in the matrix function {Φ(ω);ω ∈ R} it can be seen that the three
conjugate transpose, conjugate and positivity properties of a spectral density matrix hold as
given in Definition 5.3 for (discrete time) spectral density matrices; namely Φ(.) is a positive
complex Hermitian matrix.
Since R(.) ∈ L1∩C0, it may be verified that Φ(.) ∈ L1∩C0, and so the inversion formula
Ext+τxt = R(τ) =
∫ ∞
−∞e2πiωtΦ(ω)dω, ω ∈ R,
holds. Next, in analogy with the discrete time case, we may consider the action of a (not
necessarily non-anticipative) linear system with L2 matrix impulse response {H(τ); τ ∈ R}
on a wss stochastic process satisfying the hypotheses of this section.
In the time domain we may define the output process y by
yt =
∫ ∞
−∞H(τ)x(t− τ)dτ, τ ∈ R.
Since H(.) ∈ L2 the transfer function
H(ω) ∆
∫ ∞
−∞e−2πiωtH(t)dt, ω ∈ R,
also lies in L2. Finally this gives the fundamental relation
33
Ry(τ) = Eyt+τyt
= E(
∫ ∞
−∞H(p)x(t+ τ − p)dp)(
∫ ∞
−∞H(q)x(t− q)dp)
=
∫ ∞
−∞e2πiωtH(ω)Φx(ω)H(ω)
Tdω, ω ∈ R.
The formula above immediately yields the continuous time version of the Wiener-Khinchin
formula for the class of processes under consideration; namely
Φy(ω) = H(ω)Φx(ω)H(ω)T, ω ∈ R.
Orthogonal Representations: The Karhunen-Loeve Expansion
Again in this section we assume all processes to have zero mean. Let x be a q.m.
continuous second order (not necessarily stationary) stochastic process taking values in the
Hilbert space H. For convenience we take x to be a scalar process. We seek a set of random
second order time independent basis functions for H, say {ψn(ω);n ∈ Z+}, such that a
representation of the form
xt(ω) =∞∑n=0
αn,tψn(ω) t ∈ R+. (22)
holds for some set of time dependent coefficients {αn,t;n ∈ Z+} which are non-random.
Moreover, we seek an orthonormal (o.n.) complex valued family {Zn(ω);n ∈ Z+} such that
E|Zn(ω)|2 = 1,n ∈ Z+,
and
〈Zn, Zm〉∆EZn(ω)Zm(ω) = 0 n 6= m, n,m ∈ Z+.
Consider the non-random set of time functions
σn(t)∆〈xt, Zn〉 = Ext(ω)Zn(ω), n ∈ Z+
34
where it is assumed that {Zn;n ∈ Z+} is an o.n. family.
Then
0 ≤ E|xt −∞∑n=0
〈xt, Zn〉Zn|2
= E|xt|2 − 2∞∑n=0
〈xt, Zn〉〈xt, Zn〉+∞∑n=0
|〈xt, Zn〉|2
= E|xt|2 −∞∑n=0
|〈xt, Zn〉|2.
Consider the Hilbert space H and assume that for all xt ∈ H
‖xt‖2 = 〈xt, xt〉 =∞∑n=0
|〈xt, Zn〉|2;
then we say {Zn;n ∈ Z+} is a complete o.n. (c.o.n.) family for the space H.
In this case we necessarily have
xt(ω) = q.m.limN→∞
N∑n=0
σn(t)Zn(ω)
∇∞∑n=0
σn(t)Zn(ω).
The functions {σn(t); t ∈ R} may be verified to be continuous because x is q.m. continuous.
(Reader check.)
Assume xt is not contained in any proper subspace of {Zn}∞n=0. Then the functions σn(·)
are linearly independent; otherwise
N∑n=0
anσn(t) = 0, for some {an}N0 ,
and hence
0 =N∑n=0
an〈xt, Zn〉 = 〈xt,N∑n=0
anZn〉;
which implies
xt ⊥ {N∑n=0
anZn} while xt =∞∑n=0
αn(t)Zn.
35
But this contradicts the proper subspace condition.
Separable Covariance Function
Suppose xt =∑∞
n=0 σn(t)Zn, t ∈ R, where {Zn;n ∈ Z+} is a c.o.n. family. Then
R(t, s) = Extxs = E[∞∑n=0
σn(t)Zn][∞∑m=0
σm(s)Zm]
=∞∑n=0
σn(t)σn(s), ∀t, s ∈ R. (23)
A covariance function of the form (23) is called separable.
Mercer’s Theorem [ ] states that continuous positive functions R(s, t); s, t ∈ R on L2×L2
possess complete orthonormal families of eigenfunctions with respect to which they have
expansions of the form (23) which converge uniformly over compact sets of the form {−M ≤
s, t ≤M ; s, t ∈ R}. Hence, in particular, continuous covariance functions are separable.
Clearly, if
R(s, t) =∞∑
n=−∞
σn(s)σn(t), s, t ∈ R, (24)
with∫∞−∞ σn(s)σm(s)ds = λnδnm, n,m ∈ Z, then∫ ∞
−∞R(s, t)σn(t)dt =
∫ ∞
−∞
(∞∑k=0
σk(s)σk(t)
)σn(t)dt
=∞∑k=0
σk(s)λnδk,n
= λnσn(s), ∀s ∈ R.
Theorem 5.7 Karhunen-Loeve
Let x be a q.m. continous second order process with covariance function R(t, s), t, s ∈ R.
(a) Let {ψn;n ∈ Z+} be the set of orthonormal eigenfunctions of R(., .) such that∫ ∞
−∞R(t, s)ψn(s)ds = λnψn(t), ∀t ∈ R,
36
i.e. for which {λn;n ∈ Z+} is the set of eigenvalues. Further let
bn(ω) =√λn
−1∫ ∞
−∞x(ω, t)ψn(t)dt
for which, necessarily,
Ebnbm = δm,n, m, n ∈ Z+.
Then
x(ω, t) = q.m.limN→∞
N∑n=0
√λnψn(t)bn(ω) (25)
uniformly on compact intervals.
(b) Conversely, if x(ω, t) has an expansion of the form∫ ∞
−∞ψm(t)ψn(t)dt = δmn = Ebmbn m,n ∈ Z+,
then {ψn;n ∈ Z+} and the associated {λn;n ∈ Z+} are the eigenfunctions and eigen-
values respectively of R(·, ·).
Note that for real processes x, ψ and λ are real.
Proof
(a)
From the hypotheses of the theorem and the definition of the terms, we see that
{bn;n ∈ Z+} is a c.o.n. for the Hilbert space H spanned by x. Consequently, the
result follows directly from
E|xt −N∑n=0
√λnψn(t)bn|2
=R(t, t)−N∑n=0
λn|ψn(t)|2 → 0
as N →∞, where the latter convergence follows from Mercer’s Theorem.
37
(b) If xt =∑∞
n=0
√λnψn(t)bn(ω), then
R(t, s) = Extxs =∞∑n=0
λnψn(t)ψn(s).
Hence∫ ∞
−∞R(t, s)ψm(s)ds =
∫ ∞
−∞
∞∑n=0
λnψn(t)ψn(s)ψm(s)ds
= λmψm(t), ∀m ∈ Z+,∀t ∈ R.
Example 5.7 (Wong [1971, p.87])
Let R(t, s) = min(t, s), t, s ∈ R; in other words R(., .) is the covariance of the Wiener
process. Consider ∫ T
0
min(t, s)ψ(s)ds = λψ(t), 0 ≤ t ≤ T,
or, equivalently,∫ t
0
sψ(s)ds+ t
∫ T
t
ψ(s)ds = λψ(t), 0 ≤ s, t ≤ T.
This gives
tψ(t)− tψ(t) +
∫ T
t
ψ(s)ds = λd
dtψ(t),
and so
ψ̈(t) = −1
λψ(t), λ 6= 0,
with ψ(0) = 0, ddtψ(T ) = 0. This gives
ψ(t) = A sin1√λt, with cos
T√λ
= 0, implying√λ =
2T
(2n+ 1)π, n ∈ Z+,
where here, since sin(−x) = − sinx, we do not need to consider the negative integers n ∈ Z
in order to find additional eigenfunctions.
So, on Z+, the set of normalized eigenfunctions are
38
ψn(t) =
(2
T
)1/2
sin
(2n+ 1
2
)(πt
T
), n ∈ Z+,
and hence the process x with the specified covariance R(t, s) satisfies
xt = q.m.limN→∞
N∑n=0
(2T
(2n+ 1)π
)(2
T
)1/2(sin
(2n+ 1)
T
(πt
2
))bn(ω),
where
bn(ω) =
(2T
(2n+ 1)π
)−1 ∫ T
0
(2
T
)1/2(sin
(2n+ 1
2
)(πt
T
))x(ω, t)dt
in q.m. Incidentally, as observed by Wong, Mercer’s Theorem in this case gives the expansion
min(t, s) =2
T
∞∑n=0
T 2
π2(n+ 12)2
(sin(n+
1
2)πt
T
)(sin(n+
1
2)πs
T
),
where the convergence above is uniform on [0, T ]2.
Weiner-Kolmogorov Filtering
39
x−→ A(eiθ)y−→
Figure 1:
40