Advances in numerical projection methods for
MOR of large-scale linear dynamical systems
V. Simoncini
Dipartimento di Matematica, Universita di Bologna
1
Model Order Reduction
Given the continuous-time system
Σ =
A B
C
, A ∈ Cn×n
Analyse the construction of a reduced system
Σ =
A B
C
with A of size m ≪ n, and issues associated with its accuracy.
Applications: signal processing, system and control theory
2
Projection methods and Linear Dynamical Systems
• Solvers for the Lyapunov matrix equation
• Approximation of the matrix Transfer function
• Approximation of Hankel singular values by balanced truncation
3
Solving the Lyapunov equation. The problem
Approximate soln X to:
AX + XA⊤ + BB⊤ = 0
A ∈ Rn×n positive real B ∈ Rn×s, s ≥ 1
————————————
Time-invariant linear system:
x′(t) = Ax(t) + Bu(t), x(0) = x0
Analytic solution:
X =
∫ ∞
0
e−tABB⊤e−tA⊤
dt =
∫ ∞
0
xx⊤dt with x = exp(−tA)B.
see, e.g., Antoulas ’05, Benner ’06
4
Solving the Lyapunov equation. The problem
Approximate soln X to:
AX + XA⊤ + BB⊤ = 0
A ∈ Rn×n positive real B ∈ Rn×s, s ≥ 1
————————————
Time-invariant linear system:
x′(t) = Ax(t) + Bu(t), x(0) = x0
Analytic solution:
X =
∫ ∞
0
e−tABB⊤e−tA⊤
dt =
∫ ∞
0
xx⊤dt with x = exp(−tA)B
see, e.g., Antoulas ’05, Benner ’06
5
Standard Krylov subspace projection
X ≈ Xm Xm ∈ K
Galerkin condition: R := AXm + XmA⊤ + bb⊤ ⊥ K
V ⊤
m RVm = 0 K = range(Vm)
————————————
Assume V ⊤m Vm = Im and let Xm := VmYmV ⊤
m .
Projected Lyapunov equation:
(V ⊤
m AVm)Ym + Ym(V ⊤
m A⊤Vm) + V ⊤
m bb⊤Vm = 0
mTmYm + YmT⊤
m + e1e⊤
1 = 0
with b = Vme1 (Saad, ’90, for K = Km(A, b) = span{b, Ab, . . . , Am−1b})
6
Standard Krylov subspace projection
X ≈ Xm Xm ∈ K
Galerkin condition: R := AXm + XmA⊤ + bb⊤ ⊥ K
V ⊤
m RVm = 0 K = range(Vm)
————————————
Assume V ⊤m Vm = Im and let Xm := VmYmV ⊤
m .
Projected Lyapunov equation:
(V ⊤
m AVm)Ym + Ym(V ⊤
m A⊤Vm) + V ⊤
m bb⊤Vm = 0
mTmYm + YmT⊤
m + e1e⊤
1 = 0
with b = Vme1 (Saad, ’90, for K = Km(A, b) = span{b, Ab, . . . , Am−1b})
7
Standard Krylov projection. In quest of error bounds
AX + XA⊤ + BB⊤ = 0, X ≈ Xm ∈ Km(A, B)
‖X − Xm‖ ≤ ??
————————————
Simoncini & Druskin ’09. Analytic solution:
X =
∫ ∞
0
e−tABB⊤e−tA⊤
dt =
∫ ∞
0
xx⊤dt
with x = exp(−tA)B, B = b, ‖b‖ = 1
Let αmin = λmin((A + A⊤)/2) > 0. Then
‖x‖ ≤ exp(−tαmin)‖B‖
8
Standard Krylov projection. In quest of error bounds
AX + XA⊤ + BB⊤ = 0, X ≈ Xm ∈ Km(A, B)
‖X − Xm‖ ≤ ??
————————————
(Simoncini & Druskin ’09). Analytic solution:
X =
∫ ∞
0
e−tABB⊤e−tA⊤
dt =
∫ ∞
0
xx⊤dt
with x = exp(−tA)B, B = b, ‖b‖ = 1
Let αmin = λmin((A + A⊤)/2) > 0. Then
‖x‖ ≤ exp(−tαmin)‖B‖
9
First (key) step
Krylov subspace proj.: Xm = VmYmV ⊤m , range(Vm) = Km(A, b)
TmYm + YmT⊤
m + e1e⊤
1 = 0
Clearly,
Xm = Vm
(∫ ∞
0
e−tTme1e⊤
1 e−tT⊤
m dt
)
V ⊤
m =
∫ ∞
0
xmx⊤
mdt
————————————
II step: ‖X − Xm‖ = ‖∫∞
0(xx⊤ − xmx⊤
m)dt‖, so that
‖X − Xm‖ ≤∫ ∞
0
‖xx⊤ − xmx⊤
m‖dt ≤ 2
∫ ∞
0
e−tαmin ‖x − xm‖dt
10
First (key) step
Krylov subspace proj.: Xm = VmYmV ⊤m , range(Vm) = Km(A, b)
TmYm + YmT⊤
m + e1e⊤
1 = 0
Clearly,
Xm = Vm
(∫ ∞
0
e−tTme1e⊤
1 e−tT⊤
m dt
)
V ⊤
m =
∫ ∞
0
xmx⊤
mdt
————————————
II step: ‖X − Xm‖ = ‖∫∞
0(xx⊤ − xmx⊤
m)dt‖, so that
‖X − Xm‖ ≤∫ ∞
0
‖xx⊤ − xmx⊤
m‖dt ≤ 2
∫ ∞
0
e−tαmin ‖x − xm‖dt
11
The case of A symmetric
A symmetric ⇒ αmin = λmin(A)
Let 0 < λmin ≤ . . . ≤ λmax eigs of A + λminI, κ := λmax
λmin
Then
‖X − Xm‖ ≤√
κ + 1
λmin
√κ
(√κ − 1√κ + 1
)m
Note: same rate as CG for (A + λminI)z = b
12
The case of A symmetric. An example
0 5 10 15 20 25 3010
−12
10−10
10−8
10−6
10−4
10−2
100
dimension of Krylov subspace
ab
so
lute
err
or
no
rm
error norm ||X−X
m||
estimate of Proposition 3.1
A: 400 × 400 diagonal with uniformly distributed eigenvalues in [1, 10]
(αmin = λmin = 1)
13
The case of W (A) in an ellipse
Assume W (A) ⊆ E ⊂ C+
(E ellipse of center (c, 0), foci (c ± d, 0) and major semi-axis a)
Then
‖X − Xm‖ ≤ 4
αmin
r2
r2 − r
(r
r2
)m
where
r =a
d+
√(a
d
)2
− 1, r2 =c + αmin
d+
√(
c + αmin
d
)2
− 1
Note: same rate as FOM for (A + αminI)z = b
14
The case of W (A) in an ellipse.
0 2 4 6 8 10 12 14 1610
−14
10−12
10−10
10−8
10−6
10−4
10−2
100
dimension of Krylov subspace
ab
so
lute
err
or
no
rm
error norm ||X−X
m||
estimate of Corollary 4.2
A normal with eigenvalues on an elliptic curve
15
The case of W (A) in a wedge-shaped set. An example
Generalization to a wedge-shaped convex set of C+.
2 3 4 5 6 7 8 9−1.5
−1
−0.5
0
0.5
1
1.5
ℜ (λ)
ℑ(λ
)
0 5 10 15 2010
−12
10−10
10−8
10−6
10−4
10−2
100
dimension of Krylov subspace
ab
so
lute
err
or
no
rm
error norm ||X−X
m||
asymp. estimate of Corollary 4.6
A: diagonal (normal) matrix on the wedge-shaped curve.
16
Cyclic low rank Smith method
(ADI made efficient)
(see, e.g., Li 2000, Penzl 2000)
X0 = 0, Xj = −2pj(A + pjI)−1BB⊤(A + pjI)−⊤ j = 1, . . . , ℓ
+(A + pjI)−1(A − pjI)Xj−1(A − pjI)⊤(A + pjI)−⊤
with
rℓ(t) = Πℓj=1(t − pj), {p1, . . . , pℓ} = argmin max
t∈Λ(A)
˛
˛
˛
˛
rℓ(t)
rℓ(−t)
˛
˛
˛
˛
Convergence considerations:
Convergence depends on choice of {pj}. For A spd:
‖X − Xℓ‖ ≈(√
κadi − 2√κadi + 2
)ℓ
, κadi =λmax
λmin
17
Cyclic low rank Smith method
(ADI made efficient)
(see, e.g., Li 2000, Penzl 2000)
X0 = 0, Xj = −2pj(A + pjI)−1BB⊤(A + pjI)−⊤ j = 1, . . . , ℓ
+(A + pjI)−1(A − pjI)Xj−1(A − pjI)⊤(A + pjI)−⊤
with
rℓ(t) = Πℓj=1(t − pj), {p1, . . . , pℓ} = argmin max
t∈Λ(A)
˛
˛
˛
˛
rℓ(t)
rℓ(−t)
˛
˛
˛
˛
Convergence considerations:
Convergence depends on choice of {pj}. For A spd:
‖X − Xℓ‖ ≈(√
κadi − 2√κadi + 2
)ℓ
, κadi =λmax
λmin
18
Extended Krylov subspace method
Galerkin condition: Xm ∈ K s.t.
R := AXm + XmA⊤ + bb⊤ ⊥ K
K = Km(A, B) + Km(A−1, A−1B), range(Vm) = K(Druskin & Knizhnerman ’98, Simoncini ’07) Xm = VmYmV⊤
m
Projected Lyapunov equation:
(V⊤
mAVm)Ym + Ym(V⊤
mA⊤Vm) + V⊤
mbb⊤Vm = 0
mTmYm + YmT ⊤
m + e1e⊤
1 = 0
19
Performance evaluation. I
x′ = xxx + xyy + xzz − 10xxx − 1000yxy − 10zxz + b(x, y)u(t)
A matrix 183 × 183
0 5 10 15 20 2510
−11
10−10
10−9
10−8
10−7
10−6
10−5
10−4
10−3
10−2
CPU Time
no
rm o
f re
latv
e r
esid
ua
l
Extended Krylov
Standard Krylov
approximation space dim.: 146 (Standard Krylov) 112 (Extended Krylov)
20
Performance evaluation. II
Stopping criterion: norm of difference in solution
s EKSM CF-ADI
time(#its) dim.space time (#its) dim.space
Example 1 5.95 (12) 24 31.66 (6) 120
rail 5177 2 8.08 (10) 40 30.83 (5) 200
tol=10−5 4 11.11 ( 7) 56 40.20 (5) 400
7 18.12 ( 6) 84 54.22 (5) 700
Example (*) 1 38.95 (34) 68 588.68 (5) 150
tol=10−8 2 50.50 (33) 132 633.41 (5) 300
4 90.69 (33) 264 722.92 (5) 600
7 204.91 (32) 448 857.57 (5) 1050
x′ = xxx + xyy + xzz − 10xxx − 1000yxy − 10zxz + b(x, y)u(t) (∗)
21
Convergence analysis of Extended Krylov
General considerations
AX + XA⊤ + BB⊤ = 0
A−1X + XA−⊤ + A−1BB⊤A−⊤ = 0
Summing up for any γ ∈ R, we obtain yet a Lyapunov equation:
(A+γA−1)X+X(A⊤+γA−⊤)+[B,√
γA−1B][B⊤;√
γB⊤A−⊤] = 0
with Km(A + γA−1, [B,√
γA−1B]) ( Km(A, B) + Km(A−1, A−1B)
AX + XA⊤ + BB⊤ = 0
22
Convergence analysis of Extended Krylov
General considerations
AX + XA⊤ + BB⊤ = 0
A−1X + XA−⊤ + A−1BB⊤A−⊤ = 0
Summing up for any γ ∈ R, we obtain yet a Lyapunov equation:
(A+γA−1)X+X(A⊤+γA−⊤)+[B,√
γA−1B][B⊤;√
γB⊤A−⊤] = 0
with Km(A + γA−1, [B,√
γA−1B]) ( Km(A, B) + Km(A−1, A−1B)
AX + XA⊤ + BB⊤ = 0
23
Convergence analysis of Extended Krylov: A symmetric pos.def.
Kressner & Tobler tr’09:
‖X − Xm‖ .
(4√
κ − 14√
κ − 1
) 1
2
︸ ︷︷ ︸
ρ
m
κ =λmax
λmin
Knizhnerman & Simoncini: ‖X − Xm‖ . ρm
m1
2
24
A symmetric. An example
True error norm and asymptotic estimates for A symmetric.
0 2 4 6 8 10 12 14 16 18 2010
−12
10−10
10−8
10−6
10−4
10−2
100
number of iterations
no
rm o
f e
rro
r
true errorρm
ρm/m1/2
0 10 20 30 40 50 6010
−12
10−10
10−8
10−6
10−4
10−2
100
number of iterations
no
rm o
f e
rro
r
true error
ρm
ρm/m1/2
Left: Spectrum in [0.1, 10]. Right: Spectrum in [0.01, 100]
Knizhnerman & Simoncini (in prep.)
25
Convergence analysis of Extended Krylov: A nonsymmetricW (A) ⊂ C+ a disk of center c and radius r.
‖X − Xm‖ .1√m
(r2
4c2 − 3r2
)m
0 5 10 15 20 25 30 35 4010
−12
10−11
10−10
10−9
10−8
10−7
10−6
10−5
10−4
10−3
number of iterations
no
rm o
f e
rro
r
||X−Xm
||
ρm/m1/2
Knizhnerman & Simoncini (in prep.)
26
Comparison of convergence rates: A symmetric
ADI iteration: εadi,j ≈“√
κadi−2√κadi+2
”jStandard Krylov: εkr,j ≈
“√κ−1√κ−1
”j
Extended Krylov: εek,ℓ ≈
„
“
4√
κ−14√
κ−1
”1/2«ℓ
0 1 2 3 4 5 6 7 8 9 10
x 104
0.65
0.7
0.75
0.8
0.85
0.9
0.95
1
value of λmax
co
nve
rge
nce
ra
te
ADIKrylovExtended Krylov
λmin = 1, λmax ∈ [102, 105]
27
Transfer function approximation (cf. MOR)
h(σ) = c∗(A − iσI)−1b, σ ∈ [α, β]
Given space K and V s.t. K=range(V ),
h(σ) ≈ (V ∗c)∗(V ∗AV − σI)−1(V ∗b)
For K = Km(A, b) (standard Krylov):
hm(σ) = (V ∗mc)∗(Hm − σI)−1e1‖b‖
For K = Km(A, b) + Km(A−1, A−1b) (EKSM):
hm(σ) = (U∗mc)∗(Tm − σI)−1e1‖b‖
Alternative: Rational Krylov (Grimme-Gallivan-VanDooren etc. )
choosing the poles unresolved issue (A nonsymmetric)
28
An example: CD Player, |h(σ)| = |C∗:,i(A − iσI)−1B:,j |
10−1
100
101
102
103
104
105
106
10−8
10−6
10−4
10−2
100
102
104
106
108
CDplayer, m=20, (1,1)
truearnoldiextended
10−1
100
101
102
103
104
105
106
10−8
10−6
10−4
10−2
100
102
104
CDplayer, m=20, (1,2)
truearnoldiextended
10−1
100
101
102
103
104
105
106
10−7
10−6
10−5
10−4
10−3
10−2
10−1
100
101
102
CDplayer, m=20, (2,1)
truearnoldiextended
10−1
100
101
102
103
104
105
106
10−5
10−4
10−3
10−2
10−1
100
101
102
103
104
CDplayer, m=20, (2,2)
truearnoldiextended
29
An example: CD Player, |h(σ)| = |C∗:,i(A − iσI)−1B:,j |
10−1
100
101
102
103
104
105
106
10−8
10−6
10−4
10−2
100
102
104
106
108
CDplayer, m=20, (1,1)
truearnoldiextended
10−1
100
101
102
103
104
105
106
10−8
10−7
10−6
10−5
10−4
10−3
10−2
10−1
100
101
102
CDplayer, m=50, (1,2)
truearnoldiextended
10−1
100
101
102
103
104
105
106
10−7
10−6
10−5
10−4
10−3
10−2
10−1
100
101
102
CDplayer, m=20, (2,1)
truearnoldiextended
10−1
100
101
102
103
104
105
106
10−5
10−4
10−3
10−2
10−1
100
101
102
103
104
CDplayer, m=20, (2,2)
truearnoldiextended
30
Other related problems
• Projected generalized Lyapunov equation
EXA⊤ + AXE⊤ = −PlBB⊤P⊤l , X = PrXP⊤
r
(Stykel & Simoncini (in prep.))
• Sylvester equation: AX + XB + C = 0 (Heyouni ’08)
• Riccati equation: AX + XA⊤ − XGX + C = 0
(Heyouni & Jbilou ’08)
• Shifted systems: (A − σI)x = b with many σ’s
(..., Simoncini tr’09)
• Special Sylvester equation: AX + XΣ = [b(σ1), ..., b(σs)]
(Simoncini tr’09)
Approximation space: Extended Krylov subspace
31
Balanced reduction.
Balancing matrix transformation. Given
AP + PA⊤ + BB⊤ = 0, QA + A⊤Q + C⊤C = 0.
Find Tr, Tℓ such that T⊤ℓ PTℓ = Σ = T⊤
r QTr
The matrix
Σ = diag(σ1, σ2, σ3, . . .)
contains the Hankel singular values of the system
——————————–
Large body of literature, and various possibilities, even in the
small-scale case (cf., e.g., Antoulas ’05)
Error estimate for the reduced system:
‖Σ − Σ‖H∞≤ 2(σk+1 + · · · + σn),
32
An iterative procedure. Joint work in progress with T. Stykel
Given K0, L0.
For k = 1, 2, . . .
1. Update approx. spaces Kk−1 → Kk=range(Vk), Lk−1 → Lk=range(Wk)
2. Compute approximate Gramians Xk, Yk s.t.
P ≈ Pk = VkXkV⊤
k , Q ≈ Qk = WkYkW⊤k
with W⊤k Vk = I
3. Approximate Hankel singular values:p
λj(PQ) ≈ σj(L⊤XLY ), Xk = LXL
⊤X , Yk = LY L
⊤Y
UΣZ⊤ = svd(L⊤XLY )
4. If satisfied, compute truncated balancing transformation matrices:
Tr = VkLXUΣ−1/2, Tℓ = WkLY ZΣ−1/2 and stop
What spaces Kk, Lk to obtain accurate and small size Tr, Tℓ ?
33
Truncated balancing
What spaces Kk, Lk to obtain accurate and small size Tr, Tℓ ?
Two possible choices we are exploring:
⋆ Kk = Lk = EKk(A, [B, C⊤])
(Related to cross-Gramians for A symmetric)
⋆ Kk = EKk(A, B) Lk = EKk(A⊤, C⊤)
bi-orthogonal bases (a la Lanczos)
EKk: Extended Krylov subspace
34
Example
Penzl’s example (408 × 408): A = blkdiag(A1, A2, A3, A4, D)
A1 =
2
4
−0.01 −200
200 0.001
3
5 A2 =
2
4
−0.2 −300
300 −0.1
3
5
A3 =
2
4
−0.02 −500
500 0
3
5 A4 =
2
4
−0.01 −520
520 −0.01
3
5
and D =diag(1:400)
B = C⊤. Vector (s = 1) with large projection onto nonsym part.
35
Example. cont’ed. Error |H(σ) − Hk(σ)|
10−4
10−2
100
102
104
106
108
10−25
10−20
10−15
10−10
10−5
100
105
Frequency w
Ma
gn
itu
de
Absolute error
Error ELanczError EKSMError INV−Arnoldi
Balancing Ext’d Krylov: Tr of size 19 (space of max size 40)
Balancing Ext’d Lanczos: Tr of size 17 (left-right spaces of max size 20 each)
Inverted-Arnoldi: space of size 40
36
Convergence of Hankel singular values
0 2 4 6 8 10 12 14 16 18 2010
−8
10−7
10−6
10−5
10−4
10−3
10−2
10−1
100
101
102
number of iterations
Qu
an
tity
τk f
or
sto
pp
ing
te
st
EK: At each iteration k
τk =X
j∈Jk
|σ(k−1)j − σ
(k)j |
σ(k)1
where Jk = {index j : σj/σ1 > 10−10}
Residuals of Lyapunov equations: ‖Rk‖ = ‖Sk‖ = O(10−3)
37
One more example. Error |H(σ) − Hk(σ)|iss case. (tiny: 270 × 270), B 6= C⊤ vectors
10−4
10−2
100
102
104
106
108
10−25
10−20
10−15
10−10
10−5
100
Frequency w
Ma
gn
itu
de
Absolute error
Error ELanczError EKSMError INV−Arnoldi
Balancing Ext’d Krylov: Tr of size 35 (space of max size 40)
Balancing Ext’d Lanczos: Tr of size 20 (left-right spaces of max size 20 each)
Inverted-Arnoldi: space of size 40
38
Convergence of Hankel singular values
2 4 6 8 10 12 14 16 18 2010
−3
10−2
10−1
100
101
number of iterations
Qu
an
tity
τk f
or
sto
pp
ing
te
st
EK Lanczos: At each iteration k
τk =X
j∈Jk
|σ(k−1)j − σ
(k)j |
σ(k)1
where Jk = {index j : σj/σ1 > 10−10}
Ext’d Lanczos. Residuals of Lyapunov equations: ‖Rk‖ = O(1), ‖Sk‖ = O(103)
39
Conclusions
• Great potential of enriched projection spaces
• Exploit low cost of using A and A−1
• Projection combined with matrix function theory for proofs
http://www.dm.unibo.it/~ simoncin
40
References
1. Leonid Knizhnerman and V. Simoncini, Convergence analysis of the
Extended Krylov Subspace Method for the Lyapunov equation, In
preparation.
2. V. Simoncini and Tatjana Stykel, Iterative methods for balanced truncation
of large-scale linear dynamical systems, In preparation.
3. V. Simoncini, The Extended Krylov subspace for parameter dependent
systems, August 2009, pp.1-13.
4. V. Simoncini and Vladimir Druskin, Convergence analysis of projection
methods for the numerical solution of large Lyapunov equations, SIAM J.
Numerical Analysis. Volume 47, Issue 2,pp. 828-843 (2009).
5. V. Simoncini, A new iterative method for solving large-scale Lyapunov
matrix equations, SIAM J. Scient. Computing, v.29, n.3 (2007), pp.
1268-1288.
41
Top Related