New Identiﬁcation of complex systems · 2019. 9. 30. · Istationary point = ,2! Improvements in...

Numerically optimal identification of complex systemsTowards instrumental variables and δ-parameterizations

Tom Oomen

joint work with Robbert van Herpen, Robbert Voorhoeve, and Jurgen van Zundert

www.toomen.euDepartment of Mechanical Engineering

Eindhoven University of Technology

September 24, 2019

2/25Identification of complex systems

System identification

dataparameterization

criterion

algorithm−−−−−−→ model

Many examples of increasing complexityI networked systems (this ERNSI)I mechanical systems

I SYSID15 invited session (https://www.kth.se/social/group/system-identificatio/page/17th-ifac-symposium-on-system-identifica/)I many industries (particularly around Eindhoven)


Controller

Traditional motion controlI de Callafon & Van den Hof (2001): 3 motion DOFs⇒ 3 input, 3 outputI van de Wal et al. (2002): 6 motion DOFs⇒ 6 input, 6 output

Increasing complexity: active control of flexible modesI van Herpen et al. (2014): � 6 input, 6 outputI currently (2019): 14 input, 14 output

Algorithms for complex systems:high order, low damping, large dynamic range, many inputs, many outputs?


Typical identification algorithms (SISO)I frequency domain data: Go(ξk), ξk ∈ ξ (s-domain or Z -domain)

I parameterization: G (ξ) =n(ξ,θ)d(ξ,θ)

(nonlinear in θ)

I criterion V =m∑

k=1

∣∣∣wk(Go(ξk) – G (ξk))∣∣∣2 (wk : maximum likelihood, control-relevant, . . . )

Recent developments in identification algorithms

I V =m∑

k=1

∣∣∣��1d(ξk ,θ)

wk [Go(ξk ) 1][d(ξk , θ)n(ξk , θ)

]∣∣∣2 ⇒ Aθ = b Levy (1959)

I V =m∑

k=1

∣∣∣∣ 1d〈i–1〉(ξk ,θ)

wk [Go(ξk ) 1]

[d〈i〉(ξk , θ)

n〈i〉(ξk , θ)

]∣∣∣∣2 ⇒ A〈i–1〉θ〈i〉 = b〈i–1〉 Sanathanan&K. (1963)

‘SK’: stationary point 6= (local) optimum Whitfield (1987)

I ‘Instrumental variable (IV)’: C 〈i–1〉HA〈i–1〉θ〈i〉 = C 〈i–1〉Hb〈i–1〉 Blom & Van den Hof (2010)

stationary point = (local) optimum

http://www.toomen.eu

https://www.tue.nl/en/university/departments/mechanical-engineering/research/research-groups/control-systems-technology/

www.toomen.eu

https://www.tue.nl/en/university/departments/mechanical-engineering/

https://www.tue.nl/

https://www.kth.se/social/group/system-identificatio/page/17th-ifac-symposium-on-system-identifica/


My IV implementation on SYSID2015 benchmark (Voorhoeve et al. 2015)

0 50 100 150 20010

4

105

106

107

108

109

Iteration

V

What’s wrong?I algorithm itself?I implementation?


Is there a numerical issue?

1020

10100

10200

0 50 100 150 200

Iteration

κ

Same algorithm, different implementation

0 50 100 150 20010

4

105

106

107

108

109

Iteration

V

Indeed, many ad hoc fixesI QR factorization (Bayard 1994)

I special bases (Bayard 1994)

I frequency/amplitude scaling (Pintelon & Kollár 2005)

I amplitude scaling (Hakvoort & Van den Hof 1994)

I scaled monomials (Voorhoeve et al. 2015)

I orthonormal bases (Ninness et al. 2000) (Ninness & Hjalmarsson 2001)

I discard part of SVD (Wills & Ninness 2008)

I FLBF (Welsh & Rojas 2007) (Gilson et al. 2017)

I rational bases (Gustavsen & Semlyen 1999)

I . . .I . . .I . . .


Φ =

φ0(ξ1) φ1(ξ1) . . . φl–1(ξ1)φ0(ξ2) φ1(ξ2) . . . φl–1(ξ2)

.

.

.

.

.

.

.

.

.φ0(ξm) φ1(ξm) . . . φl–1(ξm)

Deeper into ’SK’ case: A〈i–1〉θ〈i〉 = b〈i–1〉

I V =m∑

k=1

∣∣∣∣ 1d〈i–1〉(ξk ,θ)

wk [Go(ξk ) 1]

[d〈i〉(ξk , θ)

n〈i〉(ξk , θ)

]∣∣∣∣2I general polynomial basis:[

d(ξ, θ)n(ξ, θ)

]=

l∑j=0

φj (ξ)θj ,φj (ξ) ∈ R2×2[ξ] : degree j block-polynomial

θj ∈ R2×1 : coefficient vector (θl constrained)

I traditional: pick some basis (monomial), then put in matrix W1Φ︸︷︷︸A

θ = W1Φlθl︸︷︷︸b

Optimal conditioning of Aθ = bI what if we pick 〈φi (ξ),φj (ξ)〉 =

m∑k=1

φj (ξk)HwH1kw1kφi (ξk) = δij

I then normal equation AHAθ = AHb: ΦHWH1 W1Φ︸︷︷︸

=I , hence κ(AHA)=1!!

θ = ΦHWH1 W1Φlθl

I many results on computing φ(ξ) (Rutishauser 1963) (Reichel et al. 1991) (Bultheel & Van Barel 1995): more today!


Condition number κ: worst-case amplification of errorsPerturb db in system of equations: A (θ + dθ) = (b + db)

True system: Go

Model: G =no

d(ξ,θ)

10−3

10−2

10−1

100

10−5

10−4

10−3

10−2

10−1

100

101

ξ

Po(ξ

)

Monomial basisκ(A) ∼ 4 · 1011

10−3 10−2 10−1 10010−5

10−4

10−3

10−2

10−1

100

101

ξ

P(ξ

,θ)

10−3 10−2 10−1 10010−5

10−4

10−3

10−2

10−1

100

101

ξ

10−1.8 10−1.7100.5

100.6

Optimal basisκ(A) ∼ 1

10−3 10−2 10−1 10010−5

10−4

10−3

10−2

10−1

100

101

ξ

P(ξ

,θ)

10−3 10−2 10−1 10010−5

10−4

10−3

10−2

10−1

100

101

ξ

10−1.8 10−1.7100.5

100.6

Optimal conditioning reduces sensitivity to round-off errors!


Improvement in identification algorithmsI traditional ’SK’ approach: A〈i–1〉θ〈i〉 = b〈i–1〉 stationary point 6= (local) optimum

I improved ‘IV’ : C 〈i–1〉HA〈i–1〉θ〈i〉 = C 〈i–1〉Hb〈i–1〉 stationary point = (local) optimum

How does C 〈i–1〉HA〈i–1〉θ〈i〉 = C 〈i–1〉Hb〈i–1〉 fit with κ = 1?I on level of normal equations ‘SK’ case: θ = (AHA)–1AHb (κ(A)2!)

I optimal conditioning of A matrix alone not sufficient

Today1. improved ‘IV’ algorithm explained and demonstrated

2. improved ‘IV’ algorithm and κ = 1

3. is κ = 1 all there is to numerics? (no: δ-operators!)

Focus on main ideas, for detailed math and algorithms see papers

IV for improved frequency domain identification

IV and κ = 1

Beyond κ = 1

Summary

10/25IV for improved frequency domain identification

Back to classical SK-iterations

I iterate over i : V =m∑

k=1

∣∣∣∣ 1d(ξk ,θ〈i–1〉)

wk [Go(ξk) 1]

[d(ξk , θ

〈i〉)n(ξk , θ

〈i〉)

]∣∣∣∣2I key issue: stationary point 6= (local) optimum ( ∂V

∂θT6= 0) (Whitfield 1987)

I in practice: initial guess for Gauss-Newton (next slide)

Central idea in Blom & Van den Hof (2010)I first compute ∂V

∂θTand set this to zero:

m∑k=1

ζH(ξk , θ)∣∣θ=θ〈i–1〉

1d(ξk ,θ〈i–1〉)

wk

[Go(ξk) 1

] [d(ξk , θ〈i〉)n(ξk , θ〈i〉)

]= 0

with ζ(ξk , θ) = –wk∂G(ξk ,θ)∂θT

I and iterate!: stationary point = (local) optimum


1D example(van Zundert et al. 2016)

I one free parameterI 2 minima θ∗,1, θ∗,2

I Gauss-Newton, initial θ0I θ0 < 6 · 10–4 ⇒ θ∗,1

I θ0 > 6 · 10–4 ⇒ θ∗,2

I 10 traditional SK iterationsI for any θ0 ⇒ ◦I close to minimum θ∗,1. . .

I 10 improved IV iterationsI for any θ0 ⇒ �I stationary point = θ∗,2!

Improvements inalgorithms!


Improvements in algorithms

SK

IV

ConditioningI traditional SK

I A〈i–1〉θ〈i〉 = b〈i–1〉

I QR factorization: κ(A)

I improved IVI C 〈i–1〉HA〈i–1〉θ〈i〉 = C 〈i–1〉Hb〈i–1〉

with typically κ(CHA) ≈ κ(A)2

1020

10100

10200

0 50 100 150 200

Iteration

κ

Improvements in algorithms⇒ explosion of numerical conditioning?


IV and κ = 1

Beyond κ = 1

Summary

13/25IV and κ = 1

innerproduct

indefiniteinner

product

bilinearform

I IV algorithm (CHA)θ = CHb is essentially

ΦHWH2 W1Φ ΦHWH

2 W1Φ︸︷︷︸we want this = I !

θ = ΦHWH2 W1Φlθl

I recall from Slide 7: if W2 = W1 (‘SK’ case)

〈φi (ξ),φj (ξ)〉 =m∑

k=1

φj (ξk)HwH1kw1kφi (ξk) ⇒ ΦHWH

1 W1Φ = I

I what if for IV case we pick orthonormal polynomials w.r.t.

〈φi (ξ),ψj (ξ)〉 =m∑

k=1

ψj (ξk)HwH2kw1kφi (ξk) so that ΦHWH

2 W1Φ = I ?

7 this is not an inner product! (just a bilinear form)

14/25IV and κ = 1

Bilinear form: 〈φi (ξ),ψj (ξ)〉 =m∑

k=1

ψj (ξk)HwH2kw1kφi (ξk)

Key idea: introduce additional freedom!I pick two distinct bases ψ,φ:

ΨHWH2 W1Φθ = ΨHWH

2 W1Φlθl

I such that these are bi-orthonormal 〈φi (ξ),ψj (ξ)〉 = δij I

I interpretationI oblique projectionI Ψ transforms ’left’ basis⇒ transforms instrumental variables

Key result I (van Herpen et al. 2016)

bi-orthonormal⇒ ΨHWH2 W1Φ = I ⇒ κ(ΨHWH

2 W1Φ) = 1 optimal conditioning!

15/25IV and κ = 1

It becomes even more interesting: no need to form ΨHWH2 W1Φ︸︷︷︸

=I

θ = ΨHWH2 W1Φlθl

I note that

Ψ =

ψ0(ξ1) ψ1(ξ1) . . . ψl–1(ξ1)ψ0(ξ2) ψ1(ξ2) . . . ψl–1(ξ2)

......

...ψ0(ξm) ψ1(ξm) . . . ψl–1(ξm)

, Φl =

φl (ξ1)φl (ξ2)

...φl (ξm)

, θ =

θ0θ1...

θl–1

Key result II (for each iteration 〈i〉)(van Herpen et al. 2016)

I bi-orthonormal⇒ ΨHWH2 W1Φl = 0⇒ θ = 0

I II.a. Optimal approximant:[d(ξ, θ)n(ξ, θ)

]= φl (ξ)θl

I II.b. All lower order approximants are also obtained (polynomial basis)

16/25IV and κ = 1

Status so farI (identification) problem solved if we can compute bi-orthonormal bases φi (ξ),ψj (ξ)

Computing the polynomial basesI for φi (ξ),ψj (ξ) ∈ R[ξ] these follow from two 3-term recurrence relations

φi (ξ) = 1γi

((ξ – αi )φi–1(ξ) – βi–1φi–2(ξ))

ψi (ξ) = 1βi

((ξ – αj )ψi–1(ξ) – γi–1ψi–2(ξ)

)I recursion coefficients follow from a new ‘chasing-down-the-diagonal’ algorithm

w21 w22 . . . w2mw11 ξ1w12 ξ2

.... . .

w1m ξm

‘similarity’−−−−−−→

β0γ0 α1 β1

γ1 α2. . .

. . .. . . βm–1γm–1 αm

17/25IV and κ = 1

Illustrative example

100

101

102

−40

−30

−20

−10

0

10

20

30

|.|[d

B]

100

101

102

−180

−90

0

90

180

f [Hz]

6(.)

[◦]

Alg. Basis V (θ?)∥∥∥ ∂V (θ)

∂θ

∣∣θ=θ?

∥∥∥2

κ

SK Monomial 30.47937 1.97 · 10–2 8.09 · 102

SK Orthonormal w.r.t. inner product 30.47937 1.97 · 10–2 1.00IV Monomial 30.47901 < 10–13 6.56 · 105

IV Bi-orthonormal w.r.t. bi-linear form 30.47901 0 1


IV and κ = 1

Beyond κ = 1

Summary

18/25Beyond κ = 1

Summary so farI improvements in iterative identification algorithms: IV-based

Aθ = b ⇒ (CHA)θ = CHb

I new bi-orthonormal approach leads to CHA = I ⇒ κ(CHA) = 1I special case AHA = I : orthonormal (Rutishauser 1963) (Reichel et al. 1991) (Bultheel & Van Barel 1995)

I essentially solves the problem in polynomial domain through bases Φ, ΨI algorithms for measurements on unit disc (Z -domain), real-line, imaginary axis (s-domain)

I Done?

Need to compute (bi-)orthonormal bases Φ(, Ψ) reliablyI back to classical ’SK’ case: focus on a single orthonormal basis Φ

19/25Beyond κ = 1

κ = 1 for fast-sampled discrete time systems? (SYSID2015 benchmark)

102

103

104

105

106

10-4

10-2

100

102

104

106

108

1010

Sampling frequency [Hz]

κ(A

)-1

Loss of orthonormality for increasing sampling frequencyI conditioning of Aθ = b deteriorates . . .I let’s look at this specific case a bit deeper

20/25Beyond κ = 1

fs → ∞

Im

Re1

fs → ∞

Im

Re1

fs → ∞

Im

Re1

fs → ∞

Fast-sampled discrete-time systemsI sampling frequency fs →∞I poles tend to z = 1I unity is disastrous: (1 + ε1) – (1 + ε2) = 0

δ-operator(Middleton & Goodwin 1986)

I δ = fs(z – 1)

Final topics of this talkI using δ-operator in identification (in particular with κ = 1?)I efficient computation of δ-domain orthonormal polynomials

21/25Beyond κ = 1

Back to traditional ‘SK’ case: Aθ = b, with A = WΦ

I Pick Φ as orthonormal polynomials w.r.t. data-dependent discrete inner product

〈φi (ξ),φj (ξ)〉 =m∑

k=1

φj (ξk)HwH1kw1kφi (ξk) ⇒ κ(A) = 1

Computation of φi for arbitrary points ξkI follow from an i -term recurrence relation ξφi =

k∑j=0

φj (ξ)hj ,i

I with the hj ,i ’sw1 ξ1w2 ξ2...

. . .wm ξm

similarity−−−−−→

h0,0 h0,1 h0,2 . . . h0,m

h1,1 h1,2 . . . h1,mh2,2 . . . h2,m. . .

...hm,m

Essentially an inverse eigenvalue problem, computational explosion O(m3)

22/25Beyond κ = 1

w1 ξ1w2 ξ2...

. . .wm ξm

similarity−−−−−→

× × × ×0 · · · · · · ×0 ×00 × × × · · · · · · ×0 ×00 0 × × · · · · · · ×0 ×0...

.... . .

. . .. . .

. . .. . .

......

.... . .

. . .. . . × × ×0

......

. . .. . .

. . . × × ×0 0 · · · · · · · · · 0 × ×

It’s all about recognizing structure O(m3)⇒ O(mn)

I continuous time: tridiagonal (as on Slide 16): obvious 3-term recurrenceI discrete time: Schur coefficients (a bit more complicated 3-term recurrence)I delta domain: ?

23/25Beyond κ = 1

New result (Voorhoeve & Oomen 2019)

For any generalized circle in the complex plane, the Hessenberg matrix H is a(H – 1)-quasiseparable matrix

ImplicationsI O(mn) complexityI unified approach, special cases

I tridiagonal (s-domain)I unitary Hessenberg/Schur (Z -domain)I δ-domain!

I so what is this (H – 1)-quasiseparable matrix?

× � � · · · · · · � �× × � · · · · · · � �0 × × · · · · · · � �...

. . .. . .

. . .. . .

. . ....

.... . .

. . .. . . × � �

.... . .

. . .. . . × × �

0 · · · · · · · · · 0 × ×

24/25Beyond κ = 1

Example fast-sampled discrete time systems (SYSID2015 benchmark)

102

103

104

105

106

10-4

10-2

100

102

104

106

108

1010

Sampling frequency [Hz]

κ(A

)-1

I traditional z-domain Schur-based parameterization: loss of orthonormalityI z-domain via unified (H – 1)-quasiseparable: also improvedI δ-domain perfect conditioning invariant under increasing sampling frequency


IV and κ = 1

Beyond κ = 1

Summary

25/25Summary

Numerical aspects in identification and controlI lots of issues when implementing algorithms . . .

. . . seldomly mentioned in application papersI increasingly important for increasing complexity

Results on intersection identification,numerical linear algebra, and orthonormal polynomialsI new instrumental variable (IV) algorithms(Blom & Van den Hof 2010)

I interesting results in frequency domain identification and beyond(van Zundert et al. 2016)

I new algorithm at expense of conditioning?I κ = 1 for IV-type problems(van Herpen et al. 2016)

I there’s more to our algorithms than κ = 1I δ-operator from control theory promising(Voorhoeve & Oomen 2019)

25/25References IBayard, D. S. (1994), ‘High-order multivariable transfer function curve fitting: Algorithms, sparse matrix methods and experimental results’, Automatica 30(9), 1439–1444.Blom, R. S. & Van den Hof, P. M. J. (2010), Multivariable frequency domain identification using IV-based linear regression, in ‘Proceedings of the 49th Conference on Decision and Control’,

pp. 1148–1153.Bultheel, A. & Van Barel, M. (1995), ‘Vector orthogonal polynomials and least squares approximation’, SIAM Journal on Matrix Analysis and Application 16(3), 863–885.de Callafon, R. A. & Van den Hof, P. M. J. (2001), ‘Multivariable feedback relevant system identification of a wafer stepper system’, IEEE Transactions on Control Systems Technology

9(2), 381–390.Gilson, M., Welsh, J. S. & Garnier, H. (2017), ‘Frequency localizing basis function-based IV method for wideband system identification’, IEEE Transactions on Control Systems Technology

26(1), 329–335.Gustavsen, B. & Semlyen, A. (1999), ‘Rational approximation of frequency domain responses by vector fitting’, IEEE Transactions on Power Delivery 14(3), 1052–1061.Hakvoort, R. G. & Van den Hof, P. M. J. (1994), ‘Frequency domain curve fitting with maximum amplitude criterion and guaranteed stability’, International Journal of Control 60(5), 809–825.van Herpen, R., Bosgra, O. & Oomen, T. (2016), ‘Bi-orthonormal polynomial basis function framework with applications in system identification’, IEEE Transactions on Automatic Control

61(11), 3285–3300.van Herpen, R., Oomen, T., Kikken, E., van de Wal, M., Aangenent, W. & Steinbuch, M. (2014), Exploiting additional actuators and sensors for nano-positioning robust motion control, in

‘Proceedings of the 2014 American Control Conference’, Portland, Oregon, United States, pp. 984–990.Levy, E. C. (1959), ‘Complex-curve fitting’, IRE Transactions on Automatic Control 4(1), 37–43.Middleton, R. H. & Goodwin, G. C. (1986), ‘Improved finite word length characteristics in digital control using delta operators’, IEEE Transactions on Automatic Control 31(11), 1015–1021.Ninness, B., Gibson, S. & Weller, S. (2000), Practical aspects of using orthonormal system parameterisations in estimation problems, in ‘ 2000 IFAC Symposium on System Identification’,

Santa Barbara, California, United States, pp. 463–468.Ninness, B. & Hjalmarsson, H. (2001), ‘Model structure and numerical properties of normal equations’, IEEE Transactions on Circuits and Systems 48(4), 425–437.Pintelon, R. & Kollár, I. (2005), ‘On the frequency scaling in continuous-time modeling’, IEEE Transactions on Instrumentation and Measurement 54(1), 318–321.Reichel, L., Ammar, G. & Gragg, W. (1991), ‘Discrete least squares approximation by trigonometric polynomials’, Mathematics of Computation 57(195), 273–289.Rutishauser, H. (1963), On Jacobi rotation patterns, in ‘Proceedings of the AMS Symposium in Applied Mathematics’, Vol. 15, pp. 219–239.Sanathanan, C. K. & Koerner, J. (1963), ‘Transfer function synthesis as a ratio of two complex polynomials’, IEEE Transactions on Automatic Control 8(1), 56–58.Voorhoeve, R. & Oomen, T. (2019), ‘Data-dependent orthogonal polynomials on generalized circles: A unified approach applied to δ-domain identification’, In preparation .Voorhoeve, R., van Rietschoten, A., Geerardyn, E. & Oomen, T. (2015), Identification of high-tech motion systems: An active vibration isolation benchmark, in ‘ 17th IFAC Symposium on

System Identification’, Beijing, China, pp. 1250–1255.van de Wal, M., van Baars, G., Sperling, F. & Bosgra, O. (2002), ‘Multivariable H∞/µ feedback control design for high-precision wafer stage motion’, Control Engineering Practice

10(7), 739–755.Welsh, J. S. & Rojas, C. R. (2007), Frequency localising basis functions for wide-band system identification: A condition number bound for output error systems, in ‘Proceedings of the 2007

European Control Conference’, pp. 4618–4624.Whitfield, A. H. (1987), ‘Asymptotic behaviour of transfer function synthesis methods’, International Journal of Control 45(3), 1083–1092.Wills, A. & Ninness, B. (2008), ‘On gradient-based search for multivariable system estimates’, IEEE Transactions on Automatic Control 53, 1.van Zundert, J., Bolder, J. & Oomen, T. (2016), ‘Optimality and flexibility in iterative learning control for varying tasks’, Automatica 67, 295–302.

New Identiﬁcation of complex systems · 2019. 9. 30. · Istationary point = ,2! Improvements in...

Documents

Transcript of New Identiﬁcation of complex systems · 2019. 9. 30. · Istationary point = ,2! Improvements in...