SYSTEMS Identification - UM

97
SYSTEMS SYSTEMS Identification Identification Ali Karimpour Assistant Professor Ferdowsi University of Mashhad Reference: “System Identification Theory For The User” Lennart Ljung

Transcript of SYSTEMS Identification - UM

Page 1: SYSTEMS Identification - UM

SYSTEMSSYSTEMSIdentificationIdentification

Ali KarimpourAssistant Professor

Ferdowsi University of Mashhad

Reference: “System Identification Theory For The User” Lennart Ljung

Page 2: SYSTEMS Identification - UM

2

lecture 10

Ali Karimpour Nov 2009

Lecture 10

Computing the estimateComputing the estimate

Topics to be covered include:� Linear Regression and Least Squares.

� Numerical Solution by Iterative Search Method.

� Computing Gradients.

� Two-Stage and Multistage Method.

� Local Solutions and Initial Values.

� Subspace Methods for Estimating State Space Models.

Page 3: SYSTEMS Identification - UM

3

lecture 10

Ali Karimpour Nov 2009

IntroductionIn chapter 7 three basic parameter estimation method considered

1- The Prediction-Error Approach in which a certain function VN(θ,ZN) is minimized with respect to θ.

2- The Correlation Approach in which a certain equation fN(θ,ZN)=0 is solved for θ.

3- The Subspace Approach to estimating state space models.

In this chapter we shall discuss how these problems are best solved numerically.

Page 4: SYSTEMS Identification - UM

4

lecture 10

Ali Karimpour Nov 2009

Linear Regression and Least Squares.

Topics to be covered include:� Linear Regression and Least Squares.

� Numerical Solution by Iterative Search Method.

� Computing Gradients.

� Two-Stage and Multistage Method.

� Local Solutions and Initial Values.

� Subspace Methods for Estimating State Space Models.

Page 5: SYSTEMS Identification - UM

5

lecture 10

Ali Karimpour Nov 2009

Linear Regression and Least Squares.

θϕθ )()|(ˆ tty T=

For linear regression we have:

Least-squares criterion leads to

∑∑=

=⎥⎦

⎤⎢⎣

⎡==

N

t

N

t

TNLSN tyt

Ntt

NZV

1

1

1)()(1)()(1),(minargˆ ϕϕϕθθ

θ

)(NR )(Nf

)()(ˆ 1 NfNRLSN

−=θAn alternative form is:

)(ˆ)( NfNR LSN =θ Normal equations

Note that the basic equation for IV method is quite analogous so most of what is said in this section about LS method also applied to IVmethod.

1×p

1×d

pd ×

1×ddd ×

Page 6: SYSTEMS Identification - UM

6

lecture 10

Ali Karimpour Nov 2009

Linear Regression and Least Squares.

)(ˆ)( NfNR LSN =θ Normal equations

R(N) may be ill-conditioned specially when its dimension is high.The underlying idea in these methods is that the matrix R(N) should not be formed, instead a matrix R is constructed with the property

)(NRRRT =

This class of methods is commonly known as “square-root algorithm”

But the term “quadratic methods” is more appropriate.

How to derive R?• Householder• Gram-Schmidt procedure• Bjorck decomposition

• Cholesky decomposition• QR decomposition

⎥⎦

⎤⎢⎣

⎡= ∑

=

N

t

T ttN

NR1

)()(1)( ϕϕdd ×

dd ×

Page 7: SYSTEMS Identification - UM

7

lecture 10

Ali Karimpour Nov 2009

Linear Regression and Least Squares.

ngularupper tria,, RIQQQRA T ==

Solving for the LS estimates by QR factorization.The QR-factorization of an n d matrix A is defined as:×

Here Q is an unitary n n and R is n d.× ×

⎥⎦

⎤⎢⎣

⎡−⎥⎦

⎤⎢⎣

⎡−

−−=⎥

⎤⎢⎣

⎡=

0.6325-04.4272-162.3

3162.09487.09487.03162.0

4321

:1 Example A

⎥⎥⎥

⎢⎢⎢

⎡ −−

⎥⎥⎥

⎢⎢⎢

−−−−

−=

⎥⎥⎥

⎢⎢⎢

⎡=

008281.004374.79161.5

4082.03450.08452.08165.02760.05071.0

4082.08971.01690.0

654321

:2 Example A

⎥⎥⎥⎥

⎢⎢⎢⎢

⎡−−

−−

⎥⎥⎥⎥

⎢⎢⎢⎢

−−−−−−−−−−

=

⎥⎥⎥⎥

⎢⎢⎢⎢

=

00028.10015.016.20

15.189.273.1

63.050.015.058.042.033.062.058.063.078.00021.017.077.058.0

021131

100101

:3 Example A

Page 8: SYSTEMS Identification - UM

8

lecture 10

Ali Karimpour Nov 2009

Linear Regression and Least Squares.

ngularupper tria,, RIQQQRA T ==

Solving for the LS estimates by QR factorization.

⎥⎥⎥⎥

⎢⎢⎢⎢

⎡−−

−−

⎥⎥⎥⎥

⎢⎢⎢⎢

−−−−−−−−−−

=

⎥⎥⎥⎥

⎢⎢⎢⎢

=

00028.10015.016.20

15.189.273.1

63.050.015.058.042.033.062.058.063.078.00021.017.077.058.0

021131

100101

:3 Example A

⎥⎥⎥⎥

⎢⎢⎢⎢

⎡−−−

⎥⎥⎥⎥

⎢⎢⎢⎢

−−−−−−−−−−

=

⎥⎥⎥⎥

⎢⎢⎢⎢

=

000016.2089.273.1

72.035.015.058.048.023.062.058.044.090.00024.012.077.058.0

21310001

:4 Example A

⎥⎥⎥⎥

⎢⎢⎢⎢

⎡−

⎥⎥⎥⎥

⎢⎢⎢⎢

−−−−−

−−−

=

⎥⎥⎥⎥

⎢⎢⎢⎢

=

00073.1

79.021.0058.021.078.000.058.0

001057.058.0058.0

1101

:5 Example A

Page 9: SYSTEMS Identification - UM

9

lecture 10

Ali Karimpour Nov 2009

Linear Regression and Least Squares.

ngularupper tria,, RIQQQRA T ==

Solving for the LS estimates by QR factorization.

1is,

)(..

)1(

×

⎥⎥⎥⎥

⎢⎢⎢⎢

= NpY

Ny

y

Y

2θΦ−= Y

Let define

Let Q as an unitary matrix, then

( ) 22),( θθθ Φ−=Φ−= YQYZV TNN

∑=

−=N

t

TNN ttyZV

1

2)()(),( θϕθ

dNp

T

T

×Φ

⎥⎥⎥⎥⎥

⎢⎢⎢⎢⎢

=Φ is,

(1)..(1)

ϕ

ϕ

Page 10: SYSTEMS Identification - UM

10

lecture 10

Ali Karimpour Nov 2009

Linear Regression and Least Squares.

ngularupper tria,, RIQQQRA T ==Solving for the LS estimates by QR factorization.

( ) 22),( θθθ Φ−=Φ−= YQYZV TNN

Now, introduce QR-factorization

[ ]⎥⎥⎥

⎢⎢⎢

⎡==Φ

0.....,

0RRQRY

1+d

Np

NpNp× )1( +× dNp)1()1( +×+ dd ⎥

⎤⎢⎣

⎡=

3

210 0 R

RRR

dd × 1×d

scalar

This means that

( ) 2),( θθ Φ−= YQZV TN

N2

32

12 RRR +−= θ

which clearly is minimized for2

321 ),ˆ( giving,ˆ RZVRR NNNN == θθ

21

3

2

0 ⎥⎦

⎤⎢⎣

⎡−⎥

⎤⎢⎣

⎡=

θRRR

Page 11: SYSTEMS Identification - UM

11

lecture 10

Ali Karimpour Nov 2009

)1()2()1()( 21 −=−+−+ tbutyatyaty

Consider the simple model for system

Exercise: Suppose for t=1 to 11 the value of u and y are:[ ][ ]T

T

y

u

2656.45313.60625.3125.225.65.13441424

21521735232

−−−−=

−=

1) Derive from eq. (I) and find the condition number of R(N)Nθ

)()()(ˆ 1 INfNRN−=θ

)(ˆ2

11 IIRRN−=θ

2) Derive from eq. (II) and find the condition number of R1Nθ

[ ]105.0ˆ :Answer N =θ

Linear Regression and Least Squares.

Page 12: SYSTEMS Identification - UM

12

lecture 10

Ali Karimpour Nov 2009

Linear Regression and Least Squares.

Solving for the LS estimates by QR factorization.( ) 22),( θθθ Φ−=Φ−= YQYZV N

N

There are three important advantages with this way of solving the LS estimate:

2321 ),ˆ( giving,ˆ RZVRR N

NNN == θθ

2- R1 is a triangular matrix, so the equation is easy to solve.

3- If the QR-factorization is performed for a regressor size d*, then the solutionsfor all models with fewer parameter are easily obtained from R0.

Note that the big matrix Q is never required to find. All the information are contained in the “small” matrix R0

11)( RRNR TT =ΦΦ=Therefore R1 is much better conditioned than R(N).

1- The condition number of R1 is the square root of R(N).

Page 13: SYSTEMS Identification - UM

13

lecture 10

Ali Karimpour Nov 2009

Linear Regression and Least Squares.

Initial condition: “Windowed” Data

⎥⎥⎥⎥

⎢⎢⎢⎢

=

)(..

)1(

)(

ntz

tz

The regression vector φ(t) is:

Here z(t-1) is an r-dimensional vector. For example, the for ARX model

⎥⎦

⎤⎢⎣

⎡−=

)()(

)(tuty

tz

For example, the for AR model)()( tytz −=

R(N) will be:

∑=

−−=N

t

Tij jtzitz

NNR

1

)()(1)(

Page 14: SYSTEMS Identification - UM

14

lecture 10

Ali Karimpour Nov 2009

Linear Regression and Least Squares.

Initial condition: “Windowed” Data

R(N) will be:

∑=

−−=N

t

Tij jtzitz

NNR

1

)()(1)(

If we have knowledge only of z(t) for 1 ≤ t ≤ N the question arises of how to deal with the unknown initial condition

1 - Start the summation at t=n+1 rather than t=1.

2 - Replace the unknown initial condition by zeros.

Page 15: SYSTEMS Identification - UM

15

lecture 10

Ali Karimpour Nov 2009

Numerical Solution by Iterative Search Method

Topics to be covered include:� Linear Regression and Least Squares.

� Numerical Solution by Iterative Search Method.

� Computing Gradients.

� Two-Stage and Multistage Method.

� Local Solutions and Initial Values.

� Subspace Methods for Estimating State Space Models.

Page 16: SYSTEMS Identification - UM

16

lecture 10

Ali Karimpour Nov 2009

Numerical Solution by Iterative Search Method

Numerical minimizationMethods for numerical minimization of a function V(θ) update the minimizing point iteratively by:

In general neither the function

nor

cannot be minimized or solved by analytical methods.

∑=

=N

t

NN tl

NZV

1)),,((1),( θθεθ ∑

=

==N

t

NN tt

NZf

1)),((),(1),(0 θεαθξθ

)()()1( ˆˆ iii fαθθ +=+

f (i) is a search direction based on information about V(θ)α is a positive constant

Depending on the information to determine f (i) there is 3 groups

1- Methods using function values only.2- Methods using values of the function as well as of its gradient.3- Methods using values of the function, its gradient and of its Hessian..

Page 17: SYSTEMS Identification - UM

17

lecture 10

Ali Karimpour Nov 2009

Numerical Solution by Iterative Search Method

Depending on the information to determine f (i) there is 3 groups

• Methods using function values only.

• Methods using values of the function V as well as of its gradient.

• Methods using values of the function, its gradient and of its Hessian..

Newton algorithms

Quasi Newton algorithms

An estimate of Hessian is find and then:

An estimate of gradient is used then Quasi Newton algorithm applied.

[ ] )ˆ()ˆ( )(1)()( iii VVf θθ ′′′−=−

[ ] )ˆ()ˆ( )(1)()( iii VVf θθ ′′′−=−

Page 18: SYSTEMS Identification - UM

18

lecture 10

Ali Karimpour Nov 2009

Numerical Solution by Iterative Search Method

In general consider the function

∑=

=N

t

NN tl

NZV

1)),,((1),( θθεθ

The gradient is:

∑=

⎟⎠⎞

⎜⎝⎛ ′−′

∂∂

−−=′N

t

NN tltlt

NZV

1)),,(()),,((),(1),( θθεθθε

θθεθ θε

Here, Ψ(t,θ) is:

)|(ˆ),(),( θθ

θεθ

θψ tytt∂∂

=∂∂

−=

qd ×

1×d

1×q

( )∑=

′−′−=′N

t

NN tltlt

NZV

1)),,(()),,((),(1),( θθεθθεθψθ θε

Page 19: SYSTEMS Identification - UM

19

lecture 10

Ali Karimpour Nov 2009

Numerical Solution by Iterative Search Method

Some explicit search schemes

∑=

=N

t

NN t

NZV

1

2),(1),( θεθConsider the special case

∑=

−=′N

t

NN tt

NZV

1),(),(1),( θεθψθ

The gradient is:

A general family of search routines is given by

[ ] ),ˆ(ˆˆ )(1)()()()1( NiN

iN

iN

iN

iN ZVR θµθθ ′−=

−+

iterate th denoted ˆ )( iiNθ

directionsearch themodifiesat matrix th a is )( ddR iN ×

),ˆ(),ˆ( )()1( NiN

NiN ZVZV θθ ≤+

atchosen th is and sizee step thedenotes )(iNµ

Page 20: SYSTEMS Identification - UM

20

lecture 10

Ali Karimpour Nov 2009

Numerical Solution by Iterative Search Method

Some explicit search schemes

∑=

=N

t

NN t

NZV

1

2),(1),( θεθ

Consider the special case

[ ] ),ˆ(ˆˆ )(1)()()()1( NiN

iN

iN

iN

iN ZVR θµθθ ′−=

−+

atchosen th is and sizee step thedenotes )(iNµ

),ˆ(),ˆ( )()1( NiN

NiN ZVZV θθ ≤+

Page 21: SYSTEMS Identification - UM

21

lecture 10

Ali Karimpour Nov 2009

Numerical Solution by Iterative Search Method

Some explicit search schemes

∑=

=N

t

NN t

NZV

1

2),(1),( θεθ

Consider the special case

[ ] ),ˆ(ˆˆ )(1)()()()1( NiN

iN

iN

iN

iN ZVR θµθθ ′−=

−+

IR iN =)(Let then we have

),ˆ(ˆˆ )()()()1( NiN

iN

iN

iN ZV θµθθ ′−=+

This is the gradient or steepest-descent method.

This method is fairly inefficient close to the minimum.

Page 22: SYSTEMS Identification - UM

22

lecture 10

Ali Karimpour Nov 2009

Numerical Solution by Iterative Search Method

Gradient or steepest-descent method for solving f(x)=0.

This method is fairly inefficient close to the minimum..

• Make an initial guess: x0.

x0

• Draw the tangent line.

Its equation is:))(()( 000 xxxfxfy −′+=

• Let x1 be x-intercept of the tangent line.

x1

• This intercept is given by the formula:)()(

0

001 xf

xfxx′

−=

• Now repeat x1 as the initial guess.

x2

Page 23: SYSTEMS Identification - UM

23

lecture 10

Ali Karimpour Nov 2009

Numerical Solution by Iterative Search Method

Gradient or steepest-descent method for solving f(x)=0.

)()(

0

001 xf

xfxx′

−=Some difficulties of steepest-descent method.

• Zero derivatives. • Diverging.

x0

x1x2x2

Page 24: SYSTEMS Identification - UM

24

lecture 10

Ali Karimpour Nov 2009

Numerical Solution by Iterative Search Method

Gradient or steepest-descent method for finding minimum of f(x)

Page 25: SYSTEMS Identification - UM

25

lecture 10

Ali Karimpour Nov 2009

Numerical Solution by Iterative Search Method

Gradient or steepest-descent method for finding minimum of f(x)

),ˆ(ˆˆ )()()()1( NiN

iN

iN

iN ZV θµθθ ′−=+

Page 26: SYSTEMS Identification - UM

26

lecture 10

Ali Karimpour Nov 2009

Numerical Solution by Iterative Search Method

Some explicit search schemes

∑=

=N

t

NN t

NZV

1

2),(1),( θεθ

Consider the special case

[ ] ),ˆ(ˆˆ )(1)()()()1( NiN

iN

iN

iN

iN ZVR θµθθ ′−=

−+

The gradient or steepest-descent method is fairly inefficient close to the minimum.

∑=

−=′N

t

NN tt

NZV

1

),(),(1),( θεθψθ

The gradient and the Hessian of V is:

∑∑==

′−=′′N

t

N

t

TNN tt

Ntt

NZV

11

),(),(1),(),(1),( θεθψθψθψθ

),( )()( NiNN

iN ZVR θ″=Let then we have

[ ] ),ˆ(),ˆ(ˆˆ )(1)()()()1( NiN

NiN

iN

iN

iN ZVZV θθµθθ ′′′−=

−+

This is the Newton method.

But it is not an easy task to compute Hessian since of .),( θψ t′

Page 27: SYSTEMS Identification - UM

27

lecture 10

Ali Karimpour Nov 2009

Numerical Solution by Iterative Search Method

Some explicit search schemes

∑=

=N

t

NN t

NZV

1

2),(1),( θεθ

Consider the special case

[ ] ),ˆ(),ˆ(ˆˆ )(1)()()()1( NiN

NiN

iN

iN

iN ZVZV θθµθθ ′′′−=

−+

This is the Newton method.

But it is not an easy task to compute Hessian since of .),( θψ t′

∑∑==

′−=′′N

t

N

t

TNN tt

Ntt

NZV

11

),(),(1),(),(1),( θεθψθψθψθ

Suppose that there is a value θ0 s.t. ε(t, θ0) = e0(t) 0),( =′⇒ θψ t

∑=

≈N

t

T ttN 1

),(),(1 θψθψ )(θNH∆

=

Page 28: SYSTEMS Identification - UM

28

lecture 10

Ali Karimpour Nov 2009

Numerical Solution by Iterative Search Method

Newton method

∑∑==

′−=′′N

t

N

t

TNN tt

Ntt

NZV

11

),(),(1),(),(1),( θεθψθψθψθ ∑=

≈N

t

T ttN 1

),(),(1 θψθψ )(θNH∆

=

∑=

=N

t

NN t

NZV

1

2),(1),( θεθ [ ] ),ˆ(ˆˆ )(1)()()()1( NiN

iN

iN

iN

iN ZVR θµθθ ′−=

−+

)( )()( iNN

iN HR θ=

So choose of

in the vicinity of minimum is a good estimate of Hessian.

This is known as the Gauss-Newton Method.

In the statistical literature it is called the “Method of scoring”.

In the control literature the terms “modified Newton-Raphson” and “quasi linearization” have also been used.

Page 29: SYSTEMS Identification - UM

29

lecture 10

Ali Karimpour Nov 2009

Numerical Solution by Iterative Search Method

Newton method

∑∑==

′−=′′N

t

N

t

TNN tt

Ntt

NZV

11

),(),(1),(),(1),( θεθψθψθψθ ∑=

≈N

t

T ttN 1

),(),(1 θψθψ )(θNH∆

=

∑=

=N

t

NN t

NZV

1

2),(1),( θεθ [ ] ),ˆ(ˆˆ )(1)()()()1( NiN

iN

iN

iN

iN ZVR θµθθ ′−=

−+

1)( )()()( == iN

iNN

iN HR µθ

and for

the term “damped Guess-Newton” has been used.

Dennis and Schnabel reserve the term “Guess-Newton” for

)()()( adjustable)( iN

iNN

iN HR µθ=

Page 30: SYSTEMS Identification - UM

30

lecture 10

Ali Karimpour Nov 2009

Numerical Solution by Iterative Search Method

Newton method

∑=

==N

t

TiNN

iN tt

NHR

1

)()( ),(),(1)( θψθψθ[ ] ),ˆ(ˆˆ )(1)()()()1( NiN

iN

iN

iN

iN ZVR θµθθ ′−=

−+

Even though RN is assured to be positive semi definite, it may be singular or close to singular. (for example, if the model is over-parameterized or the data are not informative enough) Various ways to overcome this problem exist and are known as “regularization techniques”

Goldfeld, Quandt and Trotter suggest

scalar positive a is where),(),(11

1)(1

)()()( λλθψθψλ

λ ⎥⎦

⎤⎢⎣

⎡+

+= ∑

=N

N

t

iN

TiN

iN Itt

NR

Levenberg and Marquardt suggest

scalar positive a is where),(),(1)(1

)()()( λλθψθψλ N

N

t

iN

TiN

iN Itt

NR += ∑

=

With λ = 0 we have the Guess-Newton case, increasing λ means that the step size is decreased and the search direction is turned towards the gradient.

Page 31: SYSTEMS Identification - UM

31

lecture 10

Ali Karimpour Nov 2009

Numerical Solution by Iterative Search Method

Remember that we want to

or(I) ),(minimize NN ZV θ (II) )),((),(1),(0

1∑=

==N

t

NN tt

NZf θεαθξθ

Newton method to solve (I)

[ ] ),ˆ(),ˆ(ˆˆ )(1)()1()()1( NiN

NiN

iN

iN

iN ZVZV θθµθθ ′′′−=

−++

This leads to 0),ˆ( )( =′ Ni

N ZV θ

Newton-Raphson method to solve (II)

[ ] ),ˆ(),ˆ(ˆˆ )(1)()1()()1( NiNN

NiNN

iN

iN

iN ZfZf θθµθθ

−++ ′−=

Correlation EquationSolving equation (II) is quite analogous to the minimization of (I)

Substitution method to solve (II)

),ˆ(ˆˆ )()1()()1( NiNN

iN

iN

iN Zf θµθθ ++ −=

Page 32: SYSTEMS Identification - UM

32

lecture 10

Ali Karimpour Nov 2009

Computing Gradients

Topics to be covered include:� Linear Regression and Least Squares.

� Numerical Solution by Iterative Search Method.

� Computing Gradients.

� Two-Stage and Multistage Method.

� Local Solutions and Initial Values.

� Subspace Methods for Estimating State Space Models.

Page 33: SYSTEMS Identification - UM

33

lecture 10

Ali Karimpour Nov 2009

Computing Gradients

The amount of work required to compute ψ(t,θ) highly dependent on model structure, and sometimes one may have to resort to numerical differentiation.

Example 10.1 Consider the ARMAX model)()()()()()( teqCtuqBtyqA +=

the predictor is: [ ] )()()()()()|(ˆ)( tyqAqCtuqBtyqC −+=θ

Differentiation with respect to ak is:

)()|(ˆ)( tyqtya

qC k

k

−−=∂∂ θ

similarly

)()|(ˆ)( tuqtyb

qC k

k

−=∂∂ θ )()|(ˆ)()|(ˆ tyqty

cqCtyq k

k

k −− =∂∂

+ θθ

now( ))|(ˆ)()|(ˆ)( θθ tytyqty

cqC k

k

−=∂∂ −

Page 34: SYSTEMS Identification - UM

34

lecture 10

Ali Karimpour Nov 2009

Computing Gradients

)()|(ˆ)( tuqtyb

qC k

k

−=∂∂ θ

)|(ˆ),(),( θθ

θεθ

θψ tytt∂∂

=∂∂

−=now

),(),()( θϕθψ ttqC =

)()|(ˆ)( tyqtya

qC k

k

−−=∂∂ θ ( ))|(ˆ)()|(ˆ)( θθ tytyqty

cqC k

k

−=∂∂ −

⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥

⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢

∂∂

∂∂

∂∂

∂∂

∂∂

∂∂

=

)|(ˆ...

)|(ˆ

)|(ˆ...

)|(ˆ

)|(ˆ...

)|(ˆ

)(),()(

1

1

1

θ

θ

θ

θ

θ

θ

θψ

tyc

tyc

tyb

tyb

tya

tya

qCtqC

nc

nb

na

⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥

⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢

−−

−−−

−−

=

)(...

)1()(

...)1(

)(...

)1(

c

b

a

nt

tntu

tunty

ty

ε

ε

),( θϕ t=

Page 35: SYSTEMS Identification - UM

35

lecture 10

Ali Karimpour Nov 2009

Computing Gradients

SISO black box model

)()()()(

)()()()( te

qDqCtu

qFqBtyqA +=

General model structure and its predictor is:

)()(

)()(1)()()()()()|(ˆ ty

qCqAqDtu

qFqCqBqDty ⎥

⎤⎢⎣

⎡−+=θ

so we have

)()()()|(ˆ kty

qCqDty

ak

−−=∂∂ θ )(

)()()()|(ˆ ktuqFqC

qDtybk

−=∂∂ θ

)()|(ˆ)()|(ˆ tyqtyc

qCtyq k

k

k −− =∂∂

+ θθ )()()()()()(

)()()()()()|(ˆ kty

qCqCqAqDktu

qFqCqCqBqDty

ck

−+−−=∂∂

⇒ θ

)()()()(

)()()()|(ˆ kty

qCqAktu

qFqCqBty

dk

−−−=∂∂ θ ),(

)(1)|(ˆ θθ ktvqC

tydk

−−=∂∂

)()(

)()(1)|(ˆ)()|(ˆ tyqqC

qAqDtyf

qFtyq k

k

k −−⎥⎦

⎤⎢⎣

⎡−=

∂∂

+ θθ )()()()(

)()()|(ˆ ktuqFqFqC

qBqDtyfk

−−=∂∂

⇒ θ

),()()(

)()|(ˆ θθ ktwqFqC

qDtyfk

−=∂∂

Page 36: SYSTEMS Identification - UM

36

lecture 10

Ali Karimpour Nov 2009

Computing Gradients

SISO black box model

)()()()(

)()()()( te

qDqCtu

qFqBtyqA +=

General model structure and its predictor is:

)()(

)()(1)()()()()()|(ˆ ty

qCqAqDtu

qFqCqBqDty ⎥

⎤⎢⎣

⎡−+=θ

)()()()|(ˆ kty

qCqDty

ak

−−=∂∂ θ )(

)()()()|(ˆ ktuqFqC

qDtybk

−=∂∂ θ

)()()()()()(

)()()()()()|(ˆ kty

qCqCqAqDktu

qFqCqCqBqDty

ck

−+−−=∂∂ θ

),()(

1)|(ˆ θθ ktvqC

tydk

−−=∂∂

),()()(

)()|(ˆ θθ ktwqFqC

qDtyfk

−=∂∂

1)()()( === qDqCqA

As an special case consider OE model

)|(ˆ),(),( θθ

θεθ

θψ tytt∂∂

=∂∂

−=now

Page 37: SYSTEMS Identification - UM

37

lecture 10

Ali Karimpour Nov 2009

Computing Gradients

SISO black box model

)()(

1)|(ˆ ktuqF

tybk

−=∂∂ θ ),(

)(1)|(ˆ θθ ktwqF

tyfk

−=∂∂

1)()()( === qDqCqAAs an special case consider OE model

)|(ˆ),(),( θθ

θεθ

θψ tytt∂∂

=∂∂

−=now

),(),()( θϕθψ ttqF =

⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥

⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢

∂∂

∂∂

∂∂

∂∂

=

)|(ˆ...

)|(ˆ

)|(ˆ...

)|(ˆ

)(),()(

1

1

θ

θ

θ

θ

θψ

tyf

tyf

tyb

tyb

qFtqF

nf

nb

⎥⎥⎥⎥⎥⎥⎥⎥

⎢⎢⎢⎢⎢⎢⎢⎢

−−

=

),(...

),1()(

...)1(

θ

θ

f

b

ntw

twntu

tu

),( θϕ t=

Page 38: SYSTEMS Identification - UM

38

lecture 10

Ali Karimpour Nov 2009

Two-Stage and Multistage Method

Topics to be covered include:� Linear Regression and Least Squares.

� Numerical Solution by Iterative Search Method.

� Computing Gradients.

� Two-Stage and Multistage Method.

� Local Solutions and Initial Values.

� Subspace Methods for Estimating State Space Models.

Page 39: SYSTEMS Identification - UM

39

lecture 10

Ali Karimpour Nov 2009

Two-Stage and Multistage Method

Numerical Solution by Iterative Search Method

Linear Regression and Least Squares

• Efficient methods with analytic solution.

• Guaranteed convergence to a local minimum.

• Efficiently.

• Applicability to general model structure.Com

bine

d

Page 40: SYSTEMS Identification - UM

40

lecture 10

Ali Karimpour Nov 2009

Two-Stage and Multistage Method

Why we interest in this topic:

• It helps to understand the identification literature.

• It is useful to providing initial estimates to use in iterative methods .

Som

e im

port

ant T

wo-

Stag

e or

M

ultis

tage

Met

hod

1- Bootstrap Methods.

2- Bilinear Parameterization.

3- Separate Least Squares.

4- High Order AR(X) Models.

5- Separating Dynamics And Noise Models.

6- Determining ARMA Models.

7- Subspace Methods For Estimating State Space Models.

Page 41: SYSTEMS Identification - UM

41

lecture 10

Ali Karimpour Nov 2009

Two-Stage and Multistage Method

Bootstrap MethodsConsider the correlation formulation

This formulation contains a number of common situation

[ ]Tba ntutuntxtxqKt )(...)1(),(...),1()(),( −−−−−−= θθθξ

)(),( tt ϕθϕ =

• IV methods with:

• PLR methods:

),(),( θϕθξ tt =

• Minimizing the quadratic criterion:),(),( θψθξ tt =

[ ]⎭⎬⎫

⎩⎨⎧

=−= ∑=

N

t

TPLRN ttyt

Nsol

1

0),()(),(1ˆ θθϕθϕθ

∑=

=N

t

NN t

NZV

1

2),(1),( θεθ ∑=

−=′N

t

NN tt

NZV

1),(),(1),( θεθψθ

( )∑=

−==N

t

TNN ttyt

NZf

1),()(),(1),(0 θθϕθξθ

IV: instrument variable

PLR: Pseudo linear regression

Page 42: SYSTEMS Identification - UM

42

lecture 10

Ali Karimpour Nov 2009

Two-Stage and Multistage Method

Bootstrap Methods( )∑

=

−==N

t

TNN ttyt

NZf

1),()(),(1),(0 θθϕθξθConsider the correlation formulation

( ) 0)ˆ,()()ˆ,(11

)1()1( =−∑=

−−N

t

iN

TiN ttyt

Nθθϕθξ

⎥⎦

⎤⎢⎣

⎡⎥⎦

⎤⎢⎣

⎡= ∑∑

=

−−

=

−−N

t

iN

N

t

iN

TiN

iN tyt

Ntt

N 1

)1(1

1

)1()1()( )()ˆ,(1)ˆ,()ˆ,(1ˆ θξθϕθξθ

),(,),( θϕθξθ tt T↔It is called Bootstrap Method since it alternate between:

It does not necessarily converge to a solution. A convergence analysis is given by:

Page 43: SYSTEMS Identification - UM

43

lecture 10

Ali Karimpour Nov 2009

Two-Stage and Multistage Method

Bilinear Parameterization.For some models, the predictor is bilinear in the parameters, for example consider ARARX model

)()(

1)()()()( teqD

tuqBtyqA +=

Now the estimator is

( ) )()()(1)()()()|(ˆ tyqDqAtuqDqBty −+=θ

[ ]Tnnn dbadddbbbaaa ......... 212121=θ

Let

[ ]Tηρθ =

Bilinear means that is linear in ρ for fixed η and linear in η for fixed ρ.)|(ˆ θty

Page 44: SYSTEMS Identification - UM

44

lecture 10

Ali Karimpour Nov 2009

Two-Stage and Multistage Method

Bilinear Parameterization.In ARARX model

With this situation, a natural way of minimizing would be to treat it as a sequence of LS problems. Let

),|(ˆ)|(ˆ ηρθ tyty =[ ]Tηρθ =

Exercise 10T.3 Show that this minimization problem is an special case of 10.40.

According to exercise 10T.3 Bilinear parameterization is thus

indeed a descent method. It converges to a local minimum.

Page 45: SYSTEMS Identification - UM

45

lecture 10

Ali Karimpour Nov 2009

Two-Stage and Multistage Method

Separate Least Squares.

),()|(ˆ θϕθθ tty T= ),(),|(ˆ ηϕθηθ tty T=

The identification criterion then becomes

For given η this criterion is an LS criterion and minimized w.r.t. θ by

We can thus insert it to VN and define the problem as

A more general situation than the bilinear case is when one set of parameters enter linearly and another set nonlinearly in the predictor:

Page 46: SYSTEMS Identification - UM

46

lecture 10

Ali Karimpour Nov 2009

Two-Stage and Multistage Method

Separate Least Squares.The identification criterion then becomes

2-

3-

1-

The method is called separate least squares since the LS-part has been separated out, and the problem reduced to a minimization problem of lower dimensions.

Page 47: SYSTEMS Identification - UM

47

lecture 10

Ali Karimpour Nov 2009

Two-Stage and Multistage Method

High Order AR(X) Models.

)()()()()( 000 teqHtuqGty +=

Suppose the true system is:

An order M, ARX structure is used

)()()()()( tetuqBtyqA MM +=

Hannan and Kavalieris and Ljung and Wahlberg show that

So high-order ARX model is capable of approximating any linear system arbitrary well.

Page 48: SYSTEMS Identification - UM

48

lecture 10

Ali Karimpour Nov 2009

Two-Stage and Multistage Method

High Order AR(X) Models.So high-order ARX model is capable of approximating any linear system arbitrary well.It is of course desirable to reduce this high-order to more tractable versions:

Page 49: SYSTEMS Identification - UM

49

lecture 10

Ali Karimpour Nov 2009

Two-Stage and Multistage Method

Separating Dynamics And Noise Models.

)()()()(

)()()()( te

qDqCtu

qFqBtyqA +=

General model structure is:

)(ˆ)(ˆ)(ˆ qFqBqA

)()(ˆ)(ˆ

)()(ˆ)(ˆ tuqFqBtyqAtv −=

)()()()(ˆ te

qDqCtv =

Page 50: SYSTEMS Identification - UM

50

lecture 10

Ali Karimpour Nov 2009

Two-Stage and Multistage Method

Determining ARMA Models.

)()()()(ˆ te

qDqCtv =

Page 51: SYSTEMS Identification - UM

51

lecture 10

Ali Karimpour Nov 2009

Two-Stage and Multistage Method

Subspace Methods For Estimating State Space Models.

The Subspace methods can also be regarded as a two-stage method, being built up from two LS-steps.

Page 52: SYSTEMS Identification - UM

52

lecture 10

Ali Karimpour Nov 2009

Local Solutions and Initial Values

Topics to be covered include:� Linear Regression and Least Squares.

� Numerical Solution by Iterative Search Method.

� Computing Gradients.

� Two-Stage and Multistage Method.

� Local Solutions and Initial Values.

� Subspace Methods for Estimating State Space Models.

Page 53: SYSTEMS Identification - UM

53

lecture 10

Ali Karimpour Nov 2009

Local Solutions and Initial Values

Local MinimaThe general numerical schemes in Section 10.2 typically have the property that, with suitably chosen step length µ, they will converge to a solution i.e.

[ ] ),ˆ(),ˆ(ˆˆ )(1)()1()()1( NiNN

NiNN

iN

iN

iN ZfZf θθµθθ

−++ ′−=

),ˆ(ˆˆ )()1()()1( NiNN

iN

iN

iN Zf θµθθ ++ −=

0),( s.t. )( =→ ∗∗ NNNN

iN Zf θθθ

While for positive definite R, we have local minimum of VN(θ,Z)

[ ] ),ˆ(ˆˆ )(1)()()()1( NiN

iN

iN

iN

iN ZVR θµθθ ′−=

−+ minimum local a is ),( s.t. )( NNNN

iN ZV ∗∗→ θθθ

The global minimum interests us.

May have several solutions.

To find the global solution one must start at different feasible initial values.

An important possibility is to use some preliminary estimation procedure to produce a good initial value.

Local minima do not necessary create problem in practice, if a model passes the validation tests. (Sec 16.5 and 16.6)

Page 54: SYSTEMS Identification - UM

54

lecture 10

Ali Karimpour Nov 2009

Local Solutions and Initial Values

Remember fromchapter 8

),( NNN ZV ∗θ

Page 55: SYSTEMS Identification - UM

55

lecture 10

Ali Karimpour Nov 2009

Local Solutions and Initial Values

Results from SISO Black-box Models

)()()()(

)()()()( te

qDqCtu

qFqBtyqA +=

General model structure is:

Consider the assumption that the system can be described within the model set: SεM

The results are listed below for the general SISO model set and refer to

),(21)( 2 θεθ tEV =

Page 56: SYSTEMS Identification - UM

56

lecture 10

Ali Karimpour Nov 2009

Local Solutions and Initial Values

Results from SISO Black-box Models

Page 57: SYSTEMS Identification - UM

57

lecture 10

Ali Karimpour Nov 2009

Local Solutions and Initial Values

Initial parameter valuesDuo to the possible occurrence of undesired local minima in the criterion function, it is worthwhile to put some effort on producing good initial values.Also Newton-type method has good local convergence rate, it is again worthwhile to put some effort on producing good initial values.

1- For a physical parameterized model structure:• Use your physical insight.

2- For a linear black-box model structure: )()()()(

)()()()( te

qDqCtu

qFqBtyqA +=

Page 58: SYSTEMS Identification - UM

58

lecture 10

Ali Karimpour Nov 2009

Local Solutions and Initial Values

Initial filter condition

In some configuration we need initial values φ(0,θ).

1 - Start the summation at t=n+1 rather than t=1.

2 - Consider initial condition by:

Page 59: SYSTEMS Identification - UM

59

lecture 10

Ali Karimpour Nov 2009

Subspace Methods for Estimating State Space Models

Topics to be covered include:� Linear Regression and Least Squares.

� Numerical Solution by Iterative Search Method.

� Computing Gradients.

� Two-Stage and Multistage Method.

� Local Solutions and Initial Values.

� Subspace Methods for Estimating State Space Models.

Page 60: SYSTEMS Identification - UM

60

lecture 10

Ali Karimpour Nov 2009

Subspace Methods for Estimating State Space Models

Let us now consider how to estimate the system matrices A, B, C and D in the ss model

)()()()()()()()1(

tvtDutCxtytwtButAxtx

++=++=+

Let the output y(t) is a p-dimensional column vector, the input u(t) is a m-dimensional column vector. Also the order of system is n.

We also assume that this ss representation is a minimal realization.

We know that many different representation can also described the system. They are:

)(~)()(~)()(~)()(~)1(~ 11

tvtDutxCTtytwtBuTtxATTtx

++=++=+ −−

Where T is any invertible matrix. We also have

)()(~ 1 txTtx −=

Page 61: SYSTEMS Identification - UM

61

lecture 10

Ali Karimpour Nov 2009

Subspace Methods for Estimating State Space Models

Let the ss as:)()()()(

)()()()1(tvtDutCxty

twtButAxtx++=++=+

known. are ˆ and ˆ If I) CA

.knownnot are ˆ and ˆBut CA

known. is system for the , matrix,ity observabil (extended) theIf II) rO

. and find easy to isIt DB

. and find easy to isIt CA

.knownnot is But rO

estimated.ly consistent becan matrix ity observabil extended The III)

d.constructe becan state theestimated,been hasmatrix ity observabil theOnce IV)

► Estimating B and D

► Finding A and C from Observability matrix

► Estimating the Extended Observability matrix

► Finding the States and Estimating the noise Statistics.

Page 62: SYSTEMS Identification - UM

62

lecture 10

Ali Karimpour Nov 2009

Subspace Methods for Estimating State Space Models

Let the ss as:)()()()(

)()()()1(tvtDutCxty

twtButAxtx++=++=+

DBCA and find Try toknown are ˆ and ˆ Suppose →

► Estimating B and D

► Finding A and C from Observability matrix

CAOr and find Try toknown is Suppose →

► Estimating the Extended Observability matrix

rO find Try to available are outputs andinput Suppose →

► Finding the States and Estimating the noise Statistics.

Subs

pace

pro

cedu

re.

Page 63: SYSTEMS Identification - UM

63

lecture 10

Ali Karimpour Nov 2009

Subspace Methods for Estimating State Space Models

DBCA and find Try toknown are ˆ and ˆ Suppose →► Estimating B and D

For given and fixed the model structure: ˆ ˆand A C

It is clearly linear in B and D.

If the system operates in open loop. We can thus consistently estimate B and Daccording to theorem 8.4 even if the noise sequence is non-white.

)()()(ˆ)(

)()()(ˆ)1(

tvtDutxCty

twtButxAtx

++=

++=+

Page 64: SYSTEMS Identification - UM

64

lecture 10

Ali Karimpour Nov 2009

DBCA and find Try toknown are ˆ and ˆ Suppose →► Estimating B and D

Let us write the predictor in the standard linear regression form

Subspace Methods for Estimating State Space Models

[ ]

[ ] mm

m

m

mm

m

m

ududtxCu

uddtxCtDutxCty

ububtxAu

ubbtxAtButxAtx

+++=⎥⎥⎥

⎢⎢⎢

⎡+=+=

+++=⎥⎥⎥

⎢⎢⎢

⎡+=+=+

...)(ˆ......)(ˆ)()(ˆ)(

...)(ˆ......)(ˆ)()(ˆ)1(

11

1

1

11

1

1[ ]

[ ]⎥⎥⎥

⎢⎢⎢

⎡+=+=

⎥⎥⎥

⎢⎢⎢

⎡+=+=+

m

m

m

m

u

uddtxCtDutxCty

u

ubbtxAtButxAtx

......)(ˆ)()(ˆ)(

......)(ˆ)()(ˆ)1(

1

1

1

1[ ]

[ ]⎥⎥⎥

⎢⎢⎢

⎡+=+=

⎥⎥⎥

⎢⎢⎢

⎡+=+=+

m

m

m

m

u

uddtxCtDutxCty

u

ubbtxAtButxAtx

......)(ˆ)()(ˆ)(

......)(ˆ)()(ˆ)1(

1

1

1

1

( ) ( ) mmmm ududububAsICty +++++−=−

......ˆˆ)(ˆ 1111

1

⎥⎥⎥

⎢⎢⎢

mb

b...

1

⎥⎥⎥

⎢⎢⎢

md

d...

1

mpmn +mpmnp +×

Page 65: SYSTEMS Identification - UM

65

lecture 10

Ali Karimpour Nov 2009

DBCA and find Try toknown are ˆ and ˆ Suppose →► Estimating B and D

Subspace Methods for Estimating State Space Models

( ) ( ))......ˆˆ)(ˆ 1111

1

mmmm ududububAsICty +++++−=−

⎥⎥⎥

⎢⎢⎢

mb

b...

1

⎥⎥⎥

⎢⎢⎢

md

d...

1

mpmn +mpmnp +×

Page 66: SYSTEMS Identification - UM

66

lecture 10

Ali Karimpour Nov 2009

Subspace Methods for Estimating State Space Models

DBCA and find Try toknown are ˆ and ˆ Suppose →► Estimating B and D

If desired, also the initial state x0=x(0) can be estimated in an analogous way, since the predictor with initial values taken into account is

Which is linear also in x0. Here is the unit pulse at time 0. ( )tδ

Page 67: SYSTEMS Identification - UM

67

lecture 10

Ali Karimpour Nov 2009

► Finding A and C from Observability matrix CAOr and find Try to Suppose →

Suppose that a dimensional matrix G is given. That is related to the extended observability matrix Or . We have to determine A and C from G.

*pr n×

Known System Order. Suppose first we know that

So that n*=n. To find C is then immediate:

⎥⎥⎥⎥

⎢⎢⎢⎢

=

−1

...r

r

CA

CAC

O

Subspace Methods for Estimating State Space Models

• Unknown System Order.

• Known System Order.There is two situation:

Page 68: SYSTEMS Identification - UM

68

lecture 10

Ali Karimpour Nov 2009

Subspace Methods for Estimating State Space Models

► Finding A and C from Observability matrix CAOr and find Try to Suppose →

Similarly, we can find from the equationA

⎥⎥⎥⎥⎥⎥

⎢⎢⎢⎢⎢⎢

=

1

2

...

r

rr

CACA

CAC

O

An ˆin unknown 2 equations )1( −rnp

Page 69: SYSTEMS Identification - UM

69

lecture 10

Ali Karimpour Nov 2009

Subspace Methods for Estimating State Space Models

► Finding A and C from Observability matrix CAOr and find Try to Suppose →

downO upO

downup OAO =ˆ ( ) downT

upupT

up OOOOA1ˆ −

=

Role of the State Space Basis

The extended obsevability matrix is depends on the choice of basis in the state-space representation. It is easy to verify that the observability matrix would be

Note: Large r leads to numerical problem.

Page 70: SYSTEMS Identification - UM

70

lecture 10

Ali Karimpour Nov 2009

Unknown system order.

Suppose now the true orders of the system is unknown. And that n*-the number of columns of G is just an upper bound for the order.

Subspace Methods for Estimating State Space Models

Page 71: SYSTEMS Identification - UM

71

lecture 10

Ali Karimpour Nov 2009

Subspace Methods for Estimating State Space Models

Page 72: SYSTEMS Identification - UM

72

lecture 10

Ali Karimpour Nov 2009

Subspace Methods for Estimating State Space Models

T

G⎥⎥⎥

⎢⎢⎢

−−−−

⎥⎥⎥⎥

⎢⎢⎢⎢

⎥⎥⎥⎥

⎢⎢⎢⎢

−−−−−−−−−−

=

⎥⎥⎥⎥

⎢⎢⎢⎢

=41.013.091.0

41.086.031.082.049.030.0

000000081.100039.24

99.000.001.014.011.034.035.087.007.028.085.044.003.090.040.019.0

31119761024421

1U1V

T

G⎥⎥⎥

⎢⎢⎢

−−−

⎥⎦

⎤⎢⎣

⎥⎥⎥⎥

⎢⎢⎢⎢

−−−

−−−

=

⎥⎥⎥⎥

⎢⎢⎢⎢

=13.091.086.031.0

49.030.0

81.10039.24

01.014.035.087.0

85.044.040.019.0

31119761024421

1S

Page 73: SYSTEMS Identification - UM

73

lecture 10

Ali Karimpour Nov 2009

Subspace Methods for Estimating State Space Models

Now multiplying this by V1 from right.

Now multiplying this by S1-1 from right.

Or for some invertible matrix R:

Page 74: SYSTEMS Identification - UM

74

lecture 10

Ali Karimpour Nov 2009

Subspace Methods for Estimating State Space Models

Using a Noisy Estimate of the Extended Observability Matrix

Let us now assume that the given matrix G is a noisy estimate of the true obsevability matrix

*pr n×

Due to the noise, S will typically have all singular non-zero values

It is reasonable to proceed as above and perform an SVD on G:

Where EN is small and tends to zero as . The rank of Or is not known. While the noise matrix EN is likely to be full rank.

∞→N

Page 75: SYSTEMS Identification - UM

75

lecture 10

Ali Karimpour Nov 2009

Subspace Methods for Estimating State Space Models

The first n will be supported by Or , while the remaining ones will stem from EN .

So this system of equations should be solved in a least-squares sense.

rO ˆrOThen use to determine , as before. However in the noisy case, will

not be exactly subject to the shift structureA C

If the noise is small, one should expected that the latter are significantly smaller than the former.

Therefore determine n as the number of singular values that are significantly larger than 0.

Page 76: SYSTEMS Identification - UM

76

lecture 10

Ali Karimpour Nov 2009

Subspace Methods for Estimating State Space Models Using Weighting Matrices in the SVDFor more flexibility we could pre- and post- multiply G as before performing the SVD

1 2G W GW=

Here R is an arbitrary matrix, that will the coordinate basis for the state representation.

In the noiseless case E=0, these weightings are without consequence. However, when noise is present, they have an important influence on the space spanned by U1.and hence on the quality of the estimates and .A C

Remark. The post-multiplying W2 by an orthogonal matrix does not effect the U1-matrix in the decomposition.

And then use the below equation to determine andA C

The post-multiplication by W2 just corresponds to a change of basis in the state-space and the pre-multiplication by W1 is eliminated.

Exercise: Proof the mentioned remark.(10.E10).

Page 77: SYSTEMS Identification - UM

77

lecture 10

Ali Karimpour Nov 2009

Subspace Methods for Estimating State Space Models

► Estimating the Extended Observability Matrix.

Remember

Now,

Page 78: SYSTEMS Identification - UM

78

lecture 10

Ali Karimpour Nov 2009

Subspace Methods for Estimating State Space Models

► Estimating the Extended Observability Matrix.

Now, form the vectors

The scalar r, is the maximum prediction horizon.

Page 79: SYSTEMS Identification - UM

79

lecture 10

Ali Karimpour Nov 2009

Subspace Methods for Estimating State Space Models

► Estimating the Extended Observability Matrix.

And the Kth block component of V(t)

Page 80: SYSTEMS Identification - UM

80

lecture 10

Ali Karimpour Nov 2009

Subspace Methods for Estimating State Space Models

► Esimating the Extended Observability Matrix.

?

We must eliminate the U term and make the noise influence disappear asymptotically.

? ?

Page 81: SYSTEMS Identification - UM

81

lecture 10

Ali Karimpour Nov 2009

Subspace Methods for Estimating State Space Models

► Estimating the Extended Observability Matrix.

We must eliminate the U term and make the noise influence disappear asymptotically.

Removing the U-term. Form the matrixN N×

Multiplying from the right by will leads to:∏⊥TU

Now

?Since this term is made up of noise contributions, the idea is to correlate is away with a suitable matrix.

Page 82: SYSTEMS Identification - UM

82

lecture 10

Ali Karimpour Nov 2009

Subspace Methods for Estimating State Space Models

► Estimating the Extended Observability Matrix.

Removing the Noise Term. Since the last term is made up of noise contributions. The idea is to correlate it away with a suitable matrix. Define matrix .s N× ( )s n≥

G NrTO ~NV

Here acts as an instrument and we must define it such thatΦ

Page 83: SYSTEMS Identification - UM

83

lecture 10

Ali Karimpour Nov 2009

Subspace Methods for Estimating State Space Models

► Estimating the Extended Observability Matrix.

GNrTO ~

NVHere acts as an instrument and we must define it such thatΦ

then

The matrix G can thus be seen as a noisy estimate of the extended observabilitymatrix.

pr s×

But we need to define .Φ

so

Page 84: SYSTEMS Identification - UM

84

lecture 10

Ali Karimpour Nov 2009

Subspace Methods for Estimating State Space Models

Finding Good Instruments. The only remaining question is how to achieve to the following equations

Remember instrument variable:

Remember:

The law of large numbers states that the sample sums converges to their respective expected values, so

Page 85: SYSTEMS Identification - UM

85

lecture 10

Ali Karimpour Nov 2009

Subspace Methods for Estimating State Space Models

Finding Good Instruments. The only remaining question is how to achieve to the following equations

Assume the input u is generated in open loop, so that it is independent of the noise V. 0

Now let

The k;th block component of V(t) is:

ttttV

s )()(

ϕ

Page 86: SYSTEMS Identification - UM

86

lecture 10

Ali Karimpour Nov 2009

Subspace Methods for Estimating State Space Models

Finding Good Instruments. The only remaining question is how to achieve to the following equations

A formal proof that has full rank is not immediate and will involve properties of the input.

T~

Similarly we have:

See problem 10G.6 and Van Overschee and DeMoor(1996).

Page 87: SYSTEMS Identification - UM

87

lecture 10

Ali Karimpour Nov 2009

Finding the States and Estimating the Noise statistics

Subspace Methods for Estimating State Space Models

Some part of chapter 7

Let a system given by the impulse response representation

)()()()()(0

jtejhjtujhty ej

u −+−=∑∞

=

Let the formal k-step ahead predictors be defined by just deleting the last terms so:

)()()()()|(ˆ jtejhjtujhktty ekj

u −+−=− ∑∞

=

Define

[ ])(ˆ...)1(ˆˆ

)1|1(ˆ...

)1|(ˆ)(ˆ NYYY

trty

ttytY rrr =

⎥⎥⎥

⎢⎢⎢

−−+

−=

Page 88: SYSTEMS Identification - UM

88

lecture 10

Ali Karimpour Nov 2009

Finding the States and Estimating the Noise statistics

Subspace Methods for Estimating State Space Models

Some part of chapter 7

)()()()()()(0

Ijtejhjtujhty ej

u −+−=∑∞

=

)()()()()|(ˆ jtejhjtujhktty ekj

u −+−=− ∑∞

=

[ ])(ˆ...)1(ˆˆ

)1|1(ˆ...

)1|(ˆ)(ˆ NYYY

trty

ttytY rrr =

⎥⎥⎥

⎢⎢⎢

−−+

−=

Then the following is true as (see chapter 4 appendix A) ∞→N

1- The system (I) has an nth order minimal state space description if and only if the

rank is equal to n for all r ≥ nY

2- The state vector of any minimal realizations form can be chosen as linear

)(ˆ)( tYLtx r=

Page 89: SYSTEMS Identification - UM

89

lecture 10

Ali Karimpour Nov 2009

Finding the States and Estimating the Noise statistics

Subspace Methods for Estimating State Space Models

Let a system given by the impulse response representation

)()()()()|(ˆ jtejhjtujhktty ekj

u −+−=− ∑∞

=

)()()()()1|1(ˆ1

jtejhjtujhtkty ej

u −+−=−−+ ∑∞

=

)(...)1()(...)1()1|1(ˆ 22111 1stutustytytkty ss −++−+−++−=−−+ ββαα

For practical reason we have

This predictor can be determined effectively by

)1()()()1( −+++=−+ kttUtkty lTks

Tk εγϕθ

or, dealing with all r predictors simultaneously

Page 90: SYSTEMS Identification - UM

90

lecture 10

Ali Karimpour Nov 2009

Finding the States and Estimating the Noise statistics

Subspace Methods for Estimating State Space Models

According to chapter 7

By LS we have

By inverse lemma

Page 91: SYSTEMS Identification - UM

91

lecture 10

Ali Karimpour Nov 2009

Finding the States and Estimating the Noise statistics

Subspace Methods for Estimating State Space Models

According to chapter 7

Predicted output is

RUOr 1=

)(...)1()(...)1()1|1(ˆ 22111 1stutustytytkty ss −++−+−++−=−−+ ββαα

So we have

TVSRX 111ˆ −=

Page 92: SYSTEMS Identification - UM

92

lecture 10

Ali Karimpour Nov 2009

Finding the States and Estimating the Noise statistics

Subspace Methods for Estimating State Space Models

So let:

TVSRX 111ˆ −=

Page 93: SYSTEMS Identification - UM

93

lecture 10

Ali Karimpour Nov 2009

With the states given, we can estimate the process and measurement noises as

Subspace Methods for Estimating State Space Models

Page 94: SYSTEMS Identification - UM

94

lecture 10

Ali Karimpour Nov 2009

Putting It All Together

The family of subspace algotithm1. From the input-output data form

Subspace Methods for Estimating State Space Models

Remember:

The scalar r, is the maximal prediction horizon and in many algorithms use r = s

Many algorithms choose φs(t) to consist of past inputs and outputs with s1=s2=s. So scalar s is a design variable.

Page 95: SYSTEMS Identification - UM

95

lecture 10

Ali Karimpour Nov 2009

Putting It All Together

The family of subspace algotithm

2. Select weighting matrices W1 and W2 and perform SVD

Subspace Methods for Estimating State Space Models

The weighting matrices W1 and W2. This is the perhaps most important choice. Existing algorithms employ the following choices:

Page 96: SYSTEMS Identification - UM

96

lecture 10

Ali Karimpour Nov 2009

Putting It All Together

The family of subspace algotithm

Subspace Methods for Estimating State Space Models

3. Select a full rank matrix R and define the matrix solverp n× 11 1

ˆrO W U R−=

For and . The latter equation should be solved in a least square sense.C A

Typical choices for R, are R=I, R=S1 or 1/21R S=

4. Estimate , and from the linear regression problem:B D 0x

Page 97: SYSTEMS Identification - UM

97

lecture 10

Ali Karimpour Nov 2009

Putting It All Together

The family of subspace algotithm

Subspace Methods for Estimating State Space Models

5. If a noise model is sought, form as inX

And estimate the noise contributions as in