14 Experiment Design - MIT OpenCourseWare · 2.160 System Identification, Estimation, and Learning...

12
2.160 System Identification, Estimation, and Learning Lecture Notes No. 19 May 1, 2006 14 Experiment Design 14.1 Review of System ID Theories for Experiment Design Key Requirements for System ID Consistent (unbiased) estimate* 0 ˆ θ θ N Small covariance ( ) N θ ˆ cov *) Unbiased estimate and consistent estimate are equivalent if the system is ergodic. (1) [ ] 0 ˆ θ θ = N E The ensemble mean of is equal to the true value N θ ˆ 0 θ …..unbiased. (2) The limit of as N tends to infinity is the true value 0 ˆ lim θ θ N N N θ ˆ 0 θ ..consistent Major Theoretical Results i) Informative data sets Is data set Z informative enough to distinguish any two different models and in the same model set ) ( 1 q W ) ( 2 q W () M θ ? ( ) ) ( ) ( 0 ) ( ) ( ) ( 2 1 2 1 ω ω ω ω i i i i e W e W t Z e W e W (3) The data set is Informative (for any linear time-invariant model), if the Spectrum Matrix: Φ Φ Φ Φ = Φ ) ( ) ( ) ( ) ( ) ( ω ω ω ω ω y yu uy u z : Positive definite for almost all ω . (4) ) (t u () yt ) ( θ M ii) Persistence of excitation Any (n-1)-st order MA filter cannot filter the input to zero. ) ( q M n ) (t u ) (t v n n n q m q m q M + + = 1 1 ) ( 1

Transcript of 14 Experiment Design - MIT OpenCourseWare · 2.160 System Identification, Estimation, and Learning...

Page 1: 14 Experiment Design - MIT OpenCourseWare · 2.160 System Identification, Estimation, and Learning Lecture Notes No. 19 May 1, 2006 14 Experiment Design 14.1 Review of System ID Theories

2.160 System Identification, Estimation, and Learning Lecture Notes No. 19

May 1, 2006

14 Experiment Design

14.1 Review of System ID Theories for Experiment Design

Key Requirements for System ID • Consistent (unbiased) estimate* 0

ˆ θθ →N

• Small covariance ( )Nθ̂cov *) Unbiased estimate and consistent estimate are equivalent if the system is ergodic.

(1) [ ] 0

ˆ θθ =NE The ensemble mean of is equal to the true value Nθ̂ 0θ …..unbiased.

(2) The limit of as N tends to infinity is the true value0ˆlim θθ →

∞→ NN Nθ̂ 0θ ..consistent

Major Theoretical Results i) Informative data sets

Is data set ∞Z informative enough to distinguish any two different models and in the same model set

)(1 qW)(2 qW ( )M θ ?

( ) )()(0)()()( 2121

ωωωω iiii eWeWtZeWeW ≡→≡− (3) The data set is Informative (for any linear time-invariant model), if the Spectrum Matrix:

⎥⎦

⎤⎢⎣

⎡Φ−ΦΦΦ

=Φ)()()()(

)(ωωωω

ωyyu

uyuz : Positive definite for almost all ω . (4)

)(tu ( )y t )(θM ii) Persistence of excitation

Any (n-1)-st order MA filter cannot filter the input to zero.

)(qM n )(tu )(tv n

nn qmqmqM −− ++= 11)(

1

Page 2: 14 Experiment Design - MIT OpenCourseWare · 2.160 System Identification, Estimation, and Learning Lecture Notes No. 19 May 1, 2006 14 Experiment Design 14.1 Review of System ID Theories

Persistently Exciting of order n

⎥⎥⎥⎥

⎢⎢⎢⎢

−−

=↔

)0()1(

)2()0()1()1()1()0(

uu

uuu

uuu

n

RnR

nRRRnRRR

R (5)

is Non-singular

In particular for ( )

f

f

b

b

k

nn

nn

n

qfqf

qbqbbqqG −−

+−−

+++

+++=

11

1121

1),( θ (6)

An open-loop experiment with an input sequence u(t) that is persistently exciting of order is informative enough to distinguish any two transfer functions of the above form. fb nn +

0 1 2 3 4 5 6-1

0

1

0 1 2 3 4 5 6-1

0

1

0 1 2 3 4 5 6-1

0

1

Informative enough w.r.t.

),( θqG having at most n poles and n zeros

n sinusoids w/different frequencies persistently exciting of order 2n

iii) Signal to Noise Ratio The Prediction-Error Method (PEM) determines parameter vector θ in such a way that a frequency-weighted squared error norm be minimized. For a fixed noise model:

( ) )(, qHqH fixed=θ the following signal-to-noise ratio is the weight of the norm:

ωωθθω

π

π

ωω

θd

eHeGeG

ifixed

uii2

2

0)(

)(),()(minargˆ Φ⋅−= ∫− (7)

Error of transfer function

Signal to Noise Ratio

2

Page 3: 14 Experiment Design - MIT OpenCourseWare · 2.160 System Identification, Estimation, and Learning Lecture Notes No. 19 May 1, 2006 14 Experiment Design 14.1 Review of System ID Theories

iv) Asymptotic Variance

Consider PEM with ∑=

=N

t

NN t

NZV

1

2 ),(211),( θεθ . If the true model is involved in the

model set MD∈0θ , the variance of parameter estimation error approaches the following Asymptotic Variance, as the number of data tends to infinity,

( )[ ] 100

0 ),(),(1~ˆ −= θψθψλθ θ ttE

NP

NCov T

N (8)

which depends on:

• Number of data; N • Noise level: 0λ

• Sensitivity: ( )0

)(ˆ, 0 θθθ

θψ tyddt = (9)

The Punch Line

Using many parameters (large d) may provide an accurate model with a small value of bias (

dR∈θMD∈0θ ), but a large d tends to worsen the variance and

convergence speed, if some parameters in θ have small sensitivity values,

)(ˆ θθ

tydd .

The above asymptotic variance can be expressed in the frequency-domain as

1

00 ),(')(),(')(

1211~ˆ

0

− ⎥⎦

⎤⎢⎣

⎡Φ

Φ∫π

π

ωω ωθωθωπ

θ deTeTN

Cov iTx

i

vN (10)

where ( ) ( ) ( )[ ]θθθθ

θ ,',,'),(,' qHqGqTddqT ==

2

00 ),()( θλω ωiv eH=Φ

)(00 ωλ eΦ=

14.2 Design Space of System ID Experiments

1) Data Acquisition • Sensors: What to measure and how to measure Output variables Disturbances: If feasible, it is better to measure disturbances • Actuators: What to manipulate Inputs physical limit and constraints must be taken into account • Sampling rate… 2.161 Signal Processing

3

Page 4: 14 Experiment Design - MIT OpenCourseWare · 2.160 System Identification, Estimation, and Learning Lecture Notes No. 19 May 1, 2006 14 Experiment Design 14.1 Review of System ID Theories

High sampling rate means More data, but Capturing noise rather than signals

Anti-aliasing Pre-filter design 2) Model Structure Selection

• Linear/Non-linear a) Linear parameterization of nonlinear systems b) Hammerstein and Wiener Models

Linear Nonlinear y(t) )

• L

• T

3) Input Howdata

The Pun

Bothproppowefreed

14.3 Inp

In

0Φ ue

Co

u(t

f [u(t)] Dynamic

Model Map f (u) Hammerstein Model

Linear Nonlinear y(t) )

u(t Dynamic

Model Map f (z) Wiener Model

inear Time-Invariant Models FIR, ARX, ARMA, OE, J-B, State Space

rade-off between bias (accuracy) and variance (reliability) System order

Signal Design do we design an input sequence (waveform) to be given to the system so that are informative and that the covariance of estimate may be small?

ch Line bias and variance conditions are provided in terms of the second-order erties of input signals. See eqs. (4), (5),(7), (8) and (10). They hinge on signal r spectra, but signal waveforms are free to choose. This gives a great degree of om in designing an input sequence.

ut Design for Open-Loop Experiments

an open-loop system, u(t) and eo(t) are independent. Therefore, , in eq. (10).

00=Φue

0=

θ

π

π

ωω ωθωθωπ

θ PN

deTeTN

v iTx

i

vN

1),(')(),(')(

1211~ˆ

1

00 0=⎥

⎤⎢⎣

⎡Φ

Φ

−∫

4

Page 5: 14 Experiment Design - MIT OpenCourseWare · 2.160 System Identification, Estimation, and Learning Lecture Notes No. 19 May 1, 2006 14 Experiment Design 14.1 Review of System ID Theories

This asymptotic variance can be broken down to the term comprising the input spectrum

θP)(ωuΦ and the one having no input spectrum. Taking inverse of , θP

BdAP u +Φ= ∫−− ωωω

π

πθ )()(1 (12)

where

),('),(')(2

1)( 00 θθωπ

ω ωω iTi

v

eGeGAΦ

= (13)

ωθθωπ

λ π

π

ωω deHeHB iTi

v∫− Φ

= ),('),(')(

12 00

0 (14)

The asymptotic variance must be small for fast convergence. This can be

achieved by allocating the input spectrum such that the first term in (12) becomes large. That is, the input spectrum should be large at frequencies where the squared parameter sensitivity of the transfer function,

θP

20'( , )iG e ω θ , is large. See the figure

below. To achieve fast convergence, choose the input spectrum )(ωuΦ such that )(ωuΦ

has a large magnitude at frequencies where )(ωA in (13) has a large magnitude. Note )(ωA is proportional to G’G’T; the sensitivity of the transfer function to

parameter changes)( ωieG

θ . Plotting for perturbed parameter vector ),( θθω ∆+ieG θ depicts where to allocate large )(ωuΦ .

This guide line for input design is only for the spectrum; the actual waveform can

be determined freely by considering other requirements.

High sensitivity Large G )(' ωie

Low sensitivity Small

θθω

dedG i ),(

),( θθω ∆+ieG

True Transfer Function G )(0ωie

Bode plots of )( ωieG for perturbed parameter vector θθ ∆+

)( ωieG

5

Page 6: 14 Experiment Design - MIT OpenCourseWare · 2.160 System Identification, Estimation, and Learning Lecture Notes No. 19 May 1, 2006 14 Experiment Design 14.1 Review of System ID Theories

14.4 Practical Requirements for Input Design

Actuator Plant Sensor Physical

Limit Physical Constraints

Dynamic Range

When conducting experiments, various constraints due to physical limits of the process must be satisfied. Among others, the input amplitude limit is a common constraint

utuu ≤≤ )( (15)

where u and u are lower and upper limits. In general, the larger the input magnitude )(tu becomes, the smaller the

asymptotic variance becomes. Therefore the input sequence should have large amplitude most of the time.

CP

Max )(tu Max )(tu)(tu

To evaluate this aspect of input signal, the follomean signals:

∑=

≤≤≡ N

t

Ntr

tuN

tuC

1

2

2

12

)(1

)(max,

Note ; Smaller is better.1≥rC

14.5 System ID Using Random Signals

The best crest factor (1, the lowest value) is ach

)(tu

Undesirable time

wing crest factor is often used for zero-

(16)

ieved with zero mean binary signals:

Desirable time

6

Page 7: 14 Experiment Design - MIT OpenCourseWare · 2.160 System Identification, Estimation, and Learning Lecture Notes No. 19 May 1, 2006 14 Experiment Design 14.1 Review of System ID Theories

utu ±=)( . The signal shown below is an example of random binary sequence with an auto-correlation: ( ) [ ( ) ( )] 1 ( )uR E u t u tτ τ δ= − = ⋅ τ (17)

e

In system identification, random signals are often used for input sequences because otheir superb noise reduction properties. The following is to show this property. Consider the model description based on impulse response given by

(18) 1

( ) ( ) ( ) ( )k

y t g k u t k v t∞

=

= − +∑ where is noise, which is uncorrelated with a random input , ( )v t ( )u t [ ( ) ( )] 0E u t v t τ− ≡ , for all τ . (19) Evaluating the correlation between the input and the output yields

1

1

1

( ) [ ( ) ( )] [{ ( ) ( ) ( )} ( )]

( ) [ ( ) ( )] [ ( ) ( )]0

( ) ( ) ( )

yuk

k

uk

R E y t u t E g k u t k v t u t

g k E u t k u t E v t u t

g k R k g

τ τ τ

τ τ

τ τ

=

=

=

= − = − +

= − − + −

= − =

}

(20)

Note that the noise term is averaged out when the input-output correlation is computed. This is particularly useful for noisy systems. If the input is a random sequwith unit variance, the input-output correlation { ( ), 1,2,yuR k k = gives the impulsresponse { ( . ), 1,2, }g k k = The advantage of the random input signal is clear when it is compared with astandard impulse response test using the following input:

u t (2,

( )0, 0

t 0t

α =⎧= ⎨ ≠⎩

The observed output is given by

tim

f

ence e

1)

7

Page 8: 14 Experiment Design - MIT OpenCourseWare · 2.160 System Identification, Estimation, and Learning Lecture Notes No. 19 May 1, 2006 14 Experiment Design 14.1 Review of System ID Theories

( ) ( ) ( )y t g t v tα= ⋅ + If one determines the impulse response from this output:

( ) ( )ˆ( ) ( )y t v tg t g tα α

= = + (22)

Note that noise shows up in the output . To eliminate the effect of , the magnitude of the input

( )v t ( )y t ( )v tα must be very large, which is prohibited in most applications.

Thus, the input-output correlation method using a random input sequence is a powerful system identification method particularly for noisy systems.

14.6 Pseudo-Random Binary Signal (PRBS)

Now that random binary signals are useful for system identification, how can we generate them? A Pseudo-Random Binary Signal (PRBS) is a periodic, deterministic signal with white noise like properties. It has been widely used for system identification as well as for spread spectrum wireless communication and GPS.

PRBS can be generated easily with a shift register that circulates its output to the input gate and thereby generates a periodic, long-sequence binary signal. Mathematically, the signal is written as the following difference equation:

)2),()1(()2),()(()(

1 ntuatuaremtuqAremtu

n −++−==

(23)

where rem(x,2) is the remainder as x is divided by 2, i.e. module 2 of x. The output is therefore binary. Example: A 5 bit shift register

+ +

5

Initial state1 1 1 1 1 00011011101010000100101100 1111100011011101010000100101100

Repeat the same 31 bits31 bits=2 1−

1 1 1 1 1 0 0 1 0 1 Output u t ( )Output u t ( ) ( 1)u t + Initial state

Binary shift register with feedback

8

Page 9: 14 Experiment Design - MIT OpenCourseWare · 2.160 System Identification, Estimation, and Learning Lecture Notes No. 19 May 1, 2006 14 Experiment Design 14.1 Review of System ID Theories

• The sequence repeats itself after every sequence of 31 bits. • This sequence is uniquely determined by the feedback law A(q) • M = 2n-1 is the maximum-length sequence.

All the maximum-length Pseudo-Random Binary Signals (PRBS) have the following mean, covariance, and power spectrum.

Mean: Mutu

M

M

t

=∑=1

)(1 (24)

Covariance: ⎪⎩

⎪⎨⎧

±±==+∑

= elsewhereMu

MMkuktutu

M

M

t

2

2

1

,2,,0)()(1 (25)

Power Spectrum: πωπωδπω 2022)(1

1

2

≤≤⎟⎠⎞

⎜⎝⎛ −=Φ ∑

=

M

ku M

kMu (26)

Features of PRBS

1) The power spectrum of n-th order PRBS with maximum length M = 2n-1 has M-1 frequency peaks between πωπ ≤≤− , and is persistently exciting of order M-1; high-density, white noise.

2) Crest factor of 1; Maximum strength input sequence 3) Periodic

• Quasi-stationary • Noise reduction by taking average of samples y(t), y(t+T), y(t+2T),…,

which are outputs to the same input, u(t), u(t+T),…

9

Image removed for copyright reasons.

Page 10: 14 Experiment Design - MIT OpenCourseWare · 2.160 System Identification, Estimation, and Learning Lecture Notes No. 19 May 1, 2006 14 Experiment Design 14.1 Review of System ID Theories

An alternative to PRBS is to generate white, zero-mean Gaussian noise, and take the sign of the noise for generating a binary sequence. Compared with this regular random binary signal, PRBS has the following features

4) The regular random signal needs a long-sequence (n>>1) to secure the second order properties, while PRBS needs a much shorter bit length.

5) PRBS’s spectrum has a special pattern, which facilitates analytic calculations. PRBS has been used for

Global Positioning System Spread-Spectrum, Code-Division Multiple Access (CDMA)

…Cellular Phones, Bluetooth Wireless Networking etc.

14.7 Sinusoidal Inputs.

Considering the requirements for input sequence design, a natural choice of input is to form it as a sum of sinusoids with different frequencies:

(∑=

+=p

kkkk tatu

1

cos)( φω ) (27)

At steady state ( : large) this multi-sine function has the following power spectrum: t

( ) ( ) ([∑=

++−=Φp

kkk

ku

a1

2

42 ωωδωωδπω )] (28)

( )ωuΦ

2

2 kaπ

These should be placed at frequencies where )(ωA in (13) is large

ωkω− pk ωωωω 21 1ω−

For each frequency, a pair of non-zero components is generated at kω± . If ( )θ,qG has at most n poles and n zeros, np ≥ frequencies satisfy the persistently exciting condition. The problem with the multi-sin input is that, depending on phase angle kφ , the crest factor often becomes vary large. If all the sinusoids are in phase,

0=kφ ,

10

Page 11: 14 Experiment Design - MIT OpenCourseWare · 2.160 System Identification, Estimation, and Learning Lecture Notes No. 19 May 1, 2006 14 Experiment Design 14.1 Review of System ID Theories

pkaaforpa

a

powersignaltuofAmplitude

tuN

tuC kp

k

k

p

kk

N

t

Ntr ,1,2

2

)(

)(1

)(max

1

21

1

2

2

12 ==⇒===

∑=

=

=

≤≤

(29) If all the sinusoids have the same amplitude, the crest factor is p2 , which becomes large as p increases.

Figure 1 Multi-Sine inputs

As shown in Figure 1 (Figure 13.5 in Ljung’s book), the magnitude becomes very large 9a) when 10 sinusoids with zero phase angles 0=kφ are superimposed; Figure(a). To

11

Figure 13-5 from Ljung, Lennart. System Identification: Theory for the User. 2nd ed.Upper Saddle River, NJ: Prentice-Hall, 1999, chapter 13. ISBN: 0136566952. Image removed for copyright purposes.��

Page 12: 14 Experiment Design - MIT OpenCourseWare · 2.160 System Identification, Estimation, and Learning Lecture Notes No. 19 May 1, 2006 14 Experiment Design 14.1 Review of System ID Theories

overcome this problem, the phase angles kφ ( pk 1= ) should be randomized (See Figure (c) ), or placed at particular angles that are as much out of phase as possible:

πφφp

kkk

)1(1

−−= pk ≤≤2 (30)

=1φ arbitrary

This is called the Schroeder phase choice. Swept Sinusoids (Chirp Signals) Continuously varying the frequency between 1ω and 2ω over a time period of

creates a Chirp Signal: Tt ≤≤0

( ) ⎟⎟⎠

⎞⎜⎜⎝

⎛−+=

TttAtu2

cos)(2

1211 ωωω (31)

The instantaneous frequency, iω , is obtained by differentiating )(tu

( 121 ωωωω −+=Tt

i ) (32)

This Chirp signal has the same crest factor as that of a single sinusoid, that is 2 . See Figure 2 below.

Figure 2 Chirp Signal and its spectrum

12

Figure 13-6 from Ljung, Lennart. System Identification: Theory for the User. 2nd ed. Upper Saddle River, NJ: Prentice-Hall, 1999, chapter 13. ISBN: 0136566952.Image removed for copyright purposes.