SIPE - Lecture 9. Optimization methods

22
Lecture 1 April 11, 2006 System Identification and Parameter Estimation Wb 2301 Frans van der Helm Lecture 9 Optimization methods

Transcript of SIPE - Lecture 9. Optimization methods

Page 1: SIPE - Lecture 9. Optimization methods

Lecture 1 April 11, 2006

System Identification andParameter Estimation

Wb 2301Frans van der Helm

Lecture 9Optimization methods

Page 2: SIPE - Lecture 9. Optimization methods

Lecture 1 April 11, 2006

u(t), y(t)

‘non-parametric’model

parametricmodel

U(ω),Y(ω)

non-parametricmodel

parametricmodel

Identification: time-domain vs. frequency-domain

ARXARMAEtc.

FrequencyResponseFunction(FRF)

Page 3: SIPE - Lecture 9. Optimization methods

Lecture 1 April 11, 2006

Contentsparameter estimation

• Parameter estimation in time-domain:• ‘Non-parametric’ models: ARMA, OE, etc.• Models with physical parameters

• Input – output data• model structure & model parameters

• Linear and non-linear models• Simulation of model structure• optimization algorithms: Adapt model parameters for best fit to

simulation• Parameter estimation in frequency domain:

• Non-parametric models: Phase and amplitude• Can be derived from non-parametric time-domain models

• Models with physical parameters• Results non-parametric model: phase and amplitude• model structure & model parameters

• Linear models• optimization algorithms:

Adapt model parameters for best fit in frequency domain

Page 4: SIPE - Lecture 9. Optimization methods

Lecture 1 April 11, 2006

Contentsparameter estimation

• Optimization algorithms:• Grid search• Gradient search

• Steepest descent (Newton)• Quasi-Newton• Levenberg-Marquardt

• Random search• Bremermann optimizer

• Genetic algorithm

Page 5: SIPE - Lecture 9. Optimization methods

Lecture 1 April 11, 2006

Contentsparameter estimation

• Special model structures:• Neural networks• (Expert systems and fuzzy sets)

Page 6: SIPE - Lecture 9. Optimization methods

Lecture 1 April 11, 2006

Parameter estimation in time-domain

• Parameter estimation in time-domain:• ‘Non-parametric’ models: ARMA, OE, etc.

• Parameters are not physically interpretable• No physical parameter fitting afterwards• Only for control purposes• By transition to frequency domain: Parameter estimation

• Parametric models• Input – output data• model structure & model parameters

• Linear and non-linear models• Simulation of model structure, e.g. in Matlab/Simulink• Criterion function: Model predictions vs. recorded data• optimization algorithms: Adapt model parameters for best fit to

simulation

Page 7: SIPE - Lecture 9. Optimization methods

Lecture 1 April 11, 2006

Linear and non-linear models

• Parameter estimation by iterative search• Static systems• Dynamic systems

• Criterion function• Optimization procedure

• Grid search• Gradient search• Random search• Genetic algorithms

• Validation

Page 8: SIPE - Lecture 9. Optimization methods

Lecture 1 April 11, 2006

Static and dynamic systems• y(k) = f(θ,u(k)) + n(k)

• θ: parameter vector• k = 1 .. N datapoints / time samples• ZN = [y(k) u(k)]

• error definition:• e(k) = y(k) - f(θ,u(k))

• criterion function (least squares):• J(ZN, θ) = 0.5*∑ e(k)2

• summation over k realizations/time instants• subject to constraints:

• linear / non-linear• equality constraints / inequality constraints

• Constraints define ‘feasible region’ for parameters

• find minimum of J(Zn, θ)

Page 9: SIPE - Lecture 9. Optimization methods

Lecture 1 April 11, 2006

Dynamic systems

• Non-linear function y(t) = f(x(t),u(t),θ,t)• correct model structure• Known (measured) input u(t)• initial guess of parameter vector θ• simulation: = f(x(t),u(t),θ,t)• error function: e = y(t) -• iterative search requires many simulations!!

( )y t( )y t

Page 10: SIPE - Lecture 9. Optimization methods

Lecture 1 April 11, 2006

Grid search

• Systematically search the parameter space and find the minimum• Very elaborate• Depending on resolution of grid• Likely to find ‘global’ minimum

Page 11: SIPE - Lecture 9. Optimization methods

Lecture 1 April 11, 2006

Gradient search

• Starting point θi in feasible region• Optimal parameter vector θ* is defined at minimum of

J(Zn, θ). Then:

• Iterative search:

• α: step size• f: search direction

• Newton algorithms:

∂ θ∂θ

J ZN( , )*

= 0

θ θ αi iif+ = +1 .

fJ Z J Zi

Ni

i

Ni

i= −

⎣⎢

⎦⎥

−∂ θ

∂θ∂ θ

∂θ

2

2

1( , )

.( , )

Page 12: SIPE - Lecture 9. Optimization methods

Lecture 1 April 11, 2006

First and second gradient(least squares criterion)

),().,(),(.),(),( *****

θθϕθ∂θ

θ∂∂θ

θ∂ NNNNN

ZeZZeZeZJ==

∂ θ∂θ

ϕ θ ϕ θ∂ ϕ θ

∂θ

2

2

2

2

J ZZ Z

ZNN N T

N( , )( , ). ( , )

( , )** *

*

= +

First derivative:

Second derivative:

Page 13: SIPE - Lecture 9. Optimization methods

Lecture 1 April 11, 2006

Gradient search

θ*

θ

J(Zn, θ)

θi

∂ θ∂θ

J ZNi

i

( , )

Page 14: SIPE - Lecture 9. Optimization methods

Lecture 1 April 11, 2006

Gradient search

• Steepest descent: Search direction depends only on first derivative• Slow close to minimum

• Search direction depends on first and second derivative (Hessian matrix)• Takes dependency between parameters into account• Fast close to minimum• Expensive to calculate Hessian

• Quasi-Newton: Uses approximation of Hessian, e.g.

∂ θ∂θ

ϕ θ ϕ θ∂ ϕ θ

∂θ

2

2

2

2

J ZZ Z

ZNN N T

N( , )( , ). ( , )

( , )** *

*

= +

Vanishes near optimum

‘Gauss-Newton’:

Page 15: SIPE - Lecture 9. Optimization methods

Lecture 1 April 11, 2006

Gradient Search

• Levenberg - Marquardt algorithm:

• strengthens diagonal of Hessian• decrease of interaction between parameters (e.g. when model

is overparameterized or badly parameterized)• Better convergence, more robust

∂ θ∂θ

ϕ θ ϕ θ δ2

2

J ZZ Z I

NN N T( , )

( , ). ( , )*

* *= +

Page 16: SIPE - Lecture 9. Optimization methods

Lecture 1 April 11, 2006

Incorporation of constraints• Criterion J(ZN,θ) = 0.5*∑ e(k)2

• Subject to• Linear equality constraints: A.θ - B = 0• Linear inequality constraints: A.θ - B < 0• Non-linear equality constraints: f(θ) - C = 0• Non-linear inequality constraints: f(θ) - C < 0

• Equality constraints incorporated into criterion:• J*(ZN,θ) = J(ZN,θ) + λ1.(A.θ - B) + λ2.(f(θ) - C)• λ1, λ2 : Lagrange multiplier, adaptive weight factor• ∂J*/ ∂λ1 = 0 → A.θ - B = 0• ∂J*/ ∂λ2 = 0 → f(θ) - C = 0

• Inequality constraints incorporated into criterion:• J*(ZN,θ) = J(ZN,θ) + λ3.(A.θ - B - s1) + λ4.(f(θ) - C - s2)• s1,s2: slack variable, s1,s2 > 0

Page 17: SIPE - Lecture 9. Optimization methods

Lecture 1 April 11, 2006

Gradient methods

• Very costly in calculating derivatives• “much information about only one point in parameter space”

• Algorithms are tuned to converge (if possible)• Sensitive to local minima• Result might depend on initial parameter guess• Most often used !!

Page 18: SIPE - Lecture 9. Optimization methods

Lecture 1 April 11, 2006

Random search methods

• Random search direction in parameter space:

• Calculate n criterion values along search direction• Fit (n-1)th order polynome through criterion value• Calculate minimum of polynome• check if minimum is lower than previous minimum• determine new search direction

θ θ αi iif+ = +1 .

Page 19: SIPE - Lecture 9. Optimization methods

Lecture 1 April 11, 2006

Genetic algorithms

• Generate multiple parameter vectors• Evaluate criterion function for each parameter vector

• Constraints must be fulfilled

• Keep best 50% of parameter vectors• Generate children from parameter vectors, e.g. by

linear interpolation and small mutations• Evaluate … etc.

Page 20: SIPE - Lecture 9. Optimization methods

Lecture 1 April 11, 2006

Optimization algorithmsMatlab

• lsqnonlin:• Gradient search, least squares criterion function assumed• Output error function: vector• Upper and lower boundaries on parameters• No constraints

• Fminunc• Gradient search, any criterion function• Output error function: criterion value• Upper and lower boundaries on parameters• No constraints

Page 21: SIPE - Lecture 9. Optimization methods

Lecture 1 April 11, 2006

Optimization algorithmsMatlab

• fminsearch:• Nelder-Mead simplex (direct search) method, any criterion

function• Output error function: criterion value• Upper and lower boundaries on parameters• No constraints

• Fmincon• Gradient search, any criterion function• Output error function: criterion value• Upper and lower boundaries on parameters• Linear and non-linear, equality and inequality constraints

Page 22: SIPE - Lecture 9. Optimization methods

Lecture 1 April 11, 2006

Optimization algorithmsNiet-Matlab

• Levmar.m:• Gradient search, least squares criterion• Output error function: ERROR vector• Levenberg-Marquardt search: very robust against interaction

between parameters• Turbo-parameters for steepest descent search• No upper and lower boundaries on parameters• No constraints