Coupled-least-squares identification for multivariable systems

12
Published in IET Control Theory and Applications Received on 1st March 2012 Revised on 17th September 2012 Accepted on 17th October 2012 doi: 10.1049/iet-cta.2012.0171 ISSN 1751-8644 Coupled-least-squares identification for multivariable systems Feng Ding* Key Laboratory of Advanced Process Control for Light Industry (Ministry of Education), Jiangnan University, Wuxi 214122, Peoples Republic of China *Control Science and Engineering Research Center, Jiangnan University, Wuxi 214122, Peoples Republic of China E-mail: [email protected] Abstract: This article studies identication problems of multiple linear regression models, which may be described a class of multi-input multi-output systems (i.e. multivariable systems). Based on the coupling identication concept, a novel coupled- least-squares (C-LS) parameter identication algorithm is introduced for the purpose of avoiding the matrix inversion in the multivariable recursive least-squares (RLS) algorithm for estimating the parameters of the multiple linear regression models. The analysis indicates that the C-LS algorithm does not involve the matrix inversion and requires less computationally efforts than the multivariable RLS algorithm, and that the parameter estimates given by the C-LS algorithm converge to their true values. Simulation results conrm the presented convergence theorems. 1 Introduction Mathematical models are basic for control problems and play an important role in control system analysis, system synthesis and control system design, state ltering [15]. System identication is the theory and method for establishing the mathematical models of (dynamic) systems [69]. For decades, exploring new identication methods have received many control scientists' attention. Many new parameter estimation methods and their performance analysis have been reported for linear or non-linear systems, for example, [1013], especially the parameter tting and convergence rates for linear regression models [14, 15] and pseudo-linear systems [16, 17]. A scalar linear system described by a difference equation can be written as a linear regression model y(t ) = w T (t )u + v(t ) (1) where y(t ) [ R is the observation output of the system, u [ R n is the parameter vector to be identied, w(t ) [ R n is the (regressive) information vector consisting of the input--output data u(t i) and y(t i) of the system, v(t ) [ R is a stochastic noise with zero mean. It is worth noting that the linear regression models in (1) include but are not limited to linear systems. For example, the linear-parameter system or non-linear system y(t ) = a 1 y(t 1) + a 2 y(t 2)y(t 3) + b 1 u(t 1) + b 2 u 2 (t 2) + v(t ) can be written as the linear form of (1) by dening u := [a 1 , a 2 , b 1 , b 2 ] T and w(t ) := [y(t 1), y(t 2)y(t 3), u(t 1), u 2 (t 2)] T which is non-linear in the input-output u(t i) and y(t i). For the linear regression models in (1), many methods can estimate its parameter vector θ. Two typical identication methods are the stochastic gradient (SG) algorithm [18, 19] and the recursive least-squares (RLS) algorithm [20, 21]. The SG algorithm requires lower computational cost, but the RLS algorithm has a faster convergence rate than the SG algorithm. Many early work studied the convergence of the RLS algorithm under different conditions such as the assumptions that the input and output signals (or observation data) of the system under consideration have nite non-zero power and the noise is an independent and identically distributed random sequence with nite fourth-order moments [22], or the observation data {y(t ), w(t )} are stationary and ergodic [23]. An important breakthrough about the strong consistency of the RLS algorithm was achieved by Lai and Wei [14], who obtained the convergence rate of the RLS parameter estimation by assuming that the observation noise v(t ) has zero mean and nite second-order moment E[v 2 (t )|F t1 ] = s 2 , 1, a.s. where the symbol E denotes the expectation operator, {v(t )} is a martingale difference sequence with respect to an increasing sequence of σ-eld F t ; that is, v(t ) is F t -measurable and E[v(t )|F t1 ] = 0 (i.e. {F t } is the σ algebra generated by the observations up to and including time t. Under such assumptions, Lai and Wei [14] proved that the parameter estimation error ˆ u(t ) u satises ˆ u(t ) u 2 = O { ln l max [P 1 0 (t )]} c l min [P 1 0 (t )] , a.s., c . 1 (2) www.ietdl.org 68 IET Control Theory Appl., 2013, Vol. 7, Iss. 1, pp. 6879 & The Institution of Engineering and Technology 2013 doi: 10.1049/iet-cta.2012.0171

Transcript of Coupled-least-squares identification for multivariable systems

Page 1: Coupled-least-squares identification for multivariable systems

www.ietdl.org

Published in IET Control Theory and ApplicationsReceived on 1st March 2012Revised on 17th September 2012Accepted on 17th October 2012doi: 10.1049/iet-cta.2012.0171

ISSN 1751-8644

Coupled-least-squares identification for multivariablesystemsFeng Ding*

Key Laboratory of Advanced Process Control for Light Industry (Ministry of Education), Jiangnan University, Wuxi 214122,

People’s Republic of China

*Control Science and Engineering Research Center, Jiangnan University, Wuxi 214122, People’s Republic of China

E-mail: [email protected]

Abstract: This article studies identification problems of multiple linear regression models, which may be described a class ofmulti-input multi-output systems (i.e. multivariable systems). Based on the coupling identification concept, a novel coupled-least-squares (C-LS) parameter identification algorithm is introduced for the purpose of avoiding the matrix inversion in themultivariable recursive least-squares (RLS) algorithm for estimating the parameters of the multiple linear regression models.The analysis indicates that the C-LS algorithm does not involve the matrix inversion and requires less computationally effortsthan the multivariable RLS algorithm, and that the parameter estimates given by the C-LS algorithm converge to their truevalues. Simulation results confirm the presented convergence theorems.

1 Introduction

Mathematical models are basic for control problems and playan important role in control system analysis, system synthesisand control system design, state filtering [1–5]. Systemidentification is the theory and method for establishing themathematical models of (dynamic) systems [6–9]. Fordecades, exploring new identification methods havereceived many control scientists' attention. Many newparameter estimation methods and their performanceanalysis have been reported for linear or non-linear systems,for example, [10–13], especially the parameter fitting andconvergence rates for linear regression models [14, 15] andpseudo-linear systems [16, 17].A scalar linear system described by a difference equation

can be written as a linear regression model

y(t) = wT(t)u+ v(t) (1)

where y(t) [ R is the observation output of the system,u [ Rn is the parameter vector to be identified, w(t) [ Rn

is the (regressive) information vector consisting of theinput--output data u(t − i) and y(t − i) of the system,v(t) [ R is a stochastic noise with zero mean.It is worth noting that the linear regression models in (1)

include but are not limited to linear systems. For example,the linear-parameter system or non-linear system

y(t) = a1y(t − 1)+ a2y(t − 2)y(t − 3)+ b1u(t − 1)

+ b2u2(t − 2)+ v(t)

can be written as the linear form of (1) by definingu := [a1, a2, b1, b2]

T and w(t) := [y(t − 1), y(t − 2)y(t − 3),

68& The Institution of Engineering and Technology 2013

u(t − 1), u2(t − 2)]T which is non-linear in the input-outputu(t − i) and y(t − i).For the linear regression models in (1), many methods can

estimate its parameter vector θ. Two typical identificationmethods are the stochastic gradient (SG) algorithm [18, 19]and the recursive least-squares (RLS) algorithm [20, 21].The SG algorithm requires lower computational cost, butthe RLS algorithm has a faster convergence rate than theSG algorithm. Many early work studied the convergence ofthe RLS algorithm under different conditions such as theassumptions that the input and output signals (orobservation data) of the system under consideration havefinite non-zero power and the noise is an independent andidentically distributed random sequence with finitefourth-order moments [22], or the observation data{y(t), w(t)} are stationary and ergodic [23].An important breakthrough about the strong consistency of

the RLS algorithm was achieved by Lai and Wei [14], whoobtained the convergence rate of the RLS parameterestimation by assuming that the observation noise v(t)has zero mean and finite second-order momentE[v2(t)|F t−1] = s2 , 1, a.s. where the symbol E denotesthe expectation operator, {v(t)} is a martingale differencesequence with respect to an increasing sequence of σ-fieldF t; that is, v(t) is F t-measurable and E[v(t)|F t−1] = 0 (i.e.{F t} is the σ algebra generated by the observations up toand including time t. Under such assumptions, Lai and Wei[14] proved that the parameter estimation error u(t)− usatisfies

‖u(t)− u‖2 = O{ ln lmax[P

−10 (t)]}c

lmin[P−10 (t)]

( ), a.s., c . 1 (2)

IET Control Theory Appl., 2013, Vol. 7, Iss. 1, pp. 68–79doi: 10.1049/iet-cta.2012.0171

Page 2: Coupled-least-squares identification for multivariable systems

www.ietdl.org

where u(t) represents the estimate of θ at time t,‖X‖2 := tr[XXT], lmax[X] and lmin[X] represent themaximum and minimum eigenvalues of the non-negativedefinite matrix X, respectively, the covariance matrix P0(t) isdefined by (9) in the next section. When lmin[P

−10 (t)] � 1

and lnlmax[P−10 (t)] = o(lmin[P

−10 (t)]), a.s., the estimation

errors approach zero, that is, ‖u(t)− u‖2 � 0 (see Corollary3 in [14]). This conclusion holds under the generalisedpersistent excitation condition [24].Furthermore, supposing that the noise {v(t)} has a finite

higher-order moment E[|v(t)|g|F t−1] = s2, a.s., for someg . 2, Lai and Wei [14] derived the convergence rate ofthe RLS algorithm. Later, some convergence results ofleast-squares or SG (based adaptive control) algorithmsused such an assumption that the higher-order momentexists, including the work of Lai and Wei [25], Wei [26],Lai and Ying [27], Toussi and Ren [28], and Ren andKumar [29].A direct extension of the scalar linear regression model in

(1) is a multiple linear regression model

y(t) = F(t)q+ v(t) (3)

where y(t) [ Rm is the observation output vector of thesystem, q [ Rn is the parameter vector to be identified,F(t) [ Rm×n is the information matrix consisting of thepast input–output data, and v(t) [ Rm is the observationnoise vector with zero mean.Equation (3) may describe a multivariable linear system

(see (4) in [30] or Example 1 later) or a multivariablenon-linear system where y(t) is linear on the parameterspace q [ Rn and the information matrix F(t) is non-linearin the system input--output data (see Example 2 later).Although the multivariable RLS algorithm can be applied to

(3), it requires computing the matrix inversion (see Remark 1in the next section), resulting in large computational burden.This motivates us to study new coupled-least-squares (C-LS)algorithms without involving matrix inversion. Recently, apartially coupled SG algorithm has been proposed fornon-uniformly sampled-data systems to improve theidentification accuracy of the SG algorithm [31]. To the bestof the authors’ knowledge, the coupled parameter estimationmethods and convergence for multiple linear regressionmodels have not been fully investigated, especially the C-LSparameter estimation algorithms and their convergenceproperties, which are the focus of this work. The maincontributions of this paper lie in the following.

† Derive a C-LS parameter estimation algorithm to avoidcomputing the matrix inversion in the multivariable RLSalgorithm, for the purpose of reducing computational load.† Analyse the performance of the C-LS estimation algorithmand prove that the parameter estimation errors given by theC-LS algorithm converge to zero.

The rest of the paper is organised as follows. Section 2describes the identification problem related to multiplelinear regression models or multivariable systems. Section 3derives a C-LS algorithm for multivariable systems anddiscusses the relation between the RLS algorithm and theC-LS algorithm. Section 4 studies the convergence of theproposed C-LS algorithms. Section 5 provides anillustrative example to validate the proposed methods.Finally, Section 6 offers some concluding remarks.

IET Control Theory Appl., 2013, Vol. 7, Iss. 1, pp. 68–79doi: 10.1049/iet-cta.2012.0171

2 Problem formulation

Let us introduce some notations first. The symbol I or In standsfor an identity matrix of appropriate sizes or n× n; thesuperscript T denotes the matrix/vector transpose; the norm ofthe matrix X is defined by ‖X‖2 := tr[XXT]; |X| := det[X]denotes the determinant of a square matrix X; 1n represents ann-dimensional column vector whose elements are all 1; p0 isa large positive number, e.g., p0 = 106; ⊗ denotes theKronecker product, if A = [aij] [ Rm×n, B = [bij] [ R p×q,then A⊗ B = [aijB] [ R(mp)×(nq); col[X] denotes the vectorformed by the columns of the matrix X, that is, ifX = [x1, x2, . . . , xn] [ Rm×n, then col[X] = [xT1 , x

T2 , . . . ,

xTn ]T [ Rmn. The relation f (t) = O(g(t)) means that there

exist positive constants d1 and t0 such that |f (t)| ≤ d1g(t) fort ≥ t0. u(t) and q(t) represent the estimates of θ and ϑ at time t.

2.1 Scalar systems

Referring to [18, 32], the following RLS algorithm canestimate the parameter vector θ of the scalar systems in (1)

u(t) = u(t − 1)+ P0(t)w(t)[y(t)− wT(t)u(t − 1)] (4)

P−10 (t) = P−1

0 (t − 1)+ w(t)wT(t), P0(0) = p0In (5)

In order to avoid the inverse matrix P−10 (t) in (5), defining the

gain vector L0(t) := P0(t)w(t) [ Rn and applying the matrixinversion formula

(A+ BC)−1 = A−1 − A−1B(I + CA−1B)−1CA−1 (6)

to (5), we can obtain the following equivalent expression ofthe RLS algorithm in (4)–(5)

u(t) = u(t − 1)+ L(t)[y(t)− wT(t)u(t − 1)] (7)

L0(t) = P0(t − 1)w(t)/[1+ wT(t)P0(t − 1)w(t)] (8)

P0(t) = [In − L0(t)wT(t)]P0(t − 1), P0(0) = p0In (9)

2.2 Multivariable systems

Consider a multivariable system described by a multiplelinear regression model in (3), rewrite as

y(t) = F(t)q+ v(t) (10)

where y(t) = [y1(t), y2(t), . . . , ym(t)]T [ Rm is the output

vector of the system, q [ Rn is the parameter vector to beidentified, F(t) [ Rm×n is the information matrix consistingof the input-output data u(t − i) and y(t − i), and v(t) [ Rm

is the observation noise vector with zero mean. Withoutloss of generality, we assume that y(t) = 0 and F(t) = 0and v(t) = 0 for t ≤ 0.In order to show the advantages of the proposed C-LS

algorithm in the next section, the following simply givesthe RLS algorithm for estimating ϑ in (10) for comparisons.The following multivariable RLS algorithm can generate

the estimate q(t) of the parameter vector ϑ for the multiplelinear regression model in (3) [33–35]

q(t) = q(t − 1)+ P(t)FT(t)[y(t)−F(t)q(t − 1)] (11)

P−1(t) = P−1(t − 1)+FT(t)F(t), P(0) = p0In (12)

69& The Institution of Engineering and Technology 2013

Page 3: Coupled-least-squares identification for multivariable systems

www.ietdl.org

Similarly, to avoid computing P−1(t) in (12), defining thegain matrix L(t) := P(t)FT(t) [ Rn×m and applying thematrix inversion formula (6)–(12) give the equivalentexpression of the multivariable RLS algorithm in (11)–(12)for the multivariable system in (10)

q(t) = q(t − 1)+ L(t)[y(t)−F(t)q(t − 1)] (13)

L(t) = P(t − 1)FT(t)[Im +F(t)P(t − 1)FT(t)]−1 (14)

P(t) = [In − L(t)F(t)]P(t − 1), P(0) = p0In (15)

Remark 1: For the multivariable RLS algorithm in (13)–(15),we can see from (14) that it requires computing the matrixinversion: [Im +F(t)P(t − 1)FT(t)]−1 [ Rm×m at each step,resulting in heavy computational load, especially for largem (the number of outputs). This is the drawback of themultivariable RLS algorithm in (13)–(15). This motivatesus to study new coupled parameter identification method.This is the objective of this paper.

3 C-LS estimation algorithm

3.1 C-LS algorithm

Referring to the partially coupled SG identification methodsin [31, 32], this section derives a C-LS algorithm for themultivariable systems on the basis of the couplingidentification concept.Let fT

i (t) [ R1×n be the ith row of F(t). From (10), weobtain m identification models (subsystems)

yi(t) = fTi (t)q+ vi(t), i = 1, 2, . . . , m (16)

each of which contains a common parameter vector q [ Rn.Obviously, only one subsystem is sufficient to estimate ϑ butwe must use all subsystems to estimate ϑ so as to enhance theparameter estimation accuracy. This is also the motivation ofthis work. According to the least-squares principle, we canobtain m RLS algorithms for estimating ϑ, the subsystemleast-squares (S-LS) algorithms for short

q(t) = q(t − 1)+ Pi(t)fi(t)[yi(t)−fTi (t)q(t − 1)] (17)

P−1i (t) = P−1

i (t − 1)+fi(t)fTi (t), i = 1, 2, 3, . . . ,m

(18)

Here, Pi(t) [ Rn×n is the covariance matrix of subsystem i,respectively.

70& The Institution of Engineering and Technology 2013

Note from (17) to (18) that each S-LS algorithm contains acommon parameter estimation vector q(t) for each subsystem –see (17). However, they are independent, namely, theestimation vector q(t) of subsystem i does not depend onq(t) of subsystem j for i = j. For the sake of clarity, we useqi(t) as q(t) in (17) of subsystem i and then the S-LSalgorithm in (17) and (18) can be equivalently written as

qi(t) = qi(t − 1)+ Pi(t)fi(t)[yi(t)−fTi (t)qi(t − 1)] (19)

P−1i (t) = P−1

i−1(t)+fi(t)fTi (t), i = 1, 2, 3, . . . , m (20)

This implies that there is no coupling between the parameterestimates qi(t) of the subsystems. The schematic diagram ofthe S-LS algorithm in (19) and (20) is shown in Fig. 1.

Remark 2: For i = 1, 2, 3, . . . , m, we obtain m estimationvectors qi(t) from (19) to (20) which are all the estimatesof the common parameter vector ϑ in all subsystemsbecause the parameter vector ϑ is estimated once in eachsubsystem, and then this will result in a large amount ofredundancy about the estimate of ϑ. Next, we derive a newC-LS method so as to avoid the redundant estimates of ϑ.

For recursive estimation algorithms, it is desired that theparameter estimates approach their true values with the datalength t increasing. Thus, one may replace qi(t − 1) withqi−1(t) [15, 16, 36].Referring to the partially coupled SG algorithm in [31] and

by means of the idea of the Jacobi and Gauss--Seideliterations [37], replacing qi(t − 1) on the right-hand side of(19) with qi−1(t) for i = 2, 3, . . . , m, and replacingq1(t − 1) on the right-hand side of (19) with qm(t − 1) fori = 1 give the following C-LS algorithm [35, 38]

qi(t) = qi−1(t)+ Li(t)[yi(t)−fTi (t)qi−1(t)] (21)

Li(t) = Pi(t)fi(t) (22)

P−1i (t) = P−1

i−1(t)+fi(t)fTi (t), i = 2, 3, . . . , m (23)

and

q1(t) = qm(t − 1)+ L1(t)[y1(t)−fT1 (t)qm(t − 1)] (24)

L1(t) = P1(t)f1(t) (25)

P−11 (t) = P−1

m (t − 1)+f1(t)fT1 (t) (26)

Applying the matrix inversion lemma in (6)–(23) and (26),

Fig. 1 Schematic diagram of the S-LS algorithm

IET Control Theory Appl., 2013, Vol. 7, Iss. 1, pp. 68–79doi: 10.1049/iet-cta.2012.0171

Page 4: Coupled-least-squares identification for multivariable systems

www.ietdl.org

the C-LS algorithm can be equivalently expressed as [38]

qi(t) = qi−1(t)+ Li(t)[yi(t)−fTi (t)qi−1(t)] (27)

Li(t) = Pi−1(t)fi(t)/[1+fTi (t)Pi−1(t)fi(t)] (28)

Pi(t) = [I − Li(t)fTi (t)]Pi−1(t), i = 2, 3, . . . , m (29)

and

q1(t) = qm(t − 1)+ L1(t)[y1(t)−fT1 (t)qm(t − 1)] (30)

L1(t) = Pm(t − 1)f1(t)/[1+fT1 (t)Pm(t − 1)f1(t)] (31)

P1(t) = [I − L1(t)fT1 (t)]Pm(t − 1), Pm(0) = p0In (32)

where qi(t) [ Rn, Li(t) [ Rn and Pi(t) [ Rn×n are theparameter estimation vector, the gain vector and thecovariance matrix of the ith subsystem at time t,respectively; qi−1(t) and Pi−1(t) are the parameterestimation vector and the covariance matrix of the (i− 1)thsubsystem at time t, respectively; qm(t − 1) and Pm(t − 1)are the parameter estimation vector and the covariancematrix of the mth subsystem at time t− 1, respectively.The C-LS algorithm can be obtained in a similar way in

[38] and the schematic diagram of the C-LS algorithm in(21)–(26) is shown in Fig. 2. In Fig. 2, the parameterestimate q1(t) of Subsystem 1 is equal to the estimateqm(t − 1) of Subsystem m at the preceding time t− 1 plusthe modified term L1(t)[y1(t)−fT

1 (t)qm(t − 1)] – see (24)or (30), and the covariance matrix P1(t) of Subsystem 1 attime t is computed through the covariance matrix Pm(t − 1)of Subsystem m at the preceding time t− 1 and the gainvector L1(t) and information vector f1(t) of Subsystem 1 –see (26) or (32). Similarly, the parameter estimate q2(t) ofSubsystem 2 is equal to the estimate q1(t) of Subsystem 1plus the modified term L2(t)[y2(t)−fT

2 (t)q1(t)] – see (21)or (27) with i = 2, and the covariance matrix P2(t) ofSubsystem 2 is computed through the covariance matrixP1(t) of Subsystem 1 and the gain vector L2(t) andinformation vector f2(t) of Subsystem 2 – see (23) or (29)with i = 2. Similar procedure will be conducted as i increases.The steps of computing the estimates qm(t) by the C-LS

algorithm in (27)–(32) are listed in the following.

1. Set the initial values: Let t = 1, qm(0) = 1n/p0,Pm(0) = p0In, p0 = 106.2. Collect the observation data y(t) and F(t), and letfT

i (t) [ R1×n be the ith row of F(t).

IET Control Theory Appl., 2013, Vol. 7, Iss. 1, pp. 68–79doi: 10.1049/iet-cta.2012.0171

3. Compute the gain vector L1(t) by (31) andcovariance matrix P1(t) by (32) and update the estimateq1(t) by (30).4. For i = 2:m

Compute the gain vector Li(t) by (28) and covariancematrix Pi(t) by (29) and update the estimate qi(t) by(27).

end

5. Increase t by 1 and go to Step 2.

Remark 3: The C-LS algorithm in (27)–(32) uses the estimateqi−1(t) on the right-hand side of (27) instead of qi(t − 1) onthe right-hand side of (19) for i = 2, 3, . . . , m, throughobserving (27) and (19). When computing q1(t), the C-LSalgorithm in (27)–(32) uses the estimate qm(t − 1) on theright-hand side of (30) instead of q1(t − 1) on theright-hand side of (19) with i = 1, through observing (30)and (19) with i = 1. Thus, the C-LS algorithm is differentfrom the S-LS algorithm.

Remark 4: The RLS algorithm in (13)–(15) requires computingthe matrix inversion [Im +F(t)P(t − 1)FT(t)]−1 – see (14),but the C-LS algorithm in (27)–(30) doest not involve thismatrix inversion. Thus the C-LS algorithm is superior to themultivariable RLS algorithm in (13)–(15).

3.2 Relation between the RLS and C-LS algorithms

About the parameter estimate qm(t) and the covariancematrix Pm(t) of Subsystem m, we have the followingtheorem [38].

Theorem 1: The parameter estimate qm(t) and the covariancematrix Pm(t) of Subsystem m in (27)–(29) with i = m areequivalent to the estimate q(t) and covariance matrix P(t)in (13)–(15), that is, q(t) = qm(t) and P(t) = Pm(t).

Proof: From (21)–(23) with i = 2, we have

q2(t) = q1(t)+ L2(t)[y2(t)−fT2 (t)q1(t)] (33)

L2(t) = P2(t)f2(t) (34)

P−12 (t) = P−1

1 (t)+f2(t)fT2 (t) (35)

Fig. 2 Schematic diagram of the C-LS algorithm [38]

71& The Institution of Engineering and Technology 2013

Page 5: Coupled-least-squares identification for multivariable systems

www.ietdl.org

Substituting (24) into (33) yields

q2(t)= q1(t)+L2(t)[y2(t)−fT2 (t)q1(t)]

= qm(t−1)+L1(t)[y1(t)−fT1 (t)qm(t−1)]+L2(t)(y2(t)

−fT2 (t){qm(t−1)+L1(t)[y1(t)−fT

1 (t)qm(t−1)]})

= qm(t−1)+ [I−L2(t)fT2 (t)]L1(t)[y1(t)

−fT1 (t)qm(t−1)]+L2(t)[y2(t)−fT

2 (t)qm(t−1)]

(36)

Substituting (26) into (35) yields

P−12 (t) = P−1

m (t − 1)+∑2i=1

fi(t)fTi (t) (37)

Successive substitution gives

qm(t) = qm(t − 1)+∏mi=2

[I − Li(t)fTi (t)]

{ }L1(t)

× [y1(t)−fT1 (t)qm(t − 1)]+

∏mi=3

[I − Li(t)fTi (t)]

{ }

× L2(t)[y2(t)−fT2 (t)qm(t − 1)]+ · · ·

+ [I − Lm(t)fTm(t)]Lm−1(t)

× [ym−1(t)−fTm−1(t)qm(t − 1)]

+ Lm(t)[ym(t)−fTm(t)qm(t − 1)] (38)

Lm(t) = Pm(t)fm(t) (39)

P−1m (t) = P−1

m (t − 1)+∑mi=1

fi(t)fTi (t) (40)

From (29), we have

Pi(t) = [I − Li(t)fTi (t)]Pi−1(t)

= [I − Li(t)fTi (t)][I − Li−1(t)f

Ti−1(t)]Pi−2(t)

= [I − Li(t)fTi (t)][I − Li−1(t)f

Ti−1(t)]

[I − Li−2(t)fTi−2(t)]Pi−3(t)

= · · · · · ·= [I − Li(t)f

Ti (t)][I − Li−1(t)f

Ti−1(t)]

× [I − Li−2(t)fTi−2(t)] · · · [I − L2(t)f

T2 (t)]P1(t)

Let i = m. Post-multiplying the ith equation of the aboveequation by fm−i(t) (i = 1, 2, . . . , m− 1) and using (25),

72& The Institution of Engineering and Technology 2013

(34) and (39) yield

Pm(t)fm−1(t) = [I − Lm(t)fTm(t)]Pm−1(t)fm−1(t)

= [I − Lm(t)fTm(t)]Lm−1(t) (41)

Pm(t)fm−2(t) = [I − Lm(t)fTm(t)][I − Lm−1(t)f

Tm−1(t)]

× Pm−2(t)fm−2(t)

= [I − Lm(t)fTm(t)][I − Lm−1(t)f

Tm−1(t)]

× Lm−2(t) (42)

..

.

Pm(t)f2(t) =∏mi=3

[I − Li(t)fTi (t)]P2(t)f2(t)

=∏mi=3

[I − Li(t)fTi (t)]L2(t)

(43)

Pm(t)f1(t) =∏mi=2

[I − Li(t)fTi (t)]P1(t)f1(t)

=∏mi=2

[I − Li(t)fTi (t)]L1(t)

(44)

Using (41)–(44), (38)–(40) can be written as

qm(t) = qm(t − 1)+ Pm(t)f(t)[y(t)−fT(t)qm(t − 1)]

(45)

Lm(t) = Pm(t)fm(t) (46)

P−1m (t) = P−1

m (t − 1)+f(t)fT(t) (47)

The algorithm in (45)–(47) is equivalent to the algorithm in(13)–(15). This proves Theorem 1.

Remark 5: Theorem 1 indicates that the parameter estimatesgiven by the C-LS algorithm in (27)–(32) equal those of theRLS algorithm in (13)–(15), but the C-LS algorithm doesnot involve the matrix inversion [Im +F(t)P(t − 1)FT(t)]−1 – see the C-LS algorithm in (27)–(30) and themultivariable RLS algorithm in (19)–(20).Theorem 1 and its proof are taken from reference [38]. Thefollowing studies the convergence of the C-LS algorithm.

4 Performance analysis

The C-LS algorithm differs from the RLS algorithm andthus its convergence is important. This section analyses theconvergence properties of the proposed C-LS algorithm in(21)–(26). The convergence of the RLS algorithm in (11)–(12) has been reported in [14, 18, 39].Assume that {vi(t), F t} (i = 1, 2, . . . , m) is a martingale

difference sequence defined on a probability space{V, F , P}, where {F t} is the σ algebra sequence generatedby the observations up to and including t [18].The noise sequence {vi(t)} satisfies the following

IET Control Theory Appl., 2013, Vol. 7, Iss. 1, pp. 68–79doi: 10.1049/iet-cta.2012.0171

Page 6: Coupled-least-squares identification for multivariable systems

www.ietdl.org

assumptions [18]:

(A1) E[vi(t)|F t−1] = 0,

(A2) E[v2i (t)|F t−1] ≤ s2 , 1, a.s., (a.s.: almost surely)

Lemma 1 (martingale convergence theorem: Lemma D.5.3 in[18]): If Tt, at, bt are non-negative random variables,measurable with respect to a non-decreasing sequence of σalgebra F t−1, and satisfy

E[Tt|F t−1] ≤ Tt−1 + at − bt, a.s.

then when∑1

t=1 at , 1, a.s., we have∑1

t=1 bt , 1, a.s.,and Tt � T , a.s., a finite random variable.

Lemma 2: For the C-LS algorithm in (21)–(26), the followinginequalities hold:

1.∑1t=1

∑mi=1

fTi (t)Pi(t)fi(t)

[ ln |P−1i (t)|]c , 1, a.s., c . 1

2.∑1t=1

∑mi=1

fTi (t)Pi(t)fi(t)

ln |P−1i (t)|[ ln ln |P−1

i (t)|]c , 1, a.s., c . 1

3.∑1t=1

∑mi=1

fTi (t)Pi(t)fi(t)

ln |P−1i (t)| ln ln |P−1

i (t)|[ ln ln ln |P−1i (t)|]c , 1,

a.s., c . 1

4.∑1t=1

∑mi=1

fTi (t)Pi(t)fi(t)

ln |P−1i (t)| ln ln |P−1

i (t)|ln ln ln |P−1

i (t)|[ ln ln ln ln |P−1i (t)|]c

, 1,

a.s., c . 1

Proof: According to (23), we have

P−1i−1(t) = P−1

i (t)−fi(t)fTi (t)

= P−1i (t)[I − Pi(t)fi(t)f

Ti (t)]

Taking the determinant of both sides of the above equationand using the formula det[Im + DE] = det[In + ED] give

|P−1i−1(t)| = |P−1

i (t)||I − Pi(t)fi(t)fTi (t)|

= |P−1i (t)|[1−fT

i (t)Pi(t)fi(t)]

IET Control Theory Appl., 2013, Vol. 7, Iss. 1, pp. 68–79doi: 10.1049/iet-cta.2012.0171

Solving for fTi (t)Pi(t)fi(t) gives

fTi (t)Pi(t)fi(t) =

|P−1i (t)| − |P−1

i−1(t)||P−1

i (t)| , i = 2, 3, . . . , m

(48)

Similarly, according to (26), we have

fT1 (t)P1(t)f1(t) =

|P−11 (t)| − |P−1

m (t − 1)||P−1

1 (t)| (49)

† For part 1, dividing both sides of (48) by [ ln |P−1i (t)|]c and

summing for i from i = 2 to m give

∑mi=2

fTi (t)Pi(t)fi(t)

[ ln |P−1i (t)|]c =

∑mi=2

|P−1i (t)| − |P−1

i−1(t)||P−1

i (t)|[ ln |P−1i (t)|]c (50)

Dividing both sides of (49) by [ ln |P−11 (t)]c gives

fT1 (t)P1(t)f1(t)

[ ln |P−11 (t)|]c = |P−1

1 (t)| − |P−1m (t − 1)|

|P−11 (t)|[ ln |P−1

1 (t)|]c (51)

Adding (51) to (50), we have (see equation at the bottom ofthe page)

Summing for t from t = 1 to 1 gives (see equation at thebottom of the page)

This proves part 1 of Lemma 2.† For part 2, dividing both sides of (48) by

ln |P−1i (t)|[ ln ln |P−1

i (t)|]c and summing for i from i = 2 tom give

∑mi=2

fTi (t)Pi(t)fi(t)

ln |P−1i (t)|[ ln ln |P−1

i (t)|]c

=∑mi=2

|P−1i (t)| − |P−1

i−1(t)||P−1

i (t)| ln |P−1i (t)|[ ln ln |P−1

i (t)|]c (52)

Dividing both sides of (49) by ln |P−11 (t)|[ ln ln |P−1

1 (t)]c gives

fT1 (t)P1(t)f1(t)

ln |P−11 (t)|[ ln ln |P−1

1 (t)|]c =|P−1

1 (t)|− |P−1m (t−1)|

|P−11 (t)| ln |P−1

1 (t)|[ lnln |P−11 (t)|]c

(53)

∑mi=1

fTi (t)Pi(t)fi(t)

[ ln |P−1i (t)|]c = |P−1

1 (t)| − |P−1m (t − 1)|

|P−11 (t)|[ ln |P−1

1 (t)|]c +∑mi=2

|P−1i (t)| − |P−1

i−1(t)||P−1

i (t)|[ ln |P−1i (t)|]c =

∫|P−11 (t)|

|P−1m (t−1)|

dx

|P−11 (t)|[ ln |P−1

1 (t)|]c

+∑mi=2

∫|P−1i (t)|

|P−1i−1(t)|

dx

|P−1i (t)|[ ln |P−1

i (t)|]c ≤∫|P−1

1 (t)|

|P−1m (t−1)|

dx

x[ ln x]c+

∑mi=2

∫|P−1i (t)|

|P−1i−1(t)|

dx

x[ ln x]c=

∫|P−1m (t)|

|P−1m (t−1)|

dx

x[ ln x]c

∑1t=1

∑mi=1

fTi (t)Pi(t)fi(t)

[ ln |P−1i (t)|]c =

∑1t=1

∫|P−1m (t)|

|P−1m (t−1)|

dx

x[ ln x]c=

∫|P−1m (1)|

|P−1m (0)|

dx

x[ ln x]c= −1

c− 1

1

[ ln x]c−1

∣∣∣∣|P−1

m (1)|

|P−1m (0)|

= 1

c− 1

1

[ ln |P−1m (0)|]c−1 −

1

[ ln |P−1m (1)|]c−1

[ ], 1, a.s.

73& The Institution of Engineering and Technology 2013

Page 7: Coupled-least-squares identification for multivariable systems

www.ietdl.org

Adding (53) to (52), we have

∑mi=1

fTi (t)Pi(t)fi(t)

ln |P−1i (t)|[ ln ln |P−1

i (t)|]c

= |P−11 (t)| − |P−1

m (t − 1)||P−1

1 (t)| ln |P−11 (t)|[ ln ln |P−1

1 (t)|]c

+∑mi=2

|P−1i (t)| − |P−1

i−1(t)||P−1

i (t)| ln |P−1i (t)|[ ln ln |P−1

i (t)|]c

=∫|P−1

1 (t)|

|P−1m (t−1)|

dx

|P−11 (t)| ln |P−1

1 (t)|[ ln ln |P−11 (t)|]c

+∑mi=2

∫|P−1i (t)|

|P−1i−1(t)|

dx

|P−1i (t)| ln |P−1

i (t)|[ ln ln |P−1i (t)|]c

≤∫|P−1

1 (t)|

|P−1m (t−1)|

dx

x ln x[ ln ln x]c+

∑mi=2

∫|P−1i (t)|

|P−1i−1(t)|

dx

x ln x[ ln ln x]c

=∫|P−1

m (t)|

|P−1m (t−1)|

dx

x ln x[ ln ln x]c

Summing for t from t = 1 to 1 gives

∑1t=1

∑mi=1

fTi (t)Pi(t)fi(t)

ln |P−1i (t)|[ ln ln |P−1

i (t)|]c

=∑1t=1

∫|P−1m (t)|

|P−1m (t−1)|

dx

x ln x[ ln ln x]c=

∫|P−1m (1)|

|P−1m (0)|

dx

x ln x[ ln ln x]c

= −1

c− 1

1

[ ln ln x]c−1

∣∣∣∣|P−1

m (1)|

|P−1m (0)|

= 1

c− 1

1

[ ln ln |P−1m (0)|]c−1 −

1

[ ln ln |P−1m (1)|]c−1

[ ], 1, a.s.

This proves part 2 of Lemma 2.† Similarly, we can prove parts 3 and 4 of Lemma 2. This

completes the proof of Lemma 2.

Theorem: For the identification model in (16) and the C-LSalgorithm in (21)–(26), suppose that (A1) and (A2) hold.Then for any c . 1, the parameter estimates q(t) = qm(t)satisfies

1. ‖q(t)− q‖2 = O{ ln tr[P−1

m (t)]}c

lmin[P−1m (t)]

( ), a.s.

2. ‖q(t)− q‖2 = Oln tr[P−1

m (t)][ ln ln tr[P−1m (t)]}c

lmin[P−1m (t)]

( ), a.s.

3. ‖q(t)− q‖2 = O

ln tr[P−1m (t)]{ ln ln tr[P−1

m (t)]}×{ ln ln ln tr[P−1

m (t)]}c

lmin[P−1m (t)]

⎛⎜⎝

⎞⎟⎠, a.s.

4. ‖q(t)− q‖2 = O

ln tr[P−1m (t)]{ ln ln tr[P−1

m (t)]}×{ ln ln ln tr[P−1

m (t)]}×{ ln ln ln ln tr[P−1

m (t)]}c

lmin[P−1m (t)]

⎛⎜⎜⎜⎝

⎞⎟⎟⎟⎠

a.s.,

These give the convergence rates of the parameter estimates.

74& The Institution of Engineering and Technology 2013

Proof: Define the parameter estimation error vectors

qi(t) := qi(t)− q, i = 1, 2, . . . , m

Using (21), (22) and (16), it follows that

qi(t)= qi−1(t)+Pi(t)fi(t)[−fTi (t)qi−1(t)+ vi(t)]

=: qi−1(t)+Pi(t)fi(t)[− ji(t)+ vi(t)], i= 2, 3, . . . ,m

(54)

where

ji(t) := fTi (t)qi−1(t)−fT

i (t)q

= fTi (t)qi−1(t), i = 2, 3, . . . , m (55)

Using (24) and (16), it follows that

q1(t) = qm(t − 1)+ P1(t)f1(t)[−fT1 (t)qm(t − 1)+ v1(t)]

=: qm(t − 1)+ P1(t)f1(t)[−j1(t)+ v1(t)] (56)

where

j1(t) := fT1 (t)qm(t − 1)−fT

1 (t)q

= fT1 (t)qm(t − 1) (57)

Define the non-negative functions

Vi(t) := qT

i (t)P−1i (t)qi(t), i = 1, 2, . . . , m

Using (54), (55) and (23) and tr[AB] = tr[BA] andtr[AT] = tr[A], it follows that

Vi(t) = qT

i (t)P−1i (t)qi(t)

= {qi−1(t)+ Pi(t)fi(t)[−ji(t)+ vi(t)]}TP−1

i (t)

× {qi−1(t)+ Pi(t)fi(t)[−ji(t)+ vi(t)]}

= qT

i−1(t)P−1i (t)qi−1(t)+ 2q

T

i (t)fi(t)[−ji(t)+ vi(t)]

+fTi (t)Pi(t)fi(t)[−ji(t)+ vi(t)]

2

= qT

i−1(t)[P−1i−1(t)+fi(t)f

Ti (t)]qi−1(t)

+ 2ji(t)[−ji(t)+ vi(t)]+fTi (t)Pi(t)fi(t)[j

2i (t)

+ v2i (t)− 2ji(t)vi(t)]

= Vi−1(t)− [1−fTi (t)Pi(t)fi(t)]j

2i (t)

+fTi (t)Pi(t)fi(t)v

2i (t)+ 2[1

−fTi (t)Pi(t)fi(t)]ji(t)vi(t)

≤ Vi−1(t)+fTi (t)Pi(t)fi(t)v

2i (t)+ 2[1

−fTi (t)Pi(t)fi(t)]ji(t)vi(t), i = 2, 3, . . . , m

(58)

IET Control Theory Appl., 2013, Vol. 7, Iss. 1, pp. 68–79doi: 10.1049/iet-cta.2012.0171

Page 8: Coupled-least-squares identification for multivariable systems

www.ietdl.org

Here, we have used the inequality

1−fTi (t)Pi(t)fi(t) = [1+fT

i (t)Pi−1(t)fi(t)]−1 . 0,

i = 2, 3, . . . , m

From (58) with i = m, successive substitution yields

Vm(t) ≤ V1(t)+∑mi=2

fTi (t)Pi(t)fi(t)v

2i (t)

+∑mi=2

2[1−fTi (t)Pi(t)fi(t)]ji(t)vi(t) (59)

Similarly, using (56), (57) and (26), we have

V1(t) = qT

1 (t)P−11 (t)q1(t)

= qT

m(t − 1)P−11 (t)qm(t − 1)+ 2q

T

m(t − 1)f1(t)

[−j1(t)+ v1(t)]+fT1 (t)P1(t)f1(t)[−j1(t)+ v1(t)]

2

= qT

m(t − 1)[P−1m (t − 1)+f1(t)f

T1 (t)]qm(t − 1)

+ 2j1(t)[− j1(t)+ v1(t)]+fT1 (t)P1(t)f1(t)

× [j21(t)+ v21(t)− 2j1(t)v1(t)]

= Vm(t − 1)− [1−fT1 (t)P1(t)f1(t)]j

21(t)

+fT1 (t)P1(t)f1(t)v

21(t)+ 2[1−fT

1 (t)

× P1(t)f1(t)]j1(t)v1(t)

≤ Vm(t − 1)+fT1 (t)P1(t)f1(t)v

21(t)

+ 2[1−fT1 (t)P1(t)f1(t)]j1(t)v1(t) (60)

Here, we have used the inequality

1−fT1 (t)P1(t)f1(t) = [1+fT

1 (t)Pm(t − 1)f1(t)]−1 . 0

Substituting (60) into (59) gives

Vm(t) ≤ Vm(t − 1)+∑mi=1

fTi (t)Pi(t)fi(t)v

2i (t)

+∑mi=2

2[1−fTi (t)Pi(t)fi(t)]ji(t)vi(t) (61)

Since ji(t), fTi (t)Pi(t)f(t) and ji(t) are uncorrelated with vi(t)

and are F t−1 measurable, taking the conditional expectationof both sides of (61) with respect to F t−1 and using (A1) give

E[Vm(t)|F t−1] ≤ Vm(t − 1)

+∑mi=1

fTi (t)Pi(t)fi(t)s

2, a.s. (62)

Let

Z(t) = Vm(t)

[ ln |P−1m (t)|]c , c . 1

Since ln |P−1(t)| is non-decreasing and

P−1m (t) ≥ P−1

m−1(t) ≥ · · · ≥ P−11 (t) ≥ P−1

m (t − 1)

IET Control Theory Appl., 2013, Vol. 7, Iss. 1, pp. 68–79doi: 10.1049/iet-cta.2012.0171

we have

E[Z(t)|F t−1] ≤Vm(t − 1)

[ ln |P−1m (t)|]c +

∑mi=1

2fTi (t)Pi(t)fi(t)

[ ln |P−1m (t)|]c s2

≤ Vm(t − 1)

[ ln |P−1m (t − 1)|]c +

∑mi=1

2fTi (t)Pi(t)fi(t)

[ ln |P−1i (t)|]c s2

≤ Z(t − 1)+∑mi=1

2fTi (t)Pi(t)fi(t)

[ ln |P−1i (t)|]c s2, a.s. (63)

Using Lemma 2, we know that the summation of the secondterm on the right-hand side for t from t = 1 to t = 1 is finiteand thus applying the martingale convergence theorem inLemma 1 can draw that Z(t), a.s., converge to a finiterandom variable, say, Z0, that is,

Z(t) = Vm(t)

[ ln |P−1m (t)|]c � Z0 , 1, a.s.

or

Vm(t) = O([ ln |P−1m (t)|]c), a.s. (64)

According to the definition of Vm(t), we have

‖qm(t)‖2 ≤tr[q

T

m(t)P−1m (t)qm(t)]

lmin[P−1m (t)]

= Vm(t)

lmin[P−1m (t)]

(65)

Using (64) and (65), we can obtain that

‖q(t)− q‖2 = ‖qm(t)‖2 = O[ ln |P−1

m (t)|]clmin[P

−1m (t)]

( )

= O{ ln tr[P−1

m (t)]}c

lmin[P−1m (t)]

( ), a.s., c . 1

This proves Conclusion 1 of Theorem 2. Similarly, letting

Z2(t) =Vm(t)

ln |P−1m (t)|[ ln ln |P−1

m (t)|]c

Z3(t) =Vm(t)

ln |P−1m (t)| ln ln |P−1

m (t)|[ ln ln ln |P−1m (t)|]c

Z4(t) =Vm(t)

ln |P−1m (t)| ln ln |P−1

m (t)| ln ln ln |P−1m (t)|

×[ ln ln ln ln |P−1m (t)|]c

we can obtain Conclusion 2 of Theorem 2.Since ln lmax[P

−1m (t)] = O( ln tr[P−1

m (t)]), the conclusionsof Theorem 2 can be expressed as

1. ‖q(t)− q‖2 = O{ ln lmax[P

−1m (t)]}c

lmin[P−1m (t)]

( ), a.s.

2. ‖q(t)− q‖2 = Oln lmax[P

−1m (t)][ ln ln lmax[P

−1m (t)]}c

lmin[P−1m (t)]

( ),

a.s.

3. ‖q(t)− q‖2 = O

ln lmax[P−1m (t)]{ ln ln lmax[P

−1m (t)]}

×{ ln ln ln lmax[P−1m (t)]}c

lmin[P−1m (t)]

⎛⎜⎝

⎞⎟⎠,

a.s.

75& The Institution of Engineering and Technology 2013

Page 9: Coupled-least-squares identification for multivariable systems

www.ietdl.org

4. ‖q(t)− q‖2 = O

ln lmax[P−1m (t)]{ ln ln lmax[P

−1m (t)]}

×{ ln ln ln lmax[P−1m (t)]}

×{ ln ln ln ln lmax[P−1m (t)]}c

lmin[P−1m (t)]

⎛⎜⎜⎜⎝

⎞⎟⎟⎟⎠,

a.s.

Remark 6: Theorem 2 indicates that four conclusionsconverges faster in turn. For the persistently excited condition

c1In ≤1

t

∑t

j=1

∑mi=1

fi(j)fTi (j) ≤ c2In, a.s., c1, c2

. 0, for large t

the parameter estimation error converges to zero. In this case,Conclusions~1 and 4 in Theorem 2 can be expressed as

1. ‖q(t)− q‖2 = O{ ln t}c

t

( ), a.s.

4. ‖q(t)− q‖2

= Oln t{ ln ln t}{ ln ln ln t}{ ln ln ln ln t}c

t

( ), a.s.

Since ( ln t)c/t � 0, Conclusion 4 converges to zero fasterthan Conclusion 2 as t increases.

5 Examples

Example 1: Consider the following linear multivariablesystem with two-input two-output in [30]

a(z)y(t) = Q(z)u(t)+ v(t) (68)

a(z) = 1+ a1z−1 + a2z

−2 + a3z−3 = 1− 1.15z−1

+ 0.425z−2 − 0.05z−3

Q(z) = Q1z−1 + Q2z

−2 + Q3z−3

= 1.0 1.0

1.2 1.2

[ ]z−1 + −0.900 −0.750

−1.080 −0.780

[ ]z−2

+ 0.200 0.125

0.240 0.120

[ ]z−3

76& The Institution of Engineering and Technology 2013

a = [a1, a2, a3]T = [−1.15, 0.425, −0.05]T

uT = [Q1, Q2, Q3]

= 1.0 1.0 −0.90 −0.75 0.20 0.125

1.2 1.2 −1.08 −0.78 0.24 0.120

[ ]

Here, z−1 is a unit backward shift operator: z−1y(t) = y(t − 1).Referring to [30], this example system in (68) can be writtenas the multiple linear regression model in (10):

y(t)=F(t)q+ v(t)

y(t)= y1(t)

y2(t)

[ ][R2 (m= 2), q= a

col[u]

[ ][R15

F(t)= [− y(t−1), − y(t−2), − y(t−3), I3⊗wT(t)][R2×15

w(t)= [uT(t−1), uT(t−2), uT(t−3)]T [R6

Here, u(t) = [u1(t), u2(t)]T is taken as a persistent excitation

vector sequence with zero mean and unit variances, andv(t) = [v1(t), v2(t)]

T as a white noise vector sequence withzero mean and variances s2

1 = 0.402 for v1(t) ands22 = 0.502 for v2(t). Taking the initial values,

qm(0) = 10−6115 and Pm(0) = 106I15, we apply the C-LSalgorithm to estimate the parameters of this examplesystem. The parameter estimates and their errors withdifferent data lengths t are shown in Table 1, the parameterestimates qi(t) against t are shown in Figs. 3 and 4, and theparameter estimation errors d := ‖q(t)− q‖/‖q‖ × 100%against t are shown in Fig. 5.

Fig. 3 C-LS parameter estimatesq1, q3, q4 against t for Example 1

Table 1 C-LS estimates and errors of Example 1

t 100 200 500 1000 2000 3000

q1 = −1.15000 − 1.16794 − 1.15772 − 1.15401 − 1.14562 − 1.15091 − 1.15737q2 = 0.42500 0.39626 0.36311 0.41306 0.41037 0.41438 0.42064q3 = −0.05000 − 0.02040 − 0.02497 − 0.04809 − 0.05522 − 0.04926 − 0.04926q4 = 1.00000 1.02124 1.02646 1.04943 1.03720 1.01908 1.01250q5 = 1.00000 1.01556 1.01599 1.02328 1.01286 1.00220 0.99628q6 = −0.90000 − 1.00549 − 0.95900 − 0.92778 − 0.89941 − 0.89944 − 0.91008q7 = −0.75000 − 0.80942 − 0.79887 − 0.76220 − 0.74874 − 0.74525 − 0.75443q8 = 0.20000 0.20365 0.14421 0.19814 0.20167 0.19542 0.19865q9 = 0.12500 0.10642 0.07419 0.14422 0.13381 0.12476 0.13014q10 = 1.20000 1.13451 1.21313 1.18654 1.20116 1.18895 1.19340q11 = 1.20000 1.13329 1.18220 1.20673 1.20385 1.19122 1.19445q12 = −1.08000 − 1.10567 − 1.07436 − 1.09769 − 1.08397 − 1.08090 − 1.08941q13 = −0.78000 − 0.79925 − 0.76553 − 0.73406 − 0.77486 − 0.79328 − 0.79670q14 = 0.24000 0.14586 0.11403 0.18240 0.18985 0.20089 0.21674q15 = 0.12000 0.04817 0.06852 0.11613 0.11624 0.12557 0.12492the errors δ (%) 6.55609 6.12850 3.28471 2.15333 1.59692 1.20391

IET Control Theory Appl., 2013, Vol. 7, Iss. 1, pp. 68–79doi: 10.1049/iet-cta.2012.0171

Page 10: Coupled-least-squares identification for multivariable systems

www.ietdl.org

Example 2: Consider the multiple linear regression modely(t) = F(t)q+ v(t) with

F(t) = −y1(t − 1), y1(t − 2) sin (y2(t − 2)), y2(t − 1)

−y1(t − 1), y1(t − 2) sin (t/p), y2(t − 1),

[

y2(t − 2)u1(t − 2), u1(t − 1), u1(t − 2)u2(t − 2),

y1(t − 2)u2(t − 2) u21(t − 1), sin (u2(t − 2)),

u2(t − 1) cos (t)

u1(t − 1)+ u2(t − 2)

][ R2×7

Fig. 4 C-LS parameter estimates q5, q6 , q9 against t for Example 1

Fig. 5 C-LS estimation errors δ against t for Example 1

IET Control Theory Appl., 2013, Vol. 7, Iss. 1, pp. 68–79doi: 10.1049/iet-cta.2012.0171

Simulation conditions are similar to those of Example 1 andv(t) = [v1(t), v2(t)]

T as a white noise vector sequence withzero mean and variances s2. The simulation results areshown in Tables 2 and 3 and Figs. 6–8 with s2 = 0.102

and s2 = 0.502.From Tables 1–3 and Figs. 3–7, we can see that the

parameter estimation errors δ become smaller (in general)with t increasing and a low noise level leads to highaccurate parameter estimates.

Fig. 6 C-LS parameter estimates q1, q2, q3, q4 against t forExample 2 (s2 = 0.502)

Fig. 7 C-LS parameter estimates q5, q6 , q7 against t for Example2 (s2 = 0.502)

Table 2 C-LS estimates and errors of Example 2 (σ2 = 102)

t q1 q2 q3 q4 q5 q6 q7 δ (%)

100 0.85043 0.45006 0.10043 − 0.57774 0.22998 − 0.48421 1.35311 0.93964200 0.84932 0.45065 0.09901 − 0.57763 0.22748 − 0.49671 1.35853 0.27999500 0.84926 0.45096 0.09873 − 0.57884 0.22882 − 0.50169 1.35853 0.177981000 0.84968 0.45051 0.09960 − 0.58007 0.22854 − 0.50247 1.35854 0.178522000 0.84990 0.45030 0.10000 − 0.57992 0.22952 − 0.50110 1.35790 0.131903000 0.84991 0.45017 0.10002 − 0.57992 0.22890 − 0.49989 1.35828 0.11111true values 0.85000 0.45000 0.10000 − 0.58000 0.23000 − 0.50000 1.36000

Table 3 C-LS estimates and errors of Example 2 (σ2 = 0.502)

t q1 q2 q3 q4 q5 q6 q7 δ(%)

100 0.86049 0.44792 0.11783 − 0.58436 0.22370 − 0.42426 1.32419 4.68368200 0.85153 0.45229 0.10136 − 0.57570 0.21527 − 0.47891 1.35129 1.49557500 0.85049 0.45356 0.09954 − 0.57671 0.22210 − 0.50540 1.35084 0.763481000 0.85139 0.45060 0.09956 − 0.57902 0.22220 − 0.51147 1.35195 0.872282000 0.85062 0.45015 0.09957 − 0.57954 0.22747 − 0.50501 1.34969 0.636303000 0.85074 0.44945 0.10035 − 0.57870 0.22457 − 0.49919 1.35151 0.55354true values 0.85000 0.45000 0.10000 − 0.58000 0.23000 − 0.50000 1.36000

77& The Institution of Engineering and Technology 2013

Page 11: Coupled-least-squares identification for multivariable systems

www.ietdl.org

6 Conclusions

Referring to [38], a C-LS algorithm is developed formultivariable systems in order to avoid computing thematrix inversion. The convergence of the C-LS algorithm isstudied using the martingale convergence theorem. TheC-LS algorithms have the following properties.

† The parameter estimates given by the C-LS algorithmconverge to their true values as the data length increases.† The proposed C-LS algorithm requires lowercomputational load and achieves highly accurate parameterestimates.† With the noise-to-signal ratios decreasing, the convergencerate of the parameter estimation of the C-LS algorithmsbecomes faster.

The basic idea of the proposed coupled identificationmethods can be extended to linear multivariable systems[40–45], non-linear multivariable systems [46, 47],non-uniformly sampled systems [48, 49] or other systemswith colored noises [50–52].

7 Acknowledgments

This work was supported by the National Natural ScienceFoundation of China (grant no. 61273194), the NaturalScience Foundation of Jiangsu Province China (grant no.BK2012549) and by the 111 Project (grant no. B12018).

8 References

1 Yan, M., Shi, Y.: ‘Robust discrete-time sliding mode control foruncertain systems with time-varying state delay’, IET Control TheoryAppl., 2008, 2, (8), pp. 662–674

2 Shi, Y., Yu, B.: ‘Output feedback stabilization of networked controlsystems with random delays modeled by Markov chains’, IEEE Trans.Autom. Control, 2009, 54, (7), pp. 1668–1674

3 Shi, Y., Fang, H.: ‘Kalman filter based identification for systems withrandomly missing measurements in a network environment’,Int. J. Control, 2010, 83, (3), pp. 538–551

4 Ding, F., Chen, T.: ‘Performance bounds of the forgetting factor leastsquares algorithm for time-varying systems with finite measurementdata’, IEEE Trans. Circuits Syst. I, Regul. Pap., 2005, 52, (3),pp. 555–566

5 Ding, F., Chen, T.: ‘Hierarchical identification of lifted state-spacemodels for general dual-rate systems’, IEEE Trans. Circuits Syst. I,Regul. Pap., 2005, 52, (6), pp. 1179–1187

6 Yuz, J.I., Alfaro, J., Agüero, J.C., Gooodwin, G.C.: ‘Identification ofcontinuous-time state-space models from non-uniform fast-sampleddata’, IET Control Theory Appl., 2011, 5, (7), pp. 842–855

Fig. 8 C-LS estimation errors δ against t for Example 2 withs2 = 0.102 and s2 = 0.502

78& The Institution of Engineering and Technology 2013

7 Herrera, J., Ibeas, A., Alcántara, S., Sen, M.D.L.: ‘Multimodel-basedtechniques for the identification and adaptive control of delayedmulti-input multi-output systems’, IET Control Theory Appl., 2011, 5,(1), pp. 188–202

8 Wang, D.Q.: ‘Least squares-based recursive and iterative estimation foroutput error moving average systems using data filtering’, IET ControlTheory Appl., 2011, 5, (14), pp. 1648–1657

9 Xie, L., Liu, Y.J., Yang, H.Z., et al.: ‘Modeling and identification fornon-uniformly periodically sampled-data systems’, IET ControlTheory Appl., 2010, 4, (5), pp. 784–794

10 Bai, E., Cai, Z.: ‘How nonlinear parametric Wiener system identificationIs under Gaussian inputs? IEEE Trans. Autom. Control, 2012, 57, (3),pp. 738–742

11 Ding, J., Ding, F., Liu, X.P., et al.: ‘Hierarchical least squaresidentification for linear SISO systems with dual-rate sampled-data’,IEEE Trans. Autom. Control, 2011, 56, (11), pp. 2677–2683

12 Mercére, G., Bako, L.: ‘Parameterization and identification ofmultivariable state-space systems: a canonical approach’, Automatica,2011, 47, (8), pp. 1547–1555

13 Schön, T., Wills, A., Ninness, B.: ‘System identification of nonlinearstate-space models’, Automatica, 2011, 47, (1), pp. 39–49

14 Lai, T.L., Wei, C.Z.: ‘Least squares estimates in stochastic regressionmodels with applications to identification and control of dynamicsystems’, Ann. Stat., 1982, 10, (1), pp. 154–166

15 Ding, F., Chen, T.: ‘Performance analysis of multi-innovation gradienttype identification methods’, Automatica, 2007, 43, (1), pp. 1–14

16 Ding, F., Liu, X.P., Liu, G.: ‘Auxiliary model based multi-innovationextended stochastic gradient parameter estimation with coloredmeasurement noises’, Signal Process., 2009, 89, (10), pp. 1883–1890

17 Ding, J., Ding, F.: ‘Bias compensation based parameter estimation foroutput error moving average systems’, Int. J. Adapt. Control SignalProcess., 2011, 25, (12), pp. 1100–1111

18 Goodwin, G.C., Sin, K.S.: ‘Adaptive filtering prediction and control’,(Prentice-Hall: Englewood Cliffs, NJ, 1984)

19 Ding, F., Yang, H.Z., Liu, F.: ‘Performance analysis of stochasticgradient algorithms under weak conditions’, Sci. China Ser. F, Inf.Sci., 2008, 51, (9), pp. 1269–1280

20 Ljung, L.: ‘System identification: theory for the user’, (Prentice-Hall,Englewood Cliffs, New Jersey, 1999, 2nd edn.)

21 Ding, F., Liu, Y.J., Bao, B.: ‘Gradient based and least squares basediterative estimation algorithms for multi-input multi-output systems’,Proc. Inst. Mech. Eng., I, J. Syst. Control Eng., 2012, 226, (1),pp. 43–55

22 Ljung, L.: ‘Consistency of the least-squares identification method’,IEEE Trans. Autom. Control, 1976, 21, (5), pp. 779–781

23 Solo, V.: ‘The Convergence of AML’, IEEE Trans. Autom. Control,1979, 24, (6), pp. 958–962

24 Ding, F., Chen, T.: ‘Combined parameter and output estimation ofdual-rate systems using an auxiliary model’, Automatica, 2004, 40,(10), pp. 1739–1748

25 Lai, T.L., Wei, C.Z.: ‘Extended least squares and their applications toadaptive control and prediction in linear systems’, IEEE Trans. Autom.Control, 1986, 31, (10), pp. 898–906

26 Wei, C.Z.: ‘Adaptive prediction by least squares prediction in stochasticregression models’, Ann. Stat., 1987, 15, (4), pp. 1667–1682

27 Lai, T.L., Ying, Z.L.: ‘Recursive identification and adaptive predictionin linear stochastic systems’, SIAM J. Control Optim., 1991, 29, (5),pp. 1061–1090

28 Toussi, K., Ren, W.: ‘On the convergence least squares estimates inwhite noise’, IEEE Trans. Autom. Control, 1994, 39, (2), pp. 364–368

29 Ren, W., Kumar, P.K.: ‘Stochastic adaptive prediction and modelreference control, IEEE Trans. Autom. Control, 1994, 39, (10),pp. 2047–2060

30 Ding, F., Chen, T.: ‘Hierarchical gradient-based identification ofmultivariable discrete-time systems’, Automatica, 2005, 41, (2),pp. 315–325

31 Ding, F., Liu, G., Liu, X.: ‘Partially coupled stochastic gradientidentification methods for non-uniformly sampled systems’, IEEETrans. Autom. Control, 2010, 55, (8), pp. 1976–1981

32 Ding, F.: ‘System identification–new theory and methods’ (SciencePress, Beijing, 2013)

33 Sen, A., Sinha, N.K.: ‘On-line estimation of the parameters of amultivariable system using matrix pseudo-inverse’, Int. J. Syst. Sci.,1976, 7, (4), pp. 461–471

34 Liu, Y.J., Sheng, J., Ding, R.F.: ‘Convergence of stochastic gradientestimation algorithm for multivariable ARX-like systems’, Comput.Math. Appl., 2010, 59, (8), pp. 2615–2627

35 Ding, F.: ‘System identification–Part H: coupling identification conceptand methods’, J. Nanjing Univ. Inf. Sci. Technol. (Natural ScienceEdn.), 2012, 4, (3), pp. 193–212.

IET Control Theory Appl., 2013, Vol. 7, Iss. 1, pp. 68–79doi: 10.1049/iet-cta.2012.0171

Page 12: Coupled-least-squares identification for multivariable systems

www.ietdl.org

36 Ding, F., Liu, G., Liu, X.P.: ‘Parameter estimation with scarce

measurements’, Automatica, 2011, 47, (8), pp. 1646–165537 Golub, G.H., Loan, C.F.V.: ‘Matrix computations’ (Johns Hopkins

University Press, Baltimore, MD, 1996, 3rd edn.)38 Fang, C.Z., Xiao, D.Y.: ‘Process identification’ (Tsinghua University

Press, Beijing, 1988)39 Xiao, Y.S., Ding, F., Zhou, Y., Li, M., Dai, J.Y.: ‘On consistency

of recursive least squares identification algorithms for controlledauto-regression models’, Appl. Math. Model., 2008, 32, (11), pp.2207–2215.

40 Liu, Y.J., Xiao, Y.S., Zhao, X.L.: ‘Mlti-innovation stochastic gradientalgorithm for multiple-input single-output systems using the auxiliarymodel’, Appl. Math. Comput., 2009, 215, (4), pp. 1477–1483

41 Wang, D.Q., Yang, G.W., Ding, R.F.: ‘Gradient-based iterativeparameter estimation for Box-Jenkins systems’, Comput. Math. Appl.,2010, 60, (5), pp. 1200–1208

42 Liu, Y.J., Wang, D.Q., Ding, F.: ‘Least-squares based iterativealgorithms for identifying Box-Jenkins models with finitemeasurement data’, Digit. Signal Process., 2010, 20, (5), pp. 1458–1467

43 Ding, F., Gu, Y.: ‘Performance analysis of the auxiliary model basedleast squares identification algorithm for one-step state delay systems’,Int. J. Comput. Math., 2012, 89, (15), pp. 2019–2028

44 Ding, F.: ‘Two-stage least squares based iterative estimation algorithmfor CARARMA system modeling’, Appl. Math. Model., 2013, 37;http://dx.doi.org/10.1016/j.apm.2012.10.014

IET Control Theory Appl., 2013, Vol. 7, Iss. 1, pp. 68–79doi: 10.1049/iet-cta.2012.0171

45 Ding, F., Gu, Y.: ‘Performance analysis of the auxiliary model-basedstochastic gradient parameter estimation algorithm for state spacesystems with one-step state delay’, Circuits Syst. Signal Process.,2013, 32; doi: 10.1007/s00034-012-9463-5

46 Ding, F., Liu, X.P., Liu, G.: ‘Identification methods for Hammersteinnonlinear systems’, Digit. Signal Process., 2011, 21, (2), pp. 215–238

47 Ding, F.: ‘Hierarchical multi-innovation stochastic gradient algorithmfor Hammerstein nonlinear system modeling’, Appl. Math. Model.,2013, 37, (4), pp. 1694–1704

48 Liu, Y.J., Xie, L., et al: ‘An auxiliary model based recursive leastsquares parameter estimation algorithm for non-uniformly sampledmultirate systems’, Proc. Inst. Mech. Eng. I, J. Syst. Control Eng.,2009, 223, (4), pp. 445–454

49 Ding, F., Qiu, L., Chen, T.: ‘Reconstruction of continuous-time systemsfrom their non-uniformly sampled discrete-time systems’, Automatica,2009, 45, (2), pp. 324–332

50 Wang, W., Ding, F., Dai, J.Y.: ‘Maximum likelihood least squaresidentification for systems with autoregressive moving average noise’,Appl. Math. Model., 2012, 36, (5), pp. 1842–1853

51 Li, J.H., Ding, F., Yang, G.W.: ‘Maximum likelihood least squaresidentification method for input nonlinear finite impulse responsemoving average systems’, Math. Comput. Model., 2012, 55, (3–4),pp. 442–450

52 Ding, F.: ‘Decomposition based fast least squares algorithm for outputerror systems’, Signal Process., 2013, 93, http://dx.doi.org/10.1016/j.sigpro.2012.12.013

79& The Institution of Engineering and Technology 2013