Coupled-least-squares identification for multivariable systems
Transcript of Coupled-least-squares identification for multivariable systems
www.ietdl.org
Published in IET Control Theory and ApplicationsReceived on 1st March 2012Revised on 17th September 2012Accepted on 17th October 2012doi: 10.1049/iet-cta.2012.0171
ISSN 1751-8644
Coupled-least-squares identification for multivariablesystemsFeng Ding*
Key Laboratory of Advanced Process Control for Light Industry (Ministry of Education), Jiangnan University, Wuxi 214122,
People’s Republic of China
*Control Science and Engineering Research Center, Jiangnan University, Wuxi 214122, People’s Republic of China
E-mail: [email protected]
Abstract: This article studies identification problems of multiple linear regression models, which may be described a class ofmulti-input multi-output systems (i.e. multivariable systems). Based on the coupling identification concept, a novel coupled-least-squares (C-LS) parameter identification algorithm is introduced for the purpose of avoiding the matrix inversion in themultivariable recursive least-squares (RLS) algorithm for estimating the parameters of the multiple linear regression models.The analysis indicates that the C-LS algorithm does not involve the matrix inversion and requires less computationally effortsthan the multivariable RLS algorithm, and that the parameter estimates given by the C-LS algorithm converge to their truevalues. Simulation results confirm the presented convergence theorems.
1 Introduction
Mathematical models are basic for control problems and playan important role in control system analysis, system synthesisand control system design, state filtering [1–5]. Systemidentification is the theory and method for establishing themathematical models of (dynamic) systems [6–9]. Fordecades, exploring new identification methods havereceived many control scientists' attention. Many newparameter estimation methods and their performanceanalysis have been reported for linear or non-linear systems,for example, [10–13], especially the parameter fitting andconvergence rates for linear regression models [14, 15] andpseudo-linear systems [16, 17].A scalar linear system described by a difference equation
can be written as a linear regression model
y(t) = wT(t)u+ v(t) (1)
where y(t) [ R is the observation output of the system,u [ Rn is the parameter vector to be identified, w(t) [ Rn
is the (regressive) information vector consisting of theinput--output data u(t − i) and y(t − i) of the system,v(t) [ R is a stochastic noise with zero mean.It is worth noting that the linear regression models in (1)
include but are not limited to linear systems. For example,the linear-parameter system or non-linear system
y(t) = a1y(t − 1)+ a2y(t − 2)y(t − 3)+ b1u(t − 1)
+ b2u2(t − 2)+ v(t)
can be written as the linear form of (1) by definingu := [a1, a2, b1, b2]
T and w(t) := [y(t − 1), y(t − 2)y(t − 3),
68& The Institution of Engineering and Technology 2013
u(t − 1), u2(t − 2)]T which is non-linear in the input-outputu(t − i) and y(t − i).For the linear regression models in (1), many methods can
estimate its parameter vector θ. Two typical identificationmethods are the stochastic gradient (SG) algorithm [18, 19]and the recursive least-squares (RLS) algorithm [20, 21].The SG algorithm requires lower computational cost, butthe RLS algorithm has a faster convergence rate than theSG algorithm. Many early work studied the convergence ofthe RLS algorithm under different conditions such as theassumptions that the input and output signals (orobservation data) of the system under consideration havefinite non-zero power and the noise is an independent andidentically distributed random sequence with finitefourth-order moments [22], or the observation data{y(t), w(t)} are stationary and ergodic [23].An important breakthrough about the strong consistency of
the RLS algorithm was achieved by Lai and Wei [14], whoobtained the convergence rate of the RLS parameterestimation by assuming that the observation noise v(t)has zero mean and finite second-order momentE[v2(t)|F t−1] = s2 , 1, a.s. where the symbol E denotesthe expectation operator, {v(t)} is a martingale differencesequence with respect to an increasing sequence of σ-fieldF t; that is, v(t) is F t-measurable and E[v(t)|F t−1] = 0 (i.e.{F t} is the σ algebra generated by the observations up toand including time t. Under such assumptions, Lai and Wei[14] proved that the parameter estimation error u(t)− usatisfies
‖u(t)− u‖2 = O{ ln lmax[P
−10 (t)]}c
lmin[P−10 (t)]
( ), a.s., c . 1 (2)
IET Control Theory Appl., 2013, Vol. 7, Iss. 1, pp. 68–79doi: 10.1049/iet-cta.2012.0171
www.ietdl.org
where u(t) represents the estimate of θ at time t,‖X‖2 := tr[XXT], lmax[X] and lmin[X] represent themaximum and minimum eigenvalues of the non-negativedefinite matrix X, respectively, the covariance matrix P0(t) isdefined by (9) in the next section. When lmin[P−10 (t)] � 1
and lnlmax[P−10 (t)] = o(lmin[P
−10 (t)]), a.s., the estimation
errors approach zero, that is, ‖u(t)− u‖2 � 0 (see Corollary3 in [14]). This conclusion holds under the generalisedpersistent excitation condition [24].Furthermore, supposing that the noise {v(t)} has a finite
higher-order moment E[|v(t)|g|F t−1] = s2, a.s., for someg . 2, Lai and Wei [14] derived the convergence rate ofthe RLS algorithm. Later, some convergence results ofleast-squares or SG (based adaptive control) algorithmsused such an assumption that the higher-order momentexists, including the work of Lai and Wei [25], Wei [26],Lai and Ying [27], Toussi and Ren [28], and Ren andKumar [29].A direct extension of the scalar linear regression model in
(1) is a multiple linear regression model
y(t) = F(t)q+ v(t) (3)
where y(t) [ Rm is the observation output vector of thesystem, q [ Rn is the parameter vector to be identified,F(t) [ Rm×n is the information matrix consisting of thepast input–output data, and v(t) [ Rm is the observationnoise vector with zero mean.Equation (3) may describe a multivariable linear system
(see (4) in [30] or Example 1 later) or a multivariablenon-linear system where y(t) is linear on the parameterspace q [ Rn and the information matrix F(t) is non-linearin the system input--output data (see Example 2 later).Although the multivariable RLS algorithm can be applied to
(3), it requires computing the matrix inversion (see Remark 1in the next section), resulting in large computational burden.This motivates us to study new coupled-least-squares (C-LS)algorithms without involving matrix inversion. Recently, apartially coupled SG algorithm has been proposed fornon-uniformly sampled-data systems to improve theidentification accuracy of the SG algorithm [31]. To the bestof the authors’ knowledge, the coupled parameter estimationmethods and convergence for multiple linear regressionmodels have not been fully investigated, especially the C-LSparameter estimation algorithms and their convergenceproperties, which are the focus of this work. The maincontributions of this paper lie in the following.
† Derive a C-LS parameter estimation algorithm to avoidcomputing the matrix inversion in the multivariable RLSalgorithm, for the purpose of reducing computational load.† Analyse the performance of the C-LS estimation algorithmand prove that the parameter estimation errors given by theC-LS algorithm converge to zero.
The rest of the paper is organised as follows. Section 2describes the identification problem related to multiplelinear regression models or multivariable systems. Section 3derives a C-LS algorithm for multivariable systems anddiscusses the relation between the RLS algorithm and theC-LS algorithm. Section 4 studies the convergence of theproposed C-LS algorithms. Section 5 provides anillustrative example to validate the proposed methods.Finally, Section 6 offers some concluding remarks.
IET Control Theory Appl., 2013, Vol. 7, Iss. 1, pp. 68–79doi: 10.1049/iet-cta.2012.0171
2 Problem formulation
Let us introduce some notations first. The symbol I or In standsfor an identity matrix of appropriate sizes or n× n; thesuperscript T denotes the matrix/vector transpose; the norm ofthe matrix X is defined by ‖X‖2 := tr[XXT]; |X| := det[X]denotes the determinant of a square matrix X; 1n represents ann-dimensional column vector whose elements are all 1; p0 isa large positive number, e.g., p0 = 106; ⊗ denotes theKronecker product, if A = [aij] [ Rm×n, B = [bij] [ R p×q,then A⊗ B = [aijB] [ R(mp)×(nq); col[X] denotes the vectorformed by the columns of the matrix X, that is, ifX = [x1, x2, . . . , xn] [ Rm×n, then col[X] = [xT1 , x
T2 , . . . ,
xTn ]T [ Rmn. The relation f (t) = O(g(t)) means that there
exist positive constants d1 and t0 such that |f (t)| ≤ d1g(t) fort ≥ t0. u(t) and q(t) represent the estimates of θ and ϑ at time t.
2.1 Scalar systems
Referring to [18, 32], the following RLS algorithm canestimate the parameter vector θ of the scalar systems in (1)
u(t) = u(t − 1)+ P0(t)w(t)[y(t)− wT(t)u(t − 1)] (4)
P−10 (t) = P−1
0 (t − 1)+ w(t)wT(t), P0(0) = p0In (5)
In order to avoid the inverse matrix P−10 (t) in (5), defining the
gain vector L0(t) := P0(t)w(t) [ Rn and applying the matrixinversion formula
(A+ BC)−1 = A−1 − A−1B(I + CA−1B)−1CA−1 (6)
to (5), we can obtain the following equivalent expression ofthe RLS algorithm in (4)–(5)
u(t) = u(t − 1)+ L(t)[y(t)− wT(t)u(t − 1)] (7)
L0(t) = P0(t − 1)w(t)/[1+ wT(t)P0(t − 1)w(t)] (8)
P0(t) = [In − L0(t)wT(t)]P0(t − 1), P0(0) = p0In (9)
2.2 Multivariable systems
Consider a multivariable system described by a multiplelinear regression model in (3), rewrite as
y(t) = F(t)q+ v(t) (10)
where y(t) = [y1(t), y2(t), . . . , ym(t)]T [ Rm is the output
vector of the system, q [ Rn is the parameter vector to beidentified, F(t) [ Rm×n is the information matrix consistingof the input-output data u(t − i) and y(t − i), and v(t) [ Rm
is the observation noise vector with zero mean. Withoutloss of generality, we assume that y(t) = 0 and F(t) = 0and v(t) = 0 for t ≤ 0.In order to show the advantages of the proposed C-LS
algorithm in the next section, the following simply givesthe RLS algorithm for estimating ϑ in (10) for comparisons.The following multivariable RLS algorithm can generate
the estimate q(t) of the parameter vector ϑ for the multiplelinear regression model in (3) [33–35]
q(t) = q(t − 1)+ P(t)FT(t)[y(t)−F(t)q(t − 1)] (11)
P−1(t) = P−1(t − 1)+FT(t)F(t), P(0) = p0In (12)
69& The Institution of Engineering and Technology 2013
www.ietdl.org
Similarly, to avoid computing P−1(t) in (12), defining thegain matrix L(t) := P(t)FT(t) [ Rn×m and applying thematrix inversion formula (6)–(12) give the equivalentexpression of the multivariable RLS algorithm in (11)–(12)for the multivariable system in (10)q(t) = q(t − 1)+ L(t)[y(t)−F(t)q(t − 1)] (13)
L(t) = P(t − 1)FT(t)[Im +F(t)P(t − 1)FT(t)]−1 (14)
P(t) = [In − L(t)F(t)]P(t − 1), P(0) = p0In (15)
Remark 1: For the multivariable RLS algorithm in (13)–(15),we can see from (14) that it requires computing the matrixinversion: [Im +F(t)P(t − 1)FT(t)]−1 [ Rm×m at each step,resulting in heavy computational load, especially for largem (the number of outputs). This is the drawback of themultivariable RLS algorithm in (13)–(15). This motivatesus to study new coupled parameter identification method.This is the objective of this paper.
3 C-LS estimation algorithm
3.1 C-LS algorithm
Referring to the partially coupled SG identification methodsin [31, 32], this section derives a C-LS algorithm for themultivariable systems on the basis of the couplingidentification concept.Let fT
i (t) [ R1×n be the ith row of F(t). From (10), weobtain m identification models (subsystems)
yi(t) = fTi (t)q+ vi(t), i = 1, 2, . . . , m (16)
each of which contains a common parameter vector q [ Rn.Obviously, only one subsystem is sufficient to estimate ϑ butwe must use all subsystems to estimate ϑ so as to enhance theparameter estimation accuracy. This is also the motivation ofthis work. According to the least-squares principle, we canobtain m RLS algorithms for estimating ϑ, the subsystemleast-squares (S-LS) algorithms for short
q(t) = q(t − 1)+ Pi(t)fi(t)[yi(t)−fTi (t)q(t − 1)] (17)
P−1i (t) = P−1
i (t − 1)+fi(t)fTi (t), i = 1, 2, 3, . . . ,m
(18)
Here, Pi(t) [ Rn×n is the covariance matrix of subsystem i,respectively.
70& The Institution of Engineering and Technology 2013
Note from (17) to (18) that each S-LS algorithm contains acommon parameter estimation vector q(t) for each subsystem –see (17). However, they are independent, namely, theestimation vector q(t) of subsystem i does not depend onq(t) of subsystem j for i = j. For the sake of clarity, we useqi(t) as q(t) in (17) of subsystem i and then the S-LSalgorithm in (17) and (18) can be equivalently written as
qi(t) = qi(t − 1)+ Pi(t)fi(t)[yi(t)−fTi (t)qi(t − 1)] (19)
P−1i (t) = P−1
i−1(t)+fi(t)fTi (t), i = 1, 2, 3, . . . , m (20)
This implies that there is no coupling between the parameterestimates qi(t) of the subsystems. The schematic diagram ofthe S-LS algorithm in (19) and (20) is shown in Fig. 1.
Remark 2: For i = 1, 2, 3, . . . , m, we obtain m estimationvectors qi(t) from (19) to (20) which are all the estimatesof the common parameter vector ϑ in all subsystemsbecause the parameter vector ϑ is estimated once in eachsubsystem, and then this will result in a large amount ofredundancy about the estimate of ϑ. Next, we derive a newC-LS method so as to avoid the redundant estimates of ϑ.
For recursive estimation algorithms, it is desired that theparameter estimates approach their true values with the datalength t increasing. Thus, one may replace qi(t − 1) withqi−1(t) [15, 16, 36].Referring to the partially coupled SG algorithm in [31] and
by means of the idea of the Jacobi and Gauss--Seideliterations [37], replacing qi(t − 1) on the right-hand side of(19) with qi−1(t) for i = 2, 3, . . . , m, and replacingq1(t − 1) on the right-hand side of (19) with qm(t − 1) fori = 1 give the following C-LS algorithm [35, 38]
qi(t) = qi−1(t)+ Li(t)[yi(t)−fTi (t)qi−1(t)] (21)
Li(t) = Pi(t)fi(t) (22)
P−1i (t) = P−1
i−1(t)+fi(t)fTi (t), i = 2, 3, . . . , m (23)
and
q1(t) = qm(t − 1)+ L1(t)[y1(t)−fT1 (t)qm(t − 1)] (24)
L1(t) = P1(t)f1(t) (25)
P−11 (t) = P−1
m (t − 1)+f1(t)fT1 (t) (26)
Applying the matrix inversion lemma in (6)–(23) and (26),
Fig. 1 Schematic diagram of the S-LS algorithm
IET Control Theory Appl., 2013, Vol. 7, Iss. 1, pp. 68–79doi: 10.1049/iet-cta.2012.0171
www.ietdl.org
the C-LS algorithm can be equivalently expressed as [38]qi(t) = qi−1(t)+ Li(t)[yi(t)−fTi (t)qi−1(t)] (27)
Li(t) = Pi−1(t)fi(t)/[1+fTi (t)Pi−1(t)fi(t)] (28)
Pi(t) = [I − Li(t)fTi (t)]Pi−1(t), i = 2, 3, . . . , m (29)
and
q1(t) = qm(t − 1)+ L1(t)[y1(t)−fT1 (t)qm(t − 1)] (30)
L1(t) = Pm(t − 1)f1(t)/[1+fT1 (t)Pm(t − 1)f1(t)] (31)
P1(t) = [I − L1(t)fT1 (t)]Pm(t − 1), Pm(0) = p0In (32)
where qi(t) [ Rn, Li(t) [ Rn and Pi(t) [ Rn×n are theparameter estimation vector, the gain vector and thecovariance matrix of the ith subsystem at time t,respectively; qi−1(t) and Pi−1(t) are the parameterestimation vector and the covariance matrix of the (i− 1)thsubsystem at time t, respectively; qm(t − 1) and Pm(t − 1)are the parameter estimation vector and the covariancematrix of the mth subsystem at time t− 1, respectively.The C-LS algorithm can be obtained in a similar way in
[38] and the schematic diagram of the C-LS algorithm in(21)–(26) is shown in Fig. 2. In Fig. 2, the parameterestimate q1(t) of Subsystem 1 is equal to the estimateqm(t − 1) of Subsystem m at the preceding time t− 1 plusthe modified term L1(t)[y1(t)−fT
1 (t)qm(t − 1)] – see (24)or (30), and the covariance matrix P1(t) of Subsystem 1 attime t is computed through the covariance matrix Pm(t − 1)of Subsystem m at the preceding time t− 1 and the gainvector L1(t) and information vector f1(t) of Subsystem 1 –see (26) or (32). Similarly, the parameter estimate q2(t) ofSubsystem 2 is equal to the estimate q1(t) of Subsystem 1plus the modified term L2(t)[y2(t)−fT
2 (t)q1(t)] – see (21)or (27) with i = 2, and the covariance matrix P2(t) ofSubsystem 2 is computed through the covariance matrixP1(t) of Subsystem 1 and the gain vector L2(t) andinformation vector f2(t) of Subsystem 2 – see (23) or (29)with i = 2. Similar procedure will be conducted as i increases.The steps of computing the estimates qm(t) by the C-LS
algorithm in (27)–(32) are listed in the following.
1. Set the initial values: Let t = 1, qm(0) = 1n/p0,Pm(0) = p0In, p0 = 106.2. Collect the observation data y(t) and F(t), and letfT
i (t) [ R1×n be the ith row of F(t).
IET Control Theory Appl., 2013, Vol. 7, Iss. 1, pp. 68–79doi: 10.1049/iet-cta.2012.0171
3. Compute the gain vector L1(t) by (31) andcovariance matrix P1(t) by (32) and update the estimateq1(t) by (30).4. For i = 2:m
Compute the gain vector Li(t) by (28) and covariancematrix Pi(t) by (29) and update the estimate qi(t) by(27).
end
5. Increase t by 1 and go to Step 2.
Remark 3: The C-LS algorithm in (27)–(32) uses the estimateqi−1(t) on the right-hand side of (27) instead of qi(t − 1) onthe right-hand side of (19) for i = 2, 3, . . . , m, throughobserving (27) and (19). When computing q1(t), the C-LSalgorithm in (27)–(32) uses the estimate qm(t − 1) on theright-hand side of (30) instead of q1(t − 1) on theright-hand side of (19) with i = 1, through observing (30)and (19) with i = 1. Thus, the C-LS algorithm is differentfrom the S-LS algorithm.
Remark 4: The RLS algorithm in (13)–(15) requires computingthe matrix inversion [Im +F(t)P(t − 1)FT(t)]−1 – see (14),but the C-LS algorithm in (27)–(30) doest not involve thismatrix inversion. Thus the C-LS algorithm is superior to themultivariable RLS algorithm in (13)–(15).
3.2 Relation between the RLS and C-LS algorithms
About the parameter estimate qm(t) and the covariancematrix Pm(t) of Subsystem m, we have the followingtheorem [38].
Theorem 1: The parameter estimate qm(t) and the covariancematrix Pm(t) of Subsystem m in (27)–(29) with i = m areequivalent to the estimate q(t) and covariance matrix P(t)in (13)–(15), that is, q(t) = qm(t) and P(t) = Pm(t).
Proof: From (21)–(23) with i = 2, we have
q2(t) = q1(t)+ L2(t)[y2(t)−fT2 (t)q1(t)] (33)
L2(t) = P2(t)f2(t) (34)
P−12 (t) = P−1
1 (t)+f2(t)fT2 (t) (35)
Fig. 2 Schematic diagram of the C-LS algorithm [38]
71& The Institution of Engineering and Technology 2013
www.ietdl.org
Substituting (24) into (33) yieldsq2(t)= q1(t)+L2(t)[y2(t)−fT2 (t)q1(t)]
= qm(t−1)+L1(t)[y1(t)−fT1 (t)qm(t−1)]+L2(t)(y2(t)
−fT2 (t){qm(t−1)+L1(t)[y1(t)−fT
1 (t)qm(t−1)]})
= qm(t−1)+ [I−L2(t)fT2 (t)]L1(t)[y1(t)
−fT1 (t)qm(t−1)]+L2(t)[y2(t)−fT
2 (t)qm(t−1)]
(36)
Substituting (26) into (35) yields
P−12 (t) = P−1
m (t − 1)+∑2i=1
fi(t)fTi (t) (37)
Successive substitution gives
qm(t) = qm(t − 1)+∏mi=2
[I − Li(t)fTi (t)]
{ }L1(t)
× [y1(t)−fT1 (t)qm(t − 1)]+
∏mi=3
[I − Li(t)fTi (t)]
{ }
× L2(t)[y2(t)−fT2 (t)qm(t − 1)]+ · · ·
+ [I − Lm(t)fTm(t)]Lm−1(t)
× [ym−1(t)−fTm−1(t)qm(t − 1)]
+ Lm(t)[ym(t)−fTm(t)qm(t − 1)] (38)
Lm(t) = Pm(t)fm(t) (39)
P−1m (t) = P−1
m (t − 1)+∑mi=1
fi(t)fTi (t) (40)
From (29), we have
Pi(t) = [I − Li(t)fTi (t)]Pi−1(t)
= [I − Li(t)fTi (t)][I − Li−1(t)f
Ti−1(t)]Pi−2(t)
= [I − Li(t)fTi (t)][I − Li−1(t)f
Ti−1(t)]
[I − Li−2(t)fTi−2(t)]Pi−3(t)
= · · · · · ·= [I − Li(t)f
Ti (t)][I − Li−1(t)f
Ti−1(t)]
× [I − Li−2(t)fTi−2(t)] · · · [I − L2(t)f
T2 (t)]P1(t)
Let i = m. Post-multiplying the ith equation of the aboveequation by fm−i(t) (i = 1, 2, . . . , m− 1) and using (25),
72& The Institution of Engineering and Technology 2013
(34) and (39) yield
Pm(t)fm−1(t) = [I − Lm(t)fTm(t)]Pm−1(t)fm−1(t)
= [I − Lm(t)fTm(t)]Lm−1(t) (41)
Pm(t)fm−2(t) = [I − Lm(t)fTm(t)][I − Lm−1(t)f
Tm−1(t)]
× Pm−2(t)fm−2(t)
= [I − Lm(t)fTm(t)][I − Lm−1(t)f
Tm−1(t)]
× Lm−2(t) (42)
..
.
Pm(t)f2(t) =∏mi=3
[I − Li(t)fTi (t)]P2(t)f2(t)
=∏mi=3
[I − Li(t)fTi (t)]L2(t)
(43)
Pm(t)f1(t) =∏mi=2
[I − Li(t)fTi (t)]P1(t)f1(t)
=∏mi=2
[I − Li(t)fTi (t)]L1(t)
(44)
Using (41)–(44), (38)–(40) can be written as
qm(t) = qm(t − 1)+ Pm(t)f(t)[y(t)−fT(t)qm(t − 1)]
(45)
Lm(t) = Pm(t)fm(t) (46)
P−1m (t) = P−1
m (t − 1)+f(t)fT(t) (47)
The algorithm in (45)–(47) is equivalent to the algorithm in(13)–(15). This proves Theorem 1.
Remark 5: Theorem 1 indicates that the parameter estimatesgiven by the C-LS algorithm in (27)–(32) equal those of theRLS algorithm in (13)–(15), but the C-LS algorithm doesnot involve the matrix inversion [Im +F(t)P(t − 1)FT(t)]−1 – see the C-LS algorithm in (27)–(30) and themultivariable RLS algorithm in (19)–(20).Theorem 1 and its proof are taken from reference [38]. Thefollowing studies the convergence of the C-LS algorithm.
4 Performance analysis
The C-LS algorithm differs from the RLS algorithm andthus its convergence is important. This section analyses theconvergence properties of the proposed C-LS algorithm in(21)–(26). The convergence of the RLS algorithm in (11)–(12) has been reported in [14, 18, 39].Assume that {vi(t), F t} (i = 1, 2, . . . , m) is a martingale
difference sequence defined on a probability space{V, F , P}, where {F t} is the σ algebra sequence generatedby the observations up to and including t [18].The noise sequence {vi(t)} satisfies the following
IET Control Theory Appl., 2013, Vol. 7, Iss. 1, pp. 68–79doi: 10.1049/iet-cta.2012.0171
www.ietdl.org
assumptions [18]:(A1) E[vi(t)|F t−1] = 0,
(A2) E[v2i (t)|F t−1] ≤ s2 , 1, a.s., (a.s.: almost surely)
Lemma 1 (martingale convergence theorem: Lemma D.5.3 in[18]): If Tt, at, bt are non-negative random variables,measurable with respect to a non-decreasing sequence of σalgebra F t−1, and satisfy
E[Tt|F t−1] ≤ Tt−1 + at − bt, a.s.
then when∑1
t=1 at , 1, a.s., we have∑1
t=1 bt , 1, a.s.,and Tt � T , a.s., a finite random variable.
Lemma 2: For the C-LS algorithm in (21)–(26), the followinginequalities hold:
1.∑1t=1
∑mi=1
fTi (t)Pi(t)fi(t)
[ ln |P−1i (t)|]c , 1, a.s., c . 1
2.∑1t=1
∑mi=1
fTi (t)Pi(t)fi(t)
ln |P−1i (t)|[ ln ln |P−1
i (t)|]c , 1, a.s., c . 1
3.∑1t=1
∑mi=1
fTi (t)Pi(t)fi(t)
ln |P−1i (t)| ln ln |P−1
i (t)|[ ln ln ln |P−1i (t)|]c , 1,
a.s., c . 1
4.∑1t=1
∑mi=1
fTi (t)Pi(t)fi(t)
ln |P−1i (t)| ln ln |P−1
i (t)|ln ln ln |P−1
i (t)|[ ln ln ln ln |P−1i (t)|]c
, 1,
a.s., c . 1
Proof: According to (23), we have
P−1i−1(t) = P−1
i (t)−fi(t)fTi (t)
= P−1i (t)[I − Pi(t)fi(t)f
Ti (t)]
Taking the determinant of both sides of the above equationand using the formula det[Im + DE] = det[In + ED] give
|P−1i−1(t)| = |P−1
i (t)||I − Pi(t)fi(t)fTi (t)|
= |P−1i (t)|[1−fT
i (t)Pi(t)fi(t)]
IET Control Theory Appl., 2013, Vol. 7, Iss. 1, pp. 68–79doi: 10.1049/iet-cta.2012.0171
Solving for fTi (t)Pi(t)fi(t) gives
fTi (t)Pi(t)fi(t) =
|P−1i (t)| − |P−1
i−1(t)||P−1
i (t)| , i = 2, 3, . . . , m
(48)
Similarly, according to (26), we have
fT1 (t)P1(t)f1(t) =
|P−11 (t)| − |P−1
m (t − 1)||P−1
1 (t)| (49)
† For part 1, dividing both sides of (48) by [ ln |P−1i (t)|]c and
summing for i from i = 2 to m give
∑mi=2
fTi (t)Pi(t)fi(t)
[ ln |P−1i (t)|]c =
∑mi=2
|P−1i (t)| − |P−1
i−1(t)||P−1
i (t)|[ ln |P−1i (t)|]c (50)
Dividing both sides of (49) by [ ln |P−11 (t)]c gives
fT1 (t)P1(t)f1(t)
[ ln |P−11 (t)|]c = |P−1
1 (t)| − |P−1m (t − 1)|
|P−11 (t)|[ ln |P−1
1 (t)|]c (51)
Adding (51) to (50), we have (see equation at the bottom ofthe page)
Summing for t from t = 1 to 1 gives (see equation at thebottom of the page)
This proves part 1 of Lemma 2.† For part 2, dividing both sides of (48) by
ln |P−1i (t)|[ ln ln |P−1
i (t)|]c and summing for i from i = 2 tom give
∑mi=2
fTi (t)Pi(t)fi(t)
ln |P−1i (t)|[ ln ln |P−1
i (t)|]c
=∑mi=2
|P−1i (t)| − |P−1
i−1(t)||P−1
i (t)| ln |P−1i (t)|[ ln ln |P−1
i (t)|]c (52)
Dividing both sides of (49) by ln |P−11 (t)|[ ln ln |P−1
1 (t)]c gives
fT1 (t)P1(t)f1(t)
ln |P−11 (t)|[ ln ln |P−1
1 (t)|]c =|P−1
1 (t)|− |P−1m (t−1)|
|P−11 (t)| ln |P−1
1 (t)|[ lnln |P−11 (t)|]c
(53)
∑mi=1
fTi (t)Pi(t)fi(t)
[ ln |P−1i (t)|]c = |P−1
1 (t)| − |P−1m (t − 1)|
|P−11 (t)|[ ln |P−1
1 (t)|]c +∑mi=2
|P−1i (t)| − |P−1
i−1(t)||P−1
i (t)|[ ln |P−1i (t)|]c =
∫|P−11 (t)|
|P−1m (t−1)|
dx
|P−11 (t)|[ ln |P−1
1 (t)|]c
+∑mi=2
∫|P−1i (t)|
|P−1i−1(t)|
dx
|P−1i (t)|[ ln |P−1
i (t)|]c ≤∫|P−1
1 (t)|
|P−1m (t−1)|
dx
x[ ln x]c+
∑mi=2
∫|P−1i (t)|
|P−1i−1(t)|
dx
x[ ln x]c=
∫|P−1m (t)|
|P−1m (t−1)|
dx
x[ ln x]c
∑1t=1
∑mi=1
fTi (t)Pi(t)fi(t)
[ ln |P−1i (t)|]c =
∑1t=1
∫|P−1m (t)|
|P−1m (t−1)|
dx
x[ ln x]c=
∫|P−1m (1)|
|P−1m (0)|
dx
x[ ln x]c= −1
c− 1
1
[ ln x]c−1
∣∣∣∣|P−1
m (1)|
|P−1m (0)|
= 1
c− 1
1
[ ln |P−1m (0)|]c−1 −
1
[ ln |P−1m (1)|]c−1
[ ], 1, a.s.
73& The Institution of Engineering and Technology 2013
www.ietdl.org
Adding (53) to (52), we have∑mi=1
fTi (t)Pi(t)fi(t)
ln |P−1i (t)|[ ln ln |P−1
i (t)|]c
= |P−11 (t)| − |P−1
m (t − 1)||P−1
1 (t)| ln |P−11 (t)|[ ln ln |P−1
1 (t)|]c
+∑mi=2
|P−1i (t)| − |P−1
i−1(t)||P−1
i (t)| ln |P−1i (t)|[ ln ln |P−1
i (t)|]c
=∫|P−1
1 (t)|
|P−1m (t−1)|
dx
|P−11 (t)| ln |P−1
1 (t)|[ ln ln |P−11 (t)|]c
+∑mi=2
∫|P−1i (t)|
|P−1i−1(t)|
dx
|P−1i (t)| ln |P−1
i (t)|[ ln ln |P−1i (t)|]c
≤∫|P−1
1 (t)|
|P−1m (t−1)|
dx
x ln x[ ln ln x]c+
∑mi=2
∫|P−1i (t)|
|P−1i−1(t)|
dx
x ln x[ ln ln x]c
=∫|P−1
m (t)|
|P−1m (t−1)|
dx
x ln x[ ln ln x]c
Summing for t from t = 1 to 1 gives
∑1t=1
∑mi=1
fTi (t)Pi(t)fi(t)
ln |P−1i (t)|[ ln ln |P−1
i (t)|]c
=∑1t=1
∫|P−1m (t)|
|P−1m (t−1)|
dx
x ln x[ ln ln x]c=
∫|P−1m (1)|
|P−1m (0)|
dx
x ln x[ ln ln x]c
= −1
c− 1
1
[ ln ln x]c−1
∣∣∣∣|P−1
m (1)|
|P−1m (0)|
= 1
c− 1
1
[ ln ln |P−1m (0)|]c−1 −
1
[ ln ln |P−1m (1)|]c−1
[ ], 1, a.s.
This proves part 2 of Lemma 2.† Similarly, we can prove parts 3 and 4 of Lemma 2. This
completes the proof of Lemma 2.
Theorem: For the identification model in (16) and the C-LSalgorithm in (21)–(26), suppose that (A1) and (A2) hold.Then for any c . 1, the parameter estimates q(t) = qm(t)satisfies
1. ‖q(t)− q‖2 = O{ ln tr[P−1
m (t)]}c
lmin[P−1m (t)]
( ), a.s.
2. ‖q(t)− q‖2 = Oln tr[P−1
m (t)][ ln ln tr[P−1m (t)]}c
lmin[P−1m (t)]
( ), a.s.
3. ‖q(t)− q‖2 = O
ln tr[P−1m (t)]{ ln ln tr[P−1
m (t)]}×{ ln ln ln tr[P−1
m (t)]}c
lmin[P−1m (t)]
⎛⎜⎝
⎞⎟⎠, a.s.
4. ‖q(t)− q‖2 = O
ln tr[P−1m (t)]{ ln ln tr[P−1
m (t)]}×{ ln ln ln tr[P−1
m (t)]}×{ ln ln ln ln tr[P−1
m (t)]}c
lmin[P−1m (t)]
⎛⎜⎜⎜⎝
⎞⎟⎟⎟⎠
a.s.,
These give the convergence rates of the parameter estimates.
74& The Institution of Engineering and Technology 2013
Proof: Define the parameter estimation error vectors
qi(t) := qi(t)− q, i = 1, 2, . . . , m
Using (21), (22) and (16), it follows that
qi(t)= qi−1(t)+Pi(t)fi(t)[−fTi (t)qi−1(t)+ vi(t)]
=: qi−1(t)+Pi(t)fi(t)[− ji(t)+ vi(t)], i= 2, 3, . . . ,m
(54)
where
ji(t) := fTi (t)qi−1(t)−fT
i (t)q
= fTi (t)qi−1(t), i = 2, 3, . . . , m (55)
Using (24) and (16), it follows that
q1(t) = qm(t − 1)+ P1(t)f1(t)[−fT1 (t)qm(t − 1)+ v1(t)]
=: qm(t − 1)+ P1(t)f1(t)[−j1(t)+ v1(t)] (56)
where
j1(t) := fT1 (t)qm(t − 1)−fT
1 (t)q
= fT1 (t)qm(t − 1) (57)
Define the non-negative functions
Vi(t) := qT
i (t)P−1i (t)qi(t), i = 1, 2, . . . , m
Using (54), (55) and (23) and tr[AB] = tr[BA] andtr[AT] = tr[A], it follows that
Vi(t) = qT
i (t)P−1i (t)qi(t)
= {qi−1(t)+ Pi(t)fi(t)[−ji(t)+ vi(t)]}TP−1
i (t)
× {qi−1(t)+ Pi(t)fi(t)[−ji(t)+ vi(t)]}
= qT
i−1(t)P−1i (t)qi−1(t)+ 2q
T
i (t)fi(t)[−ji(t)+ vi(t)]
+fTi (t)Pi(t)fi(t)[−ji(t)+ vi(t)]
2
= qT
i−1(t)[P−1i−1(t)+fi(t)f
Ti (t)]qi−1(t)
+ 2ji(t)[−ji(t)+ vi(t)]+fTi (t)Pi(t)fi(t)[j
2i (t)
+ v2i (t)− 2ji(t)vi(t)]
= Vi−1(t)− [1−fTi (t)Pi(t)fi(t)]j
2i (t)
+fTi (t)Pi(t)fi(t)v
2i (t)+ 2[1
−fTi (t)Pi(t)fi(t)]ji(t)vi(t)
≤ Vi−1(t)+fTi (t)Pi(t)fi(t)v
2i (t)+ 2[1
−fTi (t)Pi(t)fi(t)]ji(t)vi(t), i = 2, 3, . . . , m
(58)
IET Control Theory Appl., 2013, Vol. 7, Iss. 1, pp. 68–79doi: 10.1049/iet-cta.2012.0171
www.ietdl.org
Here, we have used the inequality1−fTi (t)Pi(t)fi(t) = [1+fT
i (t)Pi−1(t)fi(t)]−1 . 0,
i = 2, 3, . . . , m
From (58) with i = m, successive substitution yields
Vm(t) ≤ V1(t)+∑mi=2
fTi (t)Pi(t)fi(t)v
2i (t)
+∑mi=2
2[1−fTi (t)Pi(t)fi(t)]ji(t)vi(t) (59)
Similarly, using (56), (57) and (26), we have
V1(t) = qT
1 (t)P−11 (t)q1(t)
= qT
m(t − 1)P−11 (t)qm(t − 1)+ 2q
T
m(t − 1)f1(t)
[−j1(t)+ v1(t)]+fT1 (t)P1(t)f1(t)[−j1(t)+ v1(t)]
2
= qT
m(t − 1)[P−1m (t − 1)+f1(t)f
T1 (t)]qm(t − 1)
+ 2j1(t)[− j1(t)+ v1(t)]+fT1 (t)P1(t)f1(t)
× [j21(t)+ v21(t)− 2j1(t)v1(t)]
= Vm(t − 1)− [1−fT1 (t)P1(t)f1(t)]j
21(t)
+fT1 (t)P1(t)f1(t)v
21(t)+ 2[1−fT
1 (t)
× P1(t)f1(t)]j1(t)v1(t)
≤ Vm(t − 1)+fT1 (t)P1(t)f1(t)v
21(t)
+ 2[1−fT1 (t)P1(t)f1(t)]j1(t)v1(t) (60)
Here, we have used the inequality
1−fT1 (t)P1(t)f1(t) = [1+fT
1 (t)Pm(t − 1)f1(t)]−1 . 0
Substituting (60) into (59) gives
Vm(t) ≤ Vm(t − 1)+∑mi=1
fTi (t)Pi(t)fi(t)v
2i (t)
+∑mi=2
2[1−fTi (t)Pi(t)fi(t)]ji(t)vi(t) (61)
Since ji(t), fTi (t)Pi(t)f(t) and ji(t) are uncorrelated with vi(t)
and are F t−1 measurable, taking the conditional expectationof both sides of (61) with respect to F t−1 and using (A1) give
E[Vm(t)|F t−1] ≤ Vm(t − 1)
+∑mi=1
fTi (t)Pi(t)fi(t)s
2, a.s. (62)
Let
Z(t) = Vm(t)
[ ln |P−1m (t)|]c , c . 1
Since ln |P−1(t)| is non-decreasing and
P−1m (t) ≥ P−1
m−1(t) ≥ · · · ≥ P−11 (t) ≥ P−1
m (t − 1)
IET Control Theory Appl., 2013, Vol. 7, Iss. 1, pp. 68–79doi: 10.1049/iet-cta.2012.0171
we have
E[Z(t)|F t−1] ≤Vm(t − 1)
[ ln |P−1m (t)|]c +
∑mi=1
2fTi (t)Pi(t)fi(t)
[ ln |P−1m (t)|]c s2
≤ Vm(t − 1)
[ ln |P−1m (t − 1)|]c +
∑mi=1
2fTi (t)Pi(t)fi(t)
[ ln |P−1i (t)|]c s2
≤ Z(t − 1)+∑mi=1
2fTi (t)Pi(t)fi(t)
[ ln |P−1i (t)|]c s2, a.s. (63)
Using Lemma 2, we know that the summation of the secondterm on the right-hand side for t from t = 1 to t = 1 is finiteand thus applying the martingale convergence theorem inLemma 1 can draw that Z(t), a.s., converge to a finiterandom variable, say, Z0, that is,
Z(t) = Vm(t)
[ ln |P−1m (t)|]c � Z0 , 1, a.s.
or
Vm(t) = O([ ln |P−1m (t)|]c), a.s. (64)
According to the definition of Vm(t), we have
‖qm(t)‖2 ≤tr[q
T
m(t)P−1m (t)qm(t)]
lmin[P−1m (t)]
= Vm(t)
lmin[P−1m (t)]
(65)
Using (64) and (65), we can obtain that
‖q(t)− q‖2 = ‖qm(t)‖2 = O[ ln |P−1
m (t)|]clmin[P
−1m (t)]
( )
= O{ ln tr[P−1
m (t)]}c
lmin[P−1m (t)]
( ), a.s., c . 1
This proves Conclusion 1 of Theorem 2. Similarly, letting
Z2(t) =Vm(t)
ln |P−1m (t)|[ ln ln |P−1
m (t)|]c
Z3(t) =Vm(t)
ln |P−1m (t)| ln ln |P−1
m (t)|[ ln ln ln |P−1m (t)|]c
Z4(t) =Vm(t)
ln |P−1m (t)| ln ln |P−1
m (t)| ln ln ln |P−1m (t)|
×[ ln ln ln ln |P−1m (t)|]c
we can obtain Conclusion 2 of Theorem 2.Since ln lmax[P
−1m (t)] = O( ln tr[P−1
m (t)]), the conclusionsof Theorem 2 can be expressed as
1. ‖q(t)− q‖2 = O{ ln lmax[P
−1m (t)]}c
lmin[P−1m (t)]
( ), a.s.
2. ‖q(t)− q‖2 = Oln lmax[P
−1m (t)][ ln ln lmax[P
−1m (t)]}c
lmin[P−1m (t)]
( ),
a.s.
3. ‖q(t)− q‖2 = O
ln lmax[P−1m (t)]{ ln ln lmax[P
−1m (t)]}
×{ ln ln ln lmax[P−1m (t)]}c
lmin[P−1m (t)]
⎛⎜⎝
⎞⎟⎠,
a.s.
75& The Institution of Engineering and Technology 2013
www.ietdl.org
4. ‖q(t)− q‖2 = O
ln lmax[P−1m (t)]{ ln ln lmax[P
−1m (t)]}
×{ ln ln ln lmax[P−1m (t)]}
×{ ln ln ln ln lmax[P−1m (t)]}c
lmin[P−1m (t)]
⎛⎜⎜⎜⎝
⎞⎟⎟⎟⎠,
a.s.
Remark 6: Theorem 2 indicates that four conclusionsconverges faster in turn. For the persistently excited condition
c1In ≤1
t
∑t
j=1
∑mi=1
fi(j)fTi (j) ≤ c2In, a.s., c1, c2
. 0, for large t
the parameter estimation error converges to zero. In this case,Conclusions~1 and 4 in Theorem 2 can be expressed as
1. ‖q(t)− q‖2 = O{ ln t}c
t
( ), a.s.
4. ‖q(t)− q‖2
= Oln t{ ln ln t}{ ln ln ln t}{ ln ln ln ln t}c
t
( ), a.s.
Since ( ln t)c/t � 0, Conclusion 4 converges to zero fasterthan Conclusion 2 as t increases.
5 Examples
Example 1: Consider the following linear multivariablesystem with two-input two-output in [30]
a(z)y(t) = Q(z)u(t)+ v(t) (68)
a(z) = 1+ a1z−1 + a2z
−2 + a3z−3 = 1− 1.15z−1
+ 0.425z−2 − 0.05z−3
Q(z) = Q1z−1 + Q2z
−2 + Q3z−3
= 1.0 1.0
1.2 1.2
[ ]z−1 + −0.900 −0.750
−1.080 −0.780
[ ]z−2
+ 0.200 0.125
0.240 0.120
[ ]z−3
76& The Institution of Engineering and Technology 2013
a = [a1, a2, a3]T = [−1.15, 0.425, −0.05]T
uT = [Q1, Q2, Q3]
= 1.0 1.0 −0.90 −0.75 0.20 0.125
1.2 1.2 −1.08 −0.78 0.24 0.120
[ ]
Here, z−1 is a unit backward shift operator: z−1y(t) = y(t − 1).Referring to [30], this example system in (68) can be writtenas the multiple linear regression model in (10):
y(t)=F(t)q+ v(t)
y(t)= y1(t)
y2(t)
[ ][R2 (m= 2), q= a
col[u]
[ ][R15
F(t)= [− y(t−1), − y(t−2), − y(t−3), I3⊗wT(t)][R2×15
w(t)= [uT(t−1), uT(t−2), uT(t−3)]T [R6
Here, u(t) = [u1(t), u2(t)]T is taken as a persistent excitation
vector sequence with zero mean and unit variances, andv(t) = [v1(t), v2(t)]
T as a white noise vector sequence withzero mean and variances s2
1 = 0.402 for v1(t) ands22 = 0.502 for v2(t). Taking the initial values,
qm(0) = 10−6115 and Pm(0) = 106I15, we apply the C-LSalgorithm to estimate the parameters of this examplesystem. The parameter estimates and their errors withdifferent data lengths t are shown in Table 1, the parameterestimates qi(t) against t are shown in Figs. 3 and 4, and theparameter estimation errors d := ‖q(t)− q‖/‖q‖ × 100%against t are shown in Fig. 5.
Fig. 3 C-LS parameter estimatesq1, q3, q4 against t for Example 1
Table 1 C-LS estimates and errors of Example 1
t 100 200 500 1000 2000 3000
q1 = −1.15000 − 1.16794 − 1.15772 − 1.15401 − 1.14562 − 1.15091 − 1.15737q2 = 0.42500 0.39626 0.36311 0.41306 0.41037 0.41438 0.42064q3 = −0.05000 − 0.02040 − 0.02497 − 0.04809 − 0.05522 − 0.04926 − 0.04926q4 = 1.00000 1.02124 1.02646 1.04943 1.03720 1.01908 1.01250q5 = 1.00000 1.01556 1.01599 1.02328 1.01286 1.00220 0.99628q6 = −0.90000 − 1.00549 − 0.95900 − 0.92778 − 0.89941 − 0.89944 − 0.91008q7 = −0.75000 − 0.80942 − 0.79887 − 0.76220 − 0.74874 − 0.74525 − 0.75443q8 = 0.20000 0.20365 0.14421 0.19814 0.20167 0.19542 0.19865q9 = 0.12500 0.10642 0.07419 0.14422 0.13381 0.12476 0.13014q10 = 1.20000 1.13451 1.21313 1.18654 1.20116 1.18895 1.19340q11 = 1.20000 1.13329 1.18220 1.20673 1.20385 1.19122 1.19445q12 = −1.08000 − 1.10567 − 1.07436 − 1.09769 − 1.08397 − 1.08090 − 1.08941q13 = −0.78000 − 0.79925 − 0.76553 − 0.73406 − 0.77486 − 0.79328 − 0.79670q14 = 0.24000 0.14586 0.11403 0.18240 0.18985 0.20089 0.21674q15 = 0.12000 0.04817 0.06852 0.11613 0.11624 0.12557 0.12492the errors δ (%) 6.55609 6.12850 3.28471 2.15333 1.59692 1.20391
IET Control Theory Appl., 2013, Vol. 7, Iss. 1, pp. 68–79doi: 10.1049/iet-cta.2012.0171
www.ietdl.org
Example 2: Consider the multiple linear regression modely(t) = F(t)q+ v(t) with
F(t) = −y1(t − 1), y1(t − 2) sin (y2(t − 2)), y2(t − 1)
−y1(t − 1), y1(t − 2) sin (t/p), y2(t − 1),
[
y2(t − 2)u1(t − 2), u1(t − 1), u1(t − 2)u2(t − 2),
y1(t − 2)u2(t − 2) u21(t − 1), sin (u2(t − 2)),
u2(t − 1) cos (t)
u1(t − 1)+ u2(t − 2)
][ R2×7
Fig. 4 C-LS parameter estimates q5, q6 , q9 against t for Example 1
Fig. 5 C-LS estimation errors δ against t for Example 1
IET Control Theory Appl., 2013, Vol. 7, Iss. 1, pp. 68–79doi: 10.1049/iet-cta.2012.0171
Simulation conditions are similar to those of Example 1 andv(t) = [v1(t), v2(t)]
T as a white noise vector sequence withzero mean and variances s2. The simulation results areshown in Tables 2 and 3 and Figs. 6–8 with s2 = 0.102
and s2 = 0.502.From Tables 1–3 and Figs. 3–7, we can see that the
parameter estimation errors δ become smaller (in general)with t increasing and a low noise level leads to highaccurate parameter estimates.
Fig. 6 C-LS parameter estimates q1, q2, q3, q4 against t forExample 2 (s2 = 0.502)
Fig. 7 C-LS parameter estimates q5, q6 , q7 against t for Example2 (s2 = 0.502)
Table 2 C-LS estimates and errors of Example 2 (σ2 = 102)
t q1 q2 q3 q4 q5 q6 q7 δ (%)
100 0.85043 0.45006 0.10043 − 0.57774 0.22998 − 0.48421 1.35311 0.93964200 0.84932 0.45065 0.09901 − 0.57763 0.22748 − 0.49671 1.35853 0.27999500 0.84926 0.45096 0.09873 − 0.57884 0.22882 − 0.50169 1.35853 0.177981000 0.84968 0.45051 0.09960 − 0.58007 0.22854 − 0.50247 1.35854 0.178522000 0.84990 0.45030 0.10000 − 0.57992 0.22952 − 0.50110 1.35790 0.131903000 0.84991 0.45017 0.10002 − 0.57992 0.22890 − 0.49989 1.35828 0.11111true values 0.85000 0.45000 0.10000 − 0.58000 0.23000 − 0.50000 1.36000
Table 3 C-LS estimates and errors of Example 2 (σ2 = 0.502)
t q1 q2 q3 q4 q5 q6 q7 δ(%)
100 0.86049 0.44792 0.11783 − 0.58436 0.22370 − 0.42426 1.32419 4.68368200 0.85153 0.45229 0.10136 − 0.57570 0.21527 − 0.47891 1.35129 1.49557500 0.85049 0.45356 0.09954 − 0.57671 0.22210 − 0.50540 1.35084 0.763481000 0.85139 0.45060 0.09956 − 0.57902 0.22220 − 0.51147 1.35195 0.872282000 0.85062 0.45015 0.09957 − 0.57954 0.22747 − 0.50501 1.34969 0.636303000 0.85074 0.44945 0.10035 − 0.57870 0.22457 − 0.49919 1.35151 0.55354true values 0.85000 0.45000 0.10000 − 0.58000 0.23000 − 0.50000 1.36000
77& The Institution of Engineering and Technology 2013
www.ietdl.org
6 Conclusions
Referring to [38], a C-LS algorithm is developed formultivariable systems in order to avoid computing thematrix inversion. The convergence of the C-LS algorithm isstudied using the martingale convergence theorem. TheC-LS algorithms have the following properties.
† The parameter estimates given by the C-LS algorithmconverge to their true values as the data length increases.† The proposed C-LS algorithm requires lowercomputational load and achieves highly accurate parameterestimates.† With the noise-to-signal ratios decreasing, the convergencerate of the parameter estimation of the C-LS algorithmsbecomes faster.
The basic idea of the proposed coupled identificationmethods can be extended to linear multivariable systems[40–45], non-linear multivariable systems [46, 47],non-uniformly sampled systems [48, 49] or other systemswith colored noises [50–52].
7 Acknowledgments
This work was supported by the National Natural ScienceFoundation of China (grant no. 61273194), the NaturalScience Foundation of Jiangsu Province China (grant no.BK2012549) and by the 111 Project (grant no. B12018).
8 References
1 Yan, M., Shi, Y.: ‘Robust discrete-time sliding mode control foruncertain systems with time-varying state delay’, IET Control TheoryAppl., 2008, 2, (8), pp. 662–674
2 Shi, Y., Yu, B.: ‘Output feedback stabilization of networked controlsystems with random delays modeled by Markov chains’, IEEE Trans.Autom. Control, 2009, 54, (7), pp. 1668–1674
3 Shi, Y., Fang, H.: ‘Kalman filter based identification for systems withrandomly missing measurements in a network environment’,Int. J. Control, 2010, 83, (3), pp. 538–551
4 Ding, F., Chen, T.: ‘Performance bounds of the forgetting factor leastsquares algorithm for time-varying systems with finite measurementdata’, IEEE Trans. Circuits Syst. I, Regul. Pap., 2005, 52, (3),pp. 555–566
5 Ding, F., Chen, T.: ‘Hierarchical identification of lifted state-spacemodels for general dual-rate systems’, IEEE Trans. Circuits Syst. I,Regul. Pap., 2005, 52, (6), pp. 1179–1187
6 Yuz, J.I., Alfaro, J., Agüero, J.C., Gooodwin, G.C.: ‘Identification ofcontinuous-time state-space models from non-uniform fast-sampleddata’, IET Control Theory Appl., 2011, 5, (7), pp. 842–855
Fig. 8 C-LS estimation errors δ against t for Example 2 withs2 = 0.102 and s2 = 0.502
78& The Institution of Engineering and Technology 2013
7 Herrera, J., Ibeas, A., Alcántara, S., Sen, M.D.L.: ‘Multimodel-basedtechniques for the identification and adaptive control of delayedmulti-input multi-output systems’, IET Control Theory Appl., 2011, 5,(1), pp. 188–202
8 Wang, D.Q.: ‘Least squares-based recursive and iterative estimation foroutput error moving average systems using data filtering’, IET ControlTheory Appl., 2011, 5, (14), pp. 1648–1657
9 Xie, L., Liu, Y.J., Yang, H.Z., et al.: ‘Modeling and identification fornon-uniformly periodically sampled-data systems’, IET ControlTheory Appl., 2010, 4, (5), pp. 784–794
10 Bai, E., Cai, Z.: ‘How nonlinear parametric Wiener system identificationIs under Gaussian inputs? IEEE Trans. Autom. Control, 2012, 57, (3),pp. 738–742
11 Ding, J., Ding, F., Liu, X.P., et al.: ‘Hierarchical least squaresidentification for linear SISO systems with dual-rate sampled-data’,IEEE Trans. Autom. Control, 2011, 56, (11), pp. 2677–2683
12 Mercére, G., Bako, L.: ‘Parameterization and identification ofmultivariable state-space systems: a canonical approach’, Automatica,2011, 47, (8), pp. 1547–1555
13 Schön, T., Wills, A., Ninness, B.: ‘System identification of nonlinearstate-space models’, Automatica, 2011, 47, (1), pp. 39–49
14 Lai, T.L., Wei, C.Z.: ‘Least squares estimates in stochastic regressionmodels with applications to identification and control of dynamicsystems’, Ann. Stat., 1982, 10, (1), pp. 154–166
15 Ding, F., Chen, T.: ‘Performance analysis of multi-innovation gradienttype identification methods’, Automatica, 2007, 43, (1), pp. 1–14
16 Ding, F., Liu, X.P., Liu, G.: ‘Auxiliary model based multi-innovationextended stochastic gradient parameter estimation with coloredmeasurement noises’, Signal Process., 2009, 89, (10), pp. 1883–1890
17 Ding, J., Ding, F.: ‘Bias compensation based parameter estimation foroutput error moving average systems’, Int. J. Adapt. Control SignalProcess., 2011, 25, (12), pp. 1100–1111
18 Goodwin, G.C., Sin, K.S.: ‘Adaptive filtering prediction and control’,(Prentice-Hall: Englewood Cliffs, NJ, 1984)
19 Ding, F., Yang, H.Z., Liu, F.: ‘Performance analysis of stochasticgradient algorithms under weak conditions’, Sci. China Ser. F, Inf.Sci., 2008, 51, (9), pp. 1269–1280
20 Ljung, L.: ‘System identification: theory for the user’, (Prentice-Hall,Englewood Cliffs, New Jersey, 1999, 2nd edn.)
21 Ding, F., Liu, Y.J., Bao, B.: ‘Gradient based and least squares basediterative estimation algorithms for multi-input multi-output systems’,Proc. Inst. Mech. Eng., I, J. Syst. Control Eng., 2012, 226, (1),pp. 43–55
22 Ljung, L.: ‘Consistency of the least-squares identification method’,IEEE Trans. Autom. Control, 1976, 21, (5), pp. 779–781
23 Solo, V.: ‘The Convergence of AML’, IEEE Trans. Autom. Control,1979, 24, (6), pp. 958–962
24 Ding, F., Chen, T.: ‘Combined parameter and output estimation ofdual-rate systems using an auxiliary model’, Automatica, 2004, 40,(10), pp. 1739–1748
25 Lai, T.L., Wei, C.Z.: ‘Extended least squares and their applications toadaptive control and prediction in linear systems’, IEEE Trans. Autom.Control, 1986, 31, (10), pp. 898–906
26 Wei, C.Z.: ‘Adaptive prediction by least squares prediction in stochasticregression models’, Ann. Stat., 1987, 15, (4), pp. 1667–1682
27 Lai, T.L., Ying, Z.L.: ‘Recursive identification and adaptive predictionin linear stochastic systems’, SIAM J. Control Optim., 1991, 29, (5),pp. 1061–1090
28 Toussi, K., Ren, W.: ‘On the convergence least squares estimates inwhite noise’, IEEE Trans. Autom. Control, 1994, 39, (2), pp. 364–368
29 Ren, W., Kumar, P.K.: ‘Stochastic adaptive prediction and modelreference control, IEEE Trans. Autom. Control, 1994, 39, (10),pp. 2047–2060
30 Ding, F., Chen, T.: ‘Hierarchical gradient-based identification ofmultivariable discrete-time systems’, Automatica, 2005, 41, (2),pp. 315–325
31 Ding, F., Liu, G., Liu, X.: ‘Partially coupled stochastic gradientidentification methods for non-uniformly sampled systems’, IEEETrans. Autom. Control, 2010, 55, (8), pp. 1976–1981
32 Ding, F.: ‘System identification–new theory and methods’ (SciencePress, Beijing, 2013)
33 Sen, A., Sinha, N.K.: ‘On-line estimation of the parameters of amultivariable system using matrix pseudo-inverse’, Int. J. Syst. Sci.,1976, 7, (4), pp. 461–471
34 Liu, Y.J., Sheng, J., Ding, R.F.: ‘Convergence of stochastic gradientestimation algorithm for multivariable ARX-like systems’, Comput.Math. Appl., 2010, 59, (8), pp. 2615–2627
35 Ding, F.: ‘System identification–Part H: coupling identification conceptand methods’, J. Nanjing Univ. Inf. Sci. Technol. (Natural ScienceEdn.), 2012, 4, (3), pp. 193–212.
IET Control Theory Appl., 2013, Vol. 7, Iss. 1, pp. 68–79doi: 10.1049/iet-cta.2012.0171
www.ietdl.org
36 Ding, F., Liu, G., Liu, X.P.: ‘Parameter estimation with scarcemeasurements’, Automatica, 2011, 47, (8), pp. 1646–165537 Golub, G.H., Loan, C.F.V.: ‘Matrix computations’ (Johns Hopkins
University Press, Baltimore, MD, 1996, 3rd edn.)38 Fang, C.Z., Xiao, D.Y.: ‘Process identification’ (Tsinghua University
Press, Beijing, 1988)39 Xiao, Y.S., Ding, F., Zhou, Y., Li, M., Dai, J.Y.: ‘On consistency
of recursive least squares identification algorithms for controlledauto-regression models’, Appl. Math. Model., 2008, 32, (11), pp.2207–2215.
40 Liu, Y.J., Xiao, Y.S., Zhao, X.L.: ‘Mlti-innovation stochastic gradientalgorithm for multiple-input single-output systems using the auxiliarymodel’, Appl. Math. Comput., 2009, 215, (4), pp. 1477–1483
41 Wang, D.Q., Yang, G.W., Ding, R.F.: ‘Gradient-based iterativeparameter estimation for Box-Jenkins systems’, Comput. Math. Appl.,2010, 60, (5), pp. 1200–1208
42 Liu, Y.J., Wang, D.Q., Ding, F.: ‘Least-squares based iterativealgorithms for identifying Box-Jenkins models with finitemeasurement data’, Digit. Signal Process., 2010, 20, (5), pp. 1458–1467
43 Ding, F., Gu, Y.: ‘Performance analysis of the auxiliary model basedleast squares identification algorithm for one-step state delay systems’,Int. J. Comput. Math., 2012, 89, (15), pp. 2019–2028
44 Ding, F.: ‘Two-stage least squares based iterative estimation algorithmfor CARARMA system modeling’, Appl. Math. Model., 2013, 37;http://dx.doi.org/10.1016/j.apm.2012.10.014
IET Control Theory Appl., 2013, Vol. 7, Iss. 1, pp. 68–79doi: 10.1049/iet-cta.2012.0171
45 Ding, F., Gu, Y.: ‘Performance analysis of the auxiliary model-basedstochastic gradient parameter estimation algorithm for state spacesystems with one-step state delay’, Circuits Syst. Signal Process.,2013, 32; doi: 10.1007/s00034-012-9463-5
46 Ding, F., Liu, X.P., Liu, G.: ‘Identification methods for Hammersteinnonlinear systems’, Digit. Signal Process., 2011, 21, (2), pp. 215–238
47 Ding, F.: ‘Hierarchical multi-innovation stochastic gradient algorithmfor Hammerstein nonlinear system modeling’, Appl. Math. Model.,2013, 37, (4), pp. 1694–1704
48 Liu, Y.J., Xie, L., et al: ‘An auxiliary model based recursive leastsquares parameter estimation algorithm for non-uniformly sampledmultirate systems’, Proc. Inst. Mech. Eng. I, J. Syst. Control Eng.,2009, 223, (4), pp. 445–454
49 Ding, F., Qiu, L., Chen, T.: ‘Reconstruction of continuous-time systemsfrom their non-uniformly sampled discrete-time systems’, Automatica,2009, 45, (2), pp. 324–332
50 Wang, W., Ding, F., Dai, J.Y.: ‘Maximum likelihood least squaresidentification for systems with autoregressive moving average noise’,Appl. Math. Model., 2012, 36, (5), pp. 1842–1853
51 Li, J.H., Ding, F., Yang, G.W.: ‘Maximum likelihood least squaresidentification method for input nonlinear finite impulse responsemoving average systems’, Math. Comput. Model., 2012, 55, (3–4),pp. 442–450
52 Ding, F.: ‘Decomposition based fast least squares algorithm for outputerror systems’, Signal Process., 2013, 93, http://dx.doi.org/10.1016/j.sigpro.2012.12.013
79& The Institution of Engineering and Technology 2013