Distributed nonparametric regression in multi-agent...
Transcript of Distributed nonparametric regression in multi-agent...
Distributed nonparametric regression in
multi-agent systems
Damiano Varagnolo
(joint work with Gianluigi Pillonetto and Luca Schenato)
Department of Information Engineering - University of Padova
March, 25th 2011
entirely
writtenin
LATEX
2εus
ing
Beamer and Tik Z
Damiano Varagnolo (DEI) [email protected] March, 25th 2011 1 / 42
introduction
Multi-agent systems: examples of applications
Multi-Robot
Coordination
Underwater
Exploration
Smart Grids
Deployment
Wind Power
Management
Wild Life
Monitoring
Smart
Highways
Deployment
Smart
Surveillance
Implementation
Damiano Varagnolo (DEI) [email protected] March, 25th 2011 2 / 42
introduction
Generic and important problem
Assumption
noisy measurements of
f (x , t) : R3 × R 7→ R
that are
non uniformly sampled in space x
non uniformly sampled in time t
taken by di�erent agents
Objective
smoothing in space (x) and forecast in time (t) the quantity f (x , t)
Damiano Varagnolo (DEI) [email protected] March, 25th 2011 3 / 42
introduction
Example 1 - channel gains in geographical areas
source: Dall'Anese et al., 2011
x ∈ R2 : position
t : time
f (x , t) : channel gain
Damiano Varagnolo (DEI) [email protected] March, 25th 2011 4 / 42
introduction
Example 2 - waves power extraction
source: www.graysharboroceanenergy.com
x ∈ R2 : position
t : time
f (x , t) : sea level
Damiano Varagnolo (DEI) [email protected] March, 25th 2011 5 / 42
introduction
Example 3 - multi robot exploration
source: http://www-robotics.jpl.nasa.gov
x ∈ R2 : position
f (x) : ground level
Damiano Varagnolo (DEI) [email protected] March, 25th 2011 6 / 42
introduction
Problems and di�culties
Information-related problems
non-uniform samplings both in time and in space
unknown dynamics of f
unknown or extremely complex correlations in time and space
Hardware-related problems
energy & computational & memory & bandwidth limitations
Framework-related problems
mobile and time varying network
Damiano Varagnolo (DEI) [email protected] March, 25th 2011 7 / 42
introduction
State of the art
proposed
distributed
solutions
Least Squares
Maximum Likelihoods
Kalman Filtering
Kriging
Other Learning Techniques
Choi et al. 2009, Predd et al. 2006, Boyd et al. 2005
Schizas et al. 2008, Barbarossa
and Scutari 2007, Boyd et al.
2010
Cressie and Wikle 2002,
Olfati-Saber 2007, Carli
et al. 2008
Dall'Anese et al. 2011, Cortés 2010
Nguyen et al. 2005, Bazerque et al. 2010
nonparametric
Damiano Varagnolo (DEI) [email protected] March, 25th 2011 8 / 42
introduction
State of the art
proposed
distributed
solutions
Least Squares
Maximum Likelihoods
Kalman Filtering
Kriging
Other Learning Techniques
Choi et al. 2009, Predd et al. 2006, Boyd et al. 2005
Schizas et al. 2008, Barbarossa
and Scutari 2007, Boyd et al.
2010
Cressie and Wikle 2002,
Olfati-Saber 2007, Carli
et al. 2008
Dall'Anese et al. 2011, Cortés 2010
Nguyen et al. 2005, Bazerque et al. 2010
dynamic scenarios
Damiano Varagnolo (DEI) [email protected] March, 25th 2011 8 / 42
introduction
State of the art
proposed
distributed
solutions
Least Squares
Maximum Likelihoods
Kalman Filtering
Kriging
Other Learning Techniques
Choi et al. 2009, Predd et al. 2006, Boyd et al. 2005
Schizas et al. 2008, Barbarossa
and Scutari 2007, Boyd et al.
2010
Cressie and Wikle 2002,
Olfati-Saber 2007, Carli
et al. 2008
Dall'Anese et al. 2011, Cortés 2010
Nguyen et al. 2005, Bazerque et al. 2010
static scenarios
Damiano Varagnolo (DEI) [email protected] March, 25th 2011 8 / 42
introduction
State of the art - Vision
obtain
f (x , t) = Ψ (past measurements, θ)
being
distributed
capable of both smoothing and prediction
Our approach
nonparametric: Ψ (·) lives in an in�nite dimensional space
Damiano Varagnolo (DEI) [email protected] March, 25th 2011 9 / 42
introduction
State of the art - Vision
obtain
f (x , t) = Ψ (past measurements)
being
distributed
capable of both smoothing and prediction
Our approach
nonparametric: Ψ (·) lives in an in�nite dimensional space
Damiano Varagnolo (DEI) [email protected] March, 25th 2011 9 / 42
introduction
Why should we use a nonparametric approach?
Motivations
it could be di�cult or even impossible to de�ne a parametric
model (e.g. when only regularity assumptions are available)
parametric models could involve a large number of parameters
(could require nonlinear optimization techniques)
lead to convex optimization problems
consistent, i.e. f → f when # measurements →∞ (De Nicolao,
Ferrari-Trecate, 1999)
Damiano Varagnolo (DEI) [email protected] March, 25th 2011 10 / 42
introduction
State of the art - where we actually contribute
our small
puzzle-piece
agents estimate
the same f
static scenario
(f indepen-
dent of t)
Regularization
Network
approach
Poggio and Girosi 1990
e.g. exploration
no needs to discard
old measurements
Damiano Varagnolo (DEI) [email protected] March, 25th 2011 11 / 42
introduction
Goal of the current presentation
completely self-organizing multi-agent regression strategy
Bene�ts of self-organization
widest applicability
robustness w.r.t. human error
Requirements
simple strategy
quanti�able performances
automatic tuning of the parameters
Damiano Varagnolo (DEI) [email protected] March, 25th 2011 12 / 42
framework & assumptions
Framework
Sensors:
# = S , identical
sample the same noisy function
limited computational &
communication capabilities
want to collaborate
1 measurement × sensor (ease
of notation)
Damiano Varagnolo (DEI) [email protected] March, 25th 2011 13 / 42
framework & assumptions
Measurement model
yi = f (xi) + νi (1)
f : X ⊂ Rd → R unknown (X compact)
νi ⊥ xi , zero mean and variance σ2
xi ∼ µ i.i.d. (sensors know µ!!)
examples of µ:
uniform jitter generic
Damiano Varagnolo (DEI) [email protected] March, 25th 2011 14 / 42
distributed estimators
Regularization Networks
Q (f ) =S∑i=1
(yi − f (xi))2 + γ ‖f ‖2K
lives in an in�nite
dimensional space
regularization factor,
K : X × X → RMercer kernel
Centralized optimal solution
fc =S∑i=1
ciK (xi , ·)
[c1
.
.
.
cS
]=
([K (x1, x1) · · · K (x1, xS )
.
.
.
.
.
.
K (xS , x1) · · · K (xS , xS )
]+γI
)−1 [y1
.
.
.
yS
]Damiano Varagnolo (DEI) [email protected] March, 25th 2011 15 / 42
distributed estimators
Regularization Networks
Q (f ) =S∑i=1
(yi − f (xi))2 + γ ‖f ‖2K
lives in an in�nite
dimensional space
regularization factor,
K : X × X → RMercer kernel
Centralized optimal solution
fc =S∑i=1
ciK (xi , ·)
[c1
.
.
.
cS
]=
([K (x1, x1) · · · K (x1, xS )
.
.
.
.
.
.
K (xS , x1) · · · K (xS , xS )
]+γI
)−1 [y1
.
.
.
yS
]Damiano Varagnolo (DEI) [email protected] March, 25th 2011 15 / 42
distributed estimators
Example - Cubic splines smoothingE�ects of variation of γ for a given K
x
f(x)
high γlow γ
Damiano Varagnolo (DEI) [email protected] March, 25th 2011 16 / 42
distributed estimators
Drawbacks
fc =S∑i=1
ciK (xi , ·)
[c1
.
.
.
cS
]=
([K (x1, x1) · · · K (x1, xS )
.
.
.
.
.
.
K (xS , x1) · · · K (xS , xS )
]+γI
)−1 [y1
.
.
.
yS
]
computational cost: O (S3) (inversion of S × S matrix)
transmission cost: O (S) (knowledge of whole {xi , yi}Si=1)
⇓
need to �nd alternative solutions
Damiano Varagnolo (DEI) [email protected] March, 25th 2011 17 / 42
distributed estimators
Alternative centralized optimal solution (1st on 2)
1 consider that
K (x1, x2) =+∞∑e=1
λeφe(x1)φe(x2)λe = eigenvalue
φe = eigenfunction(2)
2 rewrite the measurement model:
yi =
Ci :=︷ ︸︸ ︷[φ1 (xi) , φ2 (xi) , . . .]
b:=︷ ︸︸ ︷ b1b2...
+νi (3)
3 rewrite the cost function:
Q (b) =S∑i=1
(yi − Cib)2 + γ+∞∑e=1
b2eλe
(4)
Damiano Varagnolo (DEI) [email protected] March, 25th 2011 18 / 42
distributed estimators
Alternative centralized optimal solution (2nd on 2)
bc =
(1
Sdiag
(γ
λe
)+
1
S
S∑i=1
CTi Ci
)−1(1
S
S∑i=1
CTi yi
)(5)
involves in�nite dimensional objects:
bc =
• · · · · · ·...
. . ....
. . .
−1
•......
⇒ cannot be computed exactly
Damiano Varagnolo (DEI) [email protected] March, 25th 2011 19 / 42
distributed estimators
Hypothesis space reduction
PCA
P ≥ 0 P = ≈
Connection with PCA
span 〈φ1, . . . , φE 〉 is the E -dimensional subspace that captures thebiggest part of the energy of the estimate (E � S)
Implication: new measurement model
yi = CEi b + νi
with CEi := [φ1(xi), . . . , φE (xi)] b := [b1, . . . , bE ]T
Damiano Varagnolo (DEI) [email protected] March, 25th 2011 20 / 42
distributed estimators
Suboptimal �nite dimensional solution
New estimator
br =
(1
Sdiag
(γ
λe
)+
1
S
S∑i=1
(CEi
)TCEi
)−1(1
S
S∑i=1
(CEi
)Tyi
)computable (involves E × E matrices and E -dimensional vectors)
minimizes QE (b) :=∑S
i=1
(yi − CE
i b)2
+ γ∑E
e=1b2eλe
Drawbacks
1 O (E 3) computational e�ort
2 O (E 2) transmission e�ort
3 must know S
Damiano Varagnolo (DEI) [email protected] March, 25th 2011 21 / 42
distributed estimators
Suboptimal �nite dimensional solution
New estimator
br =
(1
Sdiag
(γ
λe
)+
1
S
S∑i=1
(CEi
)TCEi
)−1(1
S
S∑i=1
(CEi
)Tyi
)computable (involves E × E matrices and E -dimensional vectors)
minimizes QE (b) :=∑S
i=1
(yi − CE
i b)2
+ γ∑E
e=1b2eλe
Drawbacks
1 O (E 3) computational e�ort
2 O (E 2) transmission e�ort
3 must know S
Damiano Varagnolo (DEI) [email protected] March, 25th 2011 21 / 42
distributed estimators
Derivation of the distributed estimator
br =
(1
Sdiag
(γ
λe
)+
1
S
S∑i=1
(CEi
)TCEi
)−1(1
S
S∑i=1
(CEi
)Tyi
)
Consider the approximations
S → Sg (guess, will be tuned later on)
1
S
S∑i=1
(CEi
)TCEi → E
[(CEi
)TCEi
]= I
Damiano Varagnolo (DEI) [email protected] March, 25th 2011 22 / 42
distributed estimators
Derivation of the distributed estimator
obtain: bd =
(1
Sgdiag
(γ
λe
)+ I
)−1(1
S
S∑i=1
(CEi
)Tyi
)
Advantages
1 O (E ) computational e�ort
2 O (E ) transmission e�ort
Damiano Varagnolo (DEI) [email protected] March, 25th 2011 23 / 42
distributed estimators
Summary of proposed estimation schemes
bc =
(1
Sdiag
(γ
λe
)+
1
S
S∑i=1
CTi Ci
)−1(1
S
S∑i=1
CTi yi
)
br =
(1
Sdiag
(γ
λe
)+
1
S
S∑i=1
(CEi
)TCEi
)−1(1
S
S∑i=1
(CEi
)Tyi
)
bd =
(1
Sgdiag
(γ
λe
)+ I
)−1(1
S
S∑i=1
(CEi
)Tyi
)
Damiano Varagnolo (DEI) [email protected] March, 25th 2011 24 / 42
distributed estimators
Summary of proposed estimation schemes
br
br : O(E3)comput., O
(E2)transm., uses S
bd
bd : O (E) comput., O (E) transm., uses Sg
original hyp. space
bcbc : O
(S3)comput., O (S) transm. for fc
reduced hyp. s
pace
???
Damiano Varagnolo (DEI) [email protected] March, 25th 2011 25 / 42
distributed estimators
Summary of proposed estimation schemes
br
br : O(E3)comput., O
(E2)transm., uses S
bd
bd : O (E) comput., O (E) transm., uses Sg
original hyp. space
bcbc : O
(S3)comput., O (S) transm. for fc
reduced hyp. s
pace
???
Damiano Varagnolo (DEI) [email protected] March, 25th 2011 25 / 42
distributed estimators
Summary of proposed estimation schemes
br
br : O(E3)comput., O
(E2)transm., uses S
bd
bd : O (E) comput., O (E) transm., uses Sg
original hyp. space
bcbc : O
(S3)comput., O (S) transm. for fc
reduced hyp. s
pace
???
Damiano Varagnolo (DEI) [email protected] March, 25th 2011 25 / 42
distributed estimators
Summary of proposed estimation schemes
br
br : O(E3)comput., O
(E2)transm., uses S
bd
bd : O (E) comput., O (E) transm., uses Sg
original hyp. space
bcbc : O
(S3)comput., O (S) transm. for fc
reduced hyp. s
pace
???
Damiano Varagnolo (DEI) [email protected] March, 25th 2011 25 / 42
distributed estimators
Summary of proposed estimation schemes
br
br : O(E3)comput., O
(E2)transm., uses S
bd
bd : O (E) comput., O (E) transm., uses Sg
original hyp. space
bcbc : O
(S3)comput., O (S) transm. for fc
reduced hyp. s
pace
???
Damiano Varagnolo (DEI) [email protected] March, 25th 2011 25 / 42
distributed estimators
Summary of proposed estimation schemes
br
br : O(E3)comput., O
(E2)transm., uses S
bd
bd : O (E) comput., O (E) transm., uses Sg
original hyp. space
bcbc : O
(S3)comput., O (S) transm. for fc
reduced hyp. s
pace
???
Damiano Varagnolo (DEI) [email protected] March, 25th 2011 25 / 42
distributed estimators
Open problems
quanti�cation of performance
how to tune E and Sg
Damiano Varagnolo (DEI) [email protected] March, 25th 2011 26 / 42
bounds on errors
Quanti�cation of performances: �rst results
bc
brbd
??
bc =(1Sdiag
(γλe
)+ 1
S
∑Si=1 C
Ti Ci
)−1 (1S
∑Si=1 C
Ti yi
)br =
(1Sdiag
(γλe
)+ 1
S
∑Si=1
(CEi
)TCEi
)−1 (1S
∑Si=1
(CEi
)Tyi
)bd =
(1Sg
diag(γλe
)+ I)−1 (
1S
∑Si=1
(CEi
)Tyi
)
computed o�ine
average of local residuals
‖bc − bd‖2 ≤ α ‖br − bd‖2 +1
S
S∑i=1
|ri |
Damiano Varagnolo (DEI) [email protected] March, 25th 2011 27 / 42
bounds on errors
Quanti�cation of performances: �rst results
bc
brbd
??
bc =(1Sdiag
(γλe
)+ 1
S
∑Si=1 C
Ti Ci
)−1 (1S
∑Si=1 C
Ti yi
)br =
(1Sdiag
(γλe
)+ 1
S
∑Si=1
(CEi
)TCEi
)−1 (1S
∑Si=1
(CEi
)Tyi
)bd =
(1Sg
diag(γλe
)+ I)−1 (
1S
∑Si=1
(CEi
)Tyi
)
computed o�ine
average of local residuals
‖bc − bd‖2 ≤ α ‖br − bd‖2 +1
S
S∑i=1
|ri |
Damiano Varagnolo (DEI) [email protected] March, 25th 2011 27 / 42
bounds on errors
Quanti�cation of performances
bc
brbd
??‖bc − bd‖2 ≤ α ‖br − bd‖2 + |rave|
not distributedly computable
∝ 1
Smin
− 1
Smax∝ I − 1
S
S∑i=1
(CEi
)TCEi
‖br − bd‖2 ≤ ‖USbd‖2 + ‖UCbd‖2
Damiano Varagnolo (DEI) [email protected] March, 25th 2011 28 / 42
bounds on errors
Quanti�cation of performances
bc
brbd
??‖bc − bd‖2 ≤ α ‖br − bd‖2 + |rave|
not distributedly computable
∝ 1
Smin
− 1
Smax∝ I − 1
S
S∑i=1
(CEi
)TCEi
‖br − bd‖2 ≤ ‖USbd‖2 + ‖UCbd‖2
Damiano Varagnolo (DEI) [email protected] March, 25th 2011 28 / 42
bounds on errors
Quanti�cation of performances
bc
brbd
??‖bc − bd‖2 ≤ α ‖br − bd‖2 + |rave|
not distributedly computable
∝ 1
Smin
− 1
Smax∝ I − 1
S
S∑i=1
(CEi
)TCEi
‖br − bd‖2 ≤ ‖USbd‖2 + ‖UCbd‖2
Damiano Varagnolo (DEI) [email protected] March, 25th 2011 28 / 42
bounds on errors
Quanti�cation of performances
bc
brbd
??‖bc − bd‖2 ≤ α ‖br − bd‖2 + |rave|
not distributedly computable
∝ 1
Smin
− 1
Smax∝ I − 1
S
S∑i=1
(CEi
)TCEi
‖br − bd‖2 ≤ ‖USbd‖2 + ‖UCbd‖2
Damiano Varagnolo (DEI) [email protected] March, 25th 2011 28 / 42
bounds on errors
Distributed Monte Carlo Estimation(assumption: bd and |rave| already computed)
‖bc − bd‖2 ≤ |rave|+ α ‖USbd‖2 + α ‖UCbd‖2
1 (locally) generate a sample of ‖UCbd‖22 (distributedly) estimate E
[‖UCbd‖2
], var
(‖UCbd‖2
)3 use Cantelli's inequality:
P[? ≤ E [?] + k · stdev (?)
]≥ 1
1 + k2
obtain:
‖bc−bd‖2 ≤ |rave|+α ‖USbd‖2+α(E[‖UCbd‖2
]+k · stdev
(‖UCbd‖2
))Damiano Varagnolo (DEI) [email protected] March, 25th 2011 29 / 42
bounds on errors
Distributed Monte Carlo Estimation(assumption: bd and |rave| already computed)
‖bc − bd‖2 ≤ |rave|+ α ‖USbd‖2 + α ‖UCbd‖2
1 (locally) generate a sample of ‖UCbd‖22 (distributedly) estimate E
[‖UCbd‖2
], var
(‖UCbd‖2
)3 use Cantelli's inequality:
P[? ≤ E [?] + k · stdev (?)
]≥ 1
1 + k2
obtain:
‖bc−bd‖2 ≤ |rave|+α ‖USbd‖2+α(E[‖UCbd‖2
]+k · stdev
(‖UCbd‖2
))Damiano Varagnolo (DEI) [email protected] March, 25th 2011 29 / 42
bounds on errors
Distributed Monte Carlo Estimation(assumption: bd and |rave| already computed)
‖bc − bd‖2 ≤ |rave|+ α ‖USbd‖2 + α ‖UCbd‖2
1 (locally) generate a sample of ‖UCbd‖22 (distributedly) estimate E
[‖UCbd‖2
], var
(‖UCbd‖2
)3 use Cantelli's inequality:
P[? ≤ E [?] + k · stdev (?)
]≥ 1
1 + k2
obtain:
‖bc−bd‖2 ≤ |rave|+α ‖USbd‖2+α(E[‖UCbd‖2
]+k · stdev
(‖UCbd‖2
))Damiano Varagnolo (DEI) [email protected] March, 25th 2011 29 / 42
bounds on errors
Distributed Monte Carlo Estimation(assumption: bd and |rave| already computed)
‖bc − bd‖2 ≤ |rave|+ α ‖USbd‖2 + α ‖UCbd‖2
1 (locally) generate a sample of ‖UCbd‖22 (distributedly) estimate E
[‖UCbd‖2
], var
(‖UCbd‖2
)3 use Cantelli's inequality:
P[? ≤ E [?] + k · stdev (?)
]≥ 1
1 + k2
obtain:
‖bc−bd‖2 ≤ |rave|+α ‖USbd‖2+α(E[‖UCbd‖2
]+k · stdev
(‖UCbd‖2
))Damiano Varagnolo (DEI) [email protected] March, 25th 2011 29 / 42
bounds on errors
Distributed Monte Carlo Estimation(assumption: bd and |rave| already computed)
‖bc − bd‖2 ≤ |rave|+ α ‖USbd‖2 + α ‖UCbd‖2
1 (locally) generate a sample of ‖UCbd‖22 (distributedly) estimate E
[‖UCbd‖2
], var
(‖UCbd‖2
)3 use Cantelli's inequality:
P[? ≤ E [?] + k · stdev (?)
]≥ 1
1 + k2
obtain:
‖bc−bd‖2 ≤ |rave|+α ‖USbd‖2+α(E[‖UCbd‖2
]+k · stdev
(‖UCbd‖2
))Damiano Varagnolo (DEI) [email protected] March, 25th 2011 29 / 42
bounds on errors
Description of the simulations
f : R 7→ R
µ = U [0, 1]
composition of f : sum of sinusoidals
K : Gaussian kernel
measurement noise st. dev.: 0.25 (average SNR: ≈ 10)
Damiano Varagnolo (DEI) [email protected] March, 25th 2011 30 / 42
bounds on errors
Regression strategy e�ectiveness exampleS = 1000, E = 40 (motivated later)
0 0.2 0.4 0.6 0.8 1
−1.5
−1
−0.5
0
0.5
1
x
f(x)
f
bc
bd
Damiano Varagnolo (DEI) [email protected] March, 25th 2011 31 / 42
bounds on errors
Examples of the e�ectiveness of the boundsS = 1000, E = 40, Smin = 900, Smin = 1100, 100 Monte Carlo runs
0 0.1 0.2 0.3 0.4
0
0.1
0.2
0.3
0.4
‖bc − bd‖2 / ‖bd‖2
computedbound
k = 0 (c.l. 0)k = 3 (c.l. 0.9)
Damiano Varagnolo (DEI) [email protected] March, 25th 2011 32 / 42
bounds on errors
Swept under the carpeti.e. (some of) the things we should have proved
how to compute the bound
why estimates of E[‖UCbd‖2
]and stdev
(‖UCbd‖2
)are good
estimates
why we can use Cantelli's inequality
Damiano Varagnolo (DEI) [email protected] March, 25th 2011 33 / 42
auto-tuning procedures
The need for automatic tuning procedures
to be distributedly computed
can be considered �xed
bd (Sg ,E ) =
(1
Sgdiag
(γ
λe
)+ I
)−1(1
S
S∑i=1
(CEi
)Tyi
)
Damiano Varagnolo (DEI) [email protected] March, 25th 2011 34 / 42
auto-tuning procedures
The need for automatic tuning procedures
to be distributedly computed
can be considered �xed
bd (Sg ,E ) =
(1
Sgdiag
(γ
λe
)+ I
)−1(1
S
S∑i=1
(CEi
)Tyi
)
Damiano Varagnolo (DEI) [email protected] March, 25th 2011 34 / 42
auto-tuning procedures
The need for automatic tuning procedures
to be distributedly computed
can be considered �xed
bd (Sg ,E ) =
(1
Sgdiag
(γ
λe
)+ I
)−1(1
S
S∑i=1
(CEi
)Tyi
)
Damiano Varagnolo (DEI) [email protected] March, 25th 2011 34 / 42
auto-tuning procedures
Tuning under centralized scenarios
General aim
maximize the generalization capabilities of the estimators
Classical approaches
cross validation
maximum (marginal) likelihood
. . .
Peculiarities of our distributed framework
no test sets available
involves unknown quantities
Damiano Varagnolo (DEI) [email protected] March, 25th 2011 35 / 42
auto-tuning procedures
Proposed solution - key ideas
parameters to
be estimated
number of
eigenfunc. E
number of
sensors Sg
assure ‖bc − br (E )‖to be su�ciently small
minimize the bound
on ‖bc − bd (Sg ,E )‖
Damiano Varagnolo (DEI) [email protected] March, 25th 2011 36 / 42
auto-tuning procedures
Proposed solution - key ideas
parameters to
be estimated
number of
eigenfunc. E
number of
sensors Sg
assure ‖bc − br (E )‖to be su�ciently small
minimize the bound
on ‖bc − bd (Sg ,E )‖
Damiano Varagnolo (DEI) [email protected] March, 25th 2011 36 / 42
auto-tuning procedures
Proposed solution - key ideas
parameters to
be estimated
number of
eigenfunc. E
number of
sensors Sg
assure ‖bc − br (E )‖to be su�ciently small
minimize the bound
on ‖bc − bd (Sg ,E )‖
Damiano Varagnolo (DEI) [email protected] March, 25th 2011 36 / 42
results
Description of the simulations
f : R 7→ R
µ = U [0, 1]
composition of f : sum of sinusoidals
K : Gaussian kernel
measurement noise st. dev.: 0.75 (implies an average SNR of
≈ 2.5)
Damiano Varagnolo (DEI) [email protected] March, 25th 2011 37 / 42
results
Regression strategy e�ectiveness exampleS = 100, E = 20, Smin = 90, Smax = 110
0 0.2 0.4 0.6 0.8 1
−3
−2
−1
0
1
2
3
x
f(x)
bc
bd
Damiano Varagnolo (DEI) [email protected] March, 25th 2011 38 / 42
results
Regression strategy e�ectiveness exampleS = 100, E = 20, Smin = 20, Smax = 2000
0 0.2 0.4 0.6 0.8 1
−3
−2
−1
0
1
2
3
x
f(x)
bc
bd
Damiano Varagnolo (DEI) [email protected] March, 25th 2011 39 / 42
results
Comparisons with naïve and oracle strategiesS = 100, E = 20, Smin = 20, Smax = 2000
naïve: Sg = Save
0 0.2 0.4 0.6 0.8
0
0.2
0.4
0.6
0.8
∥∥bnaïved − bc
∥∥2
∥ ∥ bours
d−
bc
∥ ∥ 2
oracle
0 0.2 0.4 0.6 0.8
0
0.2
0.4
0.6
0.8
∥∥boracled − bc
∥∥2
∥ ∥ bours
d−
bc
∥ ∥ 2
Damiano Varagnolo (DEI) [email protected] March, 25th 2011 40 / 42
conclusions
Conclusions and future works
Conclusions
Strategy is:
e�ective and easy to be implemented strategy
tight performances bound
self-tuning capabilities
Future works
exploit statistical knowledge about S
incorporate e�ects of �nite number of steps in consensus
algorithms
extend to dynamic scenarios (long term objective)
Damiano Varagnolo (DEI) [email protected] March, 25th 2011 41 / 42
conclusions
Distributed nonparametric regression in
multi-agent systems
Damiano Varagnolo
(joint work with Gianluigi Pillonetto and Luca Schenato)
Department of Information Engineering - University of Padova
March, 25th 2011
www.dei.unipd.it/∼varagnolo/google: damiano varagnolo
licensed under the Creative Commons BY-NC-SA 2.5 Italy License:
entirely
writtenin
LATEX
2εus
ing
Beamer and Tik Z
Damiano Varagnolo (DEI) [email protected] March, 25th 2011 42 / 42
References
Barbarossa, S. and Scutari, G. (2007). Decentralized
maximum-likelihood estimation for sensor networks composed of
nonlinearly coupled dynamical systems. IEEE Transactions on
Signal Processing, 55(7):3456 � 3470.
Bazerque, J., Mateos, G., and Giannakis, G. (2010). Distributed lasso
for in-network linear regression. In IEEE International Conference
on Acoustics, Speech and Signal Processing.
Boyd, S., Parikh, N., Chu, E., Peleato, B., and Eckstein, J. (2010).
Distributed optimization and statistical learning via the alternating
direction method of multipliers. Technical report, Stanford
University.
Carli, R., Chiuso, A., Schenato, L., and Zampieri, S. (2008).
Distributed kalman �ltering based on consensus strategies. IEEE
Journal on Selected Areas in Communications, 26:622�633.
Choi, J., Oh, S., and Horowitz, R. (2009). Distributed learning andDamiano Varagnolo (DEI) [email protected] March, 25th 2011 42 / 42
References
cooperative control for multi-agent systems. Automatica,
45(12):2802 � 2814.
Cortés, J. (2009). Distributed Kriged Kalman �lter for spatial
estimation. IEEE Transactions on Automatic Control, 54(12):2816
� 2827.
Cressie, N. and Wikle, C. K. (2002). Space - time Kalman �lter.
Encyclopedia of Environmetrics, 4:2045 � 2049.
Dall'Anese, E., Kim, S.-J., and Giannakis, G. B. (2011). Channel
gain map tracking via distributed kriging. IEEE Transactions on
Vehicular Technology, 2011.
De Nicolao, G. and Ferrari-Trecate, G. (1999). Consistent
identi�cation of NARX models via Regularization Networks. IEEE
Transactions on Automatic Control, 44(11):2045 � 2049.
Nguyen, X., Wainwright, M. J., and Jordan, M. I. (2005).
Nonparametric decentralized detection using kernel methods. IEEE
Transactions on Signal Processing, 53(11):4053 � 4066.Damiano Varagnolo (DEI) [email protected] March, 25th 2011 42 / 42
conclusions
Olfati-Saber, R. (2007). Distributed kalman �ltering for sensor
networks. In IEEE Conference on Decision and Control.
Predd, J. B., Kulkarni, S. R., and Poor, H. V. (2006). Distributed
learning in wireless sensor networks. IEEE Signal Processing
Magazine, 23(4):56 � 69.
Schizas, I. D., Ribeiro, A., and Giannakis, G. B. (2008). Consensus in
ad hoc WSNs with noisy links - part I: Distributed estimation of
deterministic signals. IEEE Transactions on Signal Processing,
56(1):350 � 364.
Varagnolo, D., Pillonetto, G., and Schenato, L. (2011a). Auto-tuning
procedures for distributed nonparametric regression algorithms.
IEEE Conference on Decision and Control, Submitted.
Varagnolo, D., Pillonetto, G., and Schenato, L. (2011b). Distributed
parametric and nonparametric regression with on-line performance
bounds computation. Automatica, Submitted.Damiano Varagnolo (DEI) [email protected] March, 25th 2011 42 / 42