Spectral clustering with cortical kernelsgmvision.lsis.org/slides/barbieri.pdf · Spectral...

1
Spectral clustering with cortical kernels Davide Barbieri 1 , Giovanna Citti 2 , Giacomo Cocci 2 , Marta Favali 3 , Alessandro Sarti 3 1 Universidad Aut´onoma de Madrid 2 Universit` a di Bologna 3 CAMS, EHESS/CNRS Paris Spatial and spatio-temporal receptive fields The spatial and a spatio-temporal behavior of V1 receptive fields is well modeled by Gabors: if ζ =(ξ,η,τ ) are spatio-temporal coordinates on the visual space g σ,ω z ,θ,v (ζ )= e -2π i ω ((ξ -x ) cos θ +(η -y ) sin θ -v (τ -t )) e 1 2 h(ζ -z )(ζ -z )i R 3 where z =(x , y , t ) is the receptive field center and σ is a covariance matrix. Figure: Gabor fit of spatial RF in macaque [Ringach, 2002] and cat [Jones and Palmer, 1987] Figure: Gabor fit of spatio-temporal RF in cat [DeAngelis et al., 1995 and Cocci et al., 2012] Spatial visual stimuli are represented in the 3-dimensional feature space M S = R 2 × S 1 = {(x , y )} where θ is the local orientation of the stimulus, while spatio-temporal visual stimuli are represented in the 5-dimensional feature space M T = R 3 × S 1 × R = {(x , y , t , θ, v )} where v is the velocity orthogonal to the spatial wavefront. The tangent bundle Smooth level sets of greyscale images are lifted to integral surfaces of the contact structure defined by the 1-form cos θ dx + sin θ dy while smooth level sets of greyscale moving images are lifted to integral surfaces of the contact structure defined by the 1-form cos θ dx + sin θ dy - vdt . The associated contact structures are generated by the following vector fields: ker(cos θ dx + sin θ dy ) = span{X 1 , X 2 } ker(cos θ dx + sin θ dy - vdt ) = span{X 1 , X 2 , X 4 , X 5 } , where X 1 = - sin θ∂ x + cos θ∂ y , X 2 = θ , X 4 = v , X 5 = t + v (cos θ∂ x + sin θ∂ y ). The nonintegrability of the structure is expressed by the nonzero commutators [X 1 , X 2 ]=[X 4 , X 5 ]= X 3 = cos θ∂ x + sin θ∂ y (Reeb) [X 2 , X 3 ]= X 1 , [X 2 , X 5 ]= vX 1 . Diffusion on fiber variables with drift on base variables can be implemented in terms of the Fokker-Planck hypoelliptic operators M S L 3 = X 1 - κ 2 X 2 2 = - sin θ∂ x + cos θ∂ y - κ 2 2 θ M V L V = X 1 - κ 2 X 2 2 - α 2 X 2 4 = - sin θ∂ x + cos θ∂ y - ( κ 2 2 θ + α 2 2 v ) M T L T = X 5 - κ 2 X 2 2 - α 2 X 2 4 = t + v (cos θ∂ x + sin θ∂ y ) - ( κ 2 2 θ + α 2 2 v ) which can be used to define a distance for which two points are closer the higher is their probability of being connected by a smooth contour or by a motion trajectory that is compatible with their apparent velocity ~ v θ = v ˆ e θ . Spectral Clustering Consider a data set {f i } n i =1 in a metric space as the vertices of a weighted graph, where the edge weights {a ij } n i ,j =1 define an affinity matrix A. The normalized affinity matrix P is given by P = D -1 A where D is the diagonal degree matrix, having elements d i = n X j =1 a ij . When A is symmetric, the eigenvalues {λ j } n j =1 of P satify 0 λ j 1, and its eigenvectors can be chosen with real components. The more P is close to being block diagonal, the more its spectrum is close to being bipartite, with eigenvalues close to 1 or to 0. In such a case, eigenvectors corresponding to large eigenvalues are close to indicator functions of areas of the graph that are strongly internally connected but weakly connected to the rest of the graph. An affinity that captures the geometry of the stimulus can then be used to cluster it into its strongly connected components in terms of eigenvectors of P . Geometric affinities We have performed our spectral clusterings with kernels obtained from the fundamental solutions L 3 Γ 3 = δ x ,y on M S L V Γ V = δ x ,y ,θ,v on M V L T Γ T = δ x ,y ,t ,θ,v on M T . The considered affinity for spatial stimuli with local orientations is A S = 1 2 3 3 ) where Γ 3 is the fundamental solution of the adjoint operator L 3 Figure: Clustering with Euclidean affinity (first two rows) and with A S (third row) An affinity for spatio-temporal stimuli with orientation and apparent motion is A ST T + 1 2 ( ˜ Γ V + ˜ Γ V ) where ˜ Γ V is the extension of Γ V to M T . This preserves causality and is compatible with psychophysiological evidence that two different mechanisms cooperate in the motion integration performed by the visual cortex. Identification of inductors on Kanizsa figures Spectral clustering with sub-Riemannian kernels may provide figure-ground segmentation, and the first eigenvector may be used for modal completion. Figure: First eigenvector of P S (red) for aligned and not aligned inductors [Favali et al. 2014] When the inductors are co-circularly aligned, with limit curvature that depends on the scale of the kernels, their mutual affinity is higher than with the rest of the closed contour. Otherwise, the affinity of the closed contour is dominant. The appearence of a full 6 segments-triangle, instead of three 2 segments-lines separately, is due to the polarity of the contours, which are lifted to [0, 2π ]. Clustering of moving locally oriented stimuli Spatio-temporal kernels are able to cluster moving oriented stimuli against a noisy background, and to follow them during their evolution. Figure: Clustering obtained with P ST [Cocci et al. 2014] References G. Cocci, D. Barbieri, G. Citti, A. Sarti, Cortical spatio-temporal dimensionality reduction for visual grouping. Preprint arxiv.org/abs/1407.0733 (2014). M. Favali, G. Citti, A. Sarti, Figure-ground segmentation with sub-Riemannian kernel PCA. Preprint (2014). D. Barbieri, G. Citti, G. Cocci, A. Sarti, A cortical-inspired geometry for contour perception and motion integration. J. Math. Imag. Vis. 49 (2014). G. Cocci, D. Barbieri, A. Sarti, Spatio-temporal receptive fields of cells in V1 are optimally shaped for stimulus velocity estimation. JOSA A 29 (2012). Supported by the E.C. FP7 th Marie Curie Program, PEOPLE-2013-IEF Project 626055 “HAViX” (www.uam.es/davide.barbieri) and PEOPLE-2013-ITN Project 607643 “MAnET” (manet.dm.unibo.it)

Transcript of Spectral clustering with cortical kernelsgmvision.lsis.org/slides/barbieri.pdf · Spectral...

Page 1: Spectral clustering with cortical kernelsgmvision.lsis.org/slides/barbieri.pdf · Spectral clustering with cortical kernels Davide Barbieri1, Giovanna Citti2, Giacomo Cocci2, Marta

Spectral clustering with cortical kernelsDavide Barbieri1, Giovanna Citti2, Giacomo Cocci2, Marta Favali3, Alessandro Sarti3

1Universidad Autonoma de Madrid 2Universita di Bologna 3CAMS, EHESS/CNRS Paris

Spatial and spatio-temporal receptive fieldsThe spatial and a spatio-temporal behavior of V1 receptive fields is well modeledby Gabors: if ζ = (ξ, η, τ ) are spatio-temporal coordinates on the visual space

gσ,ωz ,θ,v(ζ) = e−2πiω((ξ−x) cos θ+(η−y) sin θ−v(τ−t))e12〈(ζ−z),σ(ζ−z)〉R3

where z = (x , y , t) is the receptive field center and σ is a covariance matrix.

Figure: Gabor fit of spatial RF in macaque [Ringach, 2002] and cat [Jones and Palmer, 1987]

Figure: Gabor fit of spatio-temporal RF in cat [DeAngelis et al., 1995 and Cocci et al., 2012]

Spatial visual stimuli are represented in the 3-dimensional feature space

MS = R2 × S 1 = {(x , y , θ)}where θ is the local orientation of the stimulus, while spatio-temporal visualstimuli are represented in the 5-dimensional feature space

MT = R3 × S 1 × R = {(x , y , t, θ, v)}where v is the velocity orthogonal to the spatial wavefront.

The tangent bundleSmooth level sets of greyscale images are lifted to integral surfaces of thecontact structure defined by the 1-form

cos θdx + sin θdy

while smooth level sets of greyscale moving images are lifted to integral surfacesof the contact structure defined by the 1-form

cos θdx + sin θdy − vdt.

The associated contact structures are generated by the following vector fields:

ker(cos θdx + sin θdy) = span{X1,X2}ker(cos θdx + sin θdy − vdt) = span{X1,X2,X4,X5} , where

X1 = − sin θ∂x + cos θ∂y , X2 = ∂θ , X4 = ∂v , X5 = ∂t + v (cos θ∂x + sin θ∂y).

The nonintegrability of the structure is expressed by the nonzero commutators

[X1,X2] = [X4,X5] = X3 = cos θ∂x + sin θ∂y (Reeb)

[X2,X3] = X1 , [X2,X5] = vX1.

Diffusion on fiber variables with drift on base variables can be implemented interms of the Fokker-Planck hypoelliptic operators[MS

]L3 = X1 − κ2X 2

2 = − sin θ∂x + cos θ∂y − κ2∂2θ[

MV

]LV = X1 − κ2X 2

2 − α2X 24 = − sin θ∂x + cos θ∂y −

(κ2∂2

θ + α2∂2v

)[MT

]LT = X5 − κ2X 2

2 − α2X 24 = ∂t + v(cos θ∂x + sin θ∂y)−

(κ2∂2

θ + α2∂2v

)which can be used to define a distance for which two points are closer thehigher is their probability of being connected by a smooth contour or by amotion trajectory that is compatible with their apparent velocity ~vθ = v eθ.

Spectral ClusteringConsider a data set {fi}n

i=1 in a metric space as the vertices of a weightedgraph, where the edge weights {aij}n

i ,j=1 define an affinity matrix A. Thenormalized affinity matrix P is given by

P = D−1A

where D is the diagonal degree matrix, having elements

di =n∑

j=1

aij.

When A is symmetric, the eigenvalues {λj}nj=1 of P satify 0 ≤ λj ≤ 1, and its

eigenvectors can be chosen with real components. The more P is close to beingblock diagonal, the more its spectrum is close to being bipartite, witheigenvalues close to 1 or to 0. In such a case, eigenvectors corresponding tolarge eigenvalues are close to indicator functions of areas of the graph that arestrongly internally connected but weakly connected to the rest of the graph.

An affinity that captures the geometry of the stimulus can then be used tocluster it into its strongly connected components in terms of eigenvectors of P .

Geometric affinitiesWe have performed our spectral clusterings with kernels obtained from thefundamental solutions

L3Γ3 = δx ,y ,θ on MS

LVΓV = δx ,y ,θ,v on MV

LTΓT = δx ,y ,t,θ,v on MT .

The considered affinity for spatial stimuli with local orientations is

AS =1

2(Γ3 + Γ†3)

where Γ†3 is the fundamental solution of the adjoint operator L†3

Figure: Clustering with Euclidean affinity (first two rows) and with AS (third row)

An affinity for spatio-temporal stimuli with orientation and apparent motion is

AST = ΓT +1

2(ΓV + Γ†V)

where ΓV is the extension of ΓV to MT. This preserves causality and iscompatible with psychophysiological evidence that two different mechanismscooperate in the motion integration performed by the visual cortex.

Identification of inductors on Kanizsa figuresSpectral clustering with sub-Riemannian kernels may provide figure-groundsegmentation, and the first eigenvector may be used for modal completion.

Figure: First eigenvector of PS (red) for aligned and not aligned inductors [Favali et al. 2014]

When the inductors are co-circularly aligned, with limit curvature that dependson the scale of the kernels, their mutual affinity is higher than with the rest ofthe closed contour. Otherwise, the affinity of the closed contour is dominant.The appearence of a full 6 segments-triangle, instead of three 2 segments-linesseparately, is due to the polarity of the contours, which are lifted to [0, 2π].

Clustering of moving locally oriented stimuliSpatio-temporal kernels are able to cluster moving oriented stimuli against anoisy background, and to follow them during their evolution.

Figure: Clustering obtained with PST [Cocci et al. 2014]

ReferencesG. Cocci, D. Barbieri, G. Citti, A. Sarti, Cortical spatio-temporal dimensionalityreduction for visual grouping. Preprint arxiv.org/abs/1407.0733 (2014).M. Favali, G. Citti, A. Sarti, Figure-ground segmentation with sub-Riemanniankernel PCA. Preprint (2014).D. Barbieri, G. Citti, G. Cocci, A. Sarti, A cortical-inspired geometry for contourperception and motion integration. J. Math. Imag. Vis. 49 (2014).G. Cocci, D. Barbieri, A. Sarti, Spatio-temporal receptive fields of cells in V1 areoptimally shaped for stimulus velocity estimation. JOSA A 29 (2012).

Supported by the E.C. FP7th Marie Curie Program, PEOPLE-2013-IEF Project 626055 “HAViX” (www.uam.es/davide.barbieri) and PEOPLE-2013-ITN Project 607643 “MAnET” (manet.dm.unibo.it)