Eigenvectors and SVD

1

Eigenvectors and SVD

2

Eigenvectors of a square matrix

• Definition

• Intuition: x is unchanged by A (except for scaling)

• Examples: axis of rotation, stationary distribution of a Markov chain

Ax = λx, x �= 0 .

3

Diagonalization

• Stack up evec equation to get

• Where

• If evecs are linearly indep, X is invertible, so

AX = XΛ

X ∈ Rn×n =

| | |x1 x2 · · · xn| | |

, Λ = diag(λ1, . . . , λn) .

A = XΛX−1.

4

Evecs of symmetric matrix

• All evals are real (not complex)

• Evecs are orthonormal

• So U is orthogonal matrix

uTi uj = 0 if i �= j, uTi ui = 1

UTU = UUT = I

5

Diagonalizing a symmetric matrix

• We haveA = UΛUT =

n∑

i=1

λiuiuTi

A = =

| | |u1 u2 · · · un| | |

λ1λ2

. . .

λn

− uT1

−− uT2 −

...− uTn −

= λ1

|u1|

(− uT1

−)+ · · ·+ λn

|un|

(− uTn −)

6

Transformation by an orthogonal matrix

• Consider a vector x transformed by the orthogonal matrix U to give

• The length of the vector is preserved since

• The angle between vectors is preserved

• Thus multiplication by U can be interpreted as a rigid rotation of the coordinate system.

x = Ux

||x||2 = xT x = xTUTUTx = xTx = ||x||2

xT y = xTUUy = xTy

7

Geometry of diagonalization

• Let A be a linear transformation. We can always decompose this into a rotation U, a scaling Λ, and a reverse rotation UT=U-1.

• Hence A = U Λ UT.

• The inverse mapping is given by A-1 = U Λ-1 UT

A =

m∑

i=1

λiuiuTi

A−1 =m∑

i=1

1

λiuiu

Ti

8

Matlab example

• Given

• Diagonalize

• check

A =

1.5 −0.5 0−0.5 1.5 00 0 3

[U,D]=eig(A)

U =

-0.7071 -0.7071 0

-0.7071 0.7071 0

0 0 1.0000

D =

1 0 0

0 2 0

0 0 3

Rot(45)

Scale(1,2,3)

>> U*D*U’

ans =

1.5000 -0.5000 0

-0.5000 1.5000 0

0 0 3.0000

9

Positive definite matrices

• A matrix A is pd if xT A x > 0 for any non-zero vector x.

• Hence all the evecs of a pd matrix are positive

• A matrix is positive semi definite (psd) if λi >= 0.• A matrix of all positive entries is not necessarily pd;

conversely, a pd matrix can have negative entries

Aui = λiui

uTi Aui = λiuTi ui = λi > 0

> [u,v] = eig([1 2; 3 4])

u =

-0.8246 -0.4160

0.5658 -0.9094

v =

-0.3723 0

0 5.3723

[u,v]=eig([2 -1; -1 2])

u =

-0.7071 -0.7071

-0.7071 0.7071

v =

1 0

0 3

10

Multivariate Gaussian

• Multivariate Normal (MVN)

• Exponent is the Mahalanobis distance between x and µ

Σ is the covariance matrix (symmetric positive definite)

N (x|µ,Σ)def=

1

(2π)p/2|Σ|1/2exp[− 1

2(x− µ)TΣ−1(x− µ)]

∆ = (x− µ)TΣ−1(x− µ)

xTΣx > 0 ∀x

11

Bivariate Gaussian

• Covariance matrix is

where the correlation coefficient is

and satisfies -1 ≤ ρ ≤ 1

• Density is

Σ =

(σ2x ρσxσy

ρσxσy σ2y

)

ρ =Cov(X,Y )

√V ar(X)V ar(Y )

p(x, y) =1

2πσxσy√1− ρ2

exp

(−

1

2(1− ρ2)

(x2

σ2x+y2

σ2y−

2ρxy

(σxσy)

))

12

Spherical, diagonal, full covariance

Σ =

(σ2 00 σ2

)

Σ =

(σ2x 00 σ2y

)Σ =

(σ2x ρσxσy

ρσxσy σ2y

)

13

Surface plots

14

Matlab plotting code

stepSize = 0.5;

[x,y] = meshgrid(-5:stepSize:5,-5:stepSize:5);

[r,c]=size(x);

data = [x(:) y(:)];

p = mvnpdf(data, mu’, S);

p = reshape(p, r, c);

surfc(x,y,p); % 3D plot

contour(x,y,p); % Plot contours

15

Visualizing a covariance matrix

• Let Σ = U Λ UT. Hence

• Let y = U(x-µ) be a transformed coordinate system, translated by µ and rotated by U. Then

Σ−1 = U−TΛ−1U−1 = UΛ−1U =

p∑

i=1

1

λiuiu

Ti

(x− µ)TΣ−1(x− µ) = (x− µ)T

(p∑

i=1

1

λiuiu

Ti

)

(x− µ)

=

p∑

i=1

1

λi(x− µ)Tuiu

Ti (x− µ) =

p∑

i=1

y2iλi

16


• From the previous slide

• Recall that the equation for an ellipse in 2D is

• Hence the contours of equiprobability are elliptical, with axes given by the evecs and scales given by the evals of Σ

(x− µ)TΣ−1(x− µ) =

p∑

i=1

y2iλi

y21

λ1+y22

λ2= 1

17


• Let X ~ N(0,I) be points on a 2d circle.

• If

• Then

• So we can map a set of points on a circle to points on an ellipse

Y = UΛ1

2X

Λ1

2 = diag(√Λii)

Cov[y] = UΛ1

2Cov[x]Λ1

2UT = UΛ1

2Λ1

2UT = Σ

18

Implementation in Matlab

Y = UΛ1

2X

Λ1

2 = diag(√Λii)

function h=gaussPlot2d(mu, Sigma, color)

[U, D] = eig(Sigma);

n = 100;

t = linspace(0, 2*pi, n);

xy = [cos(t); sin(t)];

k = 6; %k = sqrt(chi2inv(0.95, 2));

w = (k * U * sqrt(D)) * xy;

z = repmat(mu, [1 n]) + w;

h = plot(z(1, :), z(2, :), color); axis(’equal

19

Standardizing the data

• We can subtract off the mean and divide by the standard deviation of each dimension to get the following (for case i=1:n and dimension j=1:d)

• Then E[Y]=0 and Var[Yj]=1.

• However, Cov[Y] might still be elliptical due to correlation amongst the dimensions.

yij =xij − xjσj

20

Whitening the data

• Let X ~ N(µ,Σ) and Σ = U Λ UT.• To remove any correlation, we can apply the

following linear transformation

• In Matlab

Y = Λ−1

2UTX

Λ−1

2 = diag(1/√Λii)

[U,D] = eig(cov(X));

Y = sqrt(inv(D)) * U’ * X;

21

Whitening: example

22

Whitening: proof

• Let

• Using

we have

and

Y = Λ−1

2UTX

Λ−1

2 = diag(1/√Λii)

Cov[AX] = ACov[X ]AT

Cov[Y ] = Λ−1

2UTΣUΛ−1

2

= Λ−1

2UT (UΛUT )UΛ−1

2

= Λ−1

2ΛΛ−1

2 = I

EY = Λ−1

2UTE[X ]

23

Singular Value Decomposition

A = UΣVT = λ1

|u1|

(− vT1

−)+

· · ·+ λr

|ur|

(− vTr −)

24

Right svectors are evecs of A^T A

• For any matrix A

ATA = VΣTUT UΣVT

= V(ΣTΣ)VT

(ATA)V = V(ΣTΣ) = VD

25

Left svectors are evecs of A A^T

AAT = UΣVT VΣTUT

= U(ΣΣT )UT

(AAT )U = U(ΣΣT ) = UD

26

Truncated SVD

A = U:,1:kΣ1:k,1:kVT1:,1:k = λ1

|u1|

(− vT1

−)+

· · ·+ λk

|uk|

(− vTk −)

Spectrum of singular values

Rank k approximation to matrix

27

SVD on images

• Run demo

load clown

[U,S,V] = svd(X,0);

ranks = [1 2 5 10 20 rank(X)];

for k=ranks(:)’

Xhat = (U(:,1:k)*S(1:k,1:k)*V(:,1:k)’);

image(Xhat);

end

28

Clown example

1 2 5

10 20 200

29

Space savings

A ≈ U:,1:kΣ1:k,1:kVT1:,1:k

m× n ≈ (m× k) (k) (n× k) = (m+ n+ 1)k

200× 320 = 64, 000 → (200 + 320 + 1)20 = 10, 420

Eigenvectors and SVD

Documents

Transcript of Eigenvectors and SVD