Lecture 25: 7

37
Lecture 25: 7.2 Orthogonal Diagonalization Wei-Ta Chu 2011/12/21

Transcript of Lecture 25: 7

Page 1: Lecture 25: 7

Lecture 25: 7.2Orthogonal Diagonalization

Wei-Ta Chu

2011/12/21

Page 2: Lecture 25: 7

Spectral Decomposition

� If A is a symmetric matrix that is orthogonally diagonalized by P=[u1 u2 … un], and if λ1, λ2, …, λn are the eigenvalues of Acorresponding to the unit eigenvectors u1, u2, …, un, then we know that D=PTAP, where D is a diagonal matrix with the eigenvalues in the diagonal positions. eigenvalues in the diagonal positions.

2

Page 3: Lecture 25: 7

Spectral Decomposition

� Multiplying out, we obtain the formula

which is called a spectral decomposition of A.

� Each term of the spectral decomposition of A has the form where u is a unit eigenvector of A in column form, and λ is an where u is a unit eigenvector of A in column form, and λ is an eigenvalue of A corresponding to u.

� It can be proved that uuT is the standard matrix for the orthogonal projection of Rn on the subspace spanned by the vector u.

3

Page 4: Lecture 25: 7

Spectral Decomposition

� The spectral decomposition of A tells that the image of a vector x under multiplication by a symmetric matrix A can be obtained by projecting x orthogonally on the lines determined by the eigenvectors of A, then scaling those projections by the eigenvalues, and then adding the scaled projections. eigenvalues, and then adding the scaled projections.

4

Page 5: Lecture 25: 7

Example: Eigenface

Face database

5

Mean face Eigenvectors

u1 u2 u3 …

Page 6: Lecture 25: 7

Example of Face Reconstruction

= -2181 +627 +389 + …

x Ax

6

Reconstruction procedure

Page 7: Lecture 25: 7

Example

� The matrix has eigenvalues λ1=-3 and λ2=2 with corresponding eigenvectors x1=(1,-2) and x2=(2,1)

� Normalizing these basis vectors yields

� A spectral docomposition of A is

7

The standard matrices for the orthogonal projections onto the eigenspacescorresponding to λ1=-3 and λ2=2

Page 8: Lecture 25: 7

Example

� The image of the vector x=(1,1)

These provide two different ways of viewing the image of the � These provide two different ways of viewing the image of the vector (1,1) under multiplication by A

8

Page 9: Lecture 25: 7

Lecture 25: 7.3Quadratic Forms

Wei-Ta Chu

2011/12/21

Page 10: Lecture 25: 7

Quadratic Form (二次式)

� Up to now, we have been interested in linear equations

� It’s a function of n variables, called a linear form.

� We will be concerned with quadratic forms, which are � We will be concerned with quadratic forms, which are functions of the form

� For example:

10

cross-product terms

Page 11: Lecture 25: 7

Quadratic Form

� Written in matrix form

� They are both of the form xTAx, where x is the column vector of variables, and A is symmetric matrix whose diagonal entries are the coefficients of the squared terms and whose entries off the main diagonal are half the coefficients of the cross-product terms.

11

Page 12: Lecture 25: 7

Example

12

Page 13: Lecture 25: 7

Symmetric Matrix

� Symmetric matrices are useful, but not essential, for representing quadratic forms.

� For example, the quadratic form 2x2+6xy-7y2 can be written as

where the coefficient 6 of the cross-product term has been split as 5+1 rather than 3+3, as in the symmetric representation.

13

Page 14: Lecture 25: 7

Symmetric Matrix

� However, symmetric matrices are usually more convenient to work with, so it will always be understood that A is symmetric when we write a quadratic form as xTAx, even if not stated explicitly.

When convenient, we can use Formula (7) of Section 4.1 � When convenient, we can use Formula (7) of Section 4.1 to express a quadratic form xTAx in terms of the Euclidean inner product as

14

Page 15: Lecture 25: 7

Problems

� Problem 1: if xTAx is a quadratic form on R2 or R3, what kind of curve or surface is represented by the equation xTAx=k?

� Problem 2: if xTAx is a quadratic form on Rn, what conditions must A satisfy for xTAx to have positive values for x ≠ 0?

� Problem 3: if xTAx is a quadratic form on Rn, what are its � Problem 3: if xTAx is a quadratic form on Rn, what are its maximum and minimum values if x is constrained to satisfy ||x|| = 1?

15

Page 16: Lecture 25: 7

Change of Variable

� Simplify the quadratic form xTAx by making a substitution x=Py. That expresses the variable x1, x2, …, xn in terms of new variables y1, y2, …, yn.

� If P is invertible, we call this change of variable. If P is orthogonal, then we call this orthogonal change of variable. orthogonal, then we call this orthogonal change of variable.

� We obtain: xTAx=(Py)TA(Py)=yTPTAPy=yT(PTAP)y� Since the matrix B=PTAP is symmetric, the effect of the

change of variable is to produce a new quadratic form yTBy.

16

Page 17: Lecture 25: 7

Change of Variable

� If we choose P to orthogonally diagonalize A, then the new quadratic form will be yTDy, where D is a diagonal matrix with the eigenvalues of A on the main diagonal.

17

Page 18: Lecture 25: 7

Theorem 7.3.1 The Principal Axes

Theorem� If A is a symmetric n by n matrix, then there is an orthogonal

change of variable that transforms the quadratic form xTAxinto a quadratic form yTDy with no cross product terms. Specifically, if P orthogonally diagonalize A, then making the change of variable x=Py in the quadratic form xTAx yields the change of variable x=Py in the quadratic form x Ax yields the quadratic form xTAx=yTDy= λ1y1

2+λ2y22+…+λnyn

2

in which λ1, λ2,…, λn are the eigenvalues of A corresponding to the eigenvectors that form the successive columns of P.

18

Page 19: Lecture 25: 7

Example

� Find an orthogonal change of variable of the quadratic form Q=x1

2-x32-4x1x2+4x2x3.

The characteristic equation of the matrix A is� The characteristic equation of the matrix A is

� The eigenvalues are 0, -3, 3. The orthonormal bases for the three eigenspace are

19

Page 20: Lecture 25: 7

Example

� A substitution x=Py that eliminates the cross product terms is

This produces the new quadratic form� This produces the new quadratic form

20

Page 21: Lecture 25: 7

Positive Definite (正定)

� Definition: A quadratic form xTAx is called positive definite if xTAx > 0 for all x ≠ 0, negative definite if xTAx < 0 for x ≠ 0 , indefinite if xTAx has both positive and negative values

� It’s called positive semidefinite if xTAx ≧ 0 for all x ≠ 0, and negative semidefinite if xTAx ≦ 0 for all x ≠ 0

21

Page 22: Lecture 25: 7

Theorem 7.3.2

� If A is a symmetric matrix A, then� xTAx is positive definite if and only if all eigenvalues of A are

positive.

� xTAx is negative definite if and only if all eigenvalues of A are negative.negative.

� xTAx is indefinite if and only if A has at least one positive eigenvalue and at least one negative eigenvalue.

22

Page 23: Lecture 25: 7

Example

� We showed that the symmetric matrix has eigenvalues and . Since these are positive, the matrix A is

positive definite, and for all x ≠ 0,

23

Page 24: Lecture 25: 7

Identifying Positive Definite Matrices

� Positive definite matrices are the most important symmetric matrices in applications.

� A method to determine whether a symmetric matrix is positive definite without finding the eigenvalues.

� The kth principal submatrix: � The kth principal submatrix:

24

First principal submatrixSecond principal submatrix

Third principal submatrixFourth principal submatrix

Page 25: Lecture 25: 7

Theorem 7.3.4

� A symmetric matrix A is positive definite if and only if the determinant of every principal submatrix is positive.

� Example:

� We are guaranteed that all eigenvalues of A are positive and xTAx > 0 for x ≠ 0.

25

Page 26: Lecture 25: 7

Exercises

� Sec. 7.1: 2, 6, 17, 22(True-False)

� Sec. 7.2: 5, 16(c), 18(a), 22(True-False)

� Sec. 7.3: 6, 12, 21, 25(b), 36(True-False)

26

Page 27: Lecture 25: 7

Lecture 25: 8.1General Linear Transformations

Wei-Ta Chu

2011/12/21

Page 28: Lecture 25: 7

Definitions and Terminology

� A matrix transformation TA: Rn →Rm is a mapping of the form TA(x) = Ax, in which A is an m by n matrix.

� Matrix transformations are precisely the linear transformations from Rn to Rm. The transformations with linearity properties linearity properties

T(u+v) = T(u) + T(v) and T(ku) = kT(u)

28

Page 29: Lecture 25: 7

Definitions and Terminology

� If T: V→ W is a function from a vector space V to a vector space W, then T is called a linear transformationfrom V to W if the following two properties hold for all vectors u and v in V and all scalars k: � (1) T(ku) = kT(u) [Homogeneity property]� (1) T(ku) = kT(u) [Homogeneity property]

� (2) T(u+v) = T(u) + T(v) [Additivity property]

� In the special case where V = W, the linear transformation is called a linear operator on the vector space V.

29

Page 30: Lecture 25: 7

Definitions and Terminology

� These properties can be used in combination

T(k1v1 + k2v2) = k1T(v1) + k2T(v2)

� More generally,

T(k1v1 + k2v2+ …+krvr) = k1T(v1) + k2T(v2) + … + krT(vr)T(k1v1 + k2v2+ …+krvr) = k1T(v1) + k2T(v2) + … + krT(vr)

� Theorem 8.1.1: If T: V→ W is a linear transformation, then � (a) T(0) = 0

� (b) T(u-v) = T(u) – T(v) for all u and v in V

30

Page 31: Lecture 25: 7

Example

� A matrix transformation TA: Rn → Rm is also a liner transformation in this more general sense with V = Rn and W = Rm

� The zero transformation: The mapping T: V→ W such that T(v) = 0 for every v in V is a linear transformation that T(v) = 0 for every v in V is a linear transformation called the zero transformation. T is linear:

T(u+v) = 0, T(u) = 0, T(v) = 0, and T(ku) = 0Therefore, T(u+v) = T(u) + T(v), and T(ku) = kT(u)

31

Page 32: Lecture 25: 7

Example

� The identity operator: The mapping I: V→ V defined by I(v) = v is called the identity operator on V.

� The mapping T: V→ V given by T(x) = kx is a linear operator on V, for if c is any scalar and if u and v are any vectors in V, then vectors in V, then

T(cu) = k(cu) = c(ku) = cT(u)

T(u+v) = k(u+v) = ku + kv = T(u) + T(v)

� If 0 < k < 1, then T is called the contraction of V with factor k, and if k > 1, it is called the dilation of V with factor k.

32

Page 33: Lecture 25: 7

Example

� Let p = p(x) = c0 + c1x + … + cnxn be a polynomial in Pn, and define the transformation T: Pn → Pn+1 by

T(p) = T(p(x)) = xp(x) = c0x + c1x2 + … + cnxn+1

� This transformation is linear because for any scalar k and any polynomial p and p in P we have any polynomial p1 and p2 in Pn we have

T(kp) = T(kp(x)) = x(kp(x)) = k(xp(x)) = kT(p)

T(p1 + p2) = T(p1(x) + p2(x)) = x(p1(x) + p2(x)) = xp1(x) + xp2(x) = T(p1) + T(p2)

33

Page 34: Lecture 25: 7

Example

� Let V be an inner product space, let v0 be any fixed vector in V, and let T: V→ R be the transformation T(x) = 〈x,v0〉 that maps a vector x into its inner product with v0.

This transformation is linear, for if k is any scalar, and if � This transformation is linear, for if k is any scalar, and if u and v are any vectors in V, then

T(ku) =〈ku,v0〉=k〈u,v0〉=kT(u)

T(u+v) = 〈u+v,v0〉=〈u,v0〉+〈v,v0〉=T(u) + T(v)

34

Page 35: Lecture 25: 7

Example

� Let Mnn be the vector space of n by n matrices. The mapping T1(A) = AT is linear.

T1(kA) = (kA)T = kAT = kT1(A)

T1(A+B) = (A+B)T = AT + BT = T1(A) + T1(B)1 1 1

� The mapping T2(A) = det(A) is not linear.

T2(kA) = det(kA) = kn det(A) = kn T2(A)

35

Page 36: Lecture 25: 7

Example

� A linear transformation maps 0 to 0. This property is useful for identifying transformations that are not linear.

� If x0 is a fixed nonzero vector in R2, then the transformation T(x) = x + x0 has the geometric effect of translating each point x in a direction parallel to xtranslating each point x in a direction parallel to x0

through a distance of ||x0||.

� This cannot be a linear transformation since T(0) = x0 so T does not map 0 to 0.

36

Page 37: Lecture 25: 7

Example

� Let V be a subspace of , let x1, x2, …, xn be distinct real numbers, and let T: V→ Rn be the transformation T(f) = (f(x1), f(x2), …, f(xn)) that associates with f the n-tuple of function values at x1, x2, …, xn.

We call this the evaluation transformation on V at x , x , � We call this the evaluation transformation on V at x1, x2, …, xn.

T(kf) = (kf(x1), kf(x2), …, kf(xn)) = k(f(x1), f(x2), …, f(xn)) = kT(f)

T(f+g) = ((f+g)(x1), (f+g)(x2), …, (f+g)(xn)) = (f(x1) + g(x1), f(x2) + g(x2), …, f(xn) + g(xn)) = T(f) + T(g)

37