MATH 340: EIGENVECTORS, SYMMETRIC MATRICES, …tjh/340_symmetric.pdf · MATH 340: EIGENVECTORS,...

MATH 340: EIGENVECTORS, SYMMETRIC MATRICES,

AND ORTHOGONALIZATION

Let A be an n × n real matrix. Recall some basic definitions.

• A is symmetric if At = A;• A vector x ∈ R

n is an eigenvector for A if x 6= 0, and if there exists a number λ suchthat Ax = λx. We call λ the eigenvalue corresponding to x;• We say a set of vectors v1, . . . ,vk in R

n is orthogonal if vi·vj = 0 whenever i 6= j.We say the vectors are orthonormal if in addition each vi is a unit vector.• We say P is orthogonal if P tP = I (Thus, P is invertible, and P−1 = P t). Equiva-lently, the columns (or rows) of P form an orthonormal set.

We want to prove (in a not-too-painful way!) the following very important theorem.

Theorem 1 (Principal axis theorem). The following statements are equivalent:

(i) A is symmetric;(ii) There exists an orthonormal basis for R

n consisting of eigenvectors of A;(iii) There exists an orthogonal matrix P such that P tAP is diagonal.

We can prove some parts of the theorem right away without much work.

(iii) ⇒ (i): Assume P exists as in (iii), and write P tAP = D, where D is diaganal.Note that P−1 = P t implies that A = PDP t. Now A is symmetric follows from

At = (PDP t)t = P ttDtP t = PDP t = A.

(We used D diagonal to justify Dt = D here).

(ii) ⇒ (iii): Suppose v1, · · · ,vn are an orthonormal basis of eigenvectors for A. LetP be the matrix whose columns are v1, . . . ,vn; in other words Pei = vi for each i.

Claim: P is orthogonal.Pf: Let δij denote the Kronecker symbol: δij = 0 if i 6= j and δii = 1. We have

the vi’s are orthonormal ⇔ vi·vj = δij (∀i, j) ⇔ etiP

tPej = δij (∀i, j) ⇔ P tP = I,

which shows that P is orthogonal, proving the claim.

Next, we claim that P tAP is diagonal. For each i, we have Avi = λivi for somescalar λi. Using vi = Pei and P t = P−1, this gives

APei = λiPei

and thus

P tAPei = λiei.

This is the same as saying that P tAP = diag(λ1, λ2, . . . , λn), a diagonal matrix withthe λi’s down the diagonal. This proves the implication (ii) ⇒ (iii).

1

2 MATH 340: EIGENVECTORS, SYMMETRIC MATRICES, AND ORTHOGONALIZATION

(iii) ⇒ (ii): This is similar to the above implication. Assume P exists as in (iii), anddefine vi = Pei. Then the assumption that P is orthogonal implies that the vi’s forman orthonormal set. Further, the identity

P tAPei = λiei,

which follows from P tAP = diag(λ1, . . . , λn), together with P−1 = P t, shows thatAvi = λivi, for each i. Thus, we have found an orthonormal basis of eigenvectors forA.

It remains to prove (i) ⇒ (iii). This is the hardest and most interesting part. I willproceed here in a different manner from what I explained (only partially) in class.

The main ingredient is the following proposition.

Proposition 2. Any symmetric matrix A has an eigenvector.

Remark: In the end, we will see that in fact A will have a lot more than just oneeigenvector, but since the proof of (i) ⇒ (iii) is ultimately done by a kind of induction,we need to produce a first eigenvector to “get started”. It is not at all the case thatan arbitrary matrix has an eigenvector. For example, suppose A is the matrix

Rθ =

[

cos(θ) −sin(θ)sin(θ) cos(θ)

]

,

the matrix corresponding to “rotation by θ radians”. If θ 6= 0, it is geometrically clearthat there is no non-zero vector x such that x and Rθ(x) are colinear; thus, Rθ has noeigenvectors at all.

There are several approaches to this proposition, each important in its own right.We will settle for the one which is closest to the material we have already covered inthis class.

Our Method: the Optimization method (cf. Hubbard/Hubbard, page 372-373).Let QA(x) = xtAx, the quadratic form corresponding to A. Explicitly, if A =

(aij), then QA(x1, . . . , xn) =∑n

i,j=1aijxixj. It is obviously a real-valued differentiable

function of n variables. One can show its derivative at a = (a1, . . . , an)t is the lineartransformation h 7→ atAh + htAa = 2atAh (the last equality since At = A), so that

DQA(a)h = atAh + htAa = 2atAh.

Indeed, we just need to note that the following holds

limh→0

(a + h)tA(a + h) − atAa − (atAh + htAa)

|h|= 0

(you should think this over carefully). Another way to phrase this result is

∇QA(a) = 2atA.

MATH 340: EIGENVECTORS, SYMMETRIC MATRICES, AND ORTHOGONALIZATION 3

Now we consider the unit sphere S in Rn: the unit sphere consists of vectors of

length 1, i.e.,S = {x ∈ R

n | |x| = 1}.

This set is closed and bounded. We are going to try to maximize QA(x) subject to theconstraint that x ∈ S, ie., subject to g(x) = 1, where

g(x) = |x|2 = x2

1+ · · · + x2

n.

We have (think of x as a column vector, so xt is the row vector (x1, . . . , xn))

∇g(x) = 2xt.

By general theorems (we skipped these – see sec. 1.6 of your text), any continuousfunction such as QA attains an absolute maximum on a closed and bounded set suchas S. Let v be a point in the sphere S where QA attains its maximum value on thesphere S. By the theory of Lagrange multipliers, there is some scalar λ ∈ R such that

∇QA(v) = λ∇g(v),

i.e.,2vtA = λ2vt,

which, since At = A, yields the following on applying transpose to both sides:

Av = λv.

This shows that v is indeed an eigenvector for A. This proves the proposition.

Here is a sketch of another important method for proving the proposition.

Another method: find a root of the characteristic polynomial for A.

Here, the characteristic polynomial for A is the polynomial det(A− tI), a real poly-nomial of degree n in the variable t. We can regard A as a matrix over the complexnumbers C, and also we can regard, in the usual way, the matrix A as a linear trans-formation A : C

n → Cn. The definitions of eigenvalue and eigenvector make sense for

complex matrices – the definitions are the same as before.

Lemma 0.1. Let A be an n × n matrix over C. Then:

(a) λ ∈ C is an eigenvalue corresponding to an eigenvector x ∈ Cn if and only if λ

is a root of the characteristic polynomial det(A − tI);(b) Every complex matrix has at least one complex eigenvector;(c) If A is a real symmetric matrix, then all of its eigenvalues are real, and it has

a real eigenvector (ie. one in the subset Rn ⊂ C

n).

The third part of this Lemma gives us another proof of the Proposition above.

Proof. (a) Fix a complex number λ. There is a non-zero complex vector x such thatAx = λx if and only if A − λI has non-zero null space, which holds if and only if thedeterminiant det(A − λI) vanishes. This gives the desired equivalence.

(b) The fundamental theorem of algebra asserts that any complex polynomial has aroot in the complex numbers. Applying this to the polynomial det(A − tI), we seeit has a root in the complex numbers. By invoking part (a), we see A possesses an

4 MATH 340: EIGENVECTORS, SYMMETRIC MATRICES, AND ORTHOGONALIZATION

eigenvalue, that is a number λ such that there is some non-zero complex vector x withAx = λx.

(c) First of all, by part (b), we know A has at least a complex eigenvalue. Once weshow this is necessarily real, then the same argument as in the part (a) shows that A

has a real eigenvector, and we’ll have proved the proposition another way.

It remains to show that if a+ib is a complex eigenvalue for the real symmetric matrixA, then b = 0, so the eigenvalue is in fact a real number. Suppose v + iw ∈ C

n is acomplex eigenvector with eigenvalue a + ib (here v,w ∈ R

n). Note that applying thecomplex conjugation to the identity A(v + iw) = (a + ib)(v + iw) yields A(v− iw) =(a − ib)(v − iw). We will show b = 0 by considering

(v − iw)tA(v + iw).

On the one hand, this is

(v − iw)t(A(v + iw)) = (a + ib)((v − iw)t(v + iw)) = (a + ib)(|v|2 + |w|2).

On the other hand, a similar calculation show it is

((v − iw)tA)(v + iw) = (A(v − iw))t(v + iw)) = (a − ib)(|v|2 + |w|2).

Putting these together and noting that |v|2 + |w|2 6= 0, we get a + ib = a − ib, henceb = 0. This completes the proof of part (c), and thus the Lemma.

�

Finally, we return to the implication (i) ⇒ (iii), having proved the Proposition nowin two different ways:

Proof. Suppose A is symmetric. We want to show there is an orthonormal matrix P

such that P tAP is diagonal. According to the proposition, there is an eigenvector u1

with eigenvalue λ1. We may as well assume u1 is a unit vector (if necessary, divide itby its length).

Let us proceed by induction on n, the size of A. We prove the “bootstrap step”:we assume that every symmetric matrix of size n − 1 has an orthogonal Q whichdiagonalizes it, and we will deduce that A may also be diagonalized by an orthogonalmatrix.

Let us consider u1 as the first element in a basis u1,u2, . . . ,un of Rn (it is always pos-

sible to find the other basis elements u2, . . . ,un). Now apply the Gram-Schmidt orthog-onalization process to this basis to produce an orthonormal basis v1 = u1,v2, . . . ,vn.Let P1 be the orthogonal matrix whose columns are v1, . . . ,vn.

Note that the first column of P−1

1AP1 = P t

1AP1 is [λ1, 0, . . . , 0]

t (Here’s why: P−1

1AP1e1 =

P−1

1Au1 = P−1

1λ1u1 = λ1e1.) Since P t

1AP1 is symmetric (why?), this means it can be

written[

λ1 00 B

]

where B is an n − 1 × n − 1 symmetric matrix, and the 0 denotes a string of n − 1zeros in the first row (resp. first column).

MATH 340: EIGENVECTORS, SYMMETRIC MATRICES, AND ORTHOGONALIZATION 5

By our induction hypothesis, there exists an orthogonal matrix Q such that QtBQ

is diagonal. Then we easily see that if we set

P = P1

[

1 00 Q

]

,

then P is orthogonal and P tAP is diagonal. This completes the proof of (i) ⇒ (iii).�

MATH 340: EIGENVECTORS, SYMMETRIC MATRICES, …tjh/340_symmetric.pdf · MATH 340: EIGENVECTORS,...

Documents

Transcript of MATH 340: EIGENVECTORS, SYMMETRIC MATRICES, …tjh/340_symmetric.pdf · MATH 340: EIGENVECTORS,...