1 Introduction and Notation - University of Bathmamamf/talks/lanczos.pdf · Lanczos Method Seminar...

Lanczos Method Seminar for Eigenvalue Reading GroupAndre Leger

1 Introduction and Notation

• Eigenvalue Problem: Ax = λx, A ∈ CN×N , x ∈ CN

• Now λ ∈ R since A = AT .

• Vector qi is orthonormal if

1. qiTqi = 1,

2. QT = Q−1, Q = [q1, . . . ,qN ],3. ||qi||2 = 1.4. qi

Tqj = 0 for i 6= j,

2 A Reminder of the Power Method

• We recall the Power Method is used to find the eigenvector associated with the maximumeigenvalue.

• Simply xk+1 = cAxk, where c is a normalisation constant to prevent large xk+1.

• As k → ∞, xk+1 → v1, the eigenvector associated with eigenvalue λ1 where λ1 > λ2 ≥λ3 . . . λN

• We obtain the maximum eigenvalue by the Rayleigh Quotient

R(A,xk) =

(xk)T Axk

||xk||22

• Why don’t we just use the QR method? Well if A is sparse, then applying an iteration of theQR approach does not maintain sparsity of the new matrix. INEFFICIENT.

• Note: We only find ONE eigenvector and eigenvalue. What if we want more?

3 The Idea Behind Lanczos Method

• Lets follow the Power Method, but save each iteration, such that we obtain

v,A,v,A2v, . . .Ak−1v

• These vectors form the Krylov Space

Kk(Av) = span {v,Av,A2v . . . , Ak−1}

• So after n iterationsv,Av, . . . ,An−1v

are linearly independent and x can be formed from the space.

• By the Power Method, the n-th iteration tends to an eigenvector hence the sequence becomeslinearly dependent but we want a sequence of linearly independent vectors.

• Hence we orthogonalise the vectors, this is the basis of Lanczos Method

4 Lanczos Method

• Assume we have orthonormal vectors

q1, q2, . . . , qN

• Simply let Q = [q1, q2, . . . , qk] hence

QTQ = I

• We want to change A to a tridiagonal matrix T, and apply a similarly transformation:

QTAQ = T or AQ = QT

• So we define T to be

Tk+1,k =

α1 β1 0 . . . . . . . . . 0β1 α2 β2 0 . . . . . . 0

0 β2 α3 β3 0 . . ....

... 0 . . . . . . . . . . . ....

... . . . . . . . . . . . . . . ....

... . . . . . . . . . . . . . . . βk−1

0 . . . . . . . . . 0 βk−1 αk0 . . . . . . . . . . . . 0 βk

∈ Ck+1,k

• After k steps we have AQk = Qk+1Tk+1,k for A ∈ CN,N , Qk ∈ CN,k, Qk+1 ∈ CN,k+1,Tk+1,k ∈ Ck+1,k.

• We observe thatAQk = Qk+1Tk+1,k = QkTk,k + βkqk+1eTk

• Now AQ = QT hence

A [q1, q2, . . . , qk] = [q1, q2, . . . , qk]Tk

• The first column of the left hand side matrix is given by

Aq1 = α1q1 + β1q2

• The ith term byAqi = βi−1qi−1 + αiqi + βiqi+1,

† i = 2, . . .

• We wish to find the alphas and betas so multiply † by qTi so that

qTi Aqi = qTi βi−1qi−1 + qTi αiqi + qTi βiqi+1

= βi−1qTi qi−1 + αiqTi qi + βiqTi qi+1

= αiqTi qi

• We obtain βi by rearranging † from the recurrence formula

ri ≡ βiqi+1 = Aqi − αiqi − βi−1qi−1

• We assume βi 6= 0 and so βi = ||ri||2.

• We may now determine the next orthonormal vector

qi+1 =riβi.

5 A Little Proof - Omit from Seminar

Lemma: All vectors qi+1 generated by the 3-term are orthogonal to all qk for k < i

Proof

• We assume qTi+1qi = 0 = qTi+1qi−1 and by induction step qTi qk for k < i.

• We prove qTi+1qk for k < i.

• Multiply † by qk for k ≤ i− 2 and we show qk,qi are A orthogonal. Hence

qTkAqi =(qTkAT

)qi = (Aqk)

T qi

= (βk−1qk−1 + αkqk + βkqk+1)T qi= βk−1qTk−1qi + αkqTk qi + βkqTk+1qi= 0 + 0 + 0 = 0

• Now multiply † by qk so that

qTkAqi = βi−1qTk qi−1 + αiqTk qi + βi−1qTk qi+1

Rearranging we obtain

βi−1qTk qi+1 = qTkAqi − βi−1qTk qi−1 − αiqTk qi = 0

6 The Lanzcos Algorithm

Initialise: choose r = q0 and let β0 = ||q0||2Begin Loop: for j = 1, ...

qj = rβj−1

r = Aqjr = r− qj−1βj−1

αj = qTj rr = r− qjαjOthorgonalise if necessaryβj = ||r||2Compute approximate eigenvalues of Tj

Test Convergence (see remarks)endfor

End Loop

7 Remarks 1: Finding the Eigenvalues and Eigenvectors

• So how do we find the eigenvalues and eigenvectors?

• If βk = 0 then

1. We diagonalise the matrix Tk using simple QR method to find the exact eigenvalues.

Tk = Skdiag (λ1, . . . , λk)STk

where the matrix Sk is orthonormal SkSTk = I.

2. The exact eigenvectors are given correspondingly in the columns of the matrix Y where

STk QTkAQkSk = diag (λ1, . . . , λk)

so that Y = QkSk.

3. We converge to the k largest eigenvalues. The proof is very difficult and is omitted.

• Now βk is never really zero. Hence we only converge to the eigenvalue.

– After k steps we have AQk = QkTk,k + βkqk+1eTk– For βk small we obtain approximations to the eigenvalues θi ≈ λi by

Tk = Skdiag (θ1, . . . , θk)STk

– We multiply AQk by Sk from above so that

AQkSk = QkTk,kSk + βkqk+1eTk Sk= QkSkdiag (θ1, . . . , θk)STk Sk + βkqk+1eTk Sk

AYk = Ykdiag (θ1, . . . , θk) + βkqk+1eTk SkAYk.ej = Ykdiag (θ1, . . . , θk) .ej + βkqk+1eTk Skej

Ayj = yjθj + βkqk+1SkjAyj − θjyj = βkqk+1Skj

∴ ||Ayj − θjyj ||2 = |βk||Skj | (1)

– So if βk → 0 we prove θj → λj .

– Otherwise |βk||Skj | needs to be small to have a good approximation, hence convergencecriterion

|βk||Skj | < ε

8 Remarks 2: Difficulties with Lanzcos Method

• In practice, the problem is that the orthogonality is not preserved.

• As soon as one eigenvalue converges all the basis vectors qi pick up perturbations biasedtoward the direction of the corresponding eigenvector and orthogonality is lost.

• A “ghost” copy of the eigenvalue will appear again in the tridiagonal matrix T.

• To counter this we fully re-orthonormalize the sequence by using Gram-Schmidt or even QR.

• However, either approach would be expense if the dimension if the Krylov space is large.

• So instead a selective re-orthonormalization is pursued. More specifically, the practicalapproach is to orthonormalize half-way i.e., within half machine-recision

√εM .

• If the eigenvalues of A are not well separated, then we can use a shift and employ the matrix

(A− σI)−1

following the shifted inverted power method to generate the appropriate Krylov subspaces.

1 Introduction and Notation - University of Bathmamamf/talks/lanczos.pdf · Lanczos Method Seminar...

Documents

Transcript of 1 Introduction and Notation - University of Bathmamamf/talks/lanczos.pdf · Lanczos Method Seminar...