1 Introduction and Notation - University of Bathmamamf/talks/lanczos.pdf · Lanczos Method Seminar...
Click here to load reader
Transcript of 1 Introduction and Notation - University of Bathmamamf/talks/lanczos.pdf · Lanczos Method Seminar...
Lanczos Method Seminar for Eigenvalue Reading GroupAndre Leger
1 Introduction and Notation
• Eigenvalue Problem: Ax = λx, A ∈ CN×N , x ∈ CN
• Now λ ∈ R since A = AT .
• Vector qi is orthonormal if
1. qiTqi = 1,
2. QT = Q−1, Q = [q1, . . . ,qN ],3. ||qi||2 = 1.4. qi
Tqj = 0 for i 6= j,
2 A Reminder of the Power Method
• We recall the Power Method is used to find the eigenvector associated with the maximumeigenvalue.
• Simply xk+1 = cAxk, where c is a normalisation constant to prevent large xk+1.
• As k → ∞, xk+1 → v1, the eigenvector associated with eigenvalue λ1 where λ1 > λ2 ≥λ3 . . . λN
• We obtain the maximum eigenvalue by the Rayleigh Quotient
R(A,xk) =
(xk)T Axk
||xk||22
• Why don’t we just use the QR method? Well if A is sparse, then applying an iteration of theQR approach does not maintain sparsity of the new matrix. INEFFICIENT.
• Note: We only find ONE eigenvector and eigenvalue. What if we want more?
3 The Idea Behind Lanczos Method
• Lets follow the Power Method, but save each iteration, such that we obtain
v,A,v,A2v, . . .Ak−1v
• These vectors form the Krylov Space
Kk(Av) = span {v,Av,A2v . . . , Ak−1}
• So after n iterationsv,Av, . . . ,An−1v
are linearly independent and x can be formed from the space.
• By the Power Method, the n-th iteration tends to an eigenvector hence the sequence becomeslinearly dependent but we want a sequence of linearly independent vectors.
• Hence we orthogonalise the vectors, this is the basis of Lanczos Method
4 Lanczos Method
• Assume we have orthonormal vectors
q1, q2, . . . , qN
• Simply let Q = [q1, q2, . . . , qk] hence
QTQ = I
• We want to change A to a tridiagonal matrix T, and apply a similarly transformation:
QTAQ = T or AQ = QT
• So we define T to be
Tk+1,k =
α1 β1 0 . . . . . . . . . 0β1 α2 β2 0 . . . . . . 0
0 β2 α3 β3 0 . . ....
... 0 . . . . . . . . . . . ....
... . . . . . . . . . . . . . . ....
... . . . . . . . . . . . . . . . βk−1
0 . . . . . . . . . 0 βk−1 αk0 . . . . . . . . . . . . 0 βk
∈ Ck+1,k
• After k steps we have AQk = Qk+1Tk+1,k for A ∈ CN,N , Qk ∈ CN,k, Qk+1 ∈ CN,k+1,Tk+1,k ∈ Ck+1,k.
• We observe thatAQk = Qk+1Tk+1,k = QkTk,k + βkqk+1eTk
• Now AQ = QT hence
A [q1, q2, . . . , qk] = [q1, q2, . . . , qk]Tk
• The first column of the left hand side matrix is given by
Aq1 = α1q1 + β1q2
• The ith term byAqi = βi−1qi−1 + αiqi + βiqi+1,
† i = 2, . . .
• We wish to find the alphas and betas so multiply † by qTi so that
qTi Aqi = qTi βi−1qi−1 + qTi αiqi + qTi βiqi+1
= βi−1qTi qi−1 + αiqTi qi + βiqTi qi+1
= αiqTi qi
• We obtain βi by rearranging † from the recurrence formula
ri ≡ βiqi+1 = Aqi − αiqi − βi−1qi−1
• We assume βi 6= 0 and so βi = ||ri||2.
• We may now determine the next orthonormal vector
qi+1 =riβi.
5 A Little Proof - Omit from Seminar
Lemma: All vectors qi+1 generated by the 3-term are orthogonal to all qk for k < i
Proof
• We assume qTi+1qi = 0 = qTi+1qi−1 and by induction step qTi qk for k < i.
• We prove qTi+1qk for k < i.
• Multiply † by qk for k ≤ i− 2 and we show qk,qi are A orthogonal. Hence
qTkAqi =(qTkAT
)qi = (Aqk)
T qi
= (βk−1qk−1 + αkqk + βkqk+1)T qi= βk−1qTk−1qi + αkqTk qi + βkqTk+1qi= 0 + 0 + 0 = 0
• Now multiply † by qk so that
qTkAqi = βi−1qTk qi−1 + αiqTk qi + βi−1qTk qi+1
Rearranging we obtain
βi−1qTk qi+1 = qTkAqi − βi−1qTk qi−1 − αiqTk qi = 0
6 The Lanzcos Algorithm
Initialise: choose r = q0 and let β0 = ||q0||2Begin Loop: for j = 1, ...
qj = rβj−1
r = Aqjr = r− qj−1βj−1
αj = qTj rr = r− qjαjOthorgonalise if necessaryβj = ||r||2Compute approximate eigenvalues of Tj
Test Convergence (see remarks)endfor
End Loop
7 Remarks 1: Finding the Eigenvalues and Eigenvectors
• So how do we find the eigenvalues and eigenvectors?
• If βk = 0 then
1. We diagonalise the matrix Tk using simple QR method to find the exact eigenvalues.
Tk = Skdiag (λ1, . . . , λk)STk
where the matrix Sk is orthonormal SkSTk = I.
2. The exact eigenvectors are given correspondingly in the columns of the matrix Y where
STk QTkAQkSk = diag (λ1, . . . , λk)
so that Y = QkSk.
3. We converge to the k largest eigenvalues. The proof is very difficult and is omitted.
• Now βk is never really zero. Hence we only converge to the eigenvalue.
– After k steps we have AQk = QkTk,k + βkqk+1eTk– For βk small we obtain approximations to the eigenvalues θi ≈ λi by
Tk = Skdiag (θ1, . . . , θk)STk
– We multiply AQk by Sk from above so that
AQkSk = QkTk,kSk + βkqk+1eTk Sk= QkSkdiag (θ1, . . . , θk)STk Sk + βkqk+1eTk Sk
AYk = Ykdiag (θ1, . . . , θk) + βkqk+1eTk SkAYk.ej = Ykdiag (θ1, . . . , θk) .ej + βkqk+1eTk Skej
Ayj = yjθj + βkqk+1SkjAyj − θjyj = βkqk+1Skj
∴ ||Ayj − θjyj ||2 = |βk||Skj | (1)
– So if βk → 0 we prove θj → λj .
– Otherwise |βk||Skj | needs to be small to have a good approximation, hence convergencecriterion
|βk||Skj | < ε
8 Remarks 2: Difficulties with Lanzcos Method
• In practice, the problem is that the orthogonality is not preserved.
• As soon as one eigenvalue converges all the basis vectors qi pick up perturbations biasedtoward the direction of the corresponding eigenvector and orthogonality is lost.
• A “ghost” copy of the eigenvalue will appear again in the tridiagonal matrix T.
• To counter this we fully re-orthonormalize the sequence by using Gram-Schmidt or even QR.
• However, either approach would be expense if the dimension if the Krylov space is large.
• So instead a selective re-orthonormalization is pursued. More specifically, the practicalapproach is to orthonormalize half-way i.e., within half machine-recision
√εM .
• If the eigenvalues of A are not well separated, then we can use a shift and employ the matrix
(A− σI)−1
following the shifted inverted power method to generate the appropriate Krylov subspaces.