Iterative Methods for Linear Systems - BGUnoia192/wiki.files/NOIA... · To minimize f(x), we...

Iterative Methods for Linear Systems

Eran Treister

Computer Science Department,Ben-Gurion University of the Negev,

Israel.

March 31, 2019

1 / 24

Iterative methods for linear systems

Definition

An iterative method is defined as

x(k+1) = φ(x(k)),

which simply “looks” only at one previous vector or

x(k+1) = φ(x(k), ..., x(0)),

designed to solveAx = b

2 / 24

Iterative methods for linear systems

Definition

An iterative method is defined as

x(k+1) = φ(x(k)),

which simply “looks” only at one previous vector or

x(k+1) = φ(x(k), ..., x(0)),

designed to solveAx = b

Initialize with an arbitrary guess x(0).

Iteratively improve this guess until the solution of the linearsystem is achieved up to some accuracy.

Usually applied when direct methods are too expensive orimpossible to use.

3 / 24

Requirements for Iterative methods

The first requirement from φ is that it converges:

limk→∞{x(k)} = x∗

where Ax∗ = b

The second requirement is that the method converges as fastas possible. We define a convergence rate to be

limk→∞

‖x(k+1) − x∗‖‖x(k) − x∗‖p

where p is the order of convergence and C is called theconvergence factor

4 / 24

Error and Residual

Definition (Error vector)

The vectore(k) = x∗ − x(k)

is called the error vector at iteration k .

For convergence, it should hold limk→∞ e(k) = 0

Definition (Residual vector)

The vectorr(k) = b− Ax(k) = Ae(k)

is called the residual vector.

5 / 24

Error and Residual

Definition (Error vector)

e(k) = x∗ − x(k)

Definition (Residual vector)

r(k) = b− Ax(k) = Ae(k)

The key difference between the two is that we cannotmeasure the error without knowing the solution, but wecan measure the residual. Note that convergence means

limk→∞{e(k)} = lim

k→∞{r(k)} = 0.

6 / 24

Simple iterative methods

Split the matrix A into two: A = M + N. Then the linearsystem is written as

Mx + Nx = b,

The iteration is defined as:

x(k+1) = M−1(b− Nx(k)) = x(k) + M−1(b− Ax(k)),

7 / 24

Simple iterative methods

Split the matrix A into two: A = M + N. Then the linearsystem is written as

Mx + Nx = b,

The iteration is defined as:

x(k+1) = M−1(b− Nx(k)) = x(k) + M−1(b− Ax(k)),

Remark

M (called the preconditioner) is ”inverted” every iteration.

The cost of the solution is naturally comprised from thenumber of iterations times the work needed to “invert” M.

8 / 24

Practical stopping conditions

We usually stop iterating if one of the following is satisfied forsome tolerance ε:

‖Ax(k) − b‖‖b‖

< ε or‖x(k) − x(k−1)‖‖x(k)‖

The left term indicates that the residual is low enoughcompared to a zero solution

The second criterion indicates that the relative change in theiterations is small enough.

9 / 24

General Iterative Method

Input: A ∈ Rn×n, b ∈ Rn, x(0) ∈ Rn, M,N ∈ Rn×n,maxIter , ε, Convergence criterion Output: x s.t

Ax ≈ b

k = 1, ...,maxIter Apply iteration:x(k) = M−1(b− Nx(k−1)) orx(k) = x(k−1) + M−1(b− Ax(k−1)),

If ‖Ax(k)−b‖‖b‖ < ε or alternatively ‖x(k)−x(k−1)‖

‖x(k)‖ < ε.

Convergence is reached, stop the iterations.

Return x(k) as the solution.

10 / 24

The Jacobi method

Example

Assume that we need to solve Ax = b:4x1 − x2 + x3 = 74x1 − 8x2 + x3 = −21−2x1 + x2 + 5x3 = 15

Rewrite: 4 −1 14 −8 1−2 1 5

7−2115

11 / 24

The Jacobi method

Example

Assume that we need to solve Ax = b: 4 −1 14 −8 1−2 1 5

7−2115

A is a diagonal dominant, so it can be approximated well by adiagonal matrix. Let us split the matrix:

A = D+L+U =

4−2 1

−1 11

12 / 24

The Jacobi method

Example

A = D+L+U =

4−2 1

−1 11

,Choosing M = D,the method then becomes (in matrix form):

x(k+1) = D−1(b− (L + U)x(k)) = x(k) + D−1(b− Ax(k)). (4)

In our example this will be: x(k+1)1

x(k+1)2

x(k+1)3

14(7 + x

(k)2 − x

(k)3 )

18(21 + 4x

(k)1 + x

(k)3 )

15(15 + 2x

(k)1 − x

(k)2 )

13 / 24

The Jacobi method

Running the iterations from a guess x (0) = [1, 2, 2] yields:Iter: 0: [1.0, 2.0, 2.0]

Iter: 1: [1.75, 3.375, 3.0]

Iter: 2: [1.84375, 3.875, 3.025]

Iter: 3: [1.9625, 3.925, 2.9625]

Iter: 4: [1.99063, 3.97656, 3.0]

Iter: 5: [1.99414, 3.99531, 3.00094]

Iter: 6: [1.99859, 3.99719, 2.99859]

Iter: 7: [1.99965, 3.99912, 3.0]

Iter: 8: [1.99978, 3.99982, 3.00004]

Iter: 9: [1.99995, 3.99989, 2.99995]

14 / 24

The Jacobi method

Figure: The residual and error history norm for the Jacobi iterations. Notethe logarithmic scale of the y axis, when plotting convergence history.

15 / 24

The Gauss-Seidel method

The GS method is achieved by the split:

(L + D)x = b− Ux⇒ (L + D)x(k+1) = b− Ux(k),

Choosing M = L + D, each iteration reads

x(k+1) = (L+D)−1(b− Ux(k)

)= x(k)+(L+D)−1

(b− Ax(k)

In scalar form, the method is given by

x(k+1)i =

bi −∑j<i

aijx(k+1)j −

∑j>i

aijx(k)j

, i = 1, ..., n.

16 / 24

The Gauss-Seidel method

GS Iteration:

x(k+1) = (L + D)−1(b− Ux(k)

)= x(k) + (L + D)−1

(b− Ax(k)

Example x(k+1)1

x(k+1)2

x(k+1)3

14(7 + x

(k)2 − x

(k)3 )

18(21 + 4x

(k+1)1 + x

(k)3 )

15(15 + 2x

(k+1)1 − x

(k+1)2 )

Convergence is much faster than the Jacobi method:Iter: 0: [1.0, 2.0, 2.0]

Iter: 1: [1.75, 3.75, 2.95]

Iter: 2: [1.95, 3.96875, 2.98625]

Iter: 3: [1.99562, 3.99609, 2.99903]

Iter: 4: [1.99927, 3.99951, 2.9998]

Iter: 5: [1.99993, 3.99994, 2.99998]

Iter: 6: [1.99999, 3.99999, 3.0]

Iter: 7: [2.0, 4.0, 3.0]17 / 24

Convergence of Iterative methods

We saw the general problem:

x(k+1) = x(k) + M−1(b− Ax(k)).

The error at (k+1)-th iteration:

e(k+1) = x∗ − x(k+1) = x∗ − x(k) −M−1(Ax∗ − Ax(k)).

The iteration matrix for the error is given by

e(k+1) = (I −M−1A)︸︷︷︸T

18 / 24

Assuming T is diagonaizable with eigenpairs (λi , vi ), ande(0) =

∑ni=1 αivi :

e(k+1) = T k+1e(0) = T k+1n∑

αivi =n∑

αiλk+1i vi

The error e(k+1) will go to 0 (as k →∞) only if the largesteigenvalue in magnitude is smaller than 1.

Recall: the largest eigenvalue in magnitude is defined as thespectral radius.

19 / 24

Theorem

Given Ax = b where A is invertible,the general iteration

x(k+1) = x(k) + M−1(b− Ax(k))

converges for any starting vector x(0) if and only if

ρ(I −M−1A) < 1.

This spectral radius is also the convergence factor of the iteration.That is, for every vector norm

limk→∞

‖e(k+1)‖‖e(k)‖

= ρ(I −M−1A).

20 / 24

Checking Convergence

Remark

Spectral radius is hard to compute. Therefore, we often try to usematrix norms to check convergence, since any matrix norm upperbounds the spectral radius. That is,

‖I −M−1A‖ < 1⇒ ρ(I −M−1A) < 1,

and if we found a norm for which ‖I −M−1A‖ < 1, then ourmethod converges.

Example

In the previous examples, the error iteration matrix is: 0 14

48 0 1

825−15 0

⇒ ‖T‖∞ =5

21 / 24

Practical Convergence test

Definition (Strictly diagonally dominant matrices (SDD))

A matrix A is strictly diagonally dominant in rows if for every row i

|aii | >∑j 6=i

|aij |

Theorem

If the matrix A is strictly diagonally dominant in rows, then bothJacobi and Gauss Seidel methods converge.

22 / 24

The variational meaning of GS

Consider the following problem:

f (x) =1

2‖x− x∗‖2A =

2x>Ax− x>b +

2(x∗)>b,

where A is positive definite.

To minimize f (x), we require ∇f (x) = 0 and get a linearsystem Ax = b

In GS, we zero the residual ri for each i, given the other x’s.

The residuals are basically the equations of the gradient.Thus, for each i we require ∂f

∂xi= 0, thus ∇f (x (k)) = 0

Corollary

The updates of Gauss-Seidel for each xi are equivalent tominimizing f (x)

23 / 24

variational GS

Example (Variational property of Gauss Seidel)

Consider the following linear system:

[2 11 3

], b =

It is easy to show that f (x) = x21 + x1x2 + 1.5x22 − 3x1 − 4x2. Thecondition ∇f = 0 in this case is

∂x1= 2x1 + x2 − 3 = 0 (5)

∂x2= x1 + 3x2 − 4 = 0 (6)

24 / 24

Iterative Methods for Linear Systems - BGUnoia192/wiki.files/NOIA... · To minimize f(x), we...

Documents

Transcript of Iterative Methods for Linear Systems - BGUnoia192/wiki.files/NOIA... · To minimize f(x), we...

Linear, Integer Linear and Dynamic Programming

1 Hypothesis Testing Under General Linear Model Previously we derived the sampling property results assuming normality: Y = X + e where e t ~N(0,

u x) = f x + homogeneous boundary conditionskglasner/math456/GREENS2.pdfGreen’s functions Suppose we want to solve a linear, inhomogeneous equation Lu(x) = f(x) + homogeneous boundary

Linear Response Theory - Physics Courses€¦ · 6 CHAPTER 3. LINEAR RESPONSE THEORY t

Multiple Linear Regression

Linear Σ Series Linear Servo Motor Installation Manualomron.com.ru/.../files/Linear_Sigma_Series-Installation_Manual.pdf · YASKAWA ELECTRIC EUROPE GmbH ... Linear Sigma Servo Motor

Fitting Generalized Linear Modelscivil.colorado.edu/~balajir/CVEN6833/lectures/glm... · 2013. 9. 5. · Stat 544, Lecture 13 1 Fitting Generalized Linear Models Last time, we introduced

CS184A/284A AI in Biology and Medicine Linear Regressionxhx/courses/CS284A/linear... · AI in Biology and Medicine Linear Regression. Machine Learning Linear Regression via Least

Linear algebra. hefferon

CONTINUITY OF SOLUTIONS TO SPACE-VARYING …mat.uab.cat/pubmat/fitxers/download/FileType:pdf/FolderName:v61(1...POINTWISE LINEAR ELLIPTIC EQUATIONS Lashi Bandara Abstract: We consider

Linear Representations of Finite Groups - Auburn Universitywebhome.auburn.edu › ~holmerr › linear representations.pdf · Linear Representations of Finite Groups Randall R. Holmes

(Generalized) Linear Mixed Modelssw15190/mgcv/tampere/mixed.pdf · Generalized linear mixed model I So far we have allowed very °exible models for the expected response and very

Methods for solving linear systems · Solving linear systems The problem: given b 2Rn, and A 2Rn Rn, we look for x 2Rn solution of Ax = b (1) Problem (1) has a unique solution if

Linear and non-linear equations - University of Notre Dametaylor/Math20580/Lectures/Lb2... · Non-linear rst order ODE Non-linear equations are signi cantly more di cult. We will

Colorado State University - math.colostate.eduwangz/m535 presentation/m535... · Numerical experiments for linear poroelasticity We test the example which is on (0;1)2 for linear

From Linear

Lecture 14: Dense Linear Algebra - Cornell Universitybindel/class/cs5220-f11/slides/lec14.pdf · Dense Linear Algebra David Bindel 18 Oct 2010. Where we are I This week: dense linear

Linear-algebraic -calculus

Mathematical foundations - linear algebradisi.unitn.it/.../slides/03_linear_algebra/talk.pdf · Mathematical foundations - linear algebra ... Machine Learning Linear algebrea. Vector

FaST linear mixed models for genome -wide association … we use only one SNP at a time in our work, ... As described next, we optimize this function of δ using a one-dimensional