Convergence Rates on Root Finding - Iowa State...

Convergence Rates on Root Finding

Com S 477/577

Oct 5, 2004

A sequence xi ∈ R converges to ξ if for each ǫ > 0, there exists an integer N(ǫ) such that|xl − ξ| > ǫ for all l ≥ N(ǫ). The Cauchy convergence criterion states that a sequence xi ∈ Ris convergent if and only if for each ǫ > 0 there exists an N(ǫ) such that |xl − xm| < ǫ for alll,m ≥ N(ǫ).

Let a sequence xi ∈ R be generated by an iteration function Φ, that is,

xi+1 = Φ(xi), i = 0, 1, 2, . . .

Let ξ be a fixed point of Φ, that is, ξ = Φ(ξ). Suppose that the sequence {xi} is generated in theneighborhood of ξ. The corresponding iteration method is said to be of at least pth order if thereexists a neighborhood N(ξ) of ξ such that for all x0 ∈ N(ξ) the generated sequence xi+1 = Φ(xi),i = 0, 1, . . ., satisfies

|xi+1 − ξ| ≤ C|xi − ξ|p,where C < 1 if p = 1. In the case of first order convergence, for instance, we have

|xi − ξ| ≤ C|xi−1 − ξ| ≤ C2|xi−2 − ξ| ≤ · · · ≤ Ci|x0 − ξ|.

Since C < 1, it follows thatlimi→∞

|xi − ξ| = limi→∞

Ci|x0 − ξ| = 0,

namely, the sequence {xi} will converge to ξ.Now suppose Φ is sufficiently often differentiable in N(ξ). If xi ∈ N(ξ) and if Φ(k)(ξ) = 0 for

k = 1, 2, . . . , p − 1 but Φ(p)(ξ) 6= 0, that is, ξ is a zero of order p, then it follows from the Taylorexpansion that

xi+1 = Φ(xi)

= Φ(ξ) +Φ(p)(ξ)

p!(xi − ξ)p + O

(

(xi − ξ)p+1)

.

Because Φ(ξ) = ξ, we obtain

limi→∞

xi+1 − ξ

(xi − ξ)p=

Φ(p)(ξ)

p!.

For p = 2, 3, . . ., the method is of (precisely) pth order.The method is of first order if p = 1 and |Φ′(ξ)| < 1. When 0 < Φ′(ξ) < 1, the sequence {xi}

will converge monotonically to ξ as shown in the left figure below. When −1 < Φ′(ξ) < 0, thesequence will alternate about ξ during convergence as shown in the right figure.

1

xi x i+1 x i +2 xi x i+1x i +2

φ( )x

φ( )x

x

0 ξ 0 ξ

x

In the below we study the convergence rates of several root finding methods introduced before.

1 Quadratic Convergence of Newton’s Method

Newton’s method has the iteration function

Φ(x) = x − f(x)

f ′(x)with f(ξ) = 0.

Suppose f is sufficiently continuously differentiable in some neighborhood N(ξ). In the non-degenerate case, f ′(ξ) 6= 0. So we have

Φ(ξ) = ξ,

Φ′(ξ) = 1 −

(

f ′(x))2

− f(x)f ′′(x)(

f ′(x))2

∣

∣

∣

∣

∣

x=ξ

= 0, since f(ξ) = 0,

Φ′′(ξ) = −

(

2f ′(x)f ′′(x) − f ′(x)f ′′(x) − f(x)f ′′′(x))(

f ′(x))2

− 2f ′(x)f ′′(x)

(

(

f ′(x))2

− f(x)f ′′(x)

)

(

f ′(x))4

∣

∣

∣

∣

∣

x=ξ

= −−

(

f ′(ξ))3

f ′′(ξ) − f(ξ)(

f ′(ξ))2

f ′′′(ξ) + 2f(ξ)f ′(ξ)(

f ′′(ξ))2

(

f ′(ξ))4

= −−

(

f ′(ξ))3

f ′′(ξ)(

f ′(ξ))4 , since f(ξ) = 0

=f ′′(ξ)

f ′(ξ)

6= 0.

So Newton’s method is quadratically convergent.In the degenerate case, ξ is an m-fold zero of f , for some m > 1, that is, f (i)(ξ) = 0, for

i = 0, 1, . . . ,m−1. We will leave to the students to determine the order of convergence in this case.

2

2 Linear Convergence of Regula Falsi

For clarity of analysis we let xi = bi for all i. We make some simplification assumptions for thediscussion of the convergence behavior: f ′′ exists and for some k the following conditions hold: (a)xk < ak; (b) f(xk) < 0 and f(ak) > 0; (c) f ′′(x) ≥ 0 for all x ∈ [xk, ak].

xk xk a k+1

(a) (b)

Under these assumptions, either f(xk+1) = 0 or f(xk+1)f(xk) > 0 and consequently

xi < xi+1 < ai+1 = ai.

To see this, use the remainder formula for polynomial interpolation at xk and ak:

f(x) − p(x) = (x − xk)(x − ak)f ′′(η)

2

for x ∈ [xk, ak] and a suitable η ∈ [xk, ak]. Under condition (c), the above equation implies thatf(x)−p(x) ≤ 0. In particular, f(xk+1)−p(xk+1) ≤ 0, which in turn implies that f(xk+1) ≤ 0 sincep(xk+1) = 0.

Unless f(xk+1) = 0, in which case the iteration stops at xk+1, we can see that conditions (a),(b), and (c) hold for all i ≥ k. Therefore ai = ak = a and

xi+1 =af(xi) − xif(a)

f(xi) − f(a)

for all i ≥ k. Furthermore, {xi} for i ≥ k form a monotone increasing sequence bounded by a. Solimi→∞ xi = ξ exists. Consequently,

f(ξ) ≤ 0, f(a) > 0, and ξ =af(ξ) − ξf(a)

f(ξ) − f(a),

which gives(ξ − a)f(ξ) = 0.

But ξ < a since f(ξ) ≤ 0 < f(a). Hence f(ξ) = 0 and {xi} converges to a zero of f .The above discussion enables us to look at the order of convergence through the iteration

function

xi+1 = Φ(xi), where Φ(x) =af(x) − xf(a)

f(x) − f(a).

3

Since f(ξ) = 0, we obtain that

Φ′(ξ) =−

(

af ′(ξ) − f(a))

f(a) + ξf(a)f ′(ξ)

f(a)2= 1 − f ′(ξ)

ξ − a

f(ξ) − f(a).

By the mean value theorem, there exist η1, η2 such that

f(ξ) − f(a)

ξ − a=

−f(a)

ξ − a= f ′(η1), ξ < η1 < a; (1)

f(xi) − f(ξ)

xi − ξ=

f(xi)

xi − ξ= f ′(η2), xi < η2 < ξ. (2)

Since f ′′(x) ≥ 0, f ′(x) increases monotonically in [xi, a], So f ′(η2) ≤ f ′(ξ) ≤ f ′(η1). Meanwhile,condition (2), xi < ξ, and f(xi) < 0 together imply that 0 < f ′(η2). Therefore 0 < f ′(ξ) ≤ f ′(η1).We have thus shown that

0 ≤ Φ′(ξ) = 1 − f ′(ξ)

f ′(η1)< 1.

So the regula falsi method converges linearly.From the previous discussion we see that the method of regula falsi will almost always end up

with the one-sided convergence demonstrated before.

3 Superlinear Convergence of Secant Method

In secant method, the iteration is in the form

xi+1 = xi −f(xi)

f [xi−1, xi], i = 0, 1, . . . (3)

To determine the local convergence rate, without loss of generality we assume that the sequence{xi} is in a small enough neighborhood of the zero ξ and that f is twice differentiable. Subtract ξfrom both sides of (3):

xi+1 − ξ = (xi − ξ) − f(xi)

f [xi−1, xi]

= (xi − ξ)

(

1 − f [xi, ξ]

f [xi−1, xi]

)

, since f [xi, ξ] =f(xi) − f(ξ)

xi − ξ=

f(xi)

xi − ξ

= (xi − ξ)(xi−1 − ξ) · f [xi−1, xi] − f [xi, ξ]

(xi−1 − ξ)f [xi−1, xi]

= (xi − ξ)(xi−1 − ξ)f [xi−1, xi, ξ]

f [xi−1, xi]. (4)

From error estimation of polynomial interpolation, we learned that

f [xi−1, xi] = f ′(η1), η1 ∈ I[xi−1, xi];

f [xi−1, xi, ξ] =1

2f ′′(η2), η2 ∈ I[xi−1, xi, ξ],

4

where I[xi−1, xi] is the smallest interval containing xi−1 and xi, and I[xi−1, xi, ξ] the smallestinterval containing xi−1, xi, ξ

If ξ is a simple zero, that is, f ′(ξ) 6= 0, there exists a bound M and an interval J ={

x∣

∣

∣|x−ξ| ≤

ǫ}

for some ǫ > 0 such that∣

∣

∣

∣

1

2

f ′′(η2)

f ′(η1)

∣

∣

∣

∣

≤ M, (5)

for any η1, η2 ∈ J .Let ei = M |xi − ξ| and e0, e1 < min{1, ǫM}. By induction and using (4) and (5) we can easily

show thatei+1 = M · |xi+1 − ξ| ≤ M · ei

M· ei−1

M· M = eiei−1,

and|ei| ≤ min{1, ǫM},

for i = 1, 2, . . ..Let q = (1 +

√5)/2 be the root of the equation z2 − z − 1 = 0. Then we have

ei ≤ Kqi

, i = 0, 1, 2, . . .

where K = max{e0, q√

e1} < 1. This is because (by induction)

ei+1 ≤ eiei−1 ≤ Kqi

Kqi−1

= Kqi−1(q+1) = Kqi−1q2

= Kqi+1

.

Thus the secant method converges at least as well as a method of order p = 1+√

52 = 1.618 . . ..

One-step secant requires one additional function evaluation. But one-step Newton requires two(f and f ′). Therefore two secant steps are as expensive as single Newton step. But two secantsteps has a convergence order of (1.618)2 ≈ 2.618. This explains why in practice the secant methodalways dominates Newton’s method with numerical derivatives.

References

[1] J. Stoer and R. Bulirsch. Introduction to Numerical Analysis. Springer-Verlag New York, Inc.,2nd edition, 1993.

[2] M. Erdmann. Lecture notes for 16-811 Mathematical Fundamentals for Robotics. The RoboticsInstitute, Carnegie Mellon University, 1998.

[3] W. H. Press, et al. Numerical Recipes in C++: The Art of Scientific Computing. CambridgeUniversity Press, 2nd edition, 2002.

5

Convergence Rates on Root Finding - Iowa State...

Documents

Transcript of Convergence Rates on Root Finding - Iowa State...