Convergence Rates on Root Finding - Iowa State...
Click here to load reader
Transcript of Convergence Rates on Root Finding - Iowa State...
Convergence Rates on Root Finding
Com S 477/577
Oct 5, 2004
A sequence xi ∈ R converges to ξ if for each ǫ > 0, there exists an integer N(ǫ) such that|xl − ξ| > ǫ for all l ≥ N(ǫ). The Cauchy convergence criterion states that a sequence xi ∈ Ris convergent if and only if for each ǫ > 0 there exists an N(ǫ) such that |xl − xm| < ǫ for alll,m ≥ N(ǫ).
Let a sequence xi ∈ R be generated by an iteration function Φ, that is,
xi+1 = Φ(xi), i = 0, 1, 2, . . .
Let ξ be a fixed point of Φ, that is, ξ = Φ(ξ). Suppose that the sequence {xi} is generated in theneighborhood of ξ. The corresponding iteration method is said to be of at least pth order if thereexists a neighborhood N(ξ) of ξ such that for all x0 ∈ N(ξ) the generated sequence xi+1 = Φ(xi),i = 0, 1, . . ., satisfies
|xi+1 − ξ| ≤ C|xi − ξ|p,where C < 1 if p = 1. In the case of first order convergence, for instance, we have
|xi − ξ| ≤ C|xi−1 − ξ| ≤ C2|xi−2 − ξ| ≤ · · · ≤ Ci|x0 − ξ|.
Since C < 1, it follows thatlimi→∞
|xi − ξ| = limi→∞
Ci|x0 − ξ| = 0,
namely, the sequence {xi} will converge to ξ.Now suppose Φ is sufficiently often differentiable in N(ξ). If xi ∈ N(ξ) and if Φ(k)(ξ) = 0 for
k = 1, 2, . . . , p − 1 but Φ(p)(ξ) 6= 0, that is, ξ is a zero of order p, then it follows from the Taylorexpansion that
xi+1 = Φ(xi)
= Φ(ξ) +Φ(p)(ξ)
p!(xi − ξ)p + O
(
(xi − ξ)p+1)
.
Because Φ(ξ) = ξ, we obtain
limi→∞
xi+1 − ξ
(xi − ξ)p=
Φ(p)(ξ)
p!.
For p = 2, 3, . . ., the method is of (precisely) pth order.The method is of first order if p = 1 and |Φ′(ξ)| < 1. When 0 < Φ′(ξ) < 1, the sequence {xi}
will converge monotonically to ξ as shown in the left figure below. When −1 < Φ′(ξ) < 0, thesequence will alternate about ξ during convergence as shown in the right figure.
1
xi x i+1 x i +2 xi x i+1x i +2
φ( )x
φ( )x
x
0 ξ 0 ξ
x
In the below we study the convergence rates of several root finding methods introduced before.
1 Quadratic Convergence of Newton’s Method
Newton’s method has the iteration function
Φ(x) = x − f(x)
f ′(x)with f(ξ) = 0.
Suppose f is sufficiently continuously differentiable in some neighborhood N(ξ). In the non-degenerate case, f ′(ξ) 6= 0. So we have
Φ(ξ) = ξ,
Φ′(ξ) = 1 −
(
f ′(x))2
− f(x)f ′′(x)(
f ′(x))2
∣
∣
∣
∣
∣
x=ξ
= 0, since f(ξ) = 0,
Φ′′(ξ) = −
(
2f ′(x)f ′′(x) − f ′(x)f ′′(x) − f(x)f ′′′(x))(
f ′(x))2
− 2f ′(x)f ′′(x)
(
(
f ′(x))2
− f(x)f ′′(x)
)
(
f ′(x))4
∣
∣
∣
∣
∣
x=ξ
= −−
(
f ′(ξ))3
f ′′(ξ) − f(ξ)(
f ′(ξ))2
f ′′′(ξ) + 2f(ξ)f ′(ξ)(
f ′′(ξ))2
(
f ′(ξ))4
= −−
(
f ′(ξ))3
f ′′(ξ)(
f ′(ξ))4 , since f(ξ) = 0
=f ′′(ξ)
f ′(ξ)
6= 0.
So Newton’s method is quadratically convergent.In the degenerate case, ξ is an m-fold zero of f , for some m > 1, that is, f (i)(ξ) = 0, for
i = 0, 1, . . . ,m−1. We will leave to the students to determine the order of convergence in this case.
2
2 Linear Convergence of Regula Falsi
For clarity of analysis we let xi = bi for all i. We make some simplification assumptions for thediscussion of the convergence behavior: f ′′ exists and for some k the following conditions hold: (a)xk < ak; (b) f(xk) < 0 and f(ak) > 0; (c) f ′′(x) ≥ 0 for all x ∈ [xk, ak].
xk xk a k+1
(a) (b)
Under these assumptions, either f(xk+1) = 0 or f(xk+1)f(xk) > 0 and consequently
xi < xi+1 < ai+1 = ai.
To see this, use the remainder formula for polynomial interpolation at xk and ak:
f(x) − p(x) = (x − xk)(x − ak)f ′′(η)
2
for x ∈ [xk, ak] and a suitable η ∈ [xk, ak]. Under condition (c), the above equation implies thatf(x)−p(x) ≤ 0. In particular, f(xk+1)−p(xk+1) ≤ 0, which in turn implies that f(xk+1) ≤ 0 sincep(xk+1) = 0.
Unless f(xk+1) = 0, in which case the iteration stops at xk+1, we can see that conditions (a),(b), and (c) hold for all i ≥ k. Therefore ai = ak = a and
xi+1 =af(xi) − xif(a)
f(xi) − f(a)
for all i ≥ k. Furthermore, {xi} for i ≥ k form a monotone increasing sequence bounded by a. Solimi→∞ xi = ξ exists. Consequently,
f(ξ) ≤ 0, f(a) > 0, and ξ =af(ξ) − ξf(a)
f(ξ) − f(a),
which gives(ξ − a)f(ξ) = 0.
But ξ < a since f(ξ) ≤ 0 < f(a). Hence f(ξ) = 0 and {xi} converges to a zero of f .The above discussion enables us to look at the order of convergence through the iteration
function
xi+1 = Φ(xi), where Φ(x) =af(x) − xf(a)
f(x) − f(a).
3
Since f(ξ) = 0, we obtain that
Φ′(ξ) =−
(
af ′(ξ) − f(a))
f(a) + ξf(a)f ′(ξ)
f(a)2= 1 − f ′(ξ)
ξ − a
f(ξ) − f(a).
By the mean value theorem, there exist η1, η2 such that
f(ξ) − f(a)
ξ − a=
−f(a)
ξ − a= f ′(η1), ξ < η1 < a; (1)
f(xi) − f(ξ)
xi − ξ=
f(xi)
xi − ξ= f ′(η2), xi < η2 < ξ. (2)
Since f ′′(x) ≥ 0, f ′(x) increases monotonically in [xi, a], So f ′(η2) ≤ f ′(ξ) ≤ f ′(η1). Meanwhile,condition (2), xi < ξ, and f(xi) < 0 together imply that 0 < f ′(η2). Therefore 0 < f ′(ξ) ≤ f ′(η1).We have thus shown that
0 ≤ Φ′(ξ) = 1 − f ′(ξ)
f ′(η1)< 1.
So the regula falsi method converges linearly.From the previous discussion we see that the method of regula falsi will almost always end up
with the one-sided convergence demonstrated before.
3 Superlinear Convergence of Secant Method
In secant method, the iteration is in the form
xi+1 = xi −f(xi)
f [xi−1, xi], i = 0, 1, . . . (3)
To determine the local convergence rate, without loss of generality we assume that the sequence{xi} is in a small enough neighborhood of the zero ξ and that f is twice differentiable. Subtract ξfrom both sides of (3):
xi+1 − ξ = (xi − ξ) − f(xi)
f [xi−1, xi]
= (xi − ξ)
(
1 − f [xi, ξ]
f [xi−1, xi]
)
, since f [xi, ξ] =f(xi) − f(ξ)
xi − ξ=
f(xi)
xi − ξ
= (xi − ξ)(xi−1 − ξ) · f [xi−1, xi] − f [xi, ξ]
(xi−1 − ξ)f [xi−1, xi]
= (xi − ξ)(xi−1 − ξ)f [xi−1, xi, ξ]
f [xi−1, xi]. (4)
From error estimation of polynomial interpolation, we learned that
f [xi−1, xi] = f ′(η1), η1 ∈ I[xi−1, xi];
f [xi−1, xi, ξ] =1
2f ′′(η2), η2 ∈ I[xi−1, xi, ξ],
4
where I[xi−1, xi] is the smallest interval containing xi−1 and xi, and I[xi−1, xi, ξ] the smallestinterval containing xi−1, xi, ξ
If ξ is a simple zero, that is, f ′(ξ) 6= 0, there exists a bound M and an interval J ={
x∣
∣
∣|x−ξ| ≤
ǫ}
for some ǫ > 0 such that∣
∣
∣
∣
1
2
f ′′(η2)
f ′(η1)
∣
∣
∣
∣
≤ M, (5)
for any η1, η2 ∈ J .Let ei = M |xi − ξ| and e0, e1 < min{1, ǫM}. By induction and using (4) and (5) we can easily
show thatei+1 = M · |xi+1 − ξ| ≤ M · ei
M· ei−1
M· M = eiei−1,
and|ei| ≤ min{1, ǫM},
for i = 1, 2, . . ..Let q = (1 +
√5)/2 be the root of the equation z2 − z − 1 = 0. Then we have
ei ≤ Kqi
, i = 0, 1, 2, . . .
where K = max{e0, q√
e1} < 1. This is because (by induction)
ei+1 ≤ eiei−1 ≤ Kqi
Kqi−1
= Kqi−1(q+1) = Kqi−1q2
= Kqi+1
.
Thus the secant method converges at least as well as a method of order p = 1+√
52 = 1.618 . . ..
One-step secant requires one additional function evaluation. But one-step Newton requires two(f and f ′). Therefore two secant steps are as expensive as single Newton step. But two secantsteps has a convergence order of (1.618)2 ≈ 2.618. This explains why in practice the secant methodalways dominates Newton’s method with numerical derivatives.
References
[1] J. Stoer and R. Bulirsch. Introduction to Numerical Analysis. Springer-Verlag New York, Inc.,2nd edition, 1993.
[2] M. Erdmann. Lecture notes for 16-811 Mathematical Fundamentals for Robotics. The RoboticsInstitute, Carnegie Mellon University, 1998.
[3] W. H. Press, et al. Numerical Recipes in C++: The Art of Scientific Computing. CambridgeUniversity Press, 2nd edition, 2002.
5