Instructor: Prof.Dr.Sahand Daneshvar Presented by: Seyed Iman Taheri Student number: 136023 Non...

Post on 27-Dec-2015

234 views 0 download

Transcript of Instructor: Prof.Dr.Sahand Daneshvar Presented by: Seyed Iman Taheri Student number: 136023 Non...

Instructor: Prof.Dr.Sahand DaneshvarPresented by: Seyed Iman TaheriStudent number: 136023

Non linear OptimizationSpring 2014-15

EASTERN MEDITERRANEAN UNIVERSITYDepartment of Industrial Engineering

Line Search Using Derivativesbisection search methodNewton's method

bisection search methodSuppose that we wish to minimize a function θ

over a closed and bounded interval. Furthermore, suppose that θ is pseudoconvex and hence, differentiable. At iteration k, let the interval of uncertainty be [ak, bk]. Suppose that the derivative θ'(λk) is known, and consider the following three possible cases:

1. If θ'(λk) = 0, then, by the pseudoconvexity of θ, λk is a minimizing point.

2. If θ'(λk) > 0, then, for λ > λk we have θ'(λk) (λ- λk) > 0; and by the pseudoconvexity of θ it follows that θ(λ) ≥θ(λk ) . In other words, the minimum occurs to the left of λk , so that the new interval of uncertainty [ak+l, bk+1 ]is given by [ak, λk].

3. If θ'(λk)<0, then, for λ<λk, θ'(λk)(λ- λk)>0, so that θ(λ) ≥θ(λk ). Thus, the minimum occurs to the right of λk , so that the new interval of uncertainty [ak+l, bk+1 ] is given by [λk, bk].

The position of λk, in the interval [ak , bk] must be chosen so that the maximum possible length of the new interval of uncertainty is minimized. That is, λk must be chosen so as to minimize the maximum of λk - ak and bk - λk . Obviously, the optimal location of λk is the midpoint (1/2)(ak+ bk). To summarize, at any iteration k, θ' is evaluated at the midpoint of the interval of uncertainty. Based on the value of θ', we either stop or construct a new interval of uncertainty whose length is half that at the previous iteration. Note that this procedure is very similar to the dichotomous search method except that at each iteration, only one derivative evaluation is required, as opposed to two functional evaluations for the dichotomous search method. However, the latter is akin to a finite difference derivative evaluation.

Convergence of the Bisection Search MethodNote that the length of the interval of

uncertainty after n observations is equal to (1/2)n(b1-a1), so that the method converges to a minimum point within any desired degree of accuracy. In particular, if the length of the final interval of uncertainty is fixed at C, then n must be chosen to be the smallest integer such that (1/2)n≤l/(b1-a1).

Summary of the Bisection Search MethodWe now summarize the bisection search procedure for

minimizing a pseudoconvex function θ over a closed and bounded interval.

Initialization Step Let [a1, b1 ] be the initial interval of uncertainty, and let P be the allowable final interval of uncertainty. Let n be the smallest positive integer such that (1/2)n≤ l /(b1-a1). Let k = 1 and go to the Main Step.

Main Step1.1. Let λk = (1/2)(ak+ bk)and evaluate θ'(λk). If θ'(λk)=0, stop; λk is

an optimal solution. Otherwise, go to Step 2 if θ'(λk)> 0, and go to Step 3 if θ'(λk)< 0.

2.Let ak+1 = ak and bk+1 = λk . GO to Step 4.3.Let ak+1 = λk , and bk+1 = bk. GO to Step 4.4.If k = n, stop; the minimum lies in the interval [an+1, bn+1].

Otherwise, replace k by k + 1 and repeat Step 1.

ExampleConsider the following problem:

Minimize (λ)2 + 2λsubject to -3≤λ≤6.

Suppose that we want to reduce the interval of uncertainty to an interval whose length e is less than or equal to 0.2. Hence, the number of observations n satisfying (1/2)n≤l/(b1-a1)= 0.2/9 = 0.0222 is given by n=6. A summary of the computations using the bisection search method is given in Table of next slide.

Summary of Computations for the Bisection Search Method

Note that the final interval of uncertainty is [-1.0313, -0.8907], so that the minimum could be taken as the midpoint, -0.961.

Newton’s MethodNewton’s method is based on exploiting the quadratic

approximation of the function θ at a given point λk . This quadratic approximation q is given by

q(λ) = θ(λk ) + θ΄ (λk )(λ - λk ) + (1/2)θ"̋ (λk)(λ - λk )2.The point λk+1 , is taken to be the point where the derivative of q

is equal to zero. This yields θ΄ (λk )+ θ"̋ (λk)(λk+1 - λk ) = 0, so thatλk+1 = λk –(θ΄ (λk )/θ"̋ (λk))

The procedure is terminated when |λk+1 - λk| < ε, or when |θ΄ (λk )|< ε, where ε is a pre-specified termination scalar.

Note that the above procedure can only be applied for twice differentiable

functions. Furthermore, the procedure is well defined only if θ"̋ (λk)≠0 for

each k.

ExampleConsider the function θ:

Note that θ is twice differentiable everywhere. We apply Newton’s method, starting from two different points. In the first case, λ1 = 0.40; and as shown in first Table of next slide, the procedure produces the point 0.002807 after six iterations. The reader can verify that the procedure indeed converges to the stationary point λ =0. In the second case, λ= 0.60, and the procedure oscillates between the points 0.60 and -0.60, as shown in second Table of next slide.

Summary of Computations for Newton's Method Starting from λ1 = 0.4

Summary of Computations for Newton‘s Method Starting from λ1 = 0.6

Convergence of Newton's MethodThe method of Newton, in general, does not

converge to a stationary point starting with an arbitrary initial point. Observe that, in general, Theorem 7.2.3 cannot be applied as a result of the unavailability of a descent function. However, as shown in next Theorem, if the starting point is sufficiently close to a stationary point, then a suitable descent function can be devised so that the method converges.

Theorem

Proof

Thanks for your attention