Analysis of Algorithm · Finiteness: The algorithm has to stop after a finite (may be very large)...

Analysis of Algorithm

Instructor: Faisal Anwer, DCS, AMU

SOURCE:

Introduction to Algorithm (3rd Edition) by Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, Clifford Stein + Freely Accessible Web Resources

A QUICK MATH REVIEW

Logarithms and Exponents

Properties of logarithms:

logb(xy) = logbx + logby

logb(x/y) = logbx - logby

logbxα = αlogbx

logba = logca/logcb

Properties of exponentials:

a(b+c) = abac

abc = (ab )c

ab/ac = a(b-c)

b = alogab

PROBLEM SOLVING: MAIN STEPS

1. Problem definition

2. Algorithm design / Algorithm specification

3. Algorithm analysis

4. Implementation

5. Testing

6. [Maintenance]

1: PROBLEM DEFINITION

What is the task to be accomplished?

Calculate the average of marks of given student

Find the shortest path between two cities

What are the time / space / speed / performance

requirements ?

2: ALGORITHM DESIGN / SPECIFICATIONS

Algorithm: Finite set of instructions that, if followed,accomplishes a particular task.

Describe: in natural language / pseudo-code / diagrams / etc.

Criteria to follow:

Input: One or more quantities (externally produced)

Output: One or more quantities

Unambiguous: Clarity, precision of each instruction

Finiteness: The algorithm has to stop after a finite (may bevery large) number of steps

Effectiveness: Each instruction has to be basic enough andfeasible

The computational complexity and efficient implementation ofthe algorithm are important in computing, and this depends onefficient logic and suitable data structures.

4,5,6. : IMPLEMENTATION, TESTING, MAINTENANCE

Implementation

Decide on the programming language to use

C, C++, Lisp, Java, Perl, Prolog, assembly, etc. , etc.

Write clean, well documented code

Test:

Insure the implementation is as per the

requirements.

Maintenance

Integrate feedback from users, fix bugs, ensure

compatibility across different versions.

3: ALGORITHM ANALYSIS

When is the running time noticeable/important?

web search

database search

real-time systems with time constraints

Space complexity

How much space is required

Time complexity

How much time does it take to run the algorithm

Often, we deal with estimates!

ANALYZING ALGORITHMS

Analyzing an algorithm helps in predicting the resources such

as memory, processing time that the algorithm requires.

Most often it is computational time that we want to

measure.

Generic one processor, RAM model is considered as

implementation technology for most of the algorithm.

The RAM model contains instructions commonly found in

real computers:

arithmetic (such as add, subtract, multiply, divide,

remainder, floor, ceiling)

data movement (load, store, copy) and

control (conditional and unconditional branch, subroutine

call and return).

Each such instruction takes a constant amount of time

SPACE COMPLEXITY

Space complexity = The amount of memory required by analgorithm to run to completion

[Core dumps = the most often encountered cause is “danglingpointers”]

Some algorithms may be more efficient if data completely loadedinto memory

Need to look also at system limitations

Fixed part: The size required to store certain data/variables, that is

independent of the size of the problem:

- e.g. name of the data collection

Variable part: Space needed by variables, whose size is dependent

on the size of the problem:

- e.g. actual text

- load 2GB of text VS. load 1MB of text

TIME COMPLEXITY

Often more important than space complexity space available (for computer programs!) tends to be

larger and larger

time is still a problem for all of us

3-4GHz processors on the market still …

Algorithms running time is an important issue

TIME COMPLEXITY…

Input

1 ms

2 ms

3 ms

4 ms

5 ms

A B C D E F G

worst-case

best-case

}average-case?

• Suppose an algorithm includes conditional statement

that may execute or not: variable running time

• Typically algorithms are measured by their worst case

TIME COMPLEXITY…

The running time of an algorithm varies with the inputs, andtypically grows with the size of the inputs.

To evaluate an algorithm or to compare two algorithms, wefocus on their relative rates of growth with respect to theincrease of the input size.

The average running time is difficult to determine.

We focus on the worst case running time

Easier to analyze

Crucial to applications such as finance, robotics, andgames

FACTORS THAT DETERMINE

RUNNING TIME OF A PROGRAM

Problem size: n

Basic algorithm / actual processing

memory access speed

CPU/processor speed

# of processors?

compiler/linker optimization?

ESTIMATION OF TIME COMPLEXITY

Experimental Approach

Theoretical Approach

EXPERIMENTAL APPROACH

Write a program to implement the algorithm.

Run this program with inputs of varying size and composition.

Get an accurate measure of the actual running time (e.g.system call date).

Limitations of Experimental Studies

The algorithm has to be implemented, which may take along time and could be very difficult.

In order to compare two algorithms, the same hardwareand software must be used.

Results may not be indicative for the running time on otherinputs that are not included in the experiments.

THEORETICAL APPROACH

Based on high-level description of the algorithms

(Pseudocode), rather than language dependent

implementations.

Makes possible an evaluation of the algorithms that is

independent of the hardware and software environments.

PRIMITIVE OPERATIONS

The basic computations performed by an algorithm

Identifiable in pseudocode

Largely independent from the programming language

Exact definition not important

Instructions have to be basic enough and feasible!

Examples:

Evaluating an expression

Assigning a value to a variable

Calling a method

Returning from a method

LOW LEVEL ALGORITHM ANALYSIS

Based on primitive operations (low-levelcomputations independent from the programminglanguage)

E.g.: Make an addition = 1 operation

Calling a method or returning from a method = 1 operation

Index in an array = 1 operation

Comparison = 1 operation etc.

Method: Inspect the pseudo-code and count thenumber of primitive operations executed by thealgorithm

or count the no. of times an statement is executed

ANALYSIS OF INSERTION SORT

The running time of an algorithm on a particular input is

the number of primitive operations or “steps” executed.

It is convenient to define the notion of step so that it is

as machine-independent as possible.

Let us adopt the following view:

A constant amount of time is required to execute each line of

our pseudocode.

The running time of the algorithm is the sum of running

times for each statement executed.

Worst-case

longest running time for an input of size n.

Best-case

INSERTION SORT ALGORITHM

INSERTION-SORT(A)

1 for j = 2 to A.length

2 key = A[j]

3 // Insert A[j] into the sorted sequence A[1…j-1]

4 i = j - 1

5 while i > 0 and A[i] > key

6 A[i+1] = A[i]

7 i = i - 1

8 A[i + 1] = key

GROWTH RATE OF RUNNING TIME

(OR ORDER OF GROWTH)

We consider only the leading term of a formula (e.g., an2)

lower-order terms are relatively insignificant for large values

of n.

We also ignore the leading term’s constant coefficient, since

constant factors are less significant than the rate of growth in

determining computational efficiency for large inputs.

We write that insertion sort has a worst-case running time of

Ɵ(n2) (pronounced “theta of n-squared”).

Changing the hardware/ software environment

Affects T(n) by a constant factor, but

Does not alter the growth rate of T(n)

GROWTH RATE OF RUNNING TIME

(OR ORDER OF GROWTH)

The Linear growth rate in best case and polynomial growth

rate in worst case of the running time T(n) is an intrinsic

property of algorithm insertionsort.

When we want to compare two algorithms for large input

sizes, a useful measure is the order of growth.

The order of growth of the running time of an algorithm,

gives a simple characterization about

algorithm’s efficiency and

relative performance of alternative algorithms

ASYMPTOTIC ANALYSIS

Asymptotic analysis of an algorithms talks about the order of growth of

the running time when input size is large enough.

In mathematical analysis, asymptotic analysis, also known as

asymptotics, is a method of describing limiting behavior.

Suppose f(n) = n2 + 3n, then as n becomes very large, the term 3n becomes

insignificant compared to n2. The function f(n) is said to be "asymptotically

equivalent to n2, as n → ∞. This is often written symbolically as f(n) ~ n2,

which is read as "f(n) is asymptotic to n2 ".

In Asymptotic Analysis, we evaluate the performance of an algorithm in

terms of input size (we don’t measure the actual running time).

In mathematical analysis, asymptotic analysis, also known

as asymptotics, is a method of describing limiting behavior.

Asymptotic analysis is useful to

estimate how long a program will run or/and how much space it would take.

compare the efficiency of different algorithms.

choose an algorithm for an application.

https://en.wikipedia.org/wiki/Mathematical_analysis

https://en.wikipedia.org/wiki/Limit_(mathematics)

The main idea of asymptotic analysis is to have a

measure of efficiency of algorithms that doesn’t

depend on machine specific constants, and doesn’t

require algorithms to be implemented.

Asymptotic notations are mathematical tools to

represent time complexity of algorithms for

asymptotic analysis.

The following asymptotic notations are used to

represent time complexity of algorithms:

Big O Notation

Big Omega - Ω Notation

Big Θ Notation

Little o Notation

Little –Omega: ω Notation

BIG O NOTATION

The Big O notation defines an upper bound of an

algorithm. It bounds a function only from above.

Suppose for a given function f(n) , g(n) is the Big-O

order when there exist positive constants c and n0

such that 0 <= f(n) <= cg(n) for all n >= n0.

BIG Ω NOTATION

Just as Big O notation provides an asymptotic upper

bound on a function, Ω notation provides an asymptotic

lower bound. It can be useful when we have lower bound

on time complexity of an algorithm.

Best case performance of an algorithm is generally not

useful, the Omega notation is the least used notation

among all.

Suppose for a given function f(n) , g(n) is the Big- Ω

order when there exist positive constants c and n0 such

that 0 <= cg(n) <= f(n) for all n >= n0.

BIG Θ NOTATION

The theta notation bounds a functions from above

and below, so it defines exact asymptotic behavior.

Suppose for a given function f(n) , g(n) is the Big- Θ

order when there exist positive constants c1 and c2

and n0 such that 0 <= c1*g(n) <= f(n) <= c2*g(n) for all

n >= n0.

When we use big-Θ notation, we're saying that we

have an asymptotically tight bound on the running

time.

LITTLE O NOTATION

Suppose for a given function f(n) , g(n) is the Little- oorder when for any positive constants c there existsa positive constant n0 such that 0 ≤ f(n) < cg(n) for alln >= n0.

Little- o is an upper bound, but is notan asymptotically tight bound.

The main difference between O- notation and o-notation is that in f (n) = O(g(n)), the bound 0<=f(n)<= cg(n) holds for some constant c > 0, but in f(n)=o(g(n)), the bound 0 < = f(n) < cg(n) holds for allconstants c > 0

Example:

For example, 2n = o(n2), but 2n2 ≠ o(n2).

Intuitively, in o-notation, the function f (n) becomesinsignificant relative to g(n) as n approachesinfinity; that is,

ω -notation is used to denote a lower bound that isnot asymptotically tight.

Suppose for a given function f(n) , g(n) is the Littleomega ω order when for any positive constants cthere exists a positive constant n0 > 0 such that 0 ≤cg(n) < f(n) for all n >= n0.

The relation f(n)= ω(g(n)) implies

The function g (n) becomes insignificant relative tof(n) as n approaches infinity.

Little –Omega: ω Notation

POINTS TO BE NOTED

Big-O notation gives only an asymptotic upper bound,

and not an asymptotically tight bound.

For example, it is absolutely correct to say that binary

search runs in O(n) time.

Other, imprecise, upper bounds on binary search would be

O(n2 ), O(n3 ) and O(2n ). But none of Θ(n), Θ(n2 ), Θ(n3 ), and

Θ(2n ) would be correct to describe the running time of

binary search in any case.

Also big-Ω notation gives only an asymptotic lower

bound, and not an asymptotically tight bound

you can also say that the running time of insertion sort is

Ω(1).

Which of the following statements is/are valid?

1. Time Complexity of QuickSort is Θ(n^2)

2. Time Complexity of QuickSort is O(n^2)

3. For any two functions f(n) and g(n), we have f(n) =

Θ(g(n)) if and only if f(n) = O(g(n)) and f(n) = Ω(g(n)).

4. Time complexity of all computer algorithms can be

written as Ω(1)

For the functions, nk and cn , what is the

asymptotic relationship between them. Assume

that k >= 1 and c > 1 are constants. Choose all

answers that apply:

For the functions, lgn and log8n, start subscript,

8, end subscript, n, what is the asymptotic

relationship between these functions?

lgn is O(log8n)

lgn is Ω(log8n)

lgn is Θ(log8n)

List the following from slowest to fastest growing

1. Θ(1)

2. Θ(n2)

3. Θ(2n )

4. Θ(lgn)

5. Θ(n)

6. Θ(n2 lg n)

7. Θ(n lg n)

BIG-OH RULES

If f(n) is a polynomial of degree d, then f(n) is O(nd),

i.e.,

1. Drop lower-order terms

2. Drop constant factors

Use the smallest possible class of functions

Say “2n is O(n)” instead of “2n is O(n2)”

Use the simplest expression of the class

Say “3n + 5 is O(n)” instead of “3n + 5 is O(3n)”

COMPUTING PREFIX AVERAGES

We further illustrate asymptotic analysis with two algorithms for prefix averages

The i-th prefix average of an array X is average of the first (i + 1) elements of X:

A[i] = (X[0] + X[1] + … + X[i])/(i+1)

PREFIX AVERAGES (QUADRATIC)

The following algorithm computes prefix averages in

quadratic time by applying the definition

The running time of prefixAverages1 is

O(1 + 2 + …+ n)

Thus, algorithm prefixAverages1 runs in O(n2) time

Algorithm prefixAverages1(X, n)

Input array X of n integers

Output array A of prefix averages of X #operationsA new array of n integers n

for i 0 to n 1 do 2n

{ s X[0] 2n

for j 1 to i do 2*(1 + 2 + …+ (n 1))

s s + X[j] 3*(1 + 2 + …+ (n 1))

A[i] s / (i + 1) } 4n

return A 1

PREFIX AVERAGES (LINEAR)

The following algorithm computes prefix

averages in linear time by keeping a running

sum

Algorithm prefixAverages2 runs in O(n) time

Algorithm prefixAverages2(X, n)

Input array X of n integers

Output array A of prefix averages of X #operations

A new array of n integers n

s 0 1

for i 0 to n 1 do 2n

{s s + X[i] 3n

A[i] s / (i + 1) } 3n

return A 1

40

Exercise: Give a big-Oh characterization

Algorithm Ex1(A, n)

Input an array X of n integers

Output the sum of the elements in A

s A[0]

for i 0 to n 1 do

s s + A[i]

return s

41


Algorithm Ex2(A, n)


Output the sum of the elements at even cells in A

s A[0]

for i 2 to n 1 by increments of 2 do

s s + A[i]

return s

42


Algorithm Ex1(A, n)


Output the sum of the prefix sums A

s 0

for i 0 to n 1 do

s s + A[0]

for j 1 to i do

s s + A[j]

return s

SUMMARY

Time complexity is a measure of algorithm

efficiency.

Efficient algorithm plays the major role in

determining the running time.

Minor tweaks in the code can cut down the

running time by a factor too.

Other items like CPU speed, memory speed,

device I/O speed can help as well.

For certain problems, it is possible to allocate

additional space & improve time complexity.

BIG-OH EXAMPLE

Example: 2n + 10 is O(n)

2n + 10 cn

(c 2) n 10

n 10/(c 2)

Pick c = 3 and n0 = 10

Example: the function n2 is not O(n)

n2 cn

n c

The above inequality cannot be satisfied since c must be a constant

Example: 3n3 + 20n2 + 5 3n3 + 20n2 + 5 is O(n3)

need c > 0 and n0 1 such that 3n3 + 20n2 + 5 c•n3 for n n0

this is true for c = 4 and n0 = 21

Analysis of Algorithm · Finiteness: The algorithm has to stop after a finite (may be very large)...

Documents

Transcript of Analysis of Algorithm · Finiteness: The algorithm has to stop after a finite (may be very large)...