Analysis of Algorithm · Finiteness: The algorithm has to stop after a finite (may be very large)...
Transcript of Analysis of Algorithm · Finiteness: The algorithm has to stop after a finite (may be very large)...
Analysis of Algorithm
Instructor: Faisal Anwer, DCS, AMU
SOURCE:
Introduction to Algorithm (3rd Edition) by Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, Clifford Stein + Freely Accessible Web Resources
A QUICK MATH REVIEW
Logarithms and Exponents
Properties of logarithms:
logb(xy) = logbx + logby
logb(x/y) = logbx - logby
logbxα = αlogbx
logba = logca/logcb
Properties of exponentials:
a(b+c) = abac
abc = (ab )c
ab/ac = a(b-c)
b = alogab
PROBLEM SOLVING: MAIN STEPS
1. Problem definition
2. Algorithm design / Algorithm specification
3. Algorithm analysis
4. Implementation
5. Testing
6. [Maintenance]
1: PROBLEM DEFINITION
What is the task to be accomplished?
Calculate the average of marks of given student
Find the shortest path between two cities
What are the time / space / speed / performance
requirements ?
2: ALGORITHM DESIGN / SPECIFICATIONS
Algorithm: Finite set of instructions that, if followed,accomplishes a particular task.
Describe: in natural language / pseudo-code / diagrams / etc.
Criteria to follow:
Input: One or more quantities (externally produced)
Output: One or more quantities
Unambiguous: Clarity, precision of each instruction
Finiteness: The algorithm has to stop after a finite (may bevery large) number of steps
Effectiveness: Each instruction has to be basic enough andfeasible
The computational complexity and efficient implementation ofthe algorithm are important in computing, and this depends onefficient logic and suitable data structures.
4,5,6. : IMPLEMENTATION, TESTING, MAINTENANCE
Implementation
Decide on the programming language to use
C, C++, Lisp, Java, Perl, Prolog, assembly, etc. , etc.
Write clean, well documented code
Test:
Insure the implementation is as per the
requirements.
Maintenance
Integrate feedback from users, fix bugs, ensure
compatibility across different versions.
3: ALGORITHM ANALYSIS
When is the running time noticeable/important?
web search
database search
real-time systems with time constraints
Space complexity
How much space is required
Time complexity
How much time does it take to run the algorithm
Often, we deal with estimates!
ANALYZING ALGORITHMS
Analyzing an algorithm helps in predicting the resources such
as memory, processing time that the algorithm requires.
Most often it is computational time that we want to
measure.
Generic one processor, RAM model is considered as
implementation technology for most of the algorithm.
The RAM model contains instructions commonly found in
real computers:
arithmetic (such as add, subtract, multiply, divide,
remainder, floor, ceiling)
data movement (load, store, copy) and
control (conditional and unconditional branch, subroutine
call and return).
Each such instruction takes a constant amount of time
SPACE COMPLEXITY
Space complexity = The amount of memory required by analgorithm to run to completion
[Core dumps = the most often encountered cause is “danglingpointers”]
Some algorithms may be more efficient if data completely loadedinto memory
Need to look also at system limitations
Fixed part: The size required to store certain data/variables, that is
independent of the size of the problem:
- e.g. name of the data collection
Variable part: Space needed by variables, whose size is dependent
on the size of the problem:
- e.g. actual text
- load 2GB of text VS. load 1MB of text
TIME COMPLEXITY
Often more important than space complexity space available (for computer programs!) tends to be
larger and larger
time is still a problem for all of us
3-4GHz processors on the market still …
Algorithms running time is an important issue
TIME COMPLEXITY…
Input
1 ms
2 ms
3 ms
4 ms
5 ms
A B C D E F G
worst-case
best-case
}average-case?
• Suppose an algorithm includes conditional statement
that may execute or not: variable running time
• Typically algorithms are measured by their worst case
TIME COMPLEXITY…
The running time of an algorithm varies with the inputs, andtypically grows with the size of the inputs.
To evaluate an algorithm or to compare two algorithms, wefocus on their relative rates of growth with respect to theincrease of the input size.
The average running time is difficult to determine.
We focus on the worst case running time
Easier to analyze
Crucial to applications such as finance, robotics, andgames
FACTORS THAT DETERMINE
RUNNING TIME OF A PROGRAM
Problem size: n
Basic algorithm / actual processing
memory access speed
CPU/processor speed
# of processors?
compiler/linker optimization?
ESTIMATION OF TIME COMPLEXITY
Experimental Approach
Theoretical Approach
EXPERIMENTAL APPROACH
Write a program to implement the algorithm.
Run this program with inputs of varying size and composition.
Get an accurate measure of the actual running time (e.g.system call date).
Limitations of Experimental Studies
The algorithm has to be implemented, which may take along time and could be very difficult.
In order to compare two algorithms, the same hardwareand software must be used.
Results may not be indicative for the running time on otherinputs that are not included in the experiments.
THEORETICAL APPROACH
Based on high-level description of the algorithms
(Pseudocode), rather than language dependent
implementations.
Makes possible an evaluation of the algorithms that is
independent of the hardware and software environments.
PRIMITIVE OPERATIONS
The basic computations performed by an algorithm
Identifiable in pseudocode
Largely independent from the programming language
Exact definition not important
Instructions have to be basic enough and feasible!
Examples:
Evaluating an expression
Assigning a value to a variable
Calling a method
Returning from a method
LOW LEVEL ALGORITHM ANALYSIS
Based on primitive operations (low-levelcomputations independent from the programminglanguage)
E.g.: Make an addition = 1 operation
Calling a method or returning from a method = 1 operation
Index in an array = 1 operation
Comparison = 1 operation etc.
Method: Inspect the pseudo-code and count thenumber of primitive operations executed by thealgorithm
or count the no. of times an statement is executed
ANALYSIS OF INSERTION SORT
The running time of an algorithm on a particular input is
the number of primitive operations or “steps” executed.
It is convenient to define the notion of step so that it is
as machine-independent as possible.
Let us adopt the following view:
A constant amount of time is required to execute each line of
our pseudocode.
The running time of the algorithm is the sum of running
times for each statement executed.
Worst-case
longest running time for an input of size n.
Best-case
INSERTION SORT ALGORITHM
INSERTION-SORT(A)
1 for j = 2 to A.length
2 key = A[j]
3 // Insert A[j] into the sorted sequence A[1…j-1]
4 i = j - 1
5 while i > 0 and A[i] > key
6 A[i+1] = A[i]
7 i = i - 1
8 A[i + 1] = key
GROWTH RATE OF RUNNING TIME
(OR ORDER OF GROWTH)
We consider only the leading term of a formula (e.g., an2)
lower-order terms are relatively insignificant for large values
of n.
We also ignore the leading term’s constant coefficient, since
constant factors are less significant than the rate of growth in
determining computational efficiency for large inputs.
We write that insertion sort has a worst-case running time of
Ɵ(n2) (pronounced “theta of n-squared”).
Changing the hardware/ software environment
Affects T(n) by a constant factor, but
Does not alter the growth rate of T(n)
GROWTH RATE OF RUNNING TIME
(OR ORDER OF GROWTH)
The Linear growth rate in best case and polynomial growth
rate in worst case of the running time T(n) is an intrinsic
property of algorithm insertionsort.
When we want to compare two algorithms for large input
sizes, a useful measure is the order of growth.
The order of growth of the running time of an algorithm,
gives a simple characterization about
algorithm’s efficiency and
relative performance of alternative algorithms
ASYMPTOTIC ANALYSIS
Asymptotic analysis of an algorithms talks about the order of growth of
the running time when input size is large enough.
In mathematical analysis, asymptotic analysis, also known as
asymptotics, is a method of describing limiting behavior.
Suppose f(n) = n2 + 3n, then as n becomes very large, the term 3n becomes
insignificant compared to n2. The function f(n) is said to be "asymptotically
equivalent to n2, as n → ∞. This is often written symbolically as f(n) ~ n2,
which is read as "f(n) is asymptotic to n2 ".
In Asymptotic Analysis, we evaluate the performance of an algorithm in
terms of input size (we don’t measure the actual running time).
In mathematical analysis, asymptotic analysis, also known
as asymptotics, is a method of describing limiting behavior.
Asymptotic analysis is useful to
estimate how long a program will run or/and how much space it would take.
compare the efficiency of different algorithms.
choose an algorithm for an application.
The main idea of asymptotic analysis is to have a
measure of efficiency of algorithms that doesn’t
depend on machine specific constants, and doesn’t
require algorithms to be implemented.
Asymptotic notations are mathematical tools to
represent time complexity of algorithms for
asymptotic analysis.
The following asymptotic notations are used to
represent time complexity of algorithms:
Big O Notation
Big Omega - Ω Notation
Big Θ Notation
Little o Notation
Little –Omega: ω Notation
BIG O NOTATION
The Big O notation defines an upper bound of an
algorithm. It bounds a function only from above.
Suppose for a given function f(n) , g(n) is the Big-O
order when there exist positive constants c and n0
such that 0 <= f(n) <= cg(n) for all n >= n0.
BIG Ω NOTATION
Just as Big O notation provides an asymptotic upper
bound on a function, Ω notation provides an asymptotic
lower bound. It can be useful when we have lower bound
on time complexity of an algorithm.
Best case performance of an algorithm is generally not
useful, the Omega notation is the least used notation
among all.
Suppose for a given function f(n) , g(n) is the Big- Ω
order when there exist positive constants c and n0 such
that 0 <= cg(n) <= f(n) for all n >= n0.
BIG Θ NOTATION
The theta notation bounds a functions from above
and below, so it defines exact asymptotic behavior.
Suppose for a given function f(n) , g(n) is the Big- Θ
order when there exist positive constants c1 and c2
and n0 such that 0 <= c1*g(n) <= f(n) <= c2*g(n) for all
n >= n0.
When we use big-Θ notation, we're saying that we
have an asymptotically tight bound on the running
time.
LITTLE O NOTATION
Suppose for a given function f(n) , g(n) is the Little- oorder when for any positive constants c there existsa positive constant n0 such that 0 ≤ f(n) < cg(n) for alln >= n0.
Little- o is an upper bound, but is notan asymptotically tight bound.
The main difference between O- notation and o-notation is that in f (n) = O(g(n)), the bound 0<=f(n)<= cg(n) holds for some constant c > 0, but in f(n)=o(g(n)), the bound 0 < = f(n) < cg(n) holds for allconstants c > 0
Example:
For example, 2n = o(n2), but 2n2 ≠ o(n2).
Intuitively, in o-notation, the function f (n) becomesinsignificant relative to g(n) as n approachesinfinity; that is,
ω -notation is used to denote a lower bound that isnot asymptotically tight.
Suppose for a given function f(n) , g(n) is the Littleomega ω order when for any positive constants cthere exists a positive constant n0 > 0 such that 0 ≤cg(n) < f(n) for all n >= n0.
The relation f(n)= ω(g(n)) implies
The function g (n) becomes insignificant relative tof(n) as n approaches infinity.
Little –Omega: ω Notation
POINTS TO BE NOTED
Big-O notation gives only an asymptotic upper bound,
and not an asymptotically tight bound.
For example, it is absolutely correct to say that binary
search runs in O(n) time.
Other, imprecise, upper bounds on binary search would be
O(n2 ), O(n3 ) and O(2n ). But none of Θ(n), Θ(n2 ), Θ(n3 ), and
Θ(2n ) would be correct to describe the running time of
binary search in any case.
Also big-Ω notation gives only an asymptotic lower
bound, and not an asymptotically tight bound
you can also say that the running time of insertion sort is
Ω(1).
Which of the following statements is/are valid?
1. Time Complexity of QuickSort is Θ(n^2)
2. Time Complexity of QuickSort is O(n^2)
3. For any two functions f(n) and g(n), we have f(n) =
Θ(g(n)) if and only if f(n) = O(g(n)) and f(n) = Ω(g(n)).
4. Time complexity of all computer algorithms can be
written as Ω(1)
For the functions, nk and cn , what is the
asymptotic relationship between them. Assume
that k >= 1 and c > 1 are constants. Choose all
answers that apply:
For the functions, lgn and log8n, start subscript,
8, end subscript, n, what is the asymptotic
relationship between these functions?
lgn is O(log8n)
lgn is Ω(log8n)
lgn is Θ(log8n)
List the following from slowest to fastest growing
1. Θ(1)
2. Θ(n2)
3. Θ(2n )
4. Θ(lgn)
5. Θ(n)
6. Θ(n2 lg n)
7. Θ(n lg n)
BIG-OH RULES
If f(n) is a polynomial of degree d, then f(n) is O(nd),
i.e.,
1. Drop lower-order terms
2. Drop constant factors
Use the smallest possible class of functions
Say “2n is O(n)” instead of “2n is O(n2)”
Use the simplest expression of the class
Say “3n + 5 is O(n)” instead of “3n + 5 is O(3n)”
COMPUTING PREFIX AVERAGES
We further illustrate asymptotic analysis with two algorithms for prefix averages
The i-th prefix average of an array X is average of the first (i + 1) elements of X:
A[i] = (X[0] + X[1] + … + X[i])/(i+1)
PREFIX AVERAGES (QUADRATIC)
The following algorithm computes prefix averages in
quadratic time by applying the definition
The running time of prefixAverages1 is
O(1 + 2 + …+ n)
Thus, algorithm prefixAverages1 runs in O(n2) time
Algorithm prefixAverages1(X, n)
Input array X of n integers
Output array A of prefix averages of X #operationsA new array of n integers n
for i 0 to n 1 do 2n
{ s X[0] 2n
for j 1 to i do 2*(1 + 2 + …+ (n 1))
s s + X[j] 3*(1 + 2 + …+ (n 1))
A[i] s / (i + 1) } 4n
return A 1
PREFIX AVERAGES (LINEAR)
The following algorithm computes prefix
averages in linear time by keeping a running
sum
Algorithm prefixAverages2 runs in O(n) time
Algorithm prefixAverages2(X, n)
Input array X of n integers
Output array A of prefix averages of X #operations
A new array of n integers n
s 0 1
for i 0 to n 1 do 2n
{s s + X[i] 3n
A[i] s / (i + 1) } 3n
return A 1
40
Exercise: Give a big-Oh characterization
Algorithm Ex1(A, n)
Input an array X of n integers
Output the sum of the elements in A
s A[0]
for i 0 to n 1 do
s s + A[i]
return s
41
Exercise: Give a big-Oh characterization
Algorithm Ex2(A, n)
Input an array X of n integers
Output the sum of the elements at even cells in A
s A[0]
for i 2 to n 1 by increments of 2 do
s s + A[i]
return s
42
Exercise: Give a big-Oh characterization
Algorithm Ex1(A, n)
Input an array X of n integers
Output the sum of the prefix sums A
s 0
for i 0 to n 1 do
s s + A[0]
for j 1 to i do
s s + A[j]
return s
SUMMARY
Time complexity is a measure of algorithm
efficiency.
Efficient algorithm plays the major role in
determining the running time.
Minor tweaks in the code can cut down the
running time by a factor too.
Other items like CPU speed, memory speed,
device I/O speed can help as well.
For certain problems, it is possible to allocate
additional space & improve time complexity.
BIG-OH EXAMPLE
Example: 2n + 10 is O(n)
2n + 10 cn
(c 2) n 10
n 10/(c 2)
Pick c = 3 and n0 = 10
Example: the function n2 is not O(n)
n2 cn
n c
The above inequality cannot be satisfied since c must be a constant
Example: 3n3 + 20n2 + 5 3n3 + 20n2 + 5 is O(n3)
need c > 0 and n0 1 such that 3n3 + 20n2 + 5 c•n3 for n n0
this is true for c = 4 and n0 = 21