computer science - C++ -- Lesson 21:  Order of Algorithms

INTRODUCTION:            The two criteria used for selecting a data structure and algorithm are the amount of memory required and the speed of execution.  The analysis of the speed of an algorithm leads to a summary statement called the order of an algorithm.

The key topics for this lesson are:

A.      Order of Algorithms

B.      Constant Algorithms, O(1)

C.      Log2N Algorithms, O(log2N)

D.      Linear Algorithms, O(N)

E.      N * Log2N Algorithms, O(N * log2N)

G.      Other Orders

H.      Comparison of Orders of Algorithms

VOCABULARY:            ORDER OF ALGORITHM            CONSTANT

LOG2 N            LINEAR                    N LOG2 N            QUADRATIC

CUBIC            BIG O NOTATION

DISCUSSION:      A.      Order of Algorithms

1.      The order of an algorithm is based on the number of steps that it takes to complete a task.  Time is not a valid measuring stick because computers have different processing speeds.  We want a method of comparing algorithms that is independent of computing environment and microprocessor speeds.

2.      Most algorithms solve problems involving an amount of data, N.  The order of algorithms will be expressed as a function of N, the size of the data set.

3.      The following chart summarizes the numerical relationships of common functions of N.

A                B                        C                D                            E

N            O(log2N)            O(N)            O(N*log2N)            O(N2)

 1 0 1 0 1 2 1 2 2 4 4 2 4 8 16 8 3 8 24 64 16 4 16 64 256 32 5 32 160 1024 64 6 64 384 4096 128 7 128 896 16384 256 8 256 2048 65536 512 9 512 4608 262144 1024 10 1024 10240 1.04858e+06 2048 11 2048 22528 4.1943e+06 4096 12 4096 49152 1.67772e+07 8192 13 8192 106496 6.71089e+07 16384 14 16384 229376 2.68435e+08 32768 15 32768 491520 1.07374e+09 65536 16 65536 1048576 4.29497e+09 131072 17 131072 2228224 1.71799e+10 262144 18 262144 4718592 6.87195e+10 524288 19 524288 9961472 2.74878e+11 1048576 20 1048576 20971520 1.09951e+12 2097152 21 2097152 44040192 4.39805e+12

a.            The first column, N, is the number of items in a data set.

b.            The other four columns are mathematical functions based on the size of N.  In computer science, we write this with a capital O (order) instead of the traditional F (function) of mathematics.  This type of notation is the order of an algorithm, or Big O notation.

c.            You have already seen the last column in an empirical sense when you counted the number of steps in the quadratic sorting algorithms.  The relationship between columns A and E is quadratic - as the value of N increases, the other column increases as a function of N2.

d.            As we work through the rest of the student outline, assume the following array declaration of list applies:

apvector<int> list (4001);

Here are the specifications of array list:

1.      Index position 0 keeps track of how many integers are stored as data.

2.      Integers are stored from positions list ... list[list].

B.      Constant Algorithms, O(1)

1.      This relationship was not included in the chart.  Here, the size of the data set does not affect the number of steps this type of algorithm takes.  For example:

void  howBig (apvector<int> list)

{

return list;

}

2.      The number of data in the array could vary from 0..4000, but this does not affect the algorithm of howBig.  It will take one step regardless of how big the data set is.

3.      A constant time algorithm could have more than just one step, as long as the number of steps is independent of the size (N) of the data set.

C.      Log2N Algorithms, O(log2N)

1.      A logarithm is the exponent to which a base number must be raised to equal a given number.

2.      A log2N algorithm is one where the number of steps increases as a function of log2N.  If the number of data was 1024, the number of steps equals log2 1024, or 10 steps.

3.      Algorithms in this category involve splitting the data set in half repeatedly.  Several examples will be encountered later in the course.

4.      Algorithms which fit in this category are classed as O(log N), regardless of the numerical base used in the analysis.

D.      Linear Algorithms, O(N)

1.      This is an algorithm where the number of steps is directly proportional to the size of the data set.  As N increases, the number of steps also increases.

long int  sumData (apvector<int> list)

//  sums all the values in the array

{

long int  total = 0;

for (int  loop=1; loop<=list; loop++)

{

total += list[loop];

}

return  total;

}

2.      In the above example, as the size of the array increases, so does the number of steps in the function.

3.      A non-recursive linear algorithm, O(N), always has a loop involved.

4.      Recursive algorithms are usually linear where the looping concept is developed through recursive calls.  The recursive factorial function is a linear function.

long int  fact (int  n)

//  precondition:  n > 0

{

if (1 == n)  return 1;

else  return  n * fact(n - 1);

}

The number of calls of fact will be n.  Inside of the function is one basic step, an if/then/else.  So we are executing one statement n times.

E.      N * Log2N Algorithms, O(N * log2N)

1.      Algorithms of this type have a log N concept that must be applied N times.

2.      When recursive MergeSort and Quicksort are covered, we will discover that they are O(N * log2N) algorithms.

3.      These algorithms are markedly more efficient than our next category, quadratics.

1.      This is an algorithm where the number of steps required to solve a problem increases as a function of N2.  For example, here is bubbleSort.

void bubbleSort (apvector<int> &list)

{

for (int outer=1; outer <= list-1; outer++)

{

for (int inner=1; inner <= list-outer; inner++)

{

if (list[inner] > list[inner+1])

swap(list[inner], list[inner+1]);

}

}

}

2.      The if statement is buried inside nested loops, each of which is tied to the size of the data set, N.  The if statement is going to happen approximately N2 times.

3.      The efficiency of this bubble sort was slightly improved by having the inner loop decrease.  But we still categorize this as a quadratic algorithm.

4.      For example, the number of times the inner loop happens varies from 1 to (N-1).  On average, the inner loop occurs (N/2) times.

5.      The outer loop happens (N-1) times, or rounded off N times.

6.      The number of times the if statement is executed is equal to this expression:

# if statements = (Outer loop) * (Inner loop)

# if statements = (N) * ( )

# if statements =

7.      Ignoring the coefficient of , we have an algorithm that is quadratic in nature.

8.      When determining the order of an algorithm, we are only concerned with its category, not a detailed analysis of the number of steps.

G.      Other Orders

1.      A cubic algorithm is one where the number of steps increases as a cube of N, or N3.

2.      An exponential algorithm is one where the number of steps increases as the power of a base, like 2N.

3.      Both of these categories are astronomical in the number of steps required.  Such algorithms cannot be implemented on small personal computers.

H.      Comparison of Orders of Algorithms

1.      We obviously want to use the most efficient algorithm in our programs.  Whenever possible, choose an algorithm that requires the fewest number of steps to process data.

SUMMARY/REVIEW:            When designing solutions to programming problems, we are concerned with the most efficient solutions regarding time and space.  We will consider memory requirements at a later time.  Speed issues are resolved based on the number of steps required by algorithms.

ASSIGNMENT:            Worksheet W.A.21.1, Order of Algorithms