Analysis of Algorithms

Recall, that an algorithm is a method for solving a problem or accomplishing some task expressed as a sequence of steps that are suitable for execution by a computer.

We are interested in designing "good" algorithms -- that is to say, algorithms that don't take forever to run, and don't completely exhaust the resources of our machines, specifically with regard to memory.

With regard to the first consideration, the running time of an algorithm gives us a means to compare two algorithms to decide which is faster. However, things are more involved than just running the algorithm once or twice and timing how long it takes to finish. First, we note that running time typically increases with the problem size (typically, either the size of the input or value of some required argument). Second, running time is affected by the hardware and software environment in which the algorithm is executed.

Let us first focus on the relationship between the running time and the input size. In particular, we are interested in how fast the running time is growing as the input size is increasing -- as a faster growth rate means that we will run into long running times more quickly and -- if these running times become long enough -- may not be able to deal at all with problems past a certain size. More generally, understanding this relationship enables one to make reasonable predictions for how long a program will run before it finishes.

A similar type of analysis will be useful with regard to the amount of memory an algorithm requires as it is run. However, we'll stick to the context of running time in what follows. Once we understand how to analyze what we call the time complexity of an algorithm, understanding how to do the same regarding an algorithm's demand on memory (called space complexity) will be relatively easy.

One way to analyze an algorithm is to do so experimentally, using the scientific method. Recall, with the scientific method one formulates some hypothesis consistent with initial observations, uses that hypothesis to make falsifiable predictions, and then verifies those predictions with experiments.

To apply this to running times, we:

Write a program that implements the algorithm.
Run the program with inputs of various sizes and compositions, using a method like System.currentTimeMillis() to get an accurate measure of the actual running times.
Plot the results, and formulate a hypothesis for how quickly the running time increases as a function of the problem size.
Make predictions for the running times associated with other inputs, and verify these by again running and timing the program.

Analyzing algorithms experimentally, of course, has its own set of problems:

We have to implement the algorithm, which may be difficult
In order to compare two algorithms, the same hardware and software environments must be used
Results may not be indicative of the running time on other inputs not included in the experiment

A better way to analyze an algorithm is to realize that the running time of a program is determined primarily by two factors:

The cost (in time) of executing each statement (which is a property of the computer, the Java compiler, and the operating system).
The frequency of execution of each statement (which is a property of the program and the input).

To help with quantifying the above, we first make a definition: A primitive operation is defined to be an operation that corresponds to a relatively low-level, basic computation with a constant (or near constant) execution time. Examples of primitive operations include declaring a variable, assigning a value to a variable, deciding if one integer is larger than another, accessing the element of an array at some index $i$, finding the length of an array, etc.

Then, to analyze an algorithm, we simply determine the frequency of the primitive operations (or the most impactful ones), and characterize this as a function of the input size, $n$. This function is referred to with a variety of names -- the cost function, cost model, or complexity, and is denoted by $C(n)$. In finding $C(n)$ in different scenarios, we have the benefit of taking into account all possible inputs, and our result is independent of any hardware/software environment, but provides an estimate for any given hardware/software setup of the time it will take, presuming we can well-approximate the execution time of the primitive operation in question.

Given that an algorithm may run faster on some inputs than it does on others (even with the same input size), we have different ways of describing the frequency of whichever primitive operations we seek to analyze. Specifically, we can look at:

The average case, where we find the average/mean cost over all possible inputs of the same size, which then suggests an expected cost for a random input of that size -- although this depends on the distribution of inputs (i.e., are some inputs more likely to occur than others? ..if so, in what way?), which may complicate things. Additionally, we must be wary of misleading averages, as typified by the old joke: "A statistician who put his head in an oven and his feet in a freezer was heard to say "On average, I feel just fine!"
The best case scenario, where the minimum cost is found. While folks are more likely interested in where the algorithm is going to create a problem (by running too long), this does establish a lower bound on the cost and suggests a goal for the cost of all other inputs that might in some cases be obtainable.
The worst case scenario, which establishes an upper bound on the cost of any input. This scenario is generally easier to analyze and reducing its cost typically leads to better algorithms.

Let's take a look at some examples of finding running-time cost functions for some primitive operations of interest...

Example 1

sum = 0.0;
for (int i = 0; i < n; i++) {
   sum += array[i];  // <----- primitive operation of interest
}

Ignoring the update to the loop variable $i$, we note the marked statement above is executed $n$ times, so the cost function is $C(n) = n$.

Example 2

sum = 0.0;
for (int i = 0; i < n; i += 2) {
    sum += array[i];    // <----- primitive operation of interest
}

Here however, since the counter variable $i$ is going up by $2$ with each pass through the loop, the number of times the marked statement above gets executed this time is halved (or close to it when $n$ is odd). More precisely, $C(n) = n/2$ when $n$ is even, and $C(n) = (n+1)/2$ when $n$ is odd.

Example 3

for (int i = 0; i < n; i++) {
   for (int j = 0; j < n; j++) {
      int x = i*j;    // <----- primitive operation of interest            
      sum += x;       // <----- primitive operation of interest
   }
}

When nested loops are encountered, notice that the two primitive operations each get executed $n^2$ times, yielding $C(n) = 2n^2$.

Example 4

for (int i = 0; i < n; i++) {
   for (int j = i; j < n; j++) {
      sum += i*j;      // <----- primitive operation of interest           
   }
}

Here, the cost function is a little more complicated to calculate. Note that the primitive operation is executed $$n + (n-1) + (n-2) + \cdots + 3 + 2 + 1 \quad \textrm{times}$$ However, an identity from algebra will help consolidate this into $\displaystyle{C(n) = \frac{n(n+1)}{2}}$.

The cost function $C(n)$ can often be a complicated and lengthy mathematical expression. However, when the problem-size is large some terms of the cost function will be relatively negligible when compared to others. There are two ways we can take this factor into consideration to simplify things.

One way is to focus on how the cost function for an algorithm can be "bounded" by a simpler function or functions. There are three primary notations that can be used to describe this type of bounding:

Big "Oh" Notation

For a given algorithm with cost function $C(n)$, a function $f(n)$ serves as a type of upper bound on the magnitude of $C(n)$ if for sufficiently large $n$, we have $|C(n)| \le k \cdot |f(n)|$ for some constant $k$. We abbreviate this using the "asymptotic notation" known as Big O, saying: "The (time or space) complexity of algorithm $A$ is $O(f(n))$."

As an example, suppose three algorithms $A_1$, $A_2$, and $A_3$ have cost functions associated with their running times of $C_1(n) = 10n^2$, $C_2(n) = 100n$, and $C_3(n) = 22n\,\textrm{log } n + 3n$, respectively. All three of these algorithms could be described as having time complexities that are $O(n^2)$.

One important thing to know about Big-O involves combinations of algorithms. If algorithm $A$ is formed by sequentially chaining together algorithms $A'$ and $A''$ (i.e., doing one, then the other) then $A$ has complexity equal to the larger of that for $A'$ and $A''$.

If instead, $A$ is formed by nesting $A'$ and $A''$ (i.e., for every primitive operation previously in $A'$, we now perform $A''$ -- this frequently is seen in nested loops), then $A$ has complexity equal to the product of that for $A'$ and $A''$.

One should take a minute and convince oneself of the truth of the two preceding statements -- they are incredibly useful!
Big Omega Notation

On the flip side, we say that a function $f(n)$ serves as a type of lower bound on the magnitude of $C(n)$ if for sufficiently large $n$, we have $|C(n)| \ge k \cdot |f(n)|$ for some constant $k$. Similar to Big-O notation, we can say this more compactly using Big Omega notation, saying: "The (time or space) complexity of algorithm $A$ is $\Omega(f(n))$."

Again, as an example, let's say three algorithms $A_1$, $A_2$, and $A_3$ have cost functions associated with their running times of $C_1(n) = n^2/2$, $C_2(n) = n^5$, and $C_3(n) = n^3+17n\,\textrm{log } n + 3n$, respectively. All three of these algorithms could be described as having time complexities that are $\Omega(n^2)$.
Big Theta Notation

Sometimes a single function can serve in this way as both a lower bound and an upper bound for the magnitude of $C(n)$. Specifically, for an algorithm $A$ with cost function $C(n)$, when there exists a function $f(n)$ and constants $k_1$ and $k_2$ such that for sufficiently large $n$ one has $k_1 \cdot |f(n)| \le |C(n)| \le k_2 \cdot |f(n)|$, we say the complexity of $A$ is $\Theta(f(n))$.

So, for example, if algorithms $A_1$, $A_2$, and $A_3$ had cost functions (in time) of $C_1(n) = n^2/2$, $C_2(n) = 10n^2$, and $C_3(n) = 5n^2 + 9\textrm{log } n + 7n$, then all of these algorithms could be described as having time complexities that are $\Theta(n^2)$.

The other way to deal with the realization that when the problem size is large some of the terms of the cost function will be relatively negligible -- at least when compared with some other, typically simpler function in a limiting sense -- is to say these two functions have essentially the same cost. This leads to yet another notation defined below.

Tilde Notation

We write $\displaystyle{f(n) \sim g(n) \textrm{ to signify that } \lim_{n \rightarrow \infty} \frac{f(n)}{g(n)} = 1}$

As examples:
- If $C_1(n) = 2n+10$, then $C_1(n) \sim 2n$
- If $C_2(n) = 3n^3 + 20n^2 + 5$, then $C_2(n) \sim 3n^3$

Applying tilde notation to example 4 discussed above, we see that $\displaystyle{C(n) = \frac{n(n+1)}{2} = \frac{n^2}{2}+\frac{n}{2} \sim \frac{n^2}{2}}$

Big O, Big Omega, Big Theta, and Tilde are all examples of asymptotic notations. Regardless of which asymptotic notation one uses, it is important to understand that the common functions one encounters in such analyses can be ranked according to their growth rate (as shown in the table below -- with higher growth rates appearing closer to the bottom of the table):

Description	Function
constant	$1$
logarithmic	$\textrm{log } n$
linear	$n$
linearithmic	$n\,\textrm{log } n$
quadratic	$n^2$
cubic	$n^3$
exponential	$2^n$

Below are some additional examples of running-time analysis for different primitive operations of interest seen in various code snippets -- including two involving logarithmic or linearithmic growth:

Example 5

double product = 1.0;
for (int i = 1; i <= n; i *= 2) { 
   product *= i;     // <----- primitive operation of interest                 
}

Here, if we suppose that $n = 2^m$, then we have the primitive operation executed when $i = 1, 2, 4, 8, ..., 2^m$, and thus $m+1$ times. Since $m = \log_2 n$, we have a logarithmic cost function $C(n) = \log_2 n + 1 \sim \log_2 n$.

Note that we can also say that $C(n)$ is $O(\textrm{log } n)$. If the switch to $\log n$ from $\log_2 n$ was confusing, note that changing the base requires only a multiplication by a constant, per the logarithm rule from high school algebra shown below. $$\log_b a = \frac{\log_x a}{\log_x b}$$

Example 6

for (int i=1; i<=N; ++i){
   for (int j=1; j<N; j*=2){
      sum++;     // <----- primitive operation of interest                
   }
}

In this example, the inner loop contributes a logarithmic cost, while the other loop requires this be done $n$ times. Thus, we have a linearithmic cost function, $C(n) \sim n \log_2 n$.

Example 7

int n = a.length;
int count = 0;
for (int i = 0; i < n; i++)
   for (int j = i+1; j < n; j++)
      for (int k = j+1; k < n; k++)
         if (a[i] + a[j] + a[k] == 0) // <----- primitive operation of interest
            count++;

Note that the ordered triple (i,j,k) eventually works its way through every possible increasing sequence of three values taken from the $n$ integers $0$ to $n-1$. As such, the condition in the if statement gets evaluated ${}_n C_3 = n(n-1)(n-2)/6$ times. Thus, the related cost function is $\sim n^3/6$.