Recall, that an algorithm is a method for solving a problem or accomplishing some task expressed as a sequence of steps that are suitable for execution by a computer.
We are interested in designing "good" algorithms -- that is to say, algorithms that don't take forever to run, and don't completely exhaust the resources of our machines, specifically with regard to memory.
With regard to the first consideration, the running time of an algorithm gives us a means to compare two algorithms to decide which is faster. However, things are more involved than just running the algorithm once or twice and timing how long it takes to finish. First, we note that running time typically increases with the problem size (typically, either the size of the input or value of some required argument). Second, running time is affected by the hardware and software environment in which the algorithm is executed.
As such, we focus on the relationship between the running time and the input size. In particular, we are interested in how fast the running time is growing as the input size is increasing -- as a faster growth rate means that we will run into long running times more quickly and -- if these running times become long enough -- may not be able to deal at all with problems past a certain size. More generally, understanding this relationship enables one to make reasonable predictions for how long a program will run before it finishes.
One way to analyze an algorithm is to do so experimentally, using the scientific method. Recall, with the scientific method one formulates some hypothesis consistent with initial observations, uses that hypothesis to make falsifiable predictions, and then verifies those predictions with experiments.
To apply this to running times, we:
System.currentTimeMillis()
to get an accurate measure of the actual running times.Analyzing algorithms experimentally, of course, has its own set of problems:
A better way to analyze an algorithm is to realize that the running time of a program is determined primarily by two factors:
To help with the quantification of the above, we first make a definition: A primitive operation is defined to be an operation that corresponds to a low-level, basic computation with a constant execution time. Examples of primitive operations include declaring a variable, assigning a value to a variable, deciding if one integer is larger than another, accessing the element of an array at some index $i$, finding the length of an array, etc.
Then, to analyze an algorithm, we simply determine the frequency of primitive operations, and characterize this as a function of the input size, $n$. This function is referred to as the Cost Function, and is denoted by $C(n)$. In finding $C(n)$, we have the benefit of taking into account all possible inputs, and our result is independent of any hardware/software environment.
Given that an algorithm may run faster on some inputs than it does on others (even with the same input size), we have different ways of counting the frequency of whichever primitive operations we seek to analyze. Specifically, we can look at:
The average case, where we take the average frequency over all possible inputs of the same size, which then suggests an expected cost for a random input of that size -- although this depends on the distribution of inputs (i.e., are some inputs more likely to occur than others? ..if so, how?), which may complicate things. Additionally, we must be wary of misleading averages, as typified by the old joke: "A statistician who put his head in the oven and his feet in the refrigerator was heard to say "On average, I feel just fine!"
The best case scenario, where the frequency count is minimized. While folks are more likely interested in where the algorithm is going to create a problem (by running too long), this does establish a lower bound on the cost and suggests a goal for the cost of all other inputs that might in some cases be obtainable.
The worst case scenario, which establishes an upper bound on the cost of any input. This scenario is generally easier to analyze and reducing its cost typically leads to better algorithms.
Let's take a look at some examples of finding running-time cost functions for some primitive operations of interest...
sum = 0.0; for (int i = 0; i < n; i++) { sum += array[i]; // <----- primitive operation of interest }
Ignoring the update to the loop variable $i$, we note the marked statement above is executed $n$ times, so the cost function is $C(n) = n$.
sum = 0.0; for (int i = 0; i < n; i += 2) { sum += array[i]; // <----- primitive operation of interest }
Here however, since the counter variable $i$ is going up by $2$ with each pass through the loop, the number of times the marked statement above gets executed this time is halved -- leading to $C(n) = n/2$.
for (int i = 0; i < n; i++) { for (int j = 0; j < n; j++) { int x = i*j; // <----- primitive operation of interest sum += x; // <----- primitive operation of interest } }
When nested loops are encountered, notice that the two primitive operations each get executed $n^2$ times, yielding $C(n) = 2n^2$.
for (int i = 0; i < n; i++) { for (int j = i; j < n; j++) { sum += i*j; // <----- primitive operation of interest } }
Here, the cost function is a little more complicated to calculate. Note that the primitive operation is executed $$n + (n-1) + (n-2) + \cdots + 3 + 2 + 1 \quad \textrm{times}$$ However, an identity from algebra will help consolidate this into $\displaystyle{C(n) = \frac{n(n+1)}{2}}$.
The cost function $C(n)$ can often be a complicated and lengthy mathematical expression. However, when the problem-size is large some terms of the cost function will be relatively negligible when compared to others. There are two ways we can take this factor into consideration to simplify things.
One way is to focus on how the cost function for an algorithm can be "bounded" by a simpler function or functions. There are three primary notations that can be used to describe this type of bounding:
For a given algorithm with cost function $C(n)$, a function $f(n)$ serves as a type of upper bound on the magnitude of $C(n)$ if for sufficiently large $n$, we have $|C(n)| \le k \cdot |f(n)|$ for some constant $k$. We abbreviate this using the "asymptotic notation" known as Big Oh, saying: "The running time of $A$ is $O(f(n))$."
As an example, suppose three algorithms $A_1$, $A_2$, and $A_3$ have cost functions $C_1(n) = 10n^2$, $C_2(n) = 100n$, and $C_3(n) = 22n\,\textrm{log } n + 3n$, respectively. All three of these algorithms could be described as having running times that are $O(n^2)$.
On the flip side, we say that a function $f(n)$ serves as a type of lower bound on the magnitude of $C(n)$ if for sufficiently large $n$, we have $|C(n)| \ge k \cdot |f(n)|$ for some constant $k$. Similar to Big Oh notation, we can say this more compactly using Big Omega notation, saying: "The running time of $A$ is $\Omega(f(n))$."
Again, as an example, let's say three algorithms $A_1$, $A_2$, and $A_3$ have cost functions $C_1(n) = n^2/2$, $C_2(n) = n^5$, and $C_3(n) = n^3+17n\,\textrm{log } n + 3n$, respectively. All three of these algorithms could be described as having running times that are $\Omega(n^2)$.
Sometimes a single function can serve in this way as both a lower bound and an upper bound for the magnitude of $C(n)$. Specifically, for an algorithm $A$ with cost function $C(n)$, when there exists a function $f(n)$ and constants $k_1$ and $k_2$ such that for sufficiently large $n$ one has $k_1 \cdot |f(n)| \le |C(n)| \le k_2 \cdot |f(n)|$, we say the running time of $A$ is $\Theta(f(n))$.
So, for example, if algorithms $A_1$, $A_2$, and $A_3$ had cost functions of $C_1(n) = n^2/2$, $C_2(n) = 10n^2$, and $C_3(n) = 5n^2 + 9\textrm{log } n + 7n$, then all of these algorithms could be described as having running times that are $\Theta(n^2)$.
The other way to simplify things upon realizing that when the problem size is large some of the terms of the cost function will be relatively negligible when compared to others is to simply "throw away" these negligible terms! This leads to yet another notation defined below.
We write $\displaystyle{f(n) \sim g(n) \textrm{ to signify that } \lim_{n \rightarrow \infty} \frac{f(n)}{g(n)} = 1}$
As examples:
Applying tilde notation to example 4 discussed above, we see that $\displaystyle{C(n) = \frac{n(n+1)}{2} = \frac{n^2}{2}+\frac{n}{2} \sim \frac{n^2}{2}}$
Big Oh, Big Omega, Big Theta, and Tilde are all examples of asymptotic notations. Regardless of which asymptotic notation one uses, it is important to understand that the common functions one encounters in such analyses can be ranked according to their growth rate (as shown in the table below -- with higher growth rates appearing closer to the bottom of the table):
Description | Function |
---|---|
constant | $1$ |
logarithmic | $\textrm{log } n$ |
linear | $n$ |
linearithmic | $n\,\textrm{log } n$ |
quadratic | $n^2$ |
cubic | $n^3$ |
exponential | $2^n$ |
Below are some additional examples of running-time analysis for different primitive operations of interest seen in various code snippets -- including two involving logarithmic or linearithmic growth:
double product = 1.0; for (int i = 1; i <= n; i *= 2) { product *= i; // <----- primitive operation of interest }
Here, if we suppose that $n = 2^m$, then we have the primitive operation executed when $i = 1, 2, 4, 8, ..., 2^m$, and thus $m+1$ times. Since $m = \log_2 n$, we have a logarithmic cost function $C(n) = \log_2 n + 1 \sim \log_2 n$.
Note that we can also say that $C(n)$ is $O(\textrm{log } n)$. If the switch to $\log n$ from $\log_2 n$ was confusing, note that changing the base requires only a multiplication by a constant, per the logarithm rule from high school algebra shown below. $$\log_b a = \frac{\log_x a}{\log_x b}$$
for (int i=1; i<=N; ++i){ for (int j=1; j<N; j*=2){ sum++; // <----- primitive operation of interest } }
In this example, the inner loop contributes a logarithmic cost, while the other loop requires this be done $n$ times. Thus, we have a linearithmic cost function, $C(n) \sim n \log n$.
int n = a.length; int count = 0; for (int i = 0; i < n; i++) for (int j = i+1; j < n; j++) for (int k = j+1; k < n; k++) if (a[i] + a[j] + a[k] == 0) // <----- primitive operation of interest count++;
Note that the ordered triple (i,j,k)
eventually works its way through every possible increasing sequence of three values taken from the $n$ integers $0$ to $n-1$. As such, the condition in the if
statement gets evaluated ${}_n C_3 = n(n-1)(n-2)/6$ times. Thus, the related cost function is $\sim n^3/6$.