Merge Sort

In a merge sort, the basic idea is to:

  1. Divide an array into two halves
  2. Sort each half
  3. Merge the two sorted halves into a sorted array
Input            M E R G E S O R T E X A M P L E
Sort Left Half   E E G M O R R S
Sort Right Half                  A E E L M P T X
Merge Results    A E E E E G L M M O P R R S T X

The sorting of the left and right halves are easily done recursively. That leaves the key step in merge sort -- the "merge". The below diagram shows the recursive breakdown (up to the center row) and merging together of the smaller lists created into the final sorted array.

The merging of two sections comes at a cost -- we must allocate enough memory space to create an auxiliary array the same size as the array we are sorting.

Then we can first copy all of the elements of our array to the auxiliary array, so that we can then merge them back into the original array in the proper order.

To see how this is accomplished, let us first denote the original array by a[] and the auxiliary array by aux[].

Also, let us suppose we are trying to merge two sections of the array that span from index lo to index hi, and that these two sections (one from index lo to index mid, the other from index mid+1 to index hi, where mid = (lo + hi) / 2), are already sorted.

Then we do the following:

  1. Start by examining the first elements (i.e., the smallest) of each half. Let us keep track of the elements examined by storing the values of their indices. Initially, these are i = lo and j = mid+1
  2. We compare and copy the smaller element (aux[i] or aux[j]) to a[n] where n=lo, initially.
  3. We increment the index (i or j) associated with the half from which we copied, and also increment n
  4. If we reach the end of either half, we copy the remaining elements from the other half, incrementing indices as necessary

Cost Analysis

Note, the cost of a merge sort can be broken into two parts:

With this in mind, if we suppose that the cost function for sorting $n$ items is $C(n)$, we can say:

$$C(n) = 2 C(\frac{n}{2}) + n \quad ; \quad C(1) = 0$$

Let us attempt to solve this recurrence relation. First, let us make the simplifying assumption that $n = 2^m$ for some integer $m$.

With this, we can rewrite $C(n)$ as

$$\begin{array}{rclc} C(n) &=& 2C(2^{m-1}) + 2^m & \quad \scriptsize{\textrm{using the recursive relationship,let's rewrite $C(2^{m-1})$...}}\\\\ &=& 2[ 2 C(2^{m-2}) + 2^{m-1}] + 2^m & \quad \scriptsize{\textrm{we can of course do this multiple times...}}\\\\ &=& 2^2 C(2^{m-2}) + 2 \cdot 2^{m} &\\\\ &=& 2^2[ 2 C(2^{m-3}) + 2^{m-2}] + 2 \cdot 2^{m} &\\\\ &=& 2^3 C(2^{m-3}) + 3 \cdot 2^{m} & \quad \scriptsize{\textrm{by now the pattern should be clear...}}\\\\ &=& 2^k C(2^{m-k}) + k \cdot 2^{m} & \quad \scriptsize{\textrm{this holds for any integer $k$ up to $m$, thus...}}\\\\ &=& 2^m C(2^{m-m}) + m \cdot 2^{m} & \quad \scriptsize{\textrm{this holds when we consider $k=m$}}\\\\ &=& 2^m C(1) + m \cdot 2^{m} & \quad \scriptsize{\textrm{but recall the base case - $C(1)=0$...}}\\\\ &=& m \cdot 2^m & \quad \scriptsize{\textrm{now note that $n = 2^m$ implies $m = \log_2 n$...}}\\\\ &=& (\log_2 n) \cdot n\\\\ \end{array}$$

Thus, the cost function $C(n)$ is $O(n \ln n)$.