The MergeSort Algorithm Tutorial

MergeSort is a sorting algorithm which is more on the advanced end. It is very fast, but unfortunately uses up a lot of memory due to the recursions. It works by recursively splitting the array into two parts (at the midpoint if the number of elements is even, otherwise the right half has one extra) and recursing onto each of the two halves. Afterwards, the two halves are merged together (merging explained below). Here is a visual example of how MergeSort recursively splits the array in half for sorting. I'm using a diagram of a binary tree to represent the steps in our recursion process.

Once the array is recursively split into single elements (3, 5, 4, 9, and 2 above), according to the recursive definition we know that those elements are in a sorted order, right? An array {3} is sorted, because it has only one element. When the recursion is rolling backwards (after expanding), it merges the two branched columns together. Here's a graphical example of how the last merge works (merging [3][5] with [2][4][9]):

The first comparison is made:

You may be wondering why it's [2][4][9] and not [4][9][2]. It's [2][4][9], because the array we're merging is already sorted as it was merged the same way in all of the recursions below the current one (below, like in the tree).

It appears that 2 is less than 3, so 2 becomes the first element in our array and is removed from the sub-array. To remove, you can either dequeue it if you choose to implement Queues or simply increment the 'element being compared' index variable like I did if you use arrays.

The second comparison is made:

The merging continues as explained while both arrays have elements. As soon as one of the arrays becomes empty, all the elements of the other array are blindly (without any comparisons) appended onto our final array. Those elements are already in ascending (sorted) order and are higher than all of the elements presently in the final array. By 'final array', I'm referring to the array we're building while merging the two sub-arrays together... it is the array that our MergeSort method returns.


A possible implementation of the MergeSort algorithm for integer arrays (for simplicity) in Java follows. Forgive me for the complexity, I didn't initially expect it to turn out this bad, but I've made the comments as detailed as possible.

public int[] mergeSort(int array[])
// pre: array is full, all elements are valid integers (not null)
// post: array is sorted in ascending order (lowest to highest)
{
	// if the array has more than 1 element, we need to split it and merge the sorted halves
	if(array.length > 1)
	{
		// number of elements in sub-array 1
		// if odd, sub-array 1 has the smaller half of the elements
		// e.g. if 7 elements total, sub-array 1 will have 3, and sub-array 2 will have 4
		int elementsInA1 = array.length/2;
		// since we want an even split, we initialize the length of sub-array 2 to
		// equal the length of sub-array 1
		int elementsInA2 = elementsInA1;
		// if the array has an odd number of elements, let the second half take the extra one
		// see note (1)
		if((array.length % 2) == 1)
			elementsInA2 += 1;
		// declare and initialize the two arrays once we've determined their sizes
		int arr1[] = new int[elementsInA1];
		int arr2[] = new int[elementsInA2];
		// copy the first part of 'array' into 'arr1', causing arr1 to become full
		for(int i = 0; i < elementsInA1; i++)
			arr1[i] = array[i];
		// copy the remaining elements of 'array' into 'arr2', causing arr2 to become full
		for(int i = elementsInA1; i < elementsInA1 + elementsInA2; i++)
			arr2[i - elementsInA1] = array[i];
		// recursively call mergeSort on each of the two sub-arrays that we've just created
		// note: when mergeSort returns, arr1 and arr2 will both be sorted!
		// it's not magic, the merging is done below, that's how mergesort works :)
		arr1 = mergeSort(arr1);
		arr2 = mergeSort(arr2);
		
		// the three variables below are indexes that we'll need for merging
		// [i] stores the index of the main array. it will be used to let us
		// know where to place the smallest element from the two sub-arrays.
		// [j] stores the index of which element from arr1 is currently being compared
		// [k] stores the index of which element from arr2 is currently being compared
		int i = 0, j = 0, k = 0;
		// the below loop will run until one of the sub-arrays becomes empty
		// in my implementation, it means until the index equals the length of the sub-array
		while(arr1.length != j && arr2.length != k)
		{
			// if the current element of arr1 is less than current element of arr2
			if(arr1[j] < arr2[k])
			{
				// copy the current element of arr1 into the final array
				array[i] = arr1[j];
				// increase the index of the final array to avoid replacing the element
				// which we've just added
				i++;
				// increase the index of arr1 to avoid comparing the element
				// which we've just added
				j++;
			}
			// if the current element of arr2 is less than current element of arr1
			else
			{
				// copy the current element of arr1 into the final array
				array[i] = arr2[k];
				// increase the index of the final array to avoid replacing the element
				// which we've just added
				i++;
				// increase the index of arr2 to avoid comparing the element
				// which we've just added
				k++;
			}
		}
		// at this point, one of the sub-arrays has been exhausted and there are no more
		// elements in it to compare. this means that all the elements in the remaining
		// array are the highest (and sorted), so it's safe to copy them all into the
		// final array.
		while(arr1.length != j)
		{
			array[i] = arr1[j];
			i++;
			j++;
		}
		while(arr2.length != k)
		{
			array[i] = arr2[k];
			i++;
			k++;
		}
	}
	// return the sorted array to the caller of the function
	return array;
}

Note (1): When you divide an integer 7 by 2 in Java (or any other similar programming language), the result will be 3, not 3.5 or 4! Java does not round integer divisions, it simply cuts off the decimal points. So to split an array of size 7 into 2 halves, we divide 7 by 2 and get 3, which we record as the length of the first array. We then set the length of the second array to be 3 as well. After that, we check if 7 mod 2 == 1 (the array has an odd number of elements) and if so, we increase the length of the second array from 3 to 4. That way, 7 is split into 3 and 4.

Written by Pavel

Copyright © 2005-2016 MyCSTutorials.com