XYZ CODE: Merge Sort

Back>

Merge Sort is a Divide and Conquer algorithm. It divides input array in two halves, calls itself for the two halves and then merges the two sorted halves.

MergeSort.java

Merge sort (imagine it as a robot No1) takes an array and its 2 boundaries as the initial input.
if the array size is smaller than the threshold, it starts to sort the array with any dumb method exists on planet earth.
if the array size is still larger than threshold, it breaks the array into 2 by finding the middle point, roughly (lo + hi)/2 from the two boundaries.

It then passes the array and the boundaries of the left half into another merge sort robot No2 (or itself if no other thread is around). It then passes the array and the boundaries of the right half into another merge sort robot No3. Once the two merge sort robots have done with their arrays, the robot No2 get left half sorted, the robot No3 get the right half sorted. The robot No1 then merge the two sorted halves into one full sorted array.

The rule of merge is the hardest part -- even there are steps, they need to be put together in order to comprehend. Once you do, you will regret reading these steps.
step1. copy the array to a tmp array. The tmp array now contains the two sorted halves.
step2. Take the first elements from left half and right half, compare them, put the smaller one into the first slot of the original array.
step3. The remaining larger element then compares with the second element from the other half, the smaller one is then put to the second slot of the original array.
step4. The remaining larger one then compare with the next element in the other half.
...
The comparison continues until either the left half or the right half is exausted.
step final. Then all the elements from the remaining half are copied into the original array without comparison. This bulk copy step is essentially where the merge sort algorithm wins.
The total memory complexity is O(n) -- divide don't take extra memory to store array elements, it just generate one extra layer of program call stack (a few variables). When n is large, the dominant factor for memory occupation is the array storage.

The performance is O(nlogn). Eventually, any algorithm has to process each element at least once in order to move it to the place, you might be regret if skipping any one of them. Nothing can beat O(n) for sorting so far. Larger n means more stuff to exam, that's just the nature. logn comes from the depth of binary dividing and merging. As you see from the curve, nlogn curve looks like straight line, it fits for problems small as well as large.

Feel the algorithm

stack content:

function merge(a, lo, mid, hi) {  
  // Merge a[lo..mid] with a[mid+1..hi].
  var i = lo; 
  var j = mid+1;
  for (var k = lo; k <= hi; k++) { 
    // Copy a[lo..hi] to aux[lo..hi].
    aux[k] = a[k];
  }
  for (var k = lo; k <= hi; k++) { 
    // Merge back to a[lo..hi].
    if      (i > mid)              a[k] = aux[j++];
    else if (j > hi )              a[k] = aux[i++];
    else if (aux[j] < aux[i])      a[k] = aux[j++];
    else                           a[k] = aux[i++];
  }
}

function sort_m(a) {
  mergeSort(a, 0, a.length - 1);
}

function mergeSort(a, lo, hi) {  
  // Sort a[lo..hi].
  //alert('lo = ' + lo + ', hi = ' + hi);
  if (hi <= lo) {
    return;
  }
  var mid = lo + Math.floor((hi - lo)/2);
  mergeSort(a, lo, mid);       // Sort left half.
  mergeSort(a, mid+1, hi);     // Sort right half.
  displayArray(a, lo, mid);
  displayArray(a, mid+1, hi);  
  merge(a, lo, mid, hi); 
  displayArray(a, lo, hi);
}

XYZ CODE

Merge Sort

No comments:

Post a Comment