We have studied a few in place sorts that turn random order array into its lowest entropy state -- totally aligned array -- with the cost of time and space.
These in place sorts, insertion sort, selection sort and shell sort, are interesting, because they have small memory footage -- they don't need extra stack to hold pending states during recursive divide and conquer, they also don't need extra array to hold intermediate results. All they need is the array it is trying to sort and one variable for swamp.
Insertion sort and selection sort have O(n^2) time complexity, while shell sort is an improved selection sort. Using shell sort, the elements bubble up quicker sometimes with larger move steps and can achieve O(nlogn) performance at best case. With the memory resource limitation, can we do the sorting faster?
Maybe, if we spent time to build a tool to facilitate the sort. If time spending on building the tool is not outrageous, the tool might change the game rule, and we saves time in the later long journey of sorting.
We can build a binary heap, a data structure where the keys are stored in an array such that each key is guaranteed to be larger than or equal to the keys at two other specific positions.
In a heap, the parent of the node in position k is in position [k/2], the two children of the node in position k are in positions 2k and 2k + 1. We can travel up and down a heap by array index arithmetic. k/2 move us up the heap, while 2k or 2k + 1 move us down the heap. For the math to work, the heap's array index start from 1 instead of 0.
heap represented as array |
For insert, we add the new key at the end of the array, and then swim up through the heap with that key to restore the heap condition.
For delete, we take root node off the top, put the item from the end of the heap at the top, and then sink down through the heap with that key to restore the heap condition
Feel the algorithm
Here are some examples to help you feel about the algorithm and "get into" the algorithm to see how it works internally. If they are boring, just skip to the code directly.
Total alignment is a rare thing in nature or society, it is just too hard to keep it that way all the time. If we looks around, most of time, elements are aligned with tree structure. It is a very efficient way of distributing information. A node in the tree only have to know connection to its parent node and children nodes in order to pass information up and down. The root node has nothing special, it just don't have a parent, it even connects less than other nodes, but collectively, this node connects to everything. If you are blind and have a question, what is the largest node, which node do you think we should go seeking for the answer? Go ask the leaves and branches, and they will all tell you seeking downward -- and you sooner or later will know the answer, the root node is the largest.
If we exam a tree, the largest size is at root, root branches out. Each branch is smaller than the root. If we exam any branch, it either connects to a leaf or connects to even smaller branches. The overall impression of a tree is, the branches get smaller upward, and bigger downward, root is the biggest branch. At the bottom, the root can branch out into one huge branch and a few much smaller branches, you roughly find smaller branches upward, not perfect, just a rule of thumb.
We want this semi-ordered structure, once we have it, the problem of sorting n elements is as simple as asking the tree n times: give me the largest element.
What's the cost of building and maintain a tree? The natural tree is built from root, but cost the life of the tree to maintain it. Once the root is gone, the tree is dead. So we have to introduce mobility into the natural tree.
That is how organizations are structured. It is constructed hierarchically, root at organization president. It is not because human like that way, it is just naturally adopted by many cultures, maybe because it is a good way of distributing information and maintain stability on change. Most of the organizations only have upward mobility, downward mobility are rare. That is just human nature. However, if we imagine ourselves are in the movie of band of brothers, you might rethink downward mobility. A wrong position will get yourself or friends killed quickly. In that situation, upward and downward mobility are equally common. An army structure works this way: if the leader of a branch gets killed or promoted, a new member (the demoted one) is added to maintain the branch, and the best one of them became the new leader. Soon, one of the branch members is not comfortable for this dangerous position, and if he is lucky there is a better member under his leadership can swap the duty with him, a willingly or unwilling demotion will bring a better fit member into the branch. The essence is: image you are one of those branch members, the time to move up (and get killed quicker) depends on how many members are fitter than you in the same branch, and the time to move down (and get safer) depends on if any members under your leadership is better than you.
Heap sort is an simplified model for the above structure. It made two assumptions. Each parent node is larger than the children nod and each node has only 2 children.
Number 2 is just a convenient number, it has the best upward mobility for members and it is easy to calculate the parent node and children node mathematically. As we observe in life, bigger branch size such as 3, 4, 5 might be more efficient. It is mathematically understandable, for 2, the tree height is logN, if the branch size is 3, 4, 5, the tree will be shorter but the math of finding the children or parent will be a few if and mod calculation harder.
For example:
if(N%3==0)
middle member
else if((N+1)%3==0)
left member
else
right member
0
1
2
3
4
5
6
7
8
9
10
11
function exch(a, i, j) { var t = a[i]; a[i] = a[j]; a[j] = t; } //heap: parent node is bigger than any of the child node function sink(a, k, N) { while (2*k < N) { var j = 2*k; if (j < N && a[j] < a[j+1]) j++; if (a[k] >= a[j]) break; exch(a, k, j); k = j; } } function sort(a) { N = a.length; //build a heap, root at a[1], child of a[i] is at a[i*2] and a[i*2 + 1] for (var k = Math.floor(N/2); k >= 1; k--) { sink(a, k, N); } N --;
while (N > 2) { exch(a, 1, N--); //remove heap head node (put at array end) sink(a, 1, N); // fix heap }
//fixing a[0] for (var j = 1; j < a.length && a[j] < a[j-1]; j++) { exch(a, j, j-1); }
}
Heapsort has time complexity NlogN, it is a in place sorting algorithm, the space complexity is O(1).
<Prev Next>