Heap sort algorithm in C
Heap Sort
Heap sort algorithm, as the name suggests, is based on the concept of heaps. It begins by constructing a special type of binary tree, called heap, out of the set of data which is to be sorted. Note:
A Heap by definition is a special type of binary tree in which each node is greater than any of its descendants. It is a complete binary tree.
A semi-heap is a binary tree in which all the nodes except the root possess the heap property.
If N be the number of a node, then its left child is 2*N and the right child 2*N+1.
The root node of a Heap, by definition, is the maximum of all the elements in the set of data, constituting the binary tree. Hence the sorting process basically consists of extracting the root node and reheaping the remaining set of elements to obtain the next largest element till there are no more elements left to heap. Elemetary implementations usually employ two arrays, one for the heap and the other to store the sorted data. But it is possible to use the same array to heap the unordered list and compile the sorted list. This is usually done by swapping the root of the heap with the end of the array and then excluding that element from any subsequent reheaping.
Significance of a semi-heap - A Semi-Heap as mentioned above is a Heap except that the root does not possess the property of a heap node. This type of a heap is significant in the discussion of Heap Sorting, since after each "Heaping" of the set of data, the root is extracted and replaced by an element from the list. This leaves us with a Semi-Heap. Reheaping a Semi-Heap is particularily easy since all other nodes have already been heaped and only the root node has to be shifted downwards to its right position. The following C function takes care of reheaping a set of data or a part of it.
void downHeap(int a[], int root, int bottom){
int maxchild, temp, child;
while (root*2 < bottom){
child = root * 2 + 1;
if (child == bottom){
maxchild = child;
}else{
if (a[child] > a[child + 1])
maxchild = child;
else
maxchild = child + 1;
}
if (a[root] < a[maxchild]){
temp = a[root];
a[root] = a[maxchild];
a[maxchild] = temp;
}
else return;
root = maxchild;
}
}
In the above function, both root and bottom are indices into the array. Note that, theoritically speaking, we generally express the indices of the nodes starting from 1 through size of the array. But in C, we know that array indexing begins at 0; and so the left child is
child = root * 2 + 1
/* so, for eg., if root = 0, child = 1 (not 0) */
In the function, what basically happens is that, starting from root each loop performs a check for the heap property of root and does whatever necessary to make it conform to it. If it does already conform to it, the loop breaks and the function returns to caller. Note that the function assumes that the tree constituted by the root and all its descendants is a Semi-Heap.
Now that we have a downheaper, what we need is the actual sorting routine.
void heapsort(int a[], int array_size){
int i;
for (i = (array_size/2 -1); i >= 0; --i){
downHeap(a, i, array_size-1);
}
for (i = array_size-1; i >= 0; --i){
int temp;
temp = a[i];
a[i] = a[0];
a[0] = temp;
downHeap(a, 0, i-1);
}
}
Note that, before the actual sorting of data takes place, the list is heaped in the for loop starting from the mid element (which is the parent of the right most leaf of the tree) of the list.
for (i = (array_size/2 -1); i >= 0; --i){
downHeap(a, i, array_size-1);
}
Following this is the loop which actually performs the extraction of the root and creating the sorted list. Notice the swapping of the ith element with the root followed by a reheaping of the list.
for (i = array_size-1; i >= 0; --i){
int temp;
temp = a[i];
a[i] = a[0];
a[0] = temp;
downHeap(a, 0, i-1);
}
The following are some snapshots of the array during the sorting process. The unordered list -
8 6 10 3 1 2 5 4
After the initial heaping done by the first for loop.
10 6 8 4 1 2 5 3
Second loop which extracts root and re-heaps.
8 6 5 4 1 2 3 10 } pass 1
6 4 5 3 1 2 8 10 } pass 2
5 4 2 3 1 6 8 10 } pass 3
4 3 2 1 5 6 8 10 } pass 4
3 1 2 4 5 6 8 10 } pass 5
2 1 3 4 5 6 8 10 } pass 6
1 2 3 4 5 6 8 10 } pass 7
1 2 3 4 5 6 8 10 } pass 8
Heap sort is one of the preferred sorting algorithms when the number of data items is large. Its efficiency in general is considered to be poorer than quick sort and merge sort.
Comments 0