We now consider another way of implementing a selection sorting algorithm using a more

efficient data structure we have already studied. The underlying idea here is that it would

help if we could pre-arrange the data so that selecting the smallest/biggest entry becomes

easier.

For that, remember the idea of a priority queue discussed earlier. We can take the

value of each item to be its priority and then queue the items accordingly.

Then, if we remove the item with the highest priority at each step we can fill an array in order ‘from the rear’, starting with the biggest item.

An obvious way of implementing them would be using a sorted array, so that the entry

with the highest priority appears in a[n]. Removing this item would be very simple, but

inserting a new item would always involve finding the right position and shifting a number of items to the right to make room for it.

For example, inserting a 3 into the queue [1, 2, 4]:

That kind of item insertion is effectively insertion sort and clearly inefficient in general, of

O(n) complexity rather than **O(log2 n)** with a binary heap tree. Another approach would be to use an unsorted array.

In this case, a new item would be inserted by just putting it into** a[n+1],** but to delete the entry with the highest priority would involve having to find it first.

Then, after that, the last item would have to be swapped into the gap, or all items with a higher index ‘shifted down’. Again, that kind of item deletion is clearly inefficient in general, of **O(n)** complexity rather than O(log2 n) with a heap tree.

Thus, of those three representations, only one is of use in carrying out the above idea

efficiently. An unsorted array is what we started from, so that is not any help, and ordering

the array is what we are trying to achieve, so heaps are the way forward.

To make use of binary heap trees, we first have to take the unsorted array and re-arrange

it so that it satisfies the heap tree priority ordering. We have already studied the heapify

algorithm which can do that with O(n) time complexity. Then we need to extract the sorted array from it.

In the heap tree, the item with the highest priority, that is the largest item, is always in **a[1]**. In the sorted array, it should be in the last position** a[n]. **

If we simply swap the two, we will have that item at the right position of the array, and also have begun the standard procedure of removing the root of the heap-tree, since **a[n]** is precisely the item that would be moved into the root position at the next step. Since **a[n]** now contains the correct item, we will never have to look at it again.

Instead, we just take the items **a[1],…,a[n-1]** and bring them back into a heap-tree form using the bubble down procedure on the new root, which we know to have complexity **O(log2 n).**

Now the second largest item is in position **a[1],** and its final position should be **a[n-1],**

so we now swap these two items. Then we rearrange** a[1],…,a[n-2]** back into a heap tree

using the bubble down procedure on the new root.And so on.

When the **next **step has been completed, the items a**[n-i+1],…,a[n]** will have the correct

entries, and there will be a heap tree for the items a[1],…,a[n-i]. Note that the size,

and therefore the height, of the heap tree decreases at each step.

As a part of the next step, we have to bubble down the new root. This will take at most twice as many comparisons as the height of the original heap tree, which is **log2 n**. So overall there are** n − 1 **steps, with at most **2log2** n comparisons, **totalling 2(n − 1)log2 n. **

The number of comparisons will actually be less than that, because the number of bubble down steps will usually be less than the full height of the tree, but usually not much less, so the time complexity is still O(nlog2 n).

The full **Heapsort **algorithm can thus be written in a very simple form First heapify converts the array into a binary heap tree, and then the for loop moves each successive root one item at a time into the correct position in the sorted array:

heapSort(array a, int n) { heapify(a,n) for( j = n ; j > 1 ; j-- ) { swap a[1] and a[j] bubbleDown(1,a,j-1) } }

It is clear from the swap step that the order of identical items can easily be reversed, so there is no way to render the **Heapsort **algorithm stable.

The average and worst-case time complexities of the entire **Heapsort algorithm** are given

by the sum of two complexity functions, first that of heapify rearranging the original unsorted array into a heap tree which is O(n), and then that of making the sorted array out of the heap tree which is **O(nlog2 n) **coming from the O(n) bubble-downs each of which has **O(log2 n) **complexity.

Thus the overall average and worst-case complexities are both **O(nlog2 n)**, and

we now have a sorting algorithm that achieves the theoretical best worst-case time complexity.

Using more sophisticated priority queues, such as Binomial or Fibonacci heaps, cannot

improve on this because they have the same delete time complexity.

A useful feature of **Heapsort **is that if only the largest **m n** items need to be found and sorted, rather than all n, the complexity of the second stage is only **O(mlog2 n),** which can easily be less than **O(n)** and thus render the whole algorithm only **O(n).**

If you like this post, don’t forget to share 🙂

## No Comments

Leave a comment Cancel