Introduction
In this guide, we will explore Heap Sort - the theory behind it and how to implement Heap Sort in JavaScript.
We will start off with what data structure it's based on (massive foreshadow here: it's a heap!), how to perform operations on that data structure, and how that data structure can be used as means of an efficient sorting algorithm.
Data structures and sorting algorithms are core concepts in programming. A computer program consistently deals with large datasets, retrieving and injecting data ad nauseam. The way we organize these datasets and operate on them is of great importance as it directly impacts the ease and speed with which the user interacts with our applications.
A sorting algorithm is evaluated based on two characteristics: the time and the space the algorithm uses as a function of the dataset's size. These are known as the Time Complexity and Space Complexity respectively, and allow us to "pit" algorithms against each other in average and best-case scenarios.
Heap Sort is regarded as an efficient algorithm, with average time complexity of θ(n log(n)).
Though there exist other algorithms outperforming Heap Sort in the average scenario, its significance relies on its power to perform with the same efficacy in the worst-case scenario as it does in the best, giving it a stable runtime over varying datasets, while some algorithms may suffer from large or small ones - depending on their underlying mechanism.
Heap Sort in JavaScript
Heap Sort is an in-place, non-stable, comparison-based sorting algorithm.
It does not require auxiliary data structures - it sorts the data in place and affects the original data (in-place). It doesn't preserve the relative order or equal elements. If you have two elements with the same value in an unsorted collection, their relative order might be changed (or stay the same) in the sorted collection (non-stable). Finally, the elements are compared to each other to find their order (comparison-based).
Although Heap Sort is in-place (doesn't require an auxiliary data structure), to make the implementation a bit clear, we will recruit an additional array during sorting.
The mechanism underlying Heap Sort is fairly simple and some even call it "Improved Selection Sort".
If you'd like to read more about Selection Sort, read our Selection Sort in JavaScript!
It starts by converting the unsorted array into a heap - either a max-heap or min-heap. In the case of a max-heap, each parent holds a greater value than its descendants, making the root element the largest among the heap and vice versa.
Heap Sort relies on this heap condition.
At each iteration, the algorithm removes the root of the heap and pushes it into an empty array. After each removal, the heap restores itself, bubbling its second-largest (or second-smallest) element up to the root to preserve its heap condition. This process is also known as heapifying and you'll oftentimes see people refer to methods doing this as heapify.
Heap Sort continues shifting the newly located root elements into the sorted array until there is none left.
Using a max-heap in this manner will result in an array with elements in descending order. For the array to be in ascending order, one has to opt for a min-heap.
This sort of self-sorting and selective removal is reminiscent of Selection Sort (sans the self-sorting part) hence the parallel people draw.
What Is a Heap?
A heap is a tree-like data structure. The type of heap we will use for our purposes will be a binary tree (a data structure that resembles a tree branch and is bound to start with one node and if were to branch out, is allowed a maximum of two successors extending from each node). While there exist few types of heaps, there are two distinctive features of a heap:
- A heap must be complete, meaning that each level of the tree should be filled from left to right, and one is not allowed to create another level of the tree without filling all the possible nodes remaining on the last level.
- Each node has to hold a value that is greater than or equal to (in the case of a min-heap, smaller than or equal to) the value of every one of its descendants. This is called the "heap condition".
Mapping a Heap to an Array
What we have defined and depicted as a heap up until this point is merely a diagram, a collection of circles and lines. To use this structure in a JavaScript-based computer program, we need to rework it into an array or a list.
Luckily, this is a fairly straightforward operation that mimics the way we build the heap in the first place. We read and shift the elements off of the heap into an array in the same order we have placed them into the heap: from left to right and level by level.
An example of a heap and its array counterpart, after this shift:
This way, not only can we manage to express a heap in code, but we also gain a compass with which to navigate inside that heap. We can deduct three equations that, given each node's index, will point us to the location of its parent and its right and left children inside the array:
Creating a Heap In JavaScript
Now that a detailed definition of a heap is in place, we can go ahead and implement it as a JavaScript class.
Check out our hands-on, practical guide to learning Git, with best-practices, industry-accepted standards, and included cheat sheet. Stop Googling Git commands and actually learn it!
In this guide, we will create and employ a max-heap. Since the difference between a max-heap and a min-heap is trivial and does not affect the general logic behind the Heap Sort algorithm, the implementation of the min-heap and, therefore, creation of an ascending order via heap sort is a matter of changing the comparison operators.
Let's go ahead and define a MaxHeap
class:
class MaxHeap{
constructor(){
this.heap = [];
}
parentIndex(index){
return Math.floor((index-1)/2);
}
leftChildIndex(index){
return (2*index + 1);
}
rightChildIndex(index){
return (2*index + 2);
}
}
In the MaxHeap
class, we have defined a constructor that initializes an empty array. Later on, we will create additional functions to populate a heap inside this array.
For the time being, however, we have only created helper functions that will return the index of the parent and children of a given node.
Inserting Elements to a Heap
Whenever a new element is inserted into a heap, it is placed next to the rightmost node on the bottom level (the last empty space in the array representation) or, if the bottom level is already full, at the leftmost node of a new level. In this scenario, the heap's first requirement: completeness of the tree, is ensured.
Moving forward, the heap property, which has likely been disturbed, needs to be reestablished. To move the new element to its proper place on the heap it is compared to its parent, and if the new element is larger than its parent, the elements are swapped.
The new element is bubbled up in the heap, whilst being compared to its parent at each level until finally the heap property is restored:
Let's add this functionality to the MaxHeap class we have previously created:
swap(a, b) {
let temp = this.heap[a];
this.heap[a] = this.heap[b];
this.heap[b] = temp;
}
insert(item) {
this.heap.push(item);
var index = this.heap.length - 1;
var parent = this.parentIndex(index);
while(this.heap[parent] && this.heap[parent] < this.heap[index]) {
this.swap(parent, index);
index = this.parentIndex(index);
parent = this.parentIndex(index);
}
}
swap()
is added as a helper method to save us some redundancy in the code since while inserting the new element, we may have to perform this action several times - a number between zero and log(n) (in the case where the new element is larger than the root of the heap, and we have to make it climb the entire tree which has a height of log(the-total-number-of-its-elements) - which in other words, is a lot.
insert()
operates as follows:
- Appends the given element to the
heap
using the built-in JavaScript method:push()
. - Marks the last element of the
heap
asindex
and its parent asparent
. - While there exists an element of the heap at the index
parent
(this.heap[parent]
), and that element happens to be smaller than the one atindex
(this.heap[parent] < this.heap[index
), theinsert()
method goes on to swap the two (this.swap(parent, index)
) and moves its cursor one level up.
Removing Elements From the Heap
A heap only allows the deletion of the root element, which afterward leaves us with a completely distorted heap. Thereon, we first have to reinstate the complete binary tree property by moving the last node of the heap to the root. Then we need to bubble this misplaced value down until the heap property is back in place:
delete() {
var item = this.heap.shift();
this.heap.unshift(this.heap.pop());
var index = 0;
var leftChild = this.leftChildIndex(index);
var rightChild = this.rightChildIndex(index);
while(this.heap[leftChild] && this.heap[leftChild] > this.heap[index] || this.heap[rightChild] > this.heap[index]){
var max = leftChild;
if(this.heap[rightChild] && this.heap[rightChild] > this.heap[max]){
max = rightChild
}
this.swap(max, index);
index = max;
leftChild = this.leftChildIndex(max);
rightChild = this.rightChildIndex(max);
}
return item;
}
The delete()
method, which we create inside the MaxHeap
class, operates in the following manner:
- The method starts by harvesting the largest element -therefore, the first element in the array representation of the heap. The built-in
shift()
method removes the first element of the array and returns the removed element, which we then store in theitem
variable. - The last element of the
heap
gets removed viapop()
and gets placed to the recently emptied first space ofheap
viaunshift()
.unshift()
is a built-in JavaScript method that works as the counterpart toshift()
. Whileshift()
removes the first element of the array and shifts the rest of the elements one space back,unshift()
pushes an element to the beginning of the array and shifts the rest of the elements one space forward. - To be able to bubble the new root downward, pointers to the location of it, which is initially 0, and its two children (
index
,rightChild
,leftChild
) gets created. - The
while()
loop checks whether there exists a left child to theindex
node to ensure the existence of another level below (does not check for a right child yet) and if any of the children in this level is bigger than the node at [index
]. - If the condition inside the while loop is met, a
max
variable is created to declare that the left node is the maximum value the method has encountered yet. Then inside the loop, in anif
clause, we check whether a right child exists, and if it does, whether it is bigger than the left child we first checked. If the value of the right child is indeed bigger, its index replaces the value inmax
. - Whichever child holding the bigger value gets swapped with its parent via
this.swap(max, index)
. - The method moves its imaginary cursor one level down at the end of the while loop and goes on to execute the code inside the while loop over and over until its condition no longer holds.
Implementing Heap Sort in JavaScript
Finally, to achieve what this guide has promised, we create a heapSort()
function (this time outside the MaxHeap
class), and supply it with an array we'd like to sort:
function heapSort(arr){
var sorted = [];
var heap1 = new MaxHeap();
for(let i=0; i<arr.length; i++){
heap1.insert(arr[i]);
}
for(let i=0; i<arr.length; i++){
sorted.push(heap1.delete());
}
return sorted;
}
The heapSort()
takes the array to be sorted as its argument. Then, it creates an empty array to place the sorted version, as well as an empty heap via which to perform the sort.
Then, heap1
is populated with the elements of arr
and are deleted one by one, pushing the removed elements into the sorted array. The heap1
self-organizes with each removal, so just pushing the elements off of it into the sorted array nets us with a sorted array.
Let's create an array and test this out:
let arr = [1, 6, 2, 3, 7, 3, 4, 6, 9];
arr = heapSort(arr);
console.log(arr);
Conclusion
In this guide, we've learned about heap data structure and how Heap Sort operates.
While not being the fastest possible algorithm, Heap Sort can be advantageous when data is partially sorted or when there is a need for a stable algorithm.
Even though we have implemented it using an additional data structure, Heap Sort is essentially an in-place sorting algorithm and, for that reason, can also be used at times when memory usage is a concern.