Introduction
Sorting is one of the fundamental techniques used in solving problems, especially in those related to writing and implementing efficient algorithms.
Usually, sorting is paired with searching - meaning we first sort elements in the given collection, then search for something within it, as it is generally easier to search for something in a sorted, rather than an unsorted collection, as we can make educated guesses and impose assumptions on the data.
There are many algorithms that can efficiently sort elements, but in this guide we'll be taking a look at how to implement Counting Sort in Java.
Counting Sort in Java
Counting Sort is a stable, non-comparative sorting algorithm, and its main use is for sorting arrays of non-negative integers.
Counting Sort counts the number of objects that have distinct key values, and then applying a prefix sum on those counts to determine the position of each key in the output. Like all other non-comparative sorting algorithms, Counting Sort also performs the sort without any comparisons between the elements to be sorted. Also, being a stable sorting algorithm, Counting Sort preserves the order of the elements with equal keys sorted in the output array as they were in the original array.
This operation results in, essentially, a list of integer occurrences, which we typically name the count array. Counting Sort uses the auxiliary count array to determine the positions of elements:
Each index in the count array represents an element in the input array. The value associated with this index is the number of occurrences (the count) of the element in the input array.
The best way to get a feel of how Counting Sort works is by going through an example. Consider we have an array:
int[] arr = {0, 8, 4, 7, 9, 1, 1, 7};
For simplicity's sake, the elements in the array will only be single digits, that is numbers from 0
through 9
. Since the largest value we can have is 9
, let's label the maximum value as max = 9
.
This is important because we'll need to designate a new count array, consisting of max + 1
elements. This array will be used for counting the number of occurrences of every digit within our original array we're given to sort, so we need to initialize the whole count array to 0
, that is:
int[] countArray = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0};
Since there are 10 possible elements our array can have, there are ten zero's for every single digit.
Since we've defined the array we'll be working on, and we have also defined our count array to keep count of each occurrence of a digit, we need to go through the following step to make Counting Sort work:
Step 1:
By going through our whole array arr
in a single for
loop, for every i
from 0
to n-1
, where n
is the number of elements in arr
, we'll count the occurrence of each digit by incrementing the value on the position arr[i]
in our countArray
. Let's see that in code:
for (int i = 0; i < arr.length; i++)
countArray[arr[i]]++;
After the first step, our countArray
looks like this: [1, 2, 0, 0, 1, 0, 0, 2, 1, 1].
Step 2:
Since we now have our countArray
filled with values, we go on to the next step - applying prefix sums to the countArray
. Prefix sums are basically formed when we add each of the previous numbers in the array onto the next cumulatively, forming a sum of all yet seen prefixes:
for (int i=1; i < countArray.length; i++)
countArray[i] += countArray[i-1];
And after applying this step we get the following countArray
: [1, 3, 3, 3, 4, 4, 4, 6, 7, 8].
Step 3:
The third and last step is calculating the element positions in the sorted output based off the values in the countArray
. For this purpose, we'll need a new array that we'll call outputArray
. The size of the outputArray
is the same as our original arr
, and we once again initialize this array to all zeros:
Check out our hands-on, practical guide to learning Git, with best-practices, industry-accepted standards, and included cheat sheet. Stop Googling Git commands and actually learn it!
int[] outputArray = {0, 0, 0, 0, 0, 0, 0, 0};
As we've mentioned earlier, Counting Sort is a stable sort. If we iterate through our arr
array from 0
to n-1
we may end up switching the elements around and ruin the stability of this sorting algorithm, so we iterate the array in the reverse order.
We'll find the index in our countArray
that is equal to the value of the current element arr[i]
. Then, at the position countArray[arr[i]] - 1
we'll place the element arr[i]
. This guarantees we keep the stability of this sort. Afterwards, we decrement the value countArray[i]
by one, and continue on doing the same until i >= 0
:
for (int i = arr.length-1; i >= 0; i--) {
outputArray[countArray[arr[i]] - 1] = arr[i];
countArray[arr[i]]--;
}
At the end of the algorithm, we can just copy the values from outputArr
into our starting array arr
and print the sorted array out:
for (int i = 0; i < arr.length; i++) {
arr[i] = outputArray[i];
System.out.print(arr[i] + " ");
}
Running of course gives us the sorted array with guaranteed stability (relative order) of equal elements:
0 1 1 4 7 7 8 9
Complexity of Counting Sort
Let's discuss both the time and space complexity of Counting Sort.
Let's say that n
is the number of elements in the arr
array, and k
is the range of allowed values for those n
elements from 1...n
. As we're working only with simple for
loops, without any recursive calls, we can analyze the time complexity in a following manner:
- Counting the occurrence of each element in our input range takes
O(n)
time, - Calculating the prefix sums takes up
O(k)
time, - And calculating the
outputArray
based on the previous two takesO(n)
time.
Accounting for all the complexities of these individual steps, the time complexity of Counting Sort is O(n+k)
, making Counting Sort's average case linear, which is better than most comparison based sorting algorithms. However, if the range of k
is 1...n²
, the worst-case of Counting Sorts quickly deteriorates to O(n²)
which is really bad.
Thankfully, this doesn't happen often, and there is a way to ensure that it never happens. This is how Radix Sort came to be - which typically uses Counting Sort as its main subroutine while sorting.
By employing Counting Sort on multiple bounded subarrays, the time complexity never deteriorates to O(n²)
. Additionally, Radix Sort may use any stable, non-comparative algorithm instead of Counting Sort, but it's the most commonly used one.
If you'd like to read more about Radix Sort, read our Radix Sort in Java!
On the other hand, the space complexity problem is much easier. Since our countArray
of size k
is bigger than our starting array of n
elements, the dominating complexity there will be O(k)
. Important thing to note is that, the larger the range of elements in the given array, the larger is the space complexity of Counting Sort.
Conclusion
In this article we've described what Counting Sort is, how it works, and how to implement it in Java.
Even though Counting Sort falls short in comparison to many other sorting algorithms (sorting only integers, having a potential bigger space complexity, etc.), it has some advantages - the main one being that Counting Sort is used as a subroutine for other, more powerful sorting algorithms, such as Radix Sort, and getting the hang of it is crucial to implementing Radix Sort.