Introduction
Sorting is one of the fundamental techniques used in solving problems, especially in those related to writing and implementing efficient algorithms.
Usually, sorting is paired with searching - meaning we first sort elements in the given collection, then search for something within it, as it is generally easier to search for something in a sorted, rather than an unsorted collection, as we can make educated guesses and impose assumptions on the data.
There are many algorithms that can efficiently sort elements, but in this guide we'll be taking a look at how to implement Radix Sort in Java.
Radix Sort in Java
Radix Sort is a non-comparative sorting algorithm, meaning it doesn't sort a collection by comparing each of the elements within it, but instead relies on something called the radix to sort the collection.
The radix (often called the base) is the number of unique digits in a positional numeral system, used to represent numbers.
For the well-known binary system, the radix is 2 (it uses only two digits - 0 and 1). For the arguably even more well-known decimal system, the radix is 10 (it uses ten digits to represent all numbers - from 0 to 9).
How does Radix Sort use this to its advantage?
Radix Sort doesn't sort by itself, really. It uses any stable, non-comparative sorting algorithm as its subroutine - and in most cases, the subroutine is Counting Sort.
If n
represents the number of elements we are to sort, and k
is the range of allowed values for those elements, Counting Sort's time complexity is O(n+k)
when k
is in range from 1...n
, which is significantly faster than the typical comparative sorting algorithm with a time complexity of O(nlogn)
.
But the problem here is - if the range is
1...n²
, the time complexity drastically deteriorates toO(n²)
very quickly.
The general idea of Radix Sort is to sort digit by digit from the least significant ones to the most significant (LSD Radix Sort) and you can also go the other way around (MSD Radix Sort). It allows Counting Sort to do its best by partitioning the input and running Counting Sort multiple times on sets that don't let k
approach n²
.
Because it's not comparison based, it's not bounded by O(nlogn)
- it can even perform in linear time.
Since the heavy lifting is done by Counting Sort, let's first go ahead and take a look at how it works and implement it, before diving into Radix Sort itself!
Counting Sort in Java - Theory and Implementation
Counting Sort is a non-comparative, stable sorting algorithm, and it's main use is for sorting arrays of integers.
The way it works is, it counts the number of objects having distinct key values, and then applies a prefix sum on those same counts to determine the position of each key value in the output. Being stable, the order of records with equal keys is preserved when the collection is sorted.
This operation results in, essentially, a list of integer occurrences, which we typically name the count array. Counting Sort uses the auxiliary count array to determine the positions of elements:
Each index in the output array represents an element in the input array. The value associated with this index is the number of occurrences (the count) of the element in the input array.
The best way to show how Counting Sort works is through an example. Consider we have the following array:
int[] arr = {3, 0, 1, 1, 8, 7, 5, 5};
For the sake of simplicity, we'll be using digits from 0 through 9. The maximum value of a digit we can take into consideration is obviously 9, so we'll set a max = 9
.
This is important because we need an additional, auxiliary array consisting of max + 1
elements. This array will be used to count the number of appearances of every digit within our array arr
, so we need to initialize the whole counting array countingArray
to 0
.
int[] countingArray = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0};
// there are 10 digits, so one zero for every element
Now that we've both defined the array we'll be working with and initialized the counting array, we need to do the following steps to implement Counting Sort:
1. Traversing through our arr
array, and counting the occurrence of every single element while incrementing the element on the position arr[i]
in our countingArray
array:
for(int i = 0; i < arr.length; i++)
countingArray[arr[i]]++;
After this step, countingArray
has the following elements: [1, 2, 0, 1, 0, 2, 0, 1, 1, 0]
.
2. The next step is applying prefix sums on the countingArray
, and we get the following:
for(int i=1; i < countingArray.length; i++)
countingArray[i] += countingArray[i-1];
After the modification of the counting array it now consists of countingArray = {1, 3, 3, 4, 4, 6, 6, 7, 8, 8}
.
3. The third and last step is to calculate element positions in the sorted output based on the values in countingArray
. For that we'll need a new array that we'll call outputArray
, and we'll initialize it to m
zeros, where m
is the number of elements in our original array arr
:
int[] outputArray = {0, 0, 0, 0, 0, 0, 0, 0};
// there are 8 elements in the arr array
Since Counting Sort is a stable sorting algorithm, we'll be iterating through the
arr
array in reverse order, lest we end up switching the elements.
We'll find the index in our countingArray
that is equal to the value of the current element arr[i]
. Then, at the position countingArray[arr[i]] - 1
we'll place the element arr[i]
.
This guarantees the stability of this sort, as well as placing every element in it's right position in the sorted order. Afterwards, we'll decrement the value of countingArray[i]
by 1.
Check out our hands-on, practical guide to learning Git, with best-practices, industry-accepted standards, and included cheat sheet. Stop Googling Git commands and actually learn it!
At the end, we'll be copying the outputArray
to arr
so that the sorted elements are contained within arr
now.
Let's unify all of these snippets and completely implement Counting Sort:
int[] arr = {3, 0, 1, 1, 8, 7, 5, 5};
int[] countingArray = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0};
for(int i = 0; i < arr.length; i++)
countingArray[arr[i]]++;
for(int i=1; i < countingArray.length; i++)
countingArray[i] += countingArray[i-1];
int[] outputArray = {0, 0, 0, 0, 0, 0, 0, 0};
for(int i = arr.length-1; i >= 0; i--){
outputArray[countingArray[arr[i]] - 1] = arr[i];
countingArray[arr[i]]--;
}
for(int i = 0; i < arr.length; i++){
arr[i] = outputArray[i];
System.out.print(arr[i] + " ");
}
Running this will give us a sorted array:
0, 1, 1, 3, 5, 5, 7, 8
As mentioned earlier, the time complexity of this algorithm is O(n+k)
where n
is the number of elements in arr
, and k
is the value of the max
element in the array. However, as k
approaches n²
this algorithm deteriorates towards O(n²)
, which is a major downside of the algorithm.
Since we've briefly explained how Counting Sort works, let's move on to the main topic of this article - Radix Sort.
Radix Sort in Java - Theory and Implementation
Again, Radix Sort typically uses Counting Sort as a subroutine, so Radix Sort itself is a stable sorting algorithm as well.
The keys used by the Counting Sort will be the digits of the integers within the array we're sorting.
There are two variants of Radix Sort - one that sorts from the Least Significant Digit (LSD), and the second one that sorts from the Most Significant Digit (MSD) - we'll be focusing on the LSD approach.
Radix Sort by itself isn't very complicated to understand once we understand how Counting Sort works, so the steps taken to implement it are fairly simple:
- Find the
max
element in the input array. - Determine the number of digits,
d
, themax
element has. The numberd
represents how many times we'll go through the array using Counting Sort to sort it. - Initialize the number
s
to 1 at the beginning, representing the least significant place and working its value up by multiplying it by 10 every time.
For example, let's say we have the following input array
arr = {73, 481, 57, 23, 332, 800, 754, 125}
. The number of times we'll loop through the array is 3, since themax
element in ourarr
array is 800, which has 3 digits.
Let's go through a visual example of an array being sorted this way, step by step, to see how Radix Sort sorts the elements in each iteration:
The input array is broken down into the digits that make up its original elements. Then - either using by the most significant digit and working our way down, or the least significant digit and working our way up, the sequence is sorted via Counting Sort:
In the first pass, only the right-hand side is used to sort, and this is why stability in Radix Sort/Counting Sort is key. If there was no stability, there would be no point in sorting this way. In the second pass, we use the middle row, and finally - the left-hand row is used, and the array is fully sorted.
Finally, let's implement Radix Sort:
static void radixSort(int[] arr) {
int max = arr[0];
for (int i = 1; i < arr.length; i++) {
if (max < arr[i])
max = arr[i];
}
for (int s = 1; max / s > 0; s *= 10)
countingSortForRadix(arr, s);
}
We'll also want to slightly modify Counting Sort.
This modification of Counting Sort does the exact same thing as the previous implementation, only it focuses on digits in different places of the integers at a time:
static void countingSortForRadix(int[] arr, int s) {
int[] countingArray = {0,0,0,0,0,0,0,0,0,0};
for (int i = 0; i < arr.length; i++)
countingArray[(arr[i] / s) % 10]++;
for (int i = 1; i < 10; i++)
countingArray[i] += countingArray[i - 1];
int[] outputArray = {0,0,0,0,0,0,0,0};
for (int i = arr.length - 1; i >= 0; i--)
outputArray[--countingArray[(arr[i] / s) % 10]] = arr[i];
for (int i = 0; i < arr.length; i++)
arr[i] = outputArray[i];
}
Let's create an array and try sorting it now:
public static void main(String[] args) {
int[] arr = {73,481,57,23,332,800,754,125};
radixSort(arr);
for (int i = 0; i < arr.length; i++)
System.out.print(arr[i] + " ");
}
This results in:
23, 57, 73, 125, 332, 481, 754, 800
Since we are using Counting Sort as the main subroutine, for an array containing n
elements, that has the max
element with d
digits, in a system with a b
base, we have the time complexity of O(d(n+b))
.
That's because we're repeating the Counting Sort process d
times, which has O(n+b)
complexity.
Conclusion
Although Radix Sort can run very efficiently and wonderfully, it requires some specific cases to do so. Because it requires that you represent the items to be sorted as integers, it's easy to see why some other comparison-based sorting algorithms can prove to be a better choice in many cases.
The additional memory requirements of Radix Sort compared to some other comparison based algorithms is also one of the reasons that this sorting algorithm is used more rarely than not.
On the other hand, this algorithm performs superbly when the input array has shorter keys, or the range of elements is smaller.