Radix Sort in Python

Introduction

The radix (or base) is the number of digits used to represent numbers in a positional numeral system. For the binary system, the radix is 2 (it uses only two digits - 0 and 1). For the decimal system, the radix is 10 (it uses ten digits to represent all numbers - from 0 to 9).

A positional numeral system is, in simple terms, a number writing system, where the weight (or the value) of a digit is determined by its position. For example, in the number 123, 1 has more value than 3 because it's in a position that denotes hundreds, and the 2 is in the tens.

Radix Sort can be used to lexicographically sort many types of data - integers, words, emails, but is mainly used to sort collections of integers and strings (that are mapped to appropriate integer keys).

It's a non-comparative sorting algorithm, meaning that it doesn't sort a collection by comparing its individual elements, but rather uses the inherent nature of the data it's sorting to sort faster - it sorts data based on their radix.

Comparative sorting algorithms have the best case time complexity of O(nlogn), which is comparatively worse than the linear execution time (O(n+k)) of non-comparative algorithms.

For example, let n be the number of elements to be sorted, and k is the range of allowed element values.

Counting Sort (a popular non-comparative algorithm) has the complexity of O(n+k) when the k is in the range from 1..n. But, if elements range from 1..n², then the complexity rises to O(n²), which is worse than any comparative sorting algorithm.

Counting Sort has the potential to be significantly faster than other popular comparative algorithms, though, only if a certain condition was fulfilled.

The idea of the Radix Sort is to upgrade Counting Sort so that it maintains the linear time complexity even if the range of elements' values drastically exceeds the number of elements.

In fact, Radix Sort inherently uses Counting Sort as the main subroutine, with a few tweaks to overcome the issues that arise with an increased range of elements' values.

Counting Sort Algorithm

In order to get a grasp of Radix Sort, we'll have to delve into Counting Sort first, implement it, and observe the downfall with an increased number of element values.

Why Use Counting Sort in the Radix Sort?

Counting sort is a stable, non-comparative sorting algorithm, and it is mainly used to sort integer arrays. All of these characteristics are important for its use in Radix Sort. You can use other algorithms as the subroutine, as long as they have these characteristics, though, Counting Sort is the most natural matchup.

Radix Sort needs to maintain a relative order of elements with the same key values in the input array while sorting the same place value digits, therefore, our main subroutine by definition needs to be some sort of stable sorting algorithm:

Non-comparative sorting algorithms generally have linear complexity, so they will have less impact on the complexity of the Radix Sort.

How Does the Counting Sort Work?

Let's take a look at an unsorted integer array, which we'll sort using Counting Sort:

I = [2, 2, 0, 6, 1, 9, 9, 7]

Counting Sort works by counting the number of elements, that fit a distinct key value, and then calculates the positions of each key.

First of all, we'll find the maximum element in the input array - max = 9.

Then, we'll create an auxiliary array with max+1 elements. This is the count array (C), which will be used to store the number of occurrences of each element in the input array.

Initially, all counts are initialized to 0:

C = [0, 0, 0, 0, 0, 0, 0, 0, 0, 0] # Count array     
#    0  1  2  3  4  5  6  7  8  9 (indices)

Now, we need to go through the following steps:

1. Traverse the input array and increase the corresponding count for every element by 1

For example, if we come across an element with the value of 2 in the input array (I), we add 1 to the element with the index 2 in the count array:

I = [2, 2, 0, 6, 1, 9, 9, 7] # The first element is 2
     ^
        
C = [0, 0, 1, 0, 0, 0, 0, 0, 0, 0] # We increase count of 2nd element by 1
#    0  1  2  3  4  5  6  7  8  9 (indices)

After this step, the count array will store the number of occurrences of each element in the input array:

C = [1, 1, 2, 0, 0, 0, 1, 1, 0, 2] 
#    0  1  2  3  4  5  6  7  8  9 (indices)
   
# Element 0 has 1 occurrence
# Element 1 has 1 occurrence
# Element 2 has 2 occurrences 
# Element 3 has no occurrences...

2. For each element in the count array, sum up its value with the value of all its previous elements, and then store that value as the value of the current element:

C = [1, 2, 4, 4, 4, 4, 5, 6, 6, 8] 
#    0  1  2  3  4  5  6  7  8  9 (indices)
# Element  0 = 1
# Element  1 = 1 + 1
# Element  2 = 1 + 1 + 2
# Element  3 = 1 + 1 + 2 + 0
#...

This way, we are storing the cumulative sum of the elements of the count array, on each step.

Free eBook: Git Essentials

Check out our hands-on, practical guide to learning Git, with best-practices, industry-accepted standards, and included cheat sheet. Stop Googling Git commands and actually learn it!

3. Calculate element position based on the count array values

To store this sorted sequence, we'll need to create a new array. Let's call it the output array (O), and initialize it with k zeros, where k is the number of elements in the input array:

O = [0, 0, 0, 0, 0, 0, 0, 0] // Initialized output array
#    0  1  2  3  4  5  6  7 (indices)

For each element I[i] (starting from the end) in the input array:

  1. Find the index in the count array that is equal to the value of the current element I[i]
    • That's the element C[j] where j=I[i]
  2. Subtract 1 from the value of the C[i]
    • Now we have newValue = C[i]-1
  3. Store the I[i] to the O[newValue]
  4. Update the C[i] with the newValue

In the end, the output array contains the sorted elements of the input array!

Implementing Counting Sort in Python

Now, with all that out of the way - let's go ahead and implement Counting Sort in Python:

def countingSort(inputArray):
    # Find the maximum element in the inputArray
    maxEl = max(inputArray)

    countArrayLength = maxEl+1

    # Initialize the countArray with (max+1) zeros
    countArray = [0] * countArrayLength

    # Step 1 -> Traverse the inputArray and increase 
    # the corresponding count for every element by 1
    for el in inputArray: 
        countArray[el] += 1

    # Step 2 -> For each element in the countArray, 
    # sum up its value with the value of the previous 
    # element, and then store that value 
    # as the value of the current element
    for i in range(1, countArrayLength):
        countArray[i] += countArray[i-1] 

    # Step 3 -> Calculate element position
    # based on the countArray values
    outputArray = [0] * len(inputArray)
    i = len(inputArray) - 1
    while i >= 0:
        currentEl = inputArray[i]
        countArray[currentEl] -= 1
        newPosition = countArray[currentEl]
        outputArray[newPosition] = currentEl
        i -= 1

    return outputArray

inputArray = [2,2,0,6,1,9,9,7]
print("Input array = ", inputArray)

sortedArray = countingSort(inputArray)
print("Counting sort result = ", sortedArray)

Running the code above will yield the following output:

Input array =  [2, 2, 0, 6, 1, 9, 9, 7]
Counting sort result =  [0, 1, 2, 2, 6, 7, 9, 9]

Counting Sort Complexity

The time complexity of the counting sort is O(n+k), where n is the number of elements in the input array, and k is the value of the max element in the array.

The problem occurs when the value of the largest element drastically exceeds the number of elements in the array. As the k approaches , the time complexity gets closer to O(n²), which is a horrible time complexity for a sorting algorithm.

This is where Radix Sort kicks in.

Radix Sort Algorithm

Instead of counting the elements by their distinct key value - Radix Sort groups digits by their positional value and performing Counting Sort in each group. The starting position can vary - LSD (Least Significant Digits) or MSD (Most Significant Digits) are two common ones, and accordingly, these variations of Radix Sort are called LSD Radix Sort and MSD Radix Sort.

Let I = [2, 20, 61, 997, 1, 619] be the input array that we want to sort:

We'll focus on LSD Radix Sort.

Radix Sort Algorithm

The steps taken by Radix Sort are fairly straightforward:

  1. Find the maximum element in the input array - max = 997
  2. Find the number of digits in the max element - D = 3
  3. Initialize the place value to the least significant place - placeVal = 1
  4. For D times do:
    1. Perform the counting sort by the current place value
    2. Move to the next place value by multiplying placeVal by 10

How to Implement Radix Sort in Python

And finally, with that out of the way, let's implement Radix Sort in Python:

def countingSortForRadix(inputArray, placeValue):
    # We can assume that the number of digits used to represent
    # all numbers on the placeValue position is not greater than 10
    countArray = [0] * 10
    inputSize = len(inputArray)

    # placeElement is the value of the current place value
    # of the current element, e.g. if the current element is
    # 123, and the place value is 10, the placeElement is
    # equal to 2
    for i in range(inputSize): 
        placeElement = (inputArray[i] // placeValue) % 10
        countArray[placeElement] += 1

    for i in range(1, 10):
        countArray[i] += countArray[i-1]

    # Reconstructing the output array
    outputArray = [0] * inputSize
    i = inputSize - 1
    while i >= 0:
        currentEl = inputArray[i]
        placeElement = (inputArray[i] // placeValue) % 10
        countArray[placeElement] -= 1
        newPosition = countArray[placeElement]
        outputArray[newPosition] = currentEl
        i -= 1
        
    return outputArray

def radixSort(inputArray):
    # Step 1 -> Find the maximum element in the input array
    maxEl = max(inputArray)

    # Step 2 -> Find the number of digits in the `max` element
    D = 1
    while maxEl > 0:
        maxEl /= 10
        D += 1
    
    # Step 3 -> Initialize the place value to the least significant place
    placeVal = 1

    # Step 4
    outputArray = inputArray
    while D > 0:
        outputArray = countingSortForRadix(outputArray, placeVal)
        placeVal *= 10  
        D -= 1

    return outputArray
    
input = [2,20,61,997,1,619]
print(input)
sorted = radixSort(input)
print(sorted)

Running the code above will yield the following output:

[2, 20, 61, 997, 1, 619]
[1, 2, 20, 61, 619, 997]

Radix Sort Complexity

As we stated before, Radix Sort has linear time complexity. If we use Counting Sort as the main subroutine, the complexity of the radix sort is O(d(n+k)). That is because we are executing the counting sort d times, and the complexity of the Counting Sort itself is O(n+k).

Conclusion

Radix sort is a great sorting algorithm to use in some specific cases. Some benchmarks have even shown that radix sort can execute up to 3 times faster than other, more general-purpose sorting algorithms.

It shines when the input array has shorter keys, or the range of the element values is smaller. But has poor space complexity in other cases, when the range of element values is quite large and elements have too many digits in their representation.

That is the main reason why the radix sort is not as widely used as some other types of sorting algorithms, even if it has linear time complexity.

Last Updated: October 27th, 2023
Was this article helpful?

Improve your dev skills!

Get tutorials, guides, and dev jobs in your inbox.

No spam ever. Unsubscribe at any time. Read our Privacy Policy.

Project

Building Your First Convolutional Neural Network With Keras

# python# artificial intelligence# machine learning# tensorflow

Most resources start with pristine datasets, start at importing and finish at validation. There's much more to know. Why was a class predicted? Where was...

David Landup
David Landup
Details
Course

Data Visualization in Python with Matplotlib and Pandas

# python# pandas# matplotlib

Data Visualization in Python with Matplotlib and Pandas is a course designed to take absolute beginners to Pandas and Matplotlib, with basic Python knowledge, and...

David Landup
David Landup
Details

© 2013-2024 Stack Abuse. All rights reserved.

AboutDisclosurePrivacyTerms