Sorting in Linear Time | Counting Sort, Radix Sort, Bucket Sort

A sorting algorithm is an algorithm that puts elements of a list into an order. We’ve already discussed common sorting algorithms in a previous post – Common Sorting Algorithms

However, in this post we’ll explore three sorting algorithms namely — Counting Sort, Radix Sort, and Bucket Sort, that have linear time complexity under certain conditions. These sorting algorithms are particularly useful when the range of input elements is relatively small compared to the number of elements being sorted.

Traditional sorting algorithms like Quick Sort, Merge Sort, and Heap Sort have average time complexities of O(n logn), making them efficient for general-purpose sorting. However, when the input elements have a limited range, sorting can be done more efficiently in linear time.
Counting Sort, Radix Sort, and Bucket Sort are linear-time sorting algorithms that exploit specific properties of the input data to achieve efficient sorting. They are particularly effective when the range of input elements is known and relatively small compared to the number of elements.

Counting Sort

Counting Sort is a non-comparison-based sorting algorithm that works efficiently when the range of input elements is small. It counts the occurrences of each distinct element and then uses this information to place each element in its correct sorted position. Also, it’s a stable sorting algorithm.

Pseudocode

// A is the input array, k is the range of elements in array AcountingSort (A, k):    n = length(A)    count = array of size k+1 initialized with zeros    output = array of size n        // Count the occurrences of each element    for i = 1 to n:        count[A[i]] += 1        // Calculate cumulative counts    // count[i] contains number of elements    // less than or equal to i    for i = 1 to k:        count[i] += count[i - 1]        // Place each element in its correct sorted position    for i = n downto 1:        output[count[A[i]]] = A[i]        count[A[i]] -= 1        return output

Code Implementation

//
//  main.cpp
//  Linear Time Sorting (Counting Sort)
//
//  Created by Himanshu on 05/03/24.
//

#include <iostream>
#include <vector>
using namespace std;

void printList (vector<int> arr) {
int n = (int) arr.size();

for (int i=0; i<n; i++) {
cout<<arr[i]<<" ";
}

}

vector<int> countingSort (vector<int> arr, int k) {
int n = (int) arr.size();
vector<int> count(k + 1, 0);
vector<int> output(n);

// Count the occurrences of each element
for (int i = 0; i < n; i++) {
count[arr[i]]++;
}

// Calculate cumulative counts
// count[i] contains number of elements
// less than or equal to i
for (int i = 1; i <= k; i++) {
count[i] += count[i - 1];
}

// Place each element in its correct sorted position
for (int i = n - 1; i >= 0; i--) {
output[count[arr[i]] - 1] = arr[i];
count[arr[i]]--;
}

return output;
}

int main () {
vector<int> arr = {4, 2, 2, 8, 3, 3, 1};
int k = 8;

vector<int> sortedList = countingSort(arr, k);

cout<<"Counting Sort:"<<endl;
printList(sortedList);

return 0;
}


Output

Counting Sort:1 2 2 3 3 4 8

Time complexity: O(n + k), where n is the number of elements in the input array and k is the range of input elements.
Auxiliary Space: O(n + k), for the count array and to store the input elements.

Radix Sort is a non-comparison-based sorting algorithm that sorts elements by their individual digits or bytes. It sorts the elements from the least significant digit (or byte) to the most significant digit (or byte) using a stable counting sort or bucket sort as a subroutine.

A stable sorting algorithm maintains the relative order of records with equal keys. This means that when a collection is sorted with a stable sorting algorithm, items with the same sort keys preserve their order.

Pseudocode

radixSort (A, d):    for i = 1 to d:        use a stable sort to sort array A on digit i

Code Implementation

//
//  main.cpp
//  Linear Time Sorting (Radix Sort)
//
//  Created by Himanshu on 05/03/24.
//

#include <iostream>
#include <vector>
#include <cmath>
using namespace std;

void printList (vector<int> arr) {
int n = (int) arr.size();

for (int i=0; i<n; i++) {
cout<<arr[i]<<" ";
}

cout<<endl;

}

// Assuming a stable 'Counting Sort' as a subroutine
// Here, d is the maximum number of digits in any element
vector<int> radixSort (vector<int> arr, int d) {
int n = (int) arr.size();

// Perform counting sort for each digit from least significant to most significant
for (int i = 1; i <= d; i++) {
// Counting sort on digit i
vector<int> count(10, 0);
vector<int> output(n);

// Count the occurrences of each digit
for (int j = 0; j < n; j++) {
// calculating ith digit of the element arr[j] from the last
int digit = (arr[j] / int(pow(10, i - 1))) % 10;

count[digit]++;
}

// Calculate cumulative counts
// count[j] contains number of elements
// with ith digit (from the last) less than or equal to j
for (int j = 1; j < 10; j++) {
count[j] += count[j - 1];
}

// Place each element in its correct sorted position
for (int j = n - 1; j >= 0; j--) {
int digit = (arr[j] / int(pow(10, i - 1))) % 10;
output[count[digit] - 1] = arr[j];
count[digit]--;
}

// Update the original array
arr = output;

printf("Array after %d iteration: \n", i);
printList(arr);
}

return arr;
}

int main () {
vector<int> arr = {329, 457, 657, 839, 436, 720, 355};
int d = 3; // Maximum number of digits in any element

printList(sortedList);

return 0;
}


Output

Array after 1 iteration: 720 355 436 457 657 329 839 Array after 2 iteration: 720 329 436 839 355 457 657 Array after 3 iteration: 329 355 436 457 657 720 839 Radix Sort:329 355 436 457 657 720 839

Notice how after ith iteration, array elements are sorted by ith digit from the last.

Time complexity: O(d * (n + k)), where d is the maximum number of digits or bytes in any element, n is the number of elements, and k is the base of the number system being used.
Auxiliary Space: O(n + k), for the count array and to store the sorted output array.

Bucket Sort

Bucket Sort is a distribution-based sorting algorithm that divides the input into a finite number of buckets, each containing a subset of the input elements. It then sorts each bucket individually, either recursively or using another sorting algorithm, and concatenates the sorted buckets to obtain the final sorted array.

Pseudocode

bucketSort (A):    n = length(A)    B = array of empty lists    for i = 1 to n:        insert A[i] into B[floor(n * A[i])]    for i = 0 to n-1:        sort list B[i] using any sorting algorithm    concatenate the lists B[0], B[1], ..., B[n-1] together in order

Code Implementation

//
//  main.cpp
//  Linear Time Sorting (Bucket Sort)
//
//  Created by Himanshu on 05/03/24.
//

#include <iostream>
#include <vector>
using namespace std;

void printList (vector<float> arr) {
int n = (int) arr.size();

for (int i=0; i<n; i++) {
cout<<arr[i]<<" ";
}

}

void insertionSort (vector<float>& arr) {
int n = (int) arr.size();

for (int i = 1; i < n; i++) {
float key = arr[i];
int j = i - 1;

while (j >= 0 && arr[j] > key) {
arr[j + 1] = arr[j];
j = j - 1;
}

arr[j + 1] = key;
}
}

vector<float> bucketSort (vector<float> arr) {
int n = (int) arr.size();
vector<vector<float>> B(n);
// Distribute elements into buckets
for (int i = 0; i < n; i++) {
int bucket_index = n * arr[i];
B[bucket_index].push_back(arr[i]);
}

// Sort each bucket
for (int i = 0; i < n; i++) {
insertionSort(B[i]);
}

// Concatenate the sorted buckets
vector<float> sorted;
for (int i = 0; i < n; i++) {
for (float num : B[i]) {
sorted.push_back(num);
}
}

return sorted;
}

int main () {
vector<float> arr = {0.78, 0.17, 0.39, 0.26, 0.72, 0.94, 0.21, 0.12, 0.23, 0.68};

vector<float> sortedList = bucketSort(arr);

cout<<"Bucket Sort:"<<endl;
printList(sortedList);

return 0;
}


Output

Bucket Sort:0.12 0.17 0.21 0.23 0.26 0.39 0.68 0.72 0.78 0.94

Time complexity: O(n + k), where n is the number of elements in the input array and k is the number of buckets. However, in the worst case, when all the elements are in the same bucket, the time complexity of bucket sort is O(n2), as all the elements need to be compared to each other.
Auxiliary Space: O(n + k), for the buckets.

Time complexity analysis

Here, worst case time complexity could be reduced to O(n logn), if we use an O(n logn) sorting algorithm, such as Merge Sort or Heap Sort, to sort each bucket.

Bucket Sort divides the input array into a number of buckets based on the range of input values and distributes elements into these buckets.
If the elements are uniformly distributed across the buckets, each bucket will contain approximately n/k elements, where n is the total number of elements in the input array, and k is the number of buckets.

• Sorting each bucket individually using an efficient algorithm typically takes:
O(n/k * log(n/k)) time.
• Considering there are k buckets, the total time spent on sorting all the buckets is approximately:
k * (n/k * log(n/k)) , which simplifies to O(n * log(n/k)) .
• If the number of buckets ( k ) is chosen to be proportional to n (i.e., k ≈ n ), the time complexity becomes O(n * log(1)) = O(n) . Here, you could avoid log(1), since time complexity of sorting a single element is O(1).

Therefore, in the average case scenario with uniform distribution of elements among the buckets, the average time complexity of Bucket Sort approaches O(n).