Version: 0.3.38
6. Sorting Without Comparing#
- Lecture:
Lecture 3.6
(slides)
- Objectives:
Understand the limit of sorting by comparing.
- Concepts:
Sorting without comparing, counting sort, radix sort
We depart from recursion a bit to wrap up our discussing about sorting algorithms. We have already looked at five: Selection sort, Insertion sort, Bubble sort, Quick sort, and Merge sort. What is the common trait among them? They all rely on comparisons, that is they compare one item to another to decide where it should go. This type of sorting is called comparison-based.
In this lecture look at the problem of sorting and its inherent limits when one uses comparisons: No comparison-based sorting algorithm can run faster than \(O(n \log n)\)! To illustrate a different approach, we shall look at Counting sort and Radix sort, which offer, under specific conditions, linear runtime.
6.1. The Limit of Sorting by Comparing#
What is sorting? From a theoretical perspective, sorting a given sequence implies finding a permutation of its items so that every items is smaller than all its successors.
Consider the sequence \(s=(a, b, c)\) for instance. How many permutations exist? There are \(6 = 3 \times 2 \times 1 = 3!\). In this case these six permutations are:
\(s_1=(a, b, c)\)
\(s_2=(a, c, b)\)
\(s_3=(b, a, c)\)
\(s_4=(b, c, a)\)
\(s_5=(c, a, b)\)
\(s_6=(c, b, a)\)
Given a random permutation, how can we find one of those that are sorted 1There can be multiple permutations that are sorted, if several items have the same value. ? We need to compare items one another to decide which one is the largest, etc. Fig. 6.11 shows the decision tree that shows which finds the sorted sequence for any given sequence \(s=(a,b,c)\). For instance, if we know that \(a < b\) and \(a < c\), we still need to assert that \(c < b\) to conclude that the sorted permutation is \(s_2=(a, c, b)\).
Fig. 6.11 Comparisons needed to sort a sequence of three items \(s=(a,b,c)\). Each node shows a decision based on comparing two items.#
The decision tree shown by Fig. 6.11 has a specific shape. For a sequence of \(n\) items sitting at the top of this decision tree, the bottom includes the \(n!\) possible permutations. Each decision is binary (yes or no) so the tree has a height of \(\log_2 n!\). This function is bounded below by \(n \log n\). In other word, the worst case cannot get any better than \(n \log n\), that is \(\log_2 n! \in \Omega(n \log_2 n)\). This is an example of the lower bound applied to the worst case: It will never be faster than \(n \log n\).
Important
When sorting using comparisons, there cannot be any algorithm that runs faster than \(O(n \log n)\).
If we are not using comparisons, let us see what else we can do. We shall look at the counting sort and the radix sort, two algorithms that do not rely on comparisons.
6.2. Counting Sort#
The idea of the counting sort is to count items per category. Consider for instance the sequence \(s=(0,1,1,2,0,1,1,2,0,1,1,2,0)\). It contains only three “categories” of items: 0, 1, and 2. Provided we know the order of these categories, we can traverse the sequence once, counting how many items falls in each category, and finally rebuild the sequence as follows:
In summary, counting sort proceeds as follows:
Traverse the given sequence, and build a frequency table, which registers how many items fall into each category.
Compute the cumulative frequencies
Adjust the given sequence, according to the cumulative frequency table
Listing 6.1 illustrate this in
Java. The countingSort
procedure accepts a sequence (as an
array) and its largest key. The largest key captures the set of
categories, implying that the given sequence will contain only values
from 0 to that largest key. The most tricky part is how we rebuild the
sorted sequence. The cumulative frequency table also capture the index
at which a given category should start. So we proceed backward and for
each item in the given sequence, we write it down at its latest
position, decrementing the count (see lines 14-18).
1int[] countingSort(int sequence[], int largestItem) {
2 int[] sorted = new int[sequence.length];
3 int [] frequencies = new int[largestItem + 1];
4
5 for (int i=0 ; i<sequence.length ; i++) {
6 int key = sequence[i];
7 frequencies[key] += 1;
8 }
9
10 for (int key=1; key<largestItem+1 ; key++) {
11 frequencies[key] += frequencies[key-1];
12 }
13
14 for (int i=sequence.length-1 ; i>=0 ; i--) {
15 int key = sequence[i];
16 sorted[frequencies[key]-1] = key;
17 frequencies[key]--;
18 }
19
20 return sorted;
21 }
- Runtime Efficiency
We see Listing 6.1 that each of the steps if implemented using a loop. We iterate either through the given sequence, or through the set of categories. As the result counting sort runs in \(O(n+k)\) where \(n\) is the length of the given sequence and \(k\) is the number of categories.
- Memory efficiency
Listing 6.1 provides an out-of-place, where we provision a separate array to hold the result. But besides this, we also need some memory for the frequency table. So counting sort requires (i.e., \(O(n+k)\)) memory.
Counting sort offers a first way to sort without comparing, but it has its limitations. Counting works best when we have a few categories and many items. For instance, sorting a sequence of 10 items between 0 and 1 000 000 would be penalized by the very large number of categories (i.e., 1 000 000). Besides, in many practical situations, we do not know the categories before hand. Let see how radix sort builds on counting sort to work around these limitations.
6.3. Radix Sort#
The idea of the radix sort is to apply counting sort repeatedly. Consider numbers such as 123 for instance. If we work on separate digits, then the number of categories required by counting sort is known, it’s 10, for the ten digits. Since, counting sort is a stable sorting algorithm, we can apply it for each digit, from the least significant (LSD) to the most significant, without disrupting what was sorted before. Fig. 6.12 illustrates this idea.
Fig. 6.12 The Radix sort (LSD): Applying repeatedly counting sort on each digits, for the least significant to the most significant.#
Listing 6.2 shows an possible implementation of radix sort in Java. It uses a modified version of the counting sort where the categories are “hard-coded” (i.e., the 10 digits), and that accept the significant digit to consider. First we search for the maximum of the given sequence, and compute its number of digits. We can then apply counting sort as many times as there are digits in this largest item.
int[] radixSort(int[] array) {
int[] sorted = array;
int maximum = findMaximum(array);
int digitCount = countDigits(maximum);
for(int digit=0 ; digit<digitCount ; digit++) {
sorted = countingSort(sorted, digit);
}
return sorted;
}
- Runtime Efficiency
Radix sort is basically running counting sort as many times as there are digits. The runtime therefore depends on the number of items in the sequence but also in the maximum number of digits in these items. Radix sort thus runs in \(O(d \cdot (k+n))\) where \(n\) is the number of items, \(k\) the number of categories, and \(d\) the number of digits.
- Memory Efficiency
Radix sort does not requires more memory than what counting sort does, so the space complexity of radix sort is the same, \(O(n+k)\).
Important
When applicable, non-comparison sorting algorithm offer fast (linear runtime) alternative to more “classical” quick sort and merge sort.
6.4. Sorting Algorithms, Overview#
That concludes our dive into sorting algorithms, but there are many more such as the heap sort or the shell sort to name a few. Table 6.3 summarizes the efficiency of the different sorting algorithms we have studied so far.
Algorithm |
Worst time |
Average time |
Space |
Stable |
---|---|---|---|---|
Insertion Sort |
\(O(n^2)\) |
\(O(n^2)\) |
\(O(1)\) |
Yes |
Selection Sort |
\(O(n^2)\) |
\(O(n^2)\) |
\(O(1)\) |
No |
Bubble Sort |
\(O(n^2)\) |
\(O(n^2)\) |
\(O(1)\) |
Yes |
Quick Sort |
\(O(n^2)\) |
\(O(n \log n)\) |
\(O(\log n)\) |
No |
Merge Sort |
\(O(n \log n)\) |
\(O(n \log n)\) |
\(O(n)\) |
Yes |
Counting Sort |
\(O(n+k)\) |
\(O(n+k)\) |
\(O(n+k)\) |
Yes |
Radix Sort |
\(O(d (n+k))\) |
\(O(d (n+k))\) |
\(O(n+k)\) |
Yes |