Version: 0.3.38
5. Simple Sorting#
- Lectures:
Lecture 2.5
(slides)
- Objectives:
Understand what is a sorted sequence and a few basic algorithms to sort
- Concepts:
Sequences (ADT), sort stability, insertion sort (algorithm), selection sort (algorithm), bubble sort (algorithm)
Sorting is I believe the classical problem in Computer Science, and especially in algorithms and data structures. There is probably more than a dozen sorting algorithms, and here, we look at three simple ones, namely selection sort, insertion sort, and bubble sort. Table 5.3 summarizes their runtime.
Algorithm |
Best |
Average |
Worst |
---|---|---|---|
Selection Sort |
\(O(n)\) |
\(O(n^2)\) |
\(O(n^2)\) |
Insertion Sort |
\(O(n)\) |
\(O(n^2)\) |
\(O(n^2)\) |
Bubble Sort |
\(O(n)\) |
\(O(n^2)\) |
\(O(n^2)\) |
5.1. What Is Sorting?#
Intuitively, sorting is putting things in order. There are many daily-life situations where we sort things. Here are a few examples:
When we play cards and arrange our hand of cards from the smallest to the highest
When we shop online, and we arrange products from the cheaper to the more expensive
When we prioritize our todo-list so that we work on the most important item first
Important
What does “bigger than” means? This depends on the problem at hand. Sometimes it is obvious: When we compare numbers (flight prices), or names, etc. Sometimes it is less obvious. For instance when we compare cards. Is the 8 of club “bigger” than the 2 of heart. It depends on the game.
For us, sorting is just another procedure of our sequence ADT. It takes a sequence as input and yields another sequence where items are sorted. Here is a more formal definition:
- seq.sort(s: Sequence) Sequence #
Returns a new sequence \(s'\) that contains all the items from the given sequence \(s\), but ordered such that every item is smaller or equals than its direct follower.
- Pre-conditions:
None
- Post-conditions:
(sort-1) No item is added or removed from the given sequence \(s\)
\[length(s) = length(s')\](sort-2) Every item in \(s'\) comes from the given sequence \(s\)
\[\forall i \in [1, length(s)], \, get(s, i) \implies search(s', i) \neq 0\](sort-3) Every item in the resulting sequence \(s'\) is smaller or equals than its direct follower.
\[\forall \, i \in [1, length(s')-1], \, get(s', i) \leq get(s', i+1)\]
5.1.1. Sort Stability#
To illustrates sorting algorithms I use (as many) simple sequences of integers. That’s convenient, but in real-life, we often sort more complicated data types, records especially. Consider Table 5.4 shown below, which shows a sequence of employee records.
ID |
Name |
Job Title |
Hourly Rate |
---|---|---|---|
00001 |
Hugo First |
UX Designer |
1650.00 |
00002 |
Pat E. Thick |
Software Engineer |
1250.00 |
00003 |
Perry Scope |
Senior Software Engineer |
1500.00 |
00004 |
Liz Erd |
Project manager |
1500.00 |
00005 |
Teri Dactyl |
Software Engineer |
1000.00 |
How can we sort this table? In such case, we have to choose one field, so called the key, which we will use to sort. In Table 5.4, we used the ID field as key.
Say we need to sort this table by hourly rate. Note that Perry and Liz share the same rate, 1500. There are therefore two valid orderings:
Teri, Pat, Perry, Liz, Hugo. This results from a stable sort, because for items that are equals, it preserves the their original ordering. Perry comes before Liz, as in Table 5.4.
Teri, Pat, Liz, Perry, Hugo. This results from an unstable sort, because items that are equals have been shuffled. Liz comes before Perry, by contrast with Table 5.4.
5.1.2. In-place Sorting#
Another property of sorting algorithms is whether we modify the given sequence, or whether we output a new one, without touching the original. When sorting affects the given sequence, we call this in-place sorting.
5.2. Selection Sort#
See also
Goodrich, M. T., Tamassia, R., & Goldwasser, M. H. (2014). Data Structures and Algorithms in Java. 6th edition. John Wiley & Sons. Section 9.4.1, p. 386
Skiena, S. S. (2020). The Algorithm Design Manual. 3rd edition. Springer International Publishing. Section 4.3, p. 115 – 116
The idea of the selection sort is to repeatedly extract the minimum of the given sequence and to swap it with the first item. We can summarize the steps as follows:
Mark the first item as our current position.
Find the position of the minimum from our current position (included) to the end of the sequence.
Swap this minimum with the current position.
Set our current position to the next item and return to Step 2.
Fig. 5.5 portrays this process. Imagine that the given sequence is partitioned into two segments: “Sorted” and “Not yet sorted”. The “Sorted” segment contains what we have sorted so far. It is empty when we start, but fills in as we proceed. By contrast, the “Not yet sorted” segment contains what we still have to sort. Initially, it contains the whole given sequence, but gradually empties as we proceed.
Fig. 5.5 Selection Sort: Repeatedly swap items with the minimum of what remains to be sorted.#
Selection Sort in Python
1def selection_sort(sequence: Sequence) -> Sequence:
2 current = 1
3 while current <= sequence.length:
4 minimum = find_minimum(sequence, current)
5 sequence.swap(current, minimum)
6 current += 1
7 return sequence
8
9def find_minimum(sequence, start) -> int:
10 """
11 Find the position of the minimum in the given sequence, from
12 the given start position (included).
13 """
14 minimum = start
15 current = start + 1
16 while current <= sequence.length:
17 if sequence.get(minimum) > sequence.get(current):
18 minimum = current
19 current += 1
20 return minimum
5.2.1. Why Does It Work?#
This selection sort works if it guarantees the post-conditions of its
specification. In seq.sort()
, we defined the three following
ones:
(sort-1) The resulting sequence has the same length. Selection sort does not add nor remove items, it simply moves them around.
(sort-2) Every item comes from the given sequence. Again, our selection sort does not add nor remove any item, so this holds by construction.
(sort-3) Every item is smaller than its direct follower. Let see how we can establish this.
-
1Otherwise, it would have been picked by previous iterations.
To show that this is true after the loop (cf. Fig. 5.5), we need a loop-invariant. Here we state the in the “sorted” segment (only), every item is smaller than its direct follower. This is true when we start as the “sorted” segment is initially empty. Besides, if its true after an iteration, it will be true after the next one because the minimum of the “not yet sorted” will be appended to the “sorted” items, and this minimum is necessarily greater or equal to the last sorted item 1Otherwise, it would have been picked by previous iterations.. This, of course, requires our
find_minimum
procedure be correct as well.
5.2.2. How Fast Is It?#
Intuitively, the selection sort repeatedly searches for \(n\) minimums (provided \(n\) is the length of the sequence).. As searching for the minimum in a sequence runs in \(O(n)\), the whole sorting procedure runs in \(O(n^2)\).
Detailed Calculation
We can use a more exhaustive approach, by counting how many arithmetic and logic operations.
Let start with the find_minimum
procedure. Table 5.5 the cost
of every fragments and how they add up.
Line |
Fragment |
Cost |
Runs |
Total |
---|---|---|---|---|
14 |
|
1 |
1 |
1 |
15 |
|
2 |
1 |
2 |
16 |
|
1 |
\(n-s\) |
\(n-s\) |
17 |
|
1 |
\(n-s-1\) |
\(n-s-1\) |
18 |
|
1 |
\(n-s-1\) |
\(n-s-1\) |
18 |
|
2 |
\(n-s-1\) |
\(2n-2s-2\) |
Total: |
\(5(n-s)-1\) |
Here I omit the cost of seq.get()
and seq.length()
for
the sake of simplicity, but that does not change the validity of
our reasoning. We end up with the function:
We proceed the same way with the selection_sort
procedure.
Line |
Fragment |
Cost |
Runs |
Total |
---|---|---|---|---|
2 |
|
1 |
1 |
1 |
3 |
|
1 |
n+1 |
n+1 |
4 |
|
? |
n |
? |
5 |
|
1 |
n |
n |
6 |
|
2 |
n |
2n |
What can we say about Line 4, where we call find_minimum
?
We know that the cost depends on the parameters, \(f(n,s) =
5(n-s)-1\). We also know that the variable current
increases
by one at every iteration. We can thus calculate the total for Line 6
as follows:
If we plug that into the Table 5.6, we get a grand total of \(\frac{5n^2 + n + 4}{2}\)
Here we see that indeed the selection sort runs in \(O(n^2)\) in the worst case.
5.3. Insertion Sort#
See also
Goodrich, M. T., Tamassia, R., & Goldwasser, M. H. (2014). Data Structures and Algorithms in Java. 6th edition. John Wiley & Sons. Section 3.1.2, p. 110
Cormen, T. H., Leiserson, C. E., Rivest, R. L., & Stein, C. (2009). Introduction to Algorithms. 2nd edition. MIT press. Section 2.1, p. 15 – 19.
Skiena, S. S. (2020). The Algorithm Design Manual. Springer International Publishing. 3rd edition. Section 4.3.5, p. 124.
As the selection sort, the insertion sort partitions the given sequence into two segments: The first contains the items we have sorted so far, whereas the second contains the items we still have to sort. The insertion sort proceeds as follows:
We partition our sequence into two segments: Sorted and Not Yet Sorted.
Initially, the first segment is empty since we have not yet sorted anything.
Let’s call next the first item in the “not yet sorted” segment.
We insert next into the “sorted” segment at a position that preserves the ordering of the “sorted” segment. To do this, if next is smaller than its predecessor, we swap them. We do so until next lands at the correct position in the sorted segment.
Repeat from Step 3 until there is no more item to sort.
Fig. 5.6 below illustrates this process. We gradually sort the array by inserting each item at the right place. As we progress, the “sorted” segment fills in while the “not yet sorted” one gradually empties.
Fig. 5.6 Insertion sort picks the next item and inserts it at the right place.#
A Simple Python Implementation
Listing 5.2 shows a simple Python
implementation of the insertion sort. I extracted the code that
inserts items back into the sorted segment into a separate function
insert_back
.
1def insertion_sort(sequence: Sequence) -> Sequence:
2 next_unsorted = 1
3 while next_unsorted <= sequence.length:
4 insert_back(sequence, next_unsorted)
5 next_unsorted += 1
6 return sequence
7
8def insert_back(sequence: Sequence, start: int):
9 current = start
10 while current > 1 \
11 and sequence.get(current) < sequence.get(current-1):
12 sequence.swap(current, current-1)
13 current = current - 1
5.3.1. Why Does It Work?#
When thinking about the correctness, we have to ensure the post
conditions of the seq.sort()
hold.
(sort-1) The resulting sequence has the same length. Insertion sort does not add new items, it simply moves them around.
(sort-2) Every item comes from the given sequence. Again, our insertion sort does not add or remove any item, so this holds by construction.
(sort-3) Every item is smaller than its direct follower. Let see how can we deduce this.
For it to hold when
insertion_sort
returns, we use the following loop invariant: Items are sorted only in the sorted segment, that is, up-to thenext_unsorted
item, excluded (cf. Listing 5.2). As the “sorted segment” progressively expands, when the loop terminates, it eventually holds for the whole sequence. For this to be true however, we have to show that theinsert_back
guarantees it.Now we have to check that
insert_back
procedure leaves the “sorted segment” in order. Here, our loop invariant is that the end of the sorted segment remains always sorted, that is fromcurrent
position excluded, to thestart
position, (cf. Listing 5.2). As we proceed with swapping items, this fraction expands backwards and eventually covers the whole sorted segment.
5.3.2. How Fast Is It?#
In the worst case, insertion sort runs in \(O(n^2)\), where n is the length of the given sequence. Intuitively, we have to go through every items in the sequence, and for each item we possibly have to “swap” them all the way back to the beginning, about \(n\) swaps. In total, this gives us \(n \times n = n^2\).
Detailed Calculation
If we want to estimate precisely the work done by the insertion sort implementation, we do not have to count arithmetic and logical operations. We can reason at a higher level: The only thing the insertion sort does is to “swap” items. So we will count only swaps.
Let’s start with the insert_back
operation. Here we
consider the worst case scenario, where we have to move the item
all the way back to the first position. This takes start-1
swaps. We can express this as the function \(f(n, k)\) such as
Now we can move to the insertion_sort
operation. How many
swaps does it perform? This operation does not call swap
directly, it only calls insert_back
. In the worst case, it
will have to move every item back to the beginning. This happen
when we give a sequence that is sorted the other way around, such
as \(s=(4,3,2,1)\). So in this worst case, it will calls
insert_back
as follows:
insert_back(sequence, 1)
insert_back(sequence, 2)
…
insert_back(sequence, n)
Since we know the number of swaps each of these calls yields (i.e., \(f(n, k)\)), we can calculate their sum \(t(n)\) as follows:
5.4. Bubble Sort#
See also
Unfortunately, in all three textbooks I recommended, bubble sort shows up in the exercises.
Bubble sort use a simple idea: Look repeatedly through all adjacent pairs of items, and we swap them if they are not in the right order. We keep swapping until all pairs are properly ordered. We could summarize the steps as follows:
Mark the first item as our current position
Compare the current item with its direct successor.
If the two are not in order, we swap them.
Move the current position to the next item.
Continue at Step 2, until all pairs are ordered.
Sample Python implementation of Bubble sort
I present below a simple implementation of a bubble sort using our sequence ADT.
1def bubble_sort(sequence: Sequence) -> Sequence:
2 swapped = True
3 while swapped:
4 swapped = False
5 index = 1
6 while index < sequence.length:
7 if sequence.get(index) > sequence.get(index+1):
8 sequence.swap(index, index+1)
9 swapped = True
10 index += 1
11 return sequence
5.4.1. Why Does It Work?#
An important aspect of bubble sort is the behavior of its inner loop, that is, the one that iterate over every pairs, swapping those that are not ordered. Fig. 5.7 illustrates what happen to the largest number during on such pass.
Fig. 5.7 How bubble sort moves the largest item to the very end in one pass#
What happens during one iteration of the outer loop: The largest item gets moved to the very end. So the first iteration will move the largest item to the end, the second iteration the second-largest item to the next to last position, and so on and so forth.
Indirectly, bubble sort also distinguishes between “sorted” and “not yet sorted” items, but it places the “sorted items” at the end.
To prove the correctness of the bubble sort, again, we need to look back at the three post-conditions we have defined:
(sort-1) The resulting sequence has the same length. Bubble sort does not add new items, it simply moves them around.
(sort-2) Every item comes from the given sequence. Again, our bubble sort does not add or remove any item, so this holds by construction.
(sort-3) Every item is smaller than its direct follower. We need to look at these two nested loops.
As for the outer loop. we can use the following loop-invariant: Items in the sorted segment are always sorted (the sorted segment starts at position last-current-1, excluded). At first this segment is empty, so its sorted. As saw above, each iteration brings another item, so the segments fills in as the iterations go.
For this to hold, we must show that the inner loop moves the largest item of the “not yet sorted” segment in first position of the “sorted segment”. Here we can use the following invariant: The largest item of the “not yet sorted segment remains between the current and the last position (of the not-yet-sorted segment). This is true before the first iteration loop, this the not-yet-sorted segments includes the whole sequence. This is after we proceed the first item: If it is the maximum, it is necessarily larger than its follower and will be swapped, so the invariant holds. Otherwise the invariant holds by definition. As the maximum will not be swapped beyond the beginning of the sorted segment, out invariant holds after the loop.
5.4.2. How Fast Is It?#
Intuitively, in worst case, every iteration has to move the first item all the way to the end. That would take \(n\) swaps. As there are \(n\) items, that gives us an algorithm that runs in \(O(n^2)\).
Detailed Calculation
As we did for the insertion sort, we can count “swaps”, instead of diving into arithmetic and logic operations.
Let us look at the inner loop first. In the worst case, it has to perform \(n-1\) swap to move the first items all the way to the end. The first iteration would thus require \(n-1\) swaps, the second one \(n-2\), the third \(n-3\), etc.
That gives us the following total for \(n\) elements:
We can see that indeed, bubble sort runs, in the worst case, in \(O(n^2)\).