Selection Sort
Insertion Sort
Merge Sort
Quicksort
Heap Sort
Counting Sort
LSD Sort
MSD Sort
Summary
Before walking through all the different sorting algorithms taught in this class, let's define stability. A sort is stable if objects that are viewed as equal by the sort are in the same position relative to each other before and after the sort. Let's explore stability with an example.
Name | Favorite Restaurant |
---|---|
Alice | Thai Basil |
Bob | Sweetheart Cafe |
Carol | Poke Parlor |
Dan | Thai Basil |
Eve | Kimchi Garden |
If we sorted the table above by favorite restaurant alphabetically, the sort views (Alice, Thai Basil) and (Dan, Thai Basil) as equal since they have the same favorite restaurant. The sort is then stable if Alice is still listed before Dan and unstable otherwise. Below is the result of a stable sort.
Name | Favorite Restaurant |
---|---|
Eve | Kimchi Garden |
Carol | Poke Parlor |
Bob | Sweetheart Cafe |
Alice | Thai Basil |
Dan | Thai Basil |
Comparison sorts are the sorting algorithms that involve directly comparing the elements being sorted with each other to determine their ordering. As a result, the efficiency of comparison sorts are limited by the number of comparisons necessary to order all the elements.
Selection sort consists of repeatedly selecting the minimum element in the unsorted portion of the array. That minimum element is then placed at the end of the sorted portion of the array. As shown in the example below, the front of the array contains the sorted (gray) elements, and the next element to be placed in order is the minimum element out of the unsorted (blue) portion of the array. The first minimum selected is 1, and it swaps positions with 5 (the first element in the unsorted portion of the array). This process continues until the entire array is sorted.
The entire unsorted portion of the array is scanned each time to find the minimum. Given an array of length N, since the number of unsorted elements decrease by 1 each time, the cost of scanning the elements is N + (N - 1) + (N - 2) + ... + 3 + 2 + 1 = Θ(N2). Swapping two elements in array take constant time.
The best case and worst case runtimes for selection sort are both Θ(N2), because scanning the entire unsorted portion of the array each time is necessary for picking out the minimum element.
Stability for selection sort depends on how the minimum element is picked out each time. In the case that multiple elements tie as the minimum element, selection sort is stable if the first minimum is picked to be placed in order. It is unstable otherwise.
Insertion sort consists of swapping each element with elements in front of it until it is in its sorted position ("inserting" elements in order). While an element a is less than the element b in front of it, a swaps with b.
The best case runtime is when the array is already sorted, since no swaps need to be made. Since we iterate through the array and perform no swaps, the best case runtime is Θ(N).
The worst case runtime is when the array is reversed. If the goal is to sort the array so that it is [1, 2, 3, 4, 5]
, the worst case runtime is achieved when the unsorted version of the array is [5, 4, 3, 2, 1]
. This requires the maximum number of swaps, as each element must be swapped to the beginning of the array resulting in the cost 1 + 2 + 3 + ... + (N - 2) + (N - 1) + N = Θ(N2).
Insertion sort is stable. Elements swap only until the element in front of it is less than or equal, so the relative ordering of equal elements remain the same after insertion sort is complete.
Merge sort consists of breaking down the array in half, recursively sorting each half, then merging the two sorted arrays together to result in the overall sorted array.
Below is the recursion tree for merge sort. The work done per node is linear, since it takes Θ(N) to merge together two arrays of length N/2.
N level 0 work: N
/ \
N/2 N/2 level 1 work: 2 * N/2 = N
/ \ / \
N/4 N/4 N/4 N/4. level 2 work: 4 * N/4 = N
...
There are logN levels, and the work per level is N, so the overall runtime of merge sort is Θ(NlogN). Feel free to review the asymptotics guide if you are confused about these calculations.
Merge sort is stable. When merging elements that are equal, the element in the left half (which appears before the elements in the right half in the original unsorted array) is placed in the sorted array first.
Quick sort involves first selecting an element p as a pivot and then placing the other elements such that everything to the left of p is less than or equal to p and everything to the right of p is greater than or equal to p. Like merge sort, the elements left of p is recursively sorted using quicksort and so is the elements right of p. In the example below, the first element of each unsorted section of the array is chosen as the pivot, but the pivot can be chosen in other ways (chosen at random, choose the last element, etc.).
Hoare Partitioning is a way to efficiently perform in-place quicksort so that we do not need to use an additional array to help us move elements to either side of the pivot. It uses two pointers: a left pointer L that starts at the beginning of the array and traverses towards the end of the array and a right pointer R that starts at the end of the array and traverses towards the front of the array. When L finds an element greater than or equal to the pivot and R finds an element less than or equal to the pivot, those two elements are swapped and the pointers move on to the next element. Once the pointers cross each other, we know that the entire array has been pivoted correctly.
The best case runtime of quick sort is when the pivots are chosen such that roughly an equal number of elements are on either side of the pivot. This effectively divides the array in half so that there are logN levels in the recursion tree and N work at each level to pivot the elements. As a result, the best case runtime analysis is the same as that of merge sort, and the best case runtime of quick sort is Θ(NlogN).
The worst case runtime occurs when the pivots are chosen poorly so that all the elements are pivoted to one side each time. This results in a runtime of Θ(N2). Although this is a poor runtime, the worst case is extremely rare so quick sort is still one of the fastest sorting algorithms on average.
The stability of quick sort depends on how elements are partitioned to either side of the pivot. The sort is stable if the elements are kept in the same relative ordering as they are moved to either side of the pivot and unstable otherwise. Quick sort with Tony Hoare Partitioning is unstable.
As the name suggests, heap sort puts all the elements into a heap (this is done by heapifying the array) and then removes them one by one to produce a sorted ordering. When sorting the elements in increasing order, a max heap is used. This is so that when the maximum element is removed, it is moved to the end of the array. As a result, the first part of the array functions as the max heap while the other part is the sorted array.
Heapification of N elements takes Θ(NlogN) time (this can be reduced to Θ(N) using bottom-up heapification). A single remove operation takes Θ(logN), so it takes Θ(NlogN) time to remove every element from the heap. Therefore, the overall runtime of heap sort is Θ(NlogN).
In the best case, where all the elements are the same, heap sort runs in Θ(N).
Heap sort is unstable. The ordering of elements inside a heap is rather random. The max heap only maintains the invariant that every element is greater than its children, but the ordering of its children does not matter.
Counting (or radix) sorts don't use comparisons, so it can run faster than Θ(NlogN). However, counting sorts can only be used if the elements have a radix, which is a fixed alphabet. For example, the radix of English words is the English alphabet, and the radix of integers are the digits 0 through 9. Objects can't be sorted with counting sorts, because there is no way to break them down into parts that serve as a radix. If we are sorting dogs, for example, it wouldn't make sense to try to break them down to an alphabet.
Counting sort, as the name suggests, first counts the number of occurrences of each element. Note that since we have a finite radix, we know exactly how many unique elements there are (if we are sorting integers, we know the digits can only be 0 through 9). We store these counts in a counts array, which has size R, where R is the size of the radix. Then we allocate a new array of size N, where N is the length of the unsorted input array. Since we know exactly how many elements there are of each category, we can place them one by one into the sorted array. A starting points array is used to keep track of the next index of each category in the sorted array.
Let's walk through the example below, where we are sorting an array whose radix is the lowercase letters a through z.
- Construct the counts array. Allocate an array of size R = 26 and count the number of occurrences of each letter in the unsorted array.
- Constuct the starting points array. Allocate an array of size R = 26 and store the index at which the next element of that letter should be placed. For example, a is the first letter in alphabetical order so the a's start at index 0 in the sorted array. Then there is 1 a and 0 b's, so we know that the c's must start at index 1 in the sorted array.
- Allocate another array of size N to contain the elements in sorted order. Iterate through the unsorted array to place elements in their right place in this new sorted array. a. Each time an element is placed in the sorted array, the starting points array is updated. For example, after we place the first c at index 1, we know that the next c will be placed at index 2. Therefore we update the starting points array so that the index of c is now 2.
As we will see, LSD and MSD sort are in essence multiple iterations of counting sort.
Let N be the size of the input array and R be the size of the radix. We can derive the runtime by breaking the sort down into its different steps.
- Create array of size N to hold sorted order: Θ(N)
- Create array of size R as the counts array: Θ(R)
- Create array of size R as the starting points array: Θ(R)
- Iterate through the input array to fill out counts and starting points arrays: Θ(N)
- Iterate through the input array to place elements at the correct indices in the sorted array, updating the starting points array each time: Θ(N)
Adding up the runtimes from each step, we get that the overall runtime of counting sort is Θ(N + R).
Counting sort is stable. Because we iterate through the input array in order when placing the elements into the sorted array, equal elements are kept in the same order relative to each other.
LSD sort uses counting sort to sort each digit starting from the least significant digit. As shown in the example below, the numbers are first sorted by the ones' place, then by the tens' place, and so on. Notice that numbers are zero-padded to make them all the same length (34 becomes 0034).
Remember that a single iteration of counting sort takes Θ(N + R). Let W be the length of the longest element. Since there is one iteration of counting sort for every digit, there are W iterations of counting sorts. The overall runtime of LSD is Θ(W(N + R)).
LSD sort is stable. It is simply multiple iterations of counting sort.
MSD sort is just like LSD sort except it sorts each digit starting from the most significant digit. In addition, it separates elements into groups with the same prefix on each counting sort iteration. In the example below, after sorting by the most significant place, the elements are separated into groups that have the same most significant digit. Then MSD sort looks at each group separately as the sort continues. In the example below, when sorting by the next most significant digit, the numbers that begin with 0 are sorted against each other, the numbers that begin with 1 are sorted against each other, and so on. This is necessary, since the all numbers with the most significant digit of 0 is less than all numbers with the most significant digit of 1 (after all numbers have been zero-padded so that they are all the same length). The sort terminates when every element is in a group alone or when it has finished sorting by the least significant digit.
In the best case, every element has a different most significant digit, so only one iteration of counting sort is needed until every element has its own group. This results in a runtime of Θ(N + R). (Convince yourself why LSD sort can't terminate early and must sort from the least significant digit all the way to the most significant digit.)
In the worst case, counting sort must be used to sort every digit, so if W is the length of the longest word, W iterations of counting sort are needed. This results in a runtime of Θ(W(N + R)).
MSD sort is stable. It is simply multiple iterations of counting sort.
Sort | Prof. Hug's Walkthrough | Best Case Runtime | Worst Case Runtime | Stable? |
---|---|---|---|---|
Selection Sort | Demo Link | Θ(N2) | Θ(N2) | Depends |
Insertion Sort | Demo Link | Θ(N) | Θ(N2) | Yes |
Merge Sort | Demo Link | Θ(NlogN) | Θ(NlogN) | Yes |
Quicksort | Demo Link | Θ(NlogN) | Θ(N2) | Depends |
Heap Sort | Demo Link | Θ(N) | Θ(NlogN) | No |
Counting Sort | Demo Link | Θ(N + R) | Θ(N + R) | Yes |
LSD Sort | N/A | Θ(W(N + R)) | Θ(W(N + R)) | Yes |
MSD Sort | N/A | Θ(N + R) | Θ(W(N + R)) | Yes |