Sorting Algorithms: Difference between revisions
(44 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
=Internal= | =Internal= | ||
* [[Selection Problem]] | |||
* [[ | |||
=Overview= | =Overview= | ||
Line 7: | Line 6: | ||
Many programs use sorting as an intermediate step, and that is why sorting is considered a fundamental operation in computer science. | Many programs use sorting as an intermediate step, and that is why sorting is considered a fundamental operation in computer science. | ||
The '''sorting problem''' if formally defined as follows: given a sequence of n numbers (a<sub>1</sub>, a<sub>2</sub>, ... a<sub>n</sub>) provided as input, the algorithm must produce as output a permutation (reordering) (a<sup>'</sup><sub>1</sub>, a<sup>'</sup><sub>2</sub>, ... a<sup>'</sup><sub>n</sub>) of the input sequence such that a<sup>'</sup><sub>1</sub> ≤ a<sup>'</sup><sub>2</sub> ≤ ... ≤ a<sup>'</sup><sub>n</sub>. A specific input sequence is called an | =Sorting Problem= | ||
The '''sorting problem''' if formally defined as follows: given a sequence of n numbers (a<sub>1</sub>, a<sub>2</sub>, ... a<sub>n</sub>) provided as input, the algorithm must produce as output a permutation (reordering) (a<sup>'</sup><sub>1</sub>, a<sup>'</sup><sub>2</sub>, ... a<sup>'</sup><sub>n</sub>) of the input sequence such that a<sup>'</sup><sub>1</sub> ≤ a<sup>'</sup><sub>2</sub> ≤ ... ≤ a<sup>'</sup><sub>n</sub>. A specific input sequence is called an [[Algorithms#Instance_of_the_problem|instance]] of the sorting problem. Although conceptually we are sorting a '''sequence''', the input comes to the sorting function as an [[Data_Structures#Arrays|array]]''' with n elements. | |||
The numbers we wish to sort are also known as '''keys'''. In practice, it is rarely the case when the keys exist in isolation. Usually they are part of a larger structure called '''record''', which also contains '''satellite data'''. | The numbers we wish to sort are also known as '''keys'''. In practice, it is rarely the case when the keys exist in isolation. Usually they are part of a larger structure called '''record''', which also contains '''satellite data'''. | ||
Line 14: | Line 15: | ||
* <span id='In-place'></span>'''in place''': a sorting algorithm is said to sort the input numbers "in place" if it rearranges the numbers within the input array, while at most a constant number of elements are stored outside the array at any time. | * <span id='In-place'></span>'''in place''': a sorting algorithm is said to sort the input numbers "in place" if it rearranges the numbers within the input array, while at most a constant number of elements are stored outside the array at any time. | ||
* <span id='Stability'><span>'''stability''' | * <span id='Stability'><span>'''stability''' | ||
=Sorting Algorithms= | =Sorting Algorithms= | ||
Line 23: | Line 20: | ||
==<span id='Comparison_Sort'></span>Key Comparison Sorting Algorithms== | ==<span id='Comparison_Sort'></span>Key Comparison Sorting Algorithms== | ||
A sorting algorithm may compare keys, and in this case it is said to be a '''key comparison algorithm'''. It can be demonstrated that a key comparison algorithm [[Comparison_Sorting_Algorithms_Complexity|cannot perform better than n | A sorting algorithm may compare keys, and in this case it is said to be a '''key comparison algorithm'''. It can be demonstrated that a key comparison algorithm [[Comparison_Sorting_Algorithms_Complexity|cannot perform better than n log n]]. The worst-case running time of comparison sort algorithms is Ω(n lgn). | ||
* [[Insertion Sort#Overview|Insertion sort]] | * [[Insertion Sort#Overview|Insertion sort]] | ||
* [[Selection Sort#Overview|Selection sort]] | * [[Selection Sort#Overview|Selection sort]] | ||
* [[Bubble Sort#Overview|Bubble sort]] | |||
* [[Merge Sort#Overview|Merge sort]] | * [[Merge Sort#Overview|Merge sort]] | ||
* [[Heap Sort#Overview|Heap sort]] | * [[Heap Sort#Overview|Heap sort]] | ||
* [[ | * [[Quicksort#Overview|Quicksort]] | ||
==<span id='Non-Comparison_Sort'></span>Non-Comparison Sorting Algorithms== | ==<span id='Non-Comparison_Sort'></span>Non-Comparison Sorting Algorithms== | ||
These algorithm have in common the fact that they do not compare elements to one another, but they "stare at the guts of an element" to decide what do do next. This implies that you are willing to make assumption on what the data is, in which case you can bypass Ω(n log n) lower bound. | |||
* [[Counting Sort|Counting sort]] | * [[Counting Sort|Counting sort]] | ||
* [[Radix Sort|Radix sort]] | * [[Radix Sort|Radix sort]] | ||
* [[Bucket Sort|Bucket sort]] | * [[Bucket Sort|Bucket sort]] | ||
=Canonical Use= | |||
* [[The 2-SUM Problem]] | |||
=<span id='Sorted_Array'></span>Static Sorted Array= | |||
The result of any sorting algorithm is a '''static sorted array''', which is a [[Data_Structures#Overview|data structure]] that exposes the following operations. | |||
==Static Sorted Array Operations== | |||
{| class="wikitable" style="text-align: left;" | |||
! Operation | |||
! Running Time | |||
! Notes | |||
|- | |||
| <span id='SEARCH'></span>[[Data_Structures#SEARCH.28K.29|SEARCH(K)]] || O(log n) ||It is implemented as a [[Binary_Search#Overview|recursive binary search]] on the sorted array. | |||
|- | |||
| [[Data_Structures#SELECT|SELECT(i<sup>th</sup> order statistics)]] || O(1) || Since the array is already sorted, any [[Selection_Problem#Overview|selection]] reduces to referencing the correct array element. | |||
|- | |||
| [[Data_Structures#MINIMUM.28.29|MINIMUM()]] || O(1) || Return the element in the first position. | |||
|- | |||
| [[Data_Structures#MAXIMUM.28.29|MAXIMUM()]] || O(1) || Return the element in the last position. | |||
|- | |||
| [[Data_Structures#PREDECESSOR.28X.29|PREDECESSOR(X)]] || O(1) || Return the previous element. | |||
|- | |||
| [[Data_Structures#SUCCESSOR.28X.29|SUCCESSOR(X)]] || O(1) || Return the next element. | |||
|- | |||
| [[Data_Structures#RANK|RANK(X)]] || O(log n) || Search for the given element and return its position. | |||
|- | |||
|} | |||
Because the array is static, insertion [[Data_Structures#INSERT.28X.29|INSERT(X)]] and deletion [[Data_Structures#DELETE.28X.29|DELETE(X)]] are not implemented efficiently on a static sorted array - elements must be copied or the whole array must be re-sorted if an element is inserted or removed. This limitation does not apply to [[Binary_Search_Trees#Balanced_Binary_Search_Trees|balanced binary search trees]], which can be thought as the dynamic version of a sorted array. | |||
=Organizatorium= | |||
* https://github.com/jwasham/coding-interview-university#sorting |
Latest revision as of 20:21, 3 November 2021
Internal
Overview
Many programs use sorting as an intermediate step, and that is why sorting is considered a fundamental operation in computer science.
Sorting Problem
The sorting problem if formally defined as follows: given a sequence of n numbers (a1, a2, ... an) provided as input, the algorithm must produce as output a permutation (reordering) (a'1, a'2, ... a'n) of the input sequence such that a'1 ≤ a'2 ≤ ... ≤ a'n. A specific input sequence is called an instance of the sorting problem. Although conceptually we are sorting a sequence, the input comes to the sorting function as an array with n elements.
The numbers we wish to sort are also known as keys. In practice, it is rarely the case when the keys exist in isolation. Usually they are part of a larger structure called record, which also contains satellite data.
Sorting algorithms characteristics:
- in place: a sorting algorithm is said to sort the input numbers "in place" if it rearranges the numbers within the input array, while at most a constant number of elements are stored outside the array at any time.
- stability
Sorting Algorithms
Key Comparison Sorting Algorithms
A sorting algorithm may compare keys, and in this case it is said to be a key comparison algorithm. It can be demonstrated that a key comparison algorithm cannot perform better than n log n. The worst-case running time of comparison sort algorithms is Ω(n lgn).
Non-Comparison Sorting Algorithms
These algorithm have in common the fact that they do not compare elements to one another, but they "stare at the guts of an element" to decide what do do next. This implies that you are willing to make assumption on what the data is, in which case you can bypass Ω(n log n) lower bound.
Canonical Use
Static Sorted Array
The result of any sorting algorithm is a static sorted array, which is a data structure that exposes the following operations.
Static Sorted Array Operations
Operation | Running Time | Notes |
---|---|---|
SEARCH(K) | O(log n) | It is implemented as a recursive binary search on the sorted array. |
SELECT(ith order statistics) | O(1) | Since the array is already sorted, any selection reduces to referencing the correct array element. |
MINIMUM() | O(1) | Return the element in the first position. |
MAXIMUM() | O(1) | Return the element in the last position. |
PREDECESSOR(X) | O(1) | Return the previous element. |
SUCCESSOR(X) | O(1) | Return the next element. |
RANK(X) | O(log n) | Search for the given element and return its position. |
Because the array is static, insertion INSERT(X) and deletion DELETE(X) are not implemented efficiently on a static sorted array - elements must be copied or the whole array must be re-sorted if an element is inserted or removed. This limitation does not apply to balanced binary search trees, which can be thought as the dynamic version of a sorted array.