Sorting Algorithms: Difference between revisions

From NovaOrdis Knowledge Base
Jump to navigation Jump to search
No edit summary
 
(39 intermediate revisions by the same user not shown)
Line 1: Line 1:
=Internal=
=Internal=
* [[Data Structures and Algorithms#Sorting_Algorithms|Data Structures and Algorithms]]
* [[Selection Problem]]
* [[Selection Problem]]


Line 8: Line 6:
Many programs use sorting as an intermediate step, and that is why sorting is considered a fundamental operation in computer science.
Many programs use sorting as an intermediate step, and that is why sorting is considered a fundamental operation in computer science.


The '''sorting problem''' if formally defined as follows: given a sequence of n numbers (a<sub>1</sub>, a<sub>2</sub>, ... a<sub>n</sub>) provided as input, the algorithm must produce as output a permutation (reordering) (a<sup>'</sup><sub>1</sub>, a<sup>'</sup><sub>2</sub>, ... a<sup>'</sup><sub>n</sub>) of the input sequence such that a<sup>'</sup><sub>1</sub> ≤ a<sup>'</sup><sub>2</sub> ≤ ... ≤ a<sup>'</sup><sub>n</sub>. A specific input sequence is called an '''[[Data Structures and Algorithms#Instance_of_the_problem|instance]]''' of the sorting problem. Although conceptually we are sorting a '''sequence''', the input comes to the sorting function as an '''[[Data_Structures#Arrays|array]]''' with n elements.
=Sorting Problem=
 
The '''sorting problem''' if formally defined as follows: given a sequence of n numbers (a<sub>1</sub>, a<sub>2</sub>, ... a<sub>n</sub>) provided as input, the algorithm must produce as output a permutation (reordering) (a<sup>'</sup><sub>1</sub>, a<sup>'</sup><sub>2</sub>, ... a<sup>'</sup><sub>n</sub>) of the input sequence such that a<sup>'</sup><sub>1</sub> ≤ a<sup>'</sup><sub>2</sub> ≤ ... ≤ a<sup>'</sup><sub>n</sub>. A specific input sequence is called an [[Algorithms#Instance_of_the_problem|instance]] of the sorting problem. Although conceptually we are sorting a '''sequence''', the input comes to the sorting function as an [[Data_Structures#Arrays|array]]''' with n elements.


The numbers we wish to sort are also known as '''keys'''.  In practice, it is rarely the case when the keys exist in isolation. Usually they are part of a larger structure called '''record''', which also contains '''satellite data'''.  
The numbers we wish to sort are also known as '''keys'''.  In practice, it is rarely the case when the keys exist in isolation. Usually they are part of a larger structure called '''record''', which also contains '''satellite data'''.  
Line 20: Line 20:
==<span id='Comparison_Sort'></span>Key Comparison Sorting Algorithms==
==<span id='Comparison_Sort'></span>Key Comparison Sorting Algorithms==


A sorting algorithm may compare keys, and in this case it is said to be a '''key comparison algorithm'''. It can be demonstrated that a key comparison algorithm [[Comparison_Sorting_Algorithms_Complexity|cannot perform better than n lg n]]. The worst-case running time of comparison sort algorithms is Ω(n lgn).
A sorting algorithm may compare keys, and in this case it is said to be a '''key comparison algorithm'''. It can be demonstrated that a key comparison algorithm [[Comparison_Sorting_Algorithms_Complexity|cannot perform better than n log n]]. The worst-case running time of comparison sort algorithms is Ω(n lgn).


* [[Insertion Sort#Overview|Insertion sort]]
* [[Insertion Sort#Overview|Insertion sort]]
* [[Selection Sort#Overview|Selection sort]]
* [[Selection Sort#Overview|Selection sort]]
* [[Bubble Sort#Overview|Bubble sort]]
* [[Merge Sort#Overview|Merge sort]]
* [[Merge Sort#Overview|Merge sort]]
* [[Heap Sort#Overview|Heap sort]]
* [[Heap Sort#Overview|Heap sort]]
Line 29: Line 30:


==<span id='Non-Comparison_Sort'></span>Non-Comparison Sorting Algorithms==
==<span id='Non-Comparison_Sort'></span>Non-Comparison Sorting Algorithms==
These algorithm have in common the fact that they do not compare elements to one another, but they "stare at the guts of an element" to decide what do do next. This implies that you are willing to make assumption on what the data is, in which case you can bypass Ω(n log n) lower bound.


* [[Counting Sort|Counting sort]]
* [[Counting Sort|Counting sort]]
* [[Radix Sort|Radix sort]]
* [[Radix Sort|Radix sort]]
* [[Bucket Sort|Bucket sort]]
* [[Bucket Sort|Bucket sort]]
=Canonical Use=
* [[The 2-SUM Problem]]
=<span id='Sorted_Array'></span>Static Sorted Array=
The result of any sorting algorithm is a '''static sorted array''', which is a [[Data_Structures#Overview|data structure]] that exposes the following operations.
==Static Sorted Array Operations==
{| class="wikitable" style="text-align: left;"
! Operation
! Running Time
! Notes
|-
| <span id='SEARCH'></span>[[Data_Structures#SEARCH.28K.29|SEARCH(K)]] || O(log n) ||It is implemented as a [[Binary_Search#Overview|recursive binary search]] on the sorted array.
|-
| [[Data_Structures#SELECT|SELECT(i<sup>th</sup> order statistics)]] || O(1) || Since the array is already sorted, any [[Selection_Problem#Overview|selection]] reduces to referencing the correct array element.
|-
| [[Data_Structures#MINIMUM.28.29|MINIMUM()]] || O(1) || Return the element in the first position.
|-
| [[Data_Structures#MAXIMUM.28.29|MAXIMUM()]] || O(1) || Return the element in the last position.
|-
| [[Data_Structures#PREDECESSOR.28X.29|PREDECESSOR(X)]] || O(1) || Return the previous element.
|-
| [[Data_Structures#SUCCESSOR.28X.29|SUCCESSOR(X)]] || O(1) || Return the next element.
|-
| [[Data_Structures#RANK|RANK(X)]] || O(log n) || Search for the given element and return its position.
|-
|}
Because the array is static, insertion [[Data_Structures#INSERT.28X.29|INSERT(X)]] and deletion [[Data_Structures#DELETE.28X.29|DELETE(X)]] are not implemented efficiently on a static sorted array - elements must be copied or the whole array must be re-sorted if an element is inserted or removed. This limitation does not apply to [[Binary_Search_Trees#Balanced_Binary_Search_Trees|balanced binary search trees]], which can be thought as the dynamic version of a sorted array.
=Organizatorium=
* https://github.com/jwasham/coding-interview-university#sorting

Latest revision as of 20:21, 3 November 2021

Internal

Overview

Many programs use sorting as an intermediate step, and that is why sorting is considered a fundamental operation in computer science.

Sorting Problem

The sorting problem if formally defined as follows: given a sequence of n numbers (a1, a2, ... an) provided as input, the algorithm must produce as output a permutation (reordering) (a'1, a'2, ... a'n) of the input sequence such that a'1 ≤ a'2 ≤ ... ≤ a'n. A specific input sequence is called an instance of the sorting problem. Although conceptually we are sorting a sequence, the input comes to the sorting function as an array with n elements.

The numbers we wish to sort are also known as keys. In practice, it is rarely the case when the keys exist in isolation. Usually they are part of a larger structure called record, which also contains satellite data.

Sorting algorithms characteristics:

  • in place: a sorting algorithm is said to sort the input numbers "in place" if it rearranges the numbers within the input array, while at most a constant number of elements are stored outside the array at any time.
  • stability

Sorting Algorithms

Key Comparison Sorting Algorithms

A sorting algorithm may compare keys, and in this case it is said to be a key comparison algorithm. It can be demonstrated that a key comparison algorithm cannot perform better than n log n. The worst-case running time of comparison sort algorithms is Ω(n lgn).

Non-Comparison Sorting Algorithms

These algorithm have in common the fact that they do not compare elements to one another, but they "stare at the guts of an element" to decide what do do next. This implies that you are willing to make assumption on what the data is, in which case you can bypass Ω(n log n) lower bound.

Canonical Use

Static Sorted Array

The result of any sorting algorithm is a static sorted array, which is a data structure that exposes the following operations.

Static Sorted Array Operations

Operation Running Time Notes
SEARCH(K) O(log n) It is implemented as a recursive binary search on the sorted array.
SELECT(ith order statistics) O(1) Since the array is already sorted, any selection reduces to referencing the correct array element.
MINIMUM() O(1) Return the element in the first position.
MAXIMUM() O(1) Return the element in the last position.
PREDECESSOR(X) O(1) Return the previous element.
SUCCESSOR(X) O(1) Return the next element.
RANK(X) O(log n) Search for the given element and return its position.

Because the array is static, insertion INSERT(X) and deletion DELETE(X) are not implemented efficiently on a static sorted array - elements must be copied or the whole array must be re-sorted if an element is inserted or removed. This limitation does not apply to balanced binary search trees, which can be thought as the dynamic version of a sorted array.

Organizatorium