The Median Maintenance Problem: Difference between revisions
Line 18: | Line 18: | ||
<font size=-1> | <font size=-1> | ||
while each x: | while each x: | ||
if x ≤ H<sub>low</sub>.MAXIMUM() | if H<sub>low</sub> is empty || x ≤ H<sub>low</sub>.MAXIMUM() | ||
H<sub>low</sub>.INSERT(x) | H<sub>low</sub>.INSERT(x) | ||
else | else |
Revision as of 00:58, 14 October 2021
Internal
Problem
Give a sequence of numbers x1, x2, .... xn, report at each step i the median number among the numbers seen so far. For a given k, the and assuming that x1, x2, .... xk are sorted in an ascending order, the median will be on (k + 1)/2 position if k is odd and on k/2 position is k is even.
Discussion
The problem can be resolved by repeatedly solving the selection problem on the set of numbers seen so far, but the selection problem has a running time of O(n), so repeating the selection algorithm for each number will have a running time of O(n2). The median maintenance problem is a canonical use case for a heap.
Solution
Use two heaps: a max heap Hlow that supports the REMOVE-MAX operation with O(1) running time and a min heap Hhigh that supports the REMOVE-MIN operation, also with O(1) running time.
The idea is to maintain the first smallest half elements in Hlow and the largest half of the elements in Hhigh.
This is implemented by appropriately inserting each new element in the corresponding heap, and then rebalancing the heaps so Hlow and Hhigh have the same number of elements, or Hlow has with at most 1 element more than Hhigh. Rebalancing means extracting the top of the heap from the oversized heap and inserting the element in the other heap.
Maintaining this load for both heaps makes possible to read the current median, at any moment, as the top of Hlow.
while each x: if Hlow is empty || x ≤ Hlow.MAXIMUM() Hlow.INSERT(x) else Hhigh.INSERT(x) REBALANCE(Hlow, Hhigh) record running median as Hlow.MAXIMUM()
More details: