Red-black Tree
External
- https://www.coursera.org/learn/algorithms-graphs-data-structures/lecture/8acpe/red-black-trees
- https://www.youtube.com/watch?v=scfDOof9pww
- https://en.wikipedia.org/wiki/Red%E2%80%93black_tree
Internal
Overview
Red-black trees were invented by Bayer (1972) and Guibas, Sedgewick (1978). A red-black tree is a type of binary search tree that self-balances on insertion and deletion, thus maintaining its height to a minimum, which allows for efficient operations. Almost all binary search tree operations have a running time bounded by the tree height, and in this case the tree height stays constant at log n, yielding O(log n) operations.
Red-black Tree Invariants
- Each node is either red or black.
- The root is always black.
- There are never 2 red nodes in a row (a red node has only black children).
- Every path taken from root to a NULL pointer - unsuccessful search - has the same number of black nodes (red makes a node "invisible" to the invariant 4).
These invariants are in addition to the fact that the red-black tree is a binary search tree, so it has the Binary Search Tree Property.
Red-black Trees Maintain Small Heights
If all these invariants are satisfied at all times, the height of the tree is going to be small. For example, a chain with three nodes cannot be a red-black tree.
Claim: every red-black tree with n nodes has height ≤ 2 log(n+1) (O(log n)).
Proof. We will prove that a red-black tree must look like a perfectly balanced tree with at most factor 2 inflation.
Suppose there is a binary tree where if you start from the root and no matter how you navigate through the tree to a NULL pointer, you have no choice but to see at least k nodes along the way. If this hypothesis is satisfied, then the top of the tree has to be fully filled in. Why? What if for all paths but one I see k nodes, and for two paths I see k + 1? The proof is by contradiction, if you were missing some nodes in any of these top k levels, that would. give you a way of hitting a NULL pointer seeing less than k nodes. This observation gives a lower bound on the population of a search tree as a function of the lengths of its root - NULL paths. The size n of the tree must include at least the number of nodes in a perfectly balanced tree of depth k - 1, which is 2k-1-1.