Optimal Binary Search Trees

From NovaOrdis Knowledge Base
Jump to navigation Jump to search

External

Internal

Overview

A balanced binary search tree exposes all its nodes via at most log n node traversals. However, if some of the search terms are more accessed than others, and we know a priori the relative access frequency, a balanced search tree might not be optimal. Instead, a tree that keeps the most frequently accessed nodes higher up in the tree offers better performance overall. An optimal search tree minimize the average search time with respect to a given set of probabilities over the keys. The cost of searching in the tree is given by the number of node assessments: if the key we look for is in the root, we only need to look at the root, so the number of node we need to assess is 1, if the key is immediately under the root, the cost is 2, and so on.

Optimal Substructure Lemma

If T is an optimal binary search tree for the keys {1, 2, ... n} with root r, and the keys in sorted order, then its subtrees T1 and T2 are optimal binary search trees for the keys {1, 2, ... r-1} and {r+1, ..., n} respectively.