Internal

Hash Tables

Overview

The deduplication problem applies to streams of data for which we need to eliminate duplicates. By stream of data we mean either a large static set, or data that becomes available over time.

The goal is to ignore duplicates and only remember the distinct objects in the stream.

The solution is to use a hash table.

The Deduplication Problem

Internal

Overview

Navigation menu

The Deduplication Problem

Internal

Overview

Navigation menu

Search