Latest revision as of 20:25, 16 October 2021

Internal

Hash Tables

Overview

The deduplication problem applies to streams of data for which we need to eliminate duplicates. By stream of data we mean either a large static set, or data that becomes available over time.

The goal is to ignore duplicates and only remember the distinct objects in the stream.

The solution is to use a hash table.

@@ Line 3: / Line 3: @@
 =Overview=
 The deduplication problem applies to streams of data for which we need to eliminate duplicates. By stream of data we mean either a large static set, or data that becomes available over time.
+The goal is to ignore duplicates and only remember the distinct objects in the stream.
+The solution is to use a [[Hash_Table#Canonical_Use|hash table]].

The Deduplication Problem: Difference between revisions

Latest revision as of 20:25, 16 October 2021

Internal

Overview

Navigation menu

The Deduplication Problem: Difference between revisions

Latest revision as of 20:25, 16 October 2021

Internal

Overview

Navigation menu

Search