Hash Table: Difference between revisions

From NovaOrdis Knowledge Base
Jump to navigation Jump to search
Line 31: Line 31:
The purpose of a hash table is to maintain a dynamic set of records, each record identified by a unique key, and provide [[#Constant_Time|constant time]] insertion, deletion and search.  
The purpose of a hash table is to maintain a dynamic set of records, each record identified by a unique key, and provide [[#Constant_Time|constant time]] insertion, deletion and search.  


The hash table provides constant time access by converting the arbitrary key into a numeric index of a '''bucket''' where the key and the associated value is maintained, by applying a hash function to the key. The conversion itself - the hash function running time - should be constant time.  
This is implemented by converting the arbitrary key into a numeric index of a '''bucket''' where the key and the associated value is stored, by applying a hash function to the key. The conversion itself - the hash function running time - should be constant time.  


The constant time guarantee only holds if the [[Hash Functions#Overview|hash function]] the table uses the key to bucket index is properly implemented. What "properly implemented" means will be addressed in length in the [[Hash Functions#Overview||Hash Functions]] section.
The constant time guarantee only holds if the [[Hash Functions#Overview|hash function]] the table uses the key to bucket index is properly implemented. What "properly implemented" means will be addressed in length in the [[Hash Functions#Overview||Hash Functions]] section.

Revision as of 20:49, 16 October 2021

External

Internal

Overview

Hash tables are one of the most used data structures in programming. They don't have that many operations (INSERT(), DELETE() and SEARCH()), but what they do, they do really well. Conceptually, a hash table is an array that provides immediate random access for constant time insertion, deletion and lookup based on an arbitrary key and not an integral index, for a dynamic set of keys. The hash map implementations do store data in proper arrays, and the mapping between the arbitrary key and the integral position in the array is provided by a hash function. Hash maps are some times referred to as dictionaries, which is fine as long as the dictionary is not assumed to support queries on totally ordered sets: hash maps do not maintain the ordering of the elements that they contain.

Canonical Use

Hash tables are used in situations where we need to do repeated fast lookups of arbitrary keys.

Supported Operations

INSERT(X)

Add a new key/value pair INSERT(X) to the hash map in O(1) running time.

DELETE(K)

Delete a key and the corresponding value DELETE(K) from the hash map in O(1) running time.

SEARCH(K)

Return the value corresponding to the provided key K SEARCH(K), or NULL if the key does not exist, in O(1) running time.

Hash Table Implementation Discussion

The purpose of a hash table is to maintain a dynamic set of records, each record identified by a unique key, and provide constant time insertion, deletion and search.

This is implemented by converting the arbitrary key into a numeric index of a bucket where the key and the associated value is stored, by applying a hash function to the key. The conversion itself - the hash function running time - should be constant time.

The constant time guarantee only holds if the hash function the table uses the key to bucket index is properly implemented. What "properly implemented" means will be addressed in length in the |Hash Functions section.

Chaining

Linear Probing

Hash Functions

Hash Functions

Constant Time Discussion

The constant time guarantees applies to properly implemented hash tables. Hash tables are easy to implement badly, on in those case, the constant time guarantee does not hold.

Also, the guarantee does not hold for every possible data set. For a specific hash function, there is always a pathological data set for which all elements of the set hash to the same value, and in this case the constant time operations guarantee does not hold. More details about pathological data sets and techniques to work around this issue in Hash Functions section.

TO DEPLETE

HashTableTODEPLETE