Hash tables explained [step-by-step example] (2024)

yourbasic.org

Hash tables explained [step-by-step example] (1)

  • Basics
  • Hashing with chaining (simplified example)
  • Realistic hash function example
  • Resizing in constant amortized time

Basics

Hash tables are used to implement map and set data structures in most common programming languages.In C++ and Java they are part of the standard libraries, while Python and Go have builtin dictionaries and maps.

A hash table is an unordered collection of key-value pairs, where each key isunique.

Hash tables offer a combination of efficient lookup, insert and delete operations.Neither arrays nor linked lists can achieve this:

  • a lookup in an unsorted array takes linear worst-case time;
  • in a sorted array, a lookup using binary search is very fast,but insertions become inefficient;
  • in a linked list an insertion can be efficient, but lookups take linear time.

Hashing with chaining (simplified example)

The most common hash table implementation uses chaining with linked lists to resolve collisions.This combines the best properties of arrays and linked lists.

Hash table operations are performed in two steps:

  • A key is converted into an integer index by using a hash function.
  • This index decides the linked list where the key-value pair record belongs.

Hash tables explained [step-by-step example] (2)

This hash table consists of an array with 1000 entries, each of which refers to a linked lists of key-value pairs.

Let’s start with a somewhat simplified example:a data structure that can store up to 1000 records with random integer keys.

To distribute the data evenly, we use several short lists.All records with keys that end with 000 belong to one list,those with keys that end with 001 belong to another one, and so on.There is a total of 1000 such lists. This structure can be represented as an array of lists:

var table = new LinkedList[1000]

where LinkedList denotes a linked list of key-value pairs.

Inserting a new record (key, value) is a two-step procedure:

  • we extract the three last digits of the key, hash = key % 1000,
  • and then insert the key and its value into the list located at table[hash].
hash = key % 1000table[hash].AddFirst(key, value)

This is a constant time operation.

A lookup is implemented by

value = table[key%1000].Find(key)

Since the keys are random, there will be roughly the same number of records in each list.Since there are 1000 lists and at most 1000 records, there will likely be very few recordsin the list table[key%1000] and therefore the lookup operation will be fast.

The average time complexityof both the lookup and insert operations isO(1).Using the same technique, deletion can also be implemented in constant average time.

Realistic hash function example

We want to generalize this basic idea to more complicated keys that aren’t evenly distributed.The number of records in each list must remain small,and the records must be evenly distributed over the lists.To achieve this we just need to change the hash function,the function which selects the list where a key belongs.

The hash function in the example above is hash=key%1000.It takes a key (apositive integer) as input and produces a number in the interval 0..999.

In general, a hash function is a function from E to 0..size-1,where E is the set of all possible keys, and size is the number of entry points in the hash table.We want this function to be uniform: it should map the expected inputs as evenly as possible over its output range.

Java’s implementation of hash functions for strings is a good example.The hashCode method in the String class computes thevalue

s[0]·31n-1+ s[1]·31n-2+… +s[n-1]

using int arithmetic, where s[i] is the i:th character of the string,and n is the length of thestring.

This method can be used as a hash function likethis:

hash = Math.abs(s.hashCode() % size)

where size is the number of entry points in the hashtable.

Note that this function

  • depends on all characters in the string,
  • and that the value changes when we change the order of the characters.

Two properties that should hold for a good hash function.

Resizing in constant amortized time

The efficiency of a hash table depends on the fact that the table size is proportional to the number of records.If the number of records is not known in advance,the table must be resized when the lists become toolong:

  • a new larger table is allocated,
  • each record is removed from the old table,
  • and inserted into the new table.

If the table size is increased by a constant factor for each resizing, i.e. by doubling its size,this strategy gives amortized constant time performance for insertions.

For more on the performance of this strategy, see Amortized time complexity.

Share this page:

Hash tables explained [step-by-step example] (2024)

FAQs

What is a hash table with an example? ›

A hash table is a type of data structure in which information is stored in an easy-to-retrieve and efficient manner. In the key-value method, keys are assigned random indexes where their values are stored in an array. The index is the information of where exactly in the array the value is stored.

What is hashing explain with an example? ›

Hashing is designed to solve the problem of needing to efficiently find or store an item in a collection. For example, if we have a list of 10,000 words of English and we want to check if a given word is in the list, it would be inefficient to successively compare the word with all 10,000 items until we find a match.

What is a hash table for dummies? ›

A hash table, also known as a hash map, is a data structure that maps keys to values. It is one part of a technique called hashing, the other of which is a hash function. A hash function is an algorithm that produces an index of where a value can be found or stored in the hash table.

What are the steps of hashing? ›

Hashing is implemented in two steps: An element is converted into an integer by using a hash function. This element can be used as an index to store the original element, which falls into the hash table. The element is stored in the hash table where it can be quickly retrieved using hashed key.

What is a good example of hash function? ›

For example, if the input is 123,456,789 and the hash table size 10,000, squaring the key produces 15,241,578,750,190,521, so the hash code is taken as the middle 4 digits of the 17-digit number (ignoring the high digit) 8750.

How does a hash table look like? ›

A Hash table is defined as a data structure used to insert, look up, and remove key-value pairs quickly. It operates on the hashing concept, where each key is translated by a hash function into a distinct index in an array. The index functions as a storage location for the matching value.

What is a real life example of hashing? ›

Every time you attempt to log in to your email account, your email provider hashes the password YOU enter and compares this hash to the hash it has saved. Only when the two hashes match are you authorized to access your email.

Which of the following is an example of hashing? ›

MD5, Triple DES, and SHA-1 are all examples of cryptographic hash functions. These are therefore suitable for cryptography. Cryptographic hash functions are important for cyber security and it helps in verifying the authenticity of any given data.

What is a hash table in data structure? ›

Hash tables are a type of data structure in which the address/ index value of the data element is generated from a hash function. This enables very fast data access as the index value behaves as a key for the data value.

Why should I use hash table? ›

Hash tables are good for doing a quick search on things. For instance if we have an array full of data (say 100 items). If we knew the position that a specific item is stored in an array, then we could quickly access it.

What makes a good hash table? ›

We have three primary requirements in implementing a good hash function for a given data type: It should be deterministic—equal keys must produce the same hash value. It should be efficient to compute. It should uniformly distribute the keys.

What is the basics of hashing? ›

A hashing algorithm is a mathematical function that takes an input (like a piece of text or a file) and converts it into a fixed-length string of characters, usually numbers or letters. This string called a "hash," is like a unique fingerprint for the input.

How do you use a hashing algorithm? ›

The user taps out the message into a computer running the algorithm. Start the hash. The system transforms the message, which might be of any length, to a predetermined bit size. Typically, programs break the message into a series of equal-sized blocks, and each one is compressed in sequence.

What is the purpose of a hash table? ›

A hash table uses a hash function to compute an index into an array of buckets or slots, from which the correct value can be found. See the below diagram it clearly explains. Advantages: In a well-dimensioned hash table, the average cost for each lookup is independent of the number of elements stored in the table.

Top Articles
Latest Posts
Article information

Author: Tish Haag

Last Updated:

Views: 5708

Rating: 4.7 / 5 (47 voted)

Reviews: 94% of readers found this page helpful

Author information

Name: Tish Haag

Birthday: 1999-11-18

Address: 30256 Tara Expressway, Kutchburgh, VT 92892-0078

Phone: +4215847628708

Job: Internal Consulting Engineer

Hobby: Roller skating, Roller skating, Kayaking, Flying, Graffiti, Ghost hunting, scrapbook

Introduction: My name is Tish Haag, I am a excited, delightful, curious, beautiful, agreeable, enchanting, fancy person who loves writing and wants to share my knowledge and understanding with you.