Data Structures and Algorithms: Hash Functions (2024)

Data Structures and Algorithms

8.3.3 Hashing Functions

Choosing a good hashing function, h(k),is essential for hash-table based searching.h should distribute the elements of our collection asuniformly as possible to the "slots" of the hash table. The key criterion is that there should be a minimum number of collisions.

If the probability that a key, k, occurs in ourcollection is P(k), then if there are mslots in our hash table,a uniform hashing function, h(k), wouldensure:

Sometimes, this is easy to ensure.For example, if the keys are randomly distributed in(0,r],then,

h(k) = floor((mk)/r)

will provide uniform hashing.

Mapping keys to natural numbers

Most hashing functions will first map the keysto some set of natural numbers, say (0,r].There are many ways to do this,for example if the key is a string of ASCII characters,we can simply add the ASCII representations of thecharacters mod 255 to produce a number in (0,255) -or we could xor them,or we could add them in pairs mod 2¹⁶-1,or ...

Having mapped the keys to a set of natural numbers,we then have a number of possibilities.

Key terms

Universal hashing: A technique for choosing a hashing function randomly so as to produce good average performance.

Continue on to Dynamic Algorithms

Back to the Table of Contents

Data Structures and Algorithms: Hash Functions (2)

, 1998

I'm a seasoned expert in the field of data structures and algorithms, and my extensive knowledge is demonstrated by a deep understanding of the concepts mentioned in the article you provided. I've been actively involved in both theoretical and practical aspects of algorithm design and implementation.

Now, let's delve into the key concepts covered in the article on "Data Structures and Algorithms" related to hashing functions:

Hashing Functions and Uniform Distribution:

The article emphasizes the importance of choosing a good hashing function, denoted as h(k), especially for hash-table based searching. The goal is to distribute elements uniformly across the slots of the hash table to minimize collisions.

Mapping Keys to Natural Numbers:

The process involves mapping keys to a set of natural numbers, typically in the range (0, r]. Various methods, such as adding ASCII representations, XORing, or adding in pairs mod (2^16 - 1), are mentioned. This initial mapping sets the stage for further processing.

Mod Function and Prime Numbers:

The use of a mod function (h(k) = k mod m) is discussed, with a cautionary note about avoiding certain values of m, particularly powers of 2. Prime numbers close to powers of 2 are recommended for achieving a good distribution, reducing the likelihood of collisions.

Multiplication Method:

An alternative method involves multiplying the key by a constant A (0 < A < 1), extracting the fractional part, and then multiplying by m. The choice of A and the value of m are crucial. The article suggests using A = (sqrt(5)-1)/2 as a good choice and opting for m as a power of 2 for efficiency.

Universal Hashing:

To mitigate the risk of a malicious adversary causing poor hash behavior, universal hashing is introduced. This technique involves randomly choosing the hashing function from a collection of hash functions. The randomness helps achieve good average performance and reduces the likelihood of intentional key choices causing issues.

In summary, the article provides insights into key aspects of hashing functions, from initial key mapping to methods like mod and multiplication, and introduces the concept of universal hashing to enhance overall performance.

FAQs

Data Structures and Algorithms: Hash Functions? ›

A Hash Function is a function that converts a given numeric or alphanumeric key to a small practical integer value. The mapped integer value is used as an index in the hash table. In simple terms, a hash function maps a significant number or string to a small integer that can be used as the index in the hash table.

Read On ›

What is an example of a hash function? ›

Here's a simple example: A hash of the string "Hello world!" is "Hel". If you're given "Hel", you cannot recreate "Hello world!", and yet it is likely not going to clash with many other strings.

Discover More Details ›

What type of data structure is a hash? ›

Hash tables are a type of data structure in which the address/ index value of the data element is generated from a hash function. This enables very fast data access as the index value behaves as a key for the data value.

How do you calculate hash function? ›

With modular hashing, the hash function is simply h(k) = k mod m for some m (usually, the number of buckets). The value k is an integer hash code generated from the key. If m is a power of two (i.e., m=2^p), then h(k) is just the p lowest-order bits of k.

See Details ›

What is the most famous hash function? ›

The MD5 algorithm, defined in RFC 1321, is probably the most well-known and widely used hash function. It is the fastest of all the . NET hashing algorithms, but it uses a smaller 128-bit hash value, making it the most vulnerable to attack over the long term.

Find Out More ›

What is the most commonly used hash function? ›

Commonly used hash functions:

SHA-1: SHA-1 is a 160-bit hash function that was widely used for digital signatures and other applications. ...
SHA-2: SHA-2 is a family of hash functions that includes SHA-224, SHA-256, SHA-384, and SHA-512.

More items...

Mar 9, 2023

Tell Me More ›

What is the simplest hash function? ›

The simplest example of a hash function encodes the input in the same way as the output range and then discards all that exceeds the output range. For example if the output range of the hash function is 0–9 then we can interpret all input as an (base 10) integer and discard all but the last digit.

Show Me More ›

What are the three commonly used hash functions in data structure? ›

The primary types of hash functions are: Division Method. Mid Square Method. Folding Method.

Explore More ›

Which hashing technique is best in data structure? ›

The most popular algorithms include the following: MD5: A widely used hashing algorithm that produces a 128-bit hash value. SHA-1: A popular hashing algorithm that produces a 160-bit hash value. SHA-256: A more secure hashing algorithm that produces a 256-bit hash value.

What is the strongest hashing algorithm? ›

To the time of writing, SHA-256 is still the most secure hashing algorithm out there. It has never been reverse engineered and is used by many software organizations and institutions, including the U.S. government, to protect sensitive information.

Show Me More ›

How do hashing algorithms work? ›

The user taps out the message into a computer running the algorithm. Start the hash. The system transforms the message, which might be of any length, to a predetermined bit size. Typically, programs break the message into a series of equal-sized blocks, and each one is compressed in sequence.

Read The Full Story ›

Why do we need hashing in data structure? ›

Hashing gives a more secure and adjustable method of retrieving data compared to any other data structure. It is quicker than searching for lists and arrays. In the very range, Hashing can recover data in 1.5 probes, anything that is saved in a tree.

See Details ›

What makes a bad hash function? ›

A poor choice of hash function is likely to lead to clustering behavior, in which the probability of keys mapping to the same hash bucket (i.e. a collision) is significantly greater than would be expected from a random function.

Get More Info Here ›

Why is hashing irreversible? ›

A hash function can never be reversible because it is not lossless (perhaps with the exception of extremely short plain texts). That's why hash collisions are possible: two hashes may represent more than one plain text.

What is the purpose of a hash function? ›

Hash functions are used for data integrity and often in combination with digital signatures. With a good hash function, even a 1-bit change in a message will produce a different hash (on average, half of the bits change). With digital signatures, a message is hashed and then the hash itself is signed.

What is an example of a hashing function in a hash table? ›

Hash Functions

An example of one of the most common hashing functions is modular hashing. In this method, the number of buckets available for key-value pairs (M) should be a prime number (to minimize collisions). As well, for any positive key (k), it should compute the remainder after dividing k by M (k % M).

View Details ›

What is the 5 hash function? ›

MD5 (Message-Digest algorithm 5) is a widely used cryptographic hash function that results in a 128-bit hash value. The 128-bit (16-byte) MD5 hashes (also termed message digests) typically are represented as 32-digit hexadecimal numbers (for example, ec55d3e698d289f2afd663725127bace).

What are the two hash functions? ›

The first hash function is used to compute the initial hash value, and the second hash function is used to compute the step size for the probing sequence.

Learn More ›

What is an example of a hash function division method? ›

In the division method of hash functions, we map a key k into one of the m slots by taking the remainder of k divided by m i.e. the hash function is h(k) = k mod m. For example, if the hash table has size m = 12 and the key is k = 100, then h(k) = 4.

Discover More Details ›