Computer Science – 13.2 File organisation and access | e-Consult
13.2 File organisation and access (1 questions)
Modular arithmetic is a fundamental component of many hash functions. It involves taking the remainder of a division operation. In a hash function, the hash value is often calculated by taking the input key modulo the size of the hash table. This ensures that the hash value falls within the valid range of indices for the table (0 to Table Size - 1).
The choice of the modulus value significantly affects the distribution of hash values. If the modulus is a prime number, it generally leads to a more uniform distribution of hash values, reducing the likelihood of clustering. If the modulus is a power of 2, it can lead to poor distribution, especially if the input keys have patterns that are multiples of the power of 2.
A universal hash function is a family of hash functions, each chosen randomly from a specified distribution. The key benefit of using a universal hash function is that it provides a provable guarantee that the number of collisions will be limited, regardless of the input keys. This avoids the worst-case collision scenarios that can occur with a single, fixed hash function.
Example of a universal hash function: Consider a universal hash function of the form: h(k) = (a * k + b) mod p, where:
- k is the input key.
- a is a randomly chosen integer between 1 and p-1.
- b is a randomly chosen integer between 0 and p-1.
- p is a prime number larger than the largest possible key value.
By choosing 'a' and 'b' randomly, we ensure that the hash function will distribute keys uniformly across the hash table, even if the keys have patterns that would cause collisions with a fixed hash function.