Computer Science – 13.2 File organisation and access | e-Consult
13.2 File organisation and access (1 questions)
A good hash function should possess the following properties:
- Uniformity: It should distribute keys evenly across the hash table, minimizing collisions.
- Deterministic: For a given key, it should always produce the same hash value.
- Efficiency: It should be computationally fast to calculate.
The load factor of a hash table is the ratio of the number of elements stored in the table to the size of the table (number of slots). It is calculated as: Load Factor = (Number of Elements) / (Table Size). A high load factor indicates that the table is becoming full, leading to more collisions and slower performance. As the load factor increases, the average lookup time increases towards O(n) in the worst case.
Strategies for managing the load factor include:
- Resizing: When the load factor exceeds a certain threshold (e.g., 0.75), the hash table is resized to a larger table. All elements are rehashed and reinserted into the new table.
- Choosing a suitable table size: Selecting a prime number for the table size can help to reduce collisions, especially when using certain hash functions.
Example of a simple hash function: A very basic hash function could be summing the ASCII values of the characters in a string and taking the result modulo the table size. For example, for the string "abc" with a table size of 10, the hash value would be (97 + 98 + 99) % 10 = 294 % 10 = 4. Weakness: This function is prone to collisions because different strings can easily have the same sum of ASCII values. Strings with similar characters will have similar hash values, leading to clustering.