The size of a hash table is another parameter that needs to be carefully adjusted. As with the hash function, an intuitive analysis can be done to see the effects of choosing extreme table sizes. For example, if the size is too small, let us say 1, again the behavior of the table degrades to a linked list. However, if table size is very large, memory efficiency is lost, because there is a large number of positions wasted with an empty collision list.
When collision lists are used, table size is calculated based on a concept called “density”. The density of a hash table is defined as the quotient between the total number of elements stored in the table and the total size of the array implementing the hash table. Assuming the hash function produces a uniform distribution of indexes, a high density value means collision lists too large. A small density value denotes a short length in the collision lists and therefore too much memory being used.
The most complex hash table implementation include two thresholds for the density. For each insertion or deletion operation over the table, these two values are checked. If the density goes over its upper threshold, a bigger table is created and all elements are transferred from the old table to achieve a lower density. Analogously, if an element is deleted and the density goes below its lower threshold, a new smaller table is created and the same procedure than before is performed to achieve a higher density. Resizing the table is expensive in terms of CPU time, but it is assumed that it will be compensated with enough future searching operations because the collision lists are kept to a reasonable length.
Check with these questions if you understood this document
Answer the following questions to see if you understood what the content of this document:
For a given hash
table, its density is: