UC3M

Telematic/Audiovisual Syst./Communication Syst. Engineering

Systems Architecture

September 2017 - January 2018

7.6.  Table Size

The size of a hash table is another parameter that needs to be carefully adjusted. As with the hash function, an intuitive analysis can be done to see the effects of choosing extreme table sizes. For example, if the size is too small, let us say 1, again the behavior of the table degrades to a linked list. However, if table size is very large, memory efficiency is lost, because there is a large number of positions wasted with an empty collision list.

When collision lists are used, table size is calculated based on a concept called density. The density of a hash table is defined as the quotient between the total number of elements stored in the table and the total size of the array implementing the hash table. Assuming the hash function produces a uniform distribution of indexes, a high density value means collision lists too large. A small density value denotes a short length in the collision lists and therefore too much memory being used.

The most complex hash table implementation include two thresholds for the density. For each insertion or deletion operation over the table, these two values are checked. If the density goes over its upper threshold, a bigger table is created and all elements are transferred from the old table to achieve a lower density. Analogously, if an element is deleted and the density goes below its lower threshold, a new smaller table is created and the same procedure than before is performed to achieve a higher density. Resizing the table is expensive in terms of CPU time, but it is assumed that it will be compensated with enough future searching operations because the collision lists are kept to a reasonable length.

Check with these questions if you understood this document

Answer the following questions to see if you understood what the content of this document:

  1. For a given hash table, its density is:

    • The maximum number of indexes that could retrieve its asociated hash function.

    • The ratio between the size of the hash table and the number of stored elements.

    • The ratio between the number of stored elements and the size of the hash table.