Symbol Tables / Dictionaries

A symbol table (also called a dictionary) is an abstract data type that associates different keys (a.k.a. symbols) with various values. We use the word "value" here in a very general sense -- it often the case that the value in question is a reference to an object that contains a considerable amount of information.

As a simple example, consider an actual physical dictionary -- the kind you find on your bookshelf. Every word is associated with some definition, as the table below suggests:

word (key) definition (value)
aardvarka large, nocturnal, burrowing ant-eating mammal
boisterousrough and noisy; clamorous
cringeto shrink, bend or crouch, especially in fear

The primary operations a symbol table needs to accomplish are:

  1. the insertion of a new key-value pair (which includes changing the value of an existing key to something new); and
  2. being able to search the symbol table for a given key to return its associated value.

To understand why, think again about our example of a physical dictionary. Someone had to write it -- all of these word-definition pairs had to be initially "inserted" into the book. Further, think about how a dictionaries are used -- one can quickly search for some unrecognized word (the key) to find its definition (the value), but they are generally not used in the other direction (i.e., to find a word with a given definition).

Some example applications of symbol tables in various contexts, along with their respective keys sought and associated values include:

application purpose of search key value
dictionary find definition word definition
book index find relevant pages term list of page numbers
file share find song to download name of song computer ID
account management process transactions account number transaction details
web search find relevant web pages keyword list of page names
compiler find type and value variable name type and value
routing table route internet packets destination best route
DNS find IP address domain name IP address
reverseDNS find domain name IP address domain name
genomics find markers DNA string known positions
file system find file on disk filename location on disk

In some of these other contexts it may also be useful to be able to determine the number of pairs the symbol table holds or if it is empty; to determine if there is a value associated with a given key; to delete a key (and its associated value) from the symbol table; and to be able to iterate through all of the keys in the table.

With these other uses in mind, we can use the following as a reasonable SymbolTable interface:

public interface SymbolTable<Key, Value> {
   void put(Key key, Value value);  // inserts a key-value pair into the table,
                                    // or updates the value associated with a key if it 
                                    // is already present
   boolean contains(Key key);       // returns true if the table contains the specified key
   Value get(Key key);              // returns the value paired with given key
   void delete(Key key);            // removes the given key (and its associated value)
                                    // from the table
   boolean isEmpty();               // returns true if the table is empty
   int size();                      // returns the number of key-value pairs in the table
   Iterable<Key> keys();            // returns an Iterable collection of all keys in the table

While the values involved can be any generic type, having the keys be of a type that implements the Comparable interface will increase our options when it comes to how the symbol table can be implemented.

Additionally, it is a best practice that the keys be of an immutable type (e.g., Integer, Double, String, etc.)

Some Easy, but Inefficient Implementations

There is, however, another and better way -- one involving a new data structure called a binary search tree...