Exercises - Hash Tables

  1. Explain what the following function does and how it does it. Be specific.

    private int hash(Key key) {
      return (key.hashCode() & 0x7fffffff) % m;
    }
    

    key.hashCode() could be negative or bigger than or equal to $m$. We need hash(Key key) to return an integer suitable as an index of an element in an array of size $m$. This means we must produce a non-negative integer less than $m$. The bitwise AND ('&') of hashCode() with 0x7fffffff forces the first bit of key.hashCode() to be zero (without changing anything else), which creates a positive integer, and finding this value mod $m$ ensures the result is less than $m$.

  2. Eight objects with hash codes of 217, 209, 265, 226, 234, 201, 207, and 223 are to be stored in an initially empty hash table using an array of size 8 with separate chaining to resolve collisions. Find the lengths of the chains involved after these eight objects are inserted into the hash table.

    $3$ chains are formed of lengths $4$, $2$, and $2$, with the rest $0$ long.

  3. In a given hash table with an associated array of size $11$, integer keys of 9, 26, 50, 15, 2, 21, 36, 22, and 32, are inserted in that order. For each of the collision-resolution methods listed below, draw the internal structure of the hash table after these insertions have been made.

    1. linear probing (assume array is never resized)
    2. separate chaining

    1. [22,32,2,36,26,15,50,-,-,9,21]
      
    2.  0 : 22
       1 : -
       2 : 2
       3 : 36
       4 : 15 26
       5 : -
       6 : 50
       7 : -
       8 : - 
       9 : 9
      10 : 32 21
      
  4. You may find the table below of ASCII values and the largest positive int value of $2147483647$ helpful as you answer the following question: $$\begin{array}{c|c|c|c|c|c|c|c|c|c|c|c|c} A&B&C&D&E&F&G&H&I&J&K&L&M\\\hline 65&66&67&68&69&70&71&72&73&74&75&76&77\\\hline\hline N&O&P&Q&R&S&T&U&V&W&X&Y&Z\\\hline 78&79&80&81&82&83&84&85&86&87&88&89&90 \end{array}$$ The strings "THEF", "ERNA", "PAUC", and "LSIH" are stored in an initially empty hash table h, that uses linear probing to resolve collisions and and resizes by doubling in size before insertions when half full. The hash table that results is shown below. $$\begin{array}{c|c|c|c|c|c|c|c} 0 & 1 & 2 & 3 & 4 & 5 & 6 & 7\\\hline \texttt{"ERNA"} & - & - & - & - & \texttt{"THEF"} & \texttt{"LSIH"} & \texttt{"PAUC"} \end{array}$$
    1. Determine the value of "LARGE".hashCode()"

    2. Determine the sign of "LARGER".hashCode()" (i.e., positive or negative)

    3. If "LSIH" is removed from the hash table, how many values are rehashed as a result?
      i) 0   ii) 1   iii) 2   iv) 3

    1. $72205083$

    2. negative, (specifically, the value is $-2056609641$)

    3. Two (namely, "PAUC" and "ERNA")

  5. Complete the class HashTableQuadraticProbe so that it implements a hash table using the quadratic probing where the array size $m$ is always an integer power of $2$ and the cell with index $p(k,i)$ as defined below is probed after the $i^{th}$ collision in seeking to insert (or retrieve) key $k$ with hash function $h$.

    $$p(k,i) = \left[ h(k) + \frac{1}{2}i + \frac{1}{2}i^2 \right]\pmod{m}$$

    Recall, this means that for the insertion (or retrieval) of key $k$ the sequence of indices probed starts at $h(k)$ and makes successively larger jumps to the right with each collision, wrapping back to the start of the array as needed. In particular, the jumps made are of size $1,2,3,\ldots$.

    Your HashTableQuadraticProbe class should support deletion of key-value pairs and resizing (double size when 1/2 full, halve size when 1/8th full). Do not add any instance variables or methods to this class beyond those already provided.

    The HashTableQuadraticProbeTest class, which takes the following actions, can be used to test your code.

    1. The user will first be prompted to enter some lowercase characters to store as keys in the hash table, using their uppercase equivalents as the corresponding values.

    2. Then, the internal state of the resulting hash table is printed.

    3. The uppercase values stored using the lowercase characters as keys are then retrieved and printed (in the same order they were inserted into the hash table).

    4. Finally, to test retrieval after one or more deletions, the user is then prompted to enter some lowercase keys to delete from the hash table (along with their corresponding values, of course). These are deleted and again the uppercase values stored using the lowercase characters as keys are retrieved and printed (in the same order they were inserted into the hash table). Of course, some of these will now be missing. Dashes will be printed for these missing keys.

    A sample run is provided below to help you debug your code.

    $ java HashTableQuadraticProbe↵
    Enter some lowercase characters to store as keys in the hash table 
    (Their uppercase equivalents will be used as the corresponding values):
    quadratic↵
    
    0 : null/null
    1 : q/Q
    2 : a/A
    3 : r/R
    4 : d/D
    5 : u/U
    6 : c/C
    7 : t/T
    8 : null/null
    9 : i/I
    10 : null/null
    11 : null/null
    12 : null/null
    13 : null/null
    14 : null/null
    15 : null/null
    
    Testing retrieval:
    QUADRATIC
    
    Now enter one or more lowercase character keys to delete from the hash table: 
    rq↵
    
    Testing retrieval after deletion:
    -UAD-ATIC
    

    If one simply deletes a key, then a later search for some other key that followed the deleted key's same probe chain may fail. To combat this potential problem, use the provided "tombstone" array to mark deleted keys. Insertions should be allowed to put an item in a position marked with a tombstone, but searches for a key shouldn't stop at a tombstone.

  6. Complete the UniqueFun class, which relies on the IndexedStringCollection to the following ends:

    In an indexed collection, like an array, define the window at $i$ of size $k$ to be the collection of $k$ elements from index $i$ (inclusive) to index $i+k$ (exclusive). Note, not every index $i$ has a window of size $k$.

    The existing code in UniqueFun fills an indexed collection of strings with 15 names randomly selected from a list of 8. Obviously some will be duplicated. Add code in the specified area of the class UniqueFun (and only there), so that for each window of size 5, starting at $1,2,3,\ldots$ the code computes and prints the number of unique names in that window (i.e., the names that appear only once), as shown in the sample run below. For example, in the second window one has the names {elli, harold, fred, harold, harold}, in which only 2 names appear once -- elli and fred. Consequently, the second integer printed is $2$. You can change the seed used in the random number generator to test your code on different lists of random names, if desired.

    There are two challenging limitations that you must adhere to as you address this program. These are spelled out in the comments, but to reiterate them here: In your UniqueFun class, you are limited to working with only the existing declared variables and one additional String variable as it might be needed to support for-each loops. Second, the IndexedStringCollection class has the peculiar behavior that any single index can only be accessed with get() twice before it throws an exception. This last restriction essentially forces you to count the unique elements in each window with a $O(n)$ algorithm.

    $ java HashTableQuadraticProbe↵
    [fred,elli,harold,fred,harold,harold,bob,ginny,harold,harold,bob,doug,elli,doug,bob]
    1 2 2 3 2 2 1 3 3 3 1
    

    You have Hashtable imported at the beginning of UniqueFun -- use it!