# Hash Table Operations

The `HashTable`

class has three attributes that provide the basic data structure and parameters we need to efficiently manage our table.

`size`

- captures the number of tuples stored in the table. It is initialized to`0`

in line 7 of the constructor.`loadFactor`

- is the load factor used to determine when to increase the capacity of the array. It is initialized to`0.75`

in line 8 of the constructor.`hashTable`

- is an array of doubly linked lists. The array is actually created in line 4 of the constructor and the doubly linked lists for each location in the array are created in the loop in lines 5 and 6 of the constructor.

```
class HashTable
int size (1)
double loadFactor = 0.75 (2)
doubleLinkedList[] hashTable (3)
function HashTable()
hashTable = new doubleLinkedList[16] (4)
for i = 0 to hashTable.length (5)
hashTable[i] = new doubleLinkedList() (6)
end for
size = 0 (7)
loadFactor = 0.75 (8)
end function
```

## Compute Index

We assume we are using the language-specific hash code function to actually generate our hash code. However, we need to convert that number (typically a very large integer) to an index for our array, which has a limited capacity. Therefore, we use the modulo function in line 1 to convert the hash code into the range of `0`

to the capacity of the array. The `computeIndex`

operation runs in constant time.

```
function computeIndex(object key) returns integer
return hashCode(key) % getCapacity() (1)
end function
```

## Put

The `put`

operation is an important hash table operation, whose goal is to place a key-value pair into the right linked list. We start by calling `computeIndex`

to find which linked list we should store the key-value pair in, which we can access using `hashTable[index]`

. Next, we check to see if the `key`

has already been stored in the table by iterating through the list found at that location in the table. If the `key`

has already been stored, we should update the value in that tuple to be the new value in line 6 and return on line 7. If the `key`

isn’t found, then we actually store the key-value pair in the table by calling the `append`

operation for the `hashTable[index]`

linked list on line 9. Notice, we create a new `Tuple`

using the `key`

and `value`

input parameters before we call append. Since we successfully added the new tuple, we increment the `size`

attribute to keep track of the number of tuples stored in the table.

```
function put(object key, object value)
index = computeIndex(key) (1)
hashTable[index].reset() (2)
current = hashTable[index].getNext() (3)
while current != null (4)
if current.key == key (5)
current.value = value (6)
return (7)
end if
current = hashTable[index].getNext() (8)
end while
hashTable[index].append(new Tuple(key, value)) (9)
size = size + 1 (10)
if size/capacity > loadFactor (11)
doubleCapacity() (12)
end if
end function
```

Since we have added a new tuple to the table, we need to check to see if the size of the table exceeds our load factor. Therefore, in line 11, we check to see if the value of `size/capacity`

exceeds our `loadFactor`

. If it does, we call the `doubleCapacity`

operation to double the capacity of our array and rehash the table. The `doubleCapacity`

operation is defined below.

Since we are looping through a list of elements in the hash table, it can be difficult to analyze the running time of this method. So, we have to look at both the best case and worst case scenario. In the best case, the linked list is empty, so the entire operation runs in constant time. As long as we have a good hash function and the keys are equally distributed across the hash table, this should be the case.

However, if the hash function is poor, it could cause all of the elements in the hash table to be placed in the same list. In that case, the operation would run on the order of $N$ time, since it would have to iterate across all elements in the list.

## Get

The `get`

operation is another important hash table operation, whose goal is to retrieve (without deleting) a key-value pair from the table. Like the `put`

operation, the first thing we need to do is to compute the index of the `key`

we are looking for in line 1. Once we know the `index`

, we call the `reset`

operation on the `hashTable[index]`

linked list so we can call the `getNext`

iterator function in line 3. Lines 4 -7 are a loop that walks through each key-value pair in the linked list. If we find the `key`

we are looking for in line 5, we return `true`

and the operation ends. If we do not find the `key`

, we call the `getNext`

function to get the next element in the linked list. If we end up going through the entire loop until `current != null`

becomes false, we fall out of the loop and return null in line 8, indicating that we did not find `key`

in the table.

```
function get(object key) returns object
index = computeIndex(key) (1)
hashTable[index].reset() (2)
current = hashTable[index].getNext() (3)
while current != null (4)
if current.key == key (5)
return current.value (6)
end if
current = hashTable[index].getNext() (7)
end while
return null (8)
end function
```

As discussed above, although we do end up looping through one of the linked lists in the hash table, these lists are much, much smaller than the overall size of the hash table under most normal circumstances. Thus, we say that the `get`

operation runs in constant time in the best case.

## Remove

The `remove`

operation is much like the `get`

operation above, except that when we find the `key`

we are searching for, we remove the key-value pair from the appropriate linked-list and return the `value`

to the calling function. The only real difference in the operations lies in the loop intervals in lines 6 and 7. Here, when we find the `key`

in the list, we call the `removeCurrent`

operation to remove the key-value pair from the linked list and then decrement size by 1. Line 8 then returns `current.value`

.

```
function remove(object key) returns object
index = computeIndex(key) (1)
hashTable[index].reset() (2)
current = hashTable[index].getNext() (3)
while current != null (4)
if current.key == key (5)
hashTable[index].removeCurrent() (6)
size = size – 1 (7)
return current.value (8)
end if
current = hashTable[index].getNext() (9)
end while
return null (10)
end function
```

Like the `get`

operation above, we loop through one of the linked lists in the hash table. However, given the relatively small size of the list, we assume the `remove`

operation runs in constant time in the best case.

## Contains Key

The `containsKey`

operation returns a Boolean value based on whether we find the requested `key`

in the table. Since the `get`

operation already finds the key-value pair associated with a given `key`

, we simply call `get`

and then compute whether the key-value pair returned from `get`

exists to compute the `containsKey`

return value. The `containsKey`

operation runs in constant time in the best case.

```
function containsKey(object key) returns boolean
return get(key) != null (1)
end function
```

## Contains Value

The `containsValue`

operation is not as simple as `containsKey`

since we don’t already have an operation to use to search for a value in our hash table. Since we are not given a `key`

to search for, we cannot use the `computeIndex`

to tell us which linked list to search for our `value`

. Therefore, we need to start at the beginning of our array and loop through each element, searching each of the linked lists stored there for our `value`

.

Line 1 is our outside loop that walks through each linked list stored in our array. Within that loop, we follow the same search process we used in the `get`

operation to search through `hashTable[i]`

. We set up the iterator in lines 2 and 3 and then walk through the linked list in the loop in lines 4-7. The only difference in this search process is what we are searching for. In line 5, instead of checking if the keys are equal, we check to see if the values are equal. If they are, we return true. However, if we do not find the value somewhere in the table, we must search through every key-value pair in the table, thus the time complexity for `containsValue`

is order $N$.

```
function containsValue(object value) returns boolean
for i = 0 to getCapacity() (1)
hashTable[i].reset() (2)
current = hashTable[i].getNext() (3)
while current != null (4)
if current.value == value (5)
return true (6)
current = hashTable[i].getNext() (7)
end while
end for
return false (8)
end function
```

## Get Size

The `getSize`

function is very simple. It simply returns the HashTable class’ `size`

attribute.

```
function getSize() returns integer
return size (1)
end function
```

## Get Capacity

Like the `getSize`

function, `getCapacity`

simply returns the length of the `hashTable`

array.

```
function getCapacity() returns integer
return length of the hashTable array (1)
end function
```

## Is Empty

The `isEmpty`

operation simply returns the Boolean value of whether or not the `size`

attribute is 0.

```
function isEmpty() returns boolean
return size == 0 (1)
end function
```

## Copy

The `copy`

operation is similar to the `containsValue`

operation in that it must walk through the entire hash table to get all the key-value pairs in the table and `put`

them into the new hash table. Line 1 creates the new empty hash table, which we call `copy`

.

The `for`

loop in line 2 walks through the `hashTable`

array, allowing us to access each linked list using `hashTable[i]`

. Within the loop we use the linked list iterator functions (lines 3, 4, and 7) to walk through each key-value pair in the linked lists. For each key-value pair, we call `copy.put`

to insert that key-value pair into the `copy`

hash table. Once we have completed both loops, we return the copy in line 8. Like the `containsValue`

operation, since we walk through each key-value pair in the hash table, the `copy`

operation runs in order $N$ time.

```
function copy() returns HashTable
HashTable copy = new HashTable() (1)
for i = 0 to getCapacity() (2)
hashTable[i].reset() (3)
current = hashTable[i].getNext() (4)
while current != null (5)
copy.put(current.key, current.value) (6)
current = hashTable[i].getNext() (7)
end while
end for
return copy (8)
end function
```

## To String

The `toString`

operation is almost identical to the `copy`

operation above, except that instead of inserting each key-value pair into a second hash table, we append the string representation of each key-value pair to an `output`

string. In fact, the only differences to the pseudocode come in lines 1, 6, and 7. Line 1 creates a new empty string and line 8 returns that string after walking through the hash table. Finally, line 6 is where we append the current key-value pair’s string representation to the `output`

string. We also append a comma to separate the key-value pairs in `output`

. Like the `copy`

operation, the `toString`

operation runs in order $N$ time.

```
function toString() returns string
string output = null (1)
for i = 0 to getCapacity() (2)
hashTable[i].reset() (3)
current = hashTable[i].getNext() (4)
while current != null (5)
output += current.toString() + ", " (6)
current = hashTable[i].getNext() (7)
end while
end for
return output (8)
end function
```

## Double Capacity

The `doubleCapacity`

operation is similar to the same operations for the array-based stack and queue implementations that we covered earlier. First, we create a new array with twice the capacity of the existing `hashTable`

. Next, we “rehash” each of the key-value pairs in `hashTable`

into the new table. Finally, we point the `hashTable`

attribute at the new table.

The implementation of this process is captured in the following pseudocode. In line 1, we double the size of `capacity`

. It is important to update `capacity`

first so we can use the new value when creating the new hash table array. It is especially important to use this new capacity when calculating the indexes for key-value pairs in the new table.

```
function doubleCapacity()
capacity = capacity * 2 (1)
doubleLinkedList[] newTable = new doubleLinkedList[getCapacity()] (2)
for i = 0 to getCapacity() (3)
newTable[i] = new doubleLinkedList() (4)
end for
for i = 0 to getCapacity() / 2 (5)
hashTable[i].reset() (6)
current = hashTable[i].getNext() (7)
while current != null (8)
index = computeIndex(current.key) (9)
newTable[index].append(current) (10)
current = hashTable[i].getNext() (11)
end while
end for
hashTable = newTable (12)
end function
```

Line 2 creates a new array called `newTable`

with twice the capacity of the existing table. In lines 3 and 4, we create a new doubly linked list for each element of `newTable`

. Then, in lines 5 – 11 we employ the same looping structure used above in `copy`

and `toString`

to walk through each key-value pair in `hashTable`

. Then for each key-value pair, we compute its new `index`

and then append the key-value pair to the linked list at `hashTable[index]`

. Once we have completed the copying process, we fall out of the loop. Our final action is to point the `hashTable`

attribute at `newTable`

. Like the `copy`

and `toString`

operations, the run time of `doubleCapacity`

is order $N$.

## Iterator Operations

While we do not present the iterator operations here, they are useful operations for hash tables. They are implemented similarly to the other iterator functions we have studied, except that in order to walk through the entire hash table we need to use nested loops where the outside loop walks through the array and the internal loop walks through each linked list. This is very similar to the looping structure in `doubleCapacity`

above.