# Hash Table Operations

You won’t be asked to implement any of these operations since we will just use the built-in class in our chosen language. However, we want you to se behind the scenes and understand how a hash table is implemented directly in code.

The `HashTable`

class has three attributes that provide the basic data structure and parameters we need to efficiently manage our table.

`size`

- captures the number of tuples stored in the table. It is initialized to`0`

in line 11 of the constructor.`loadFactor`

- is the load factor used to determine when to increase the capacity of the array. It is initialized to`0.75`

in line 12 of the constructor.`hashTable`

- is an array of doubly linked lists. The array is actually created in line 7 of the constructor and the doubly linked lists for each location in the array are created in the loop in lines 8 and 9 of the constructor.

```
1class HashTable
2 int size
3 double loadFactor = 0.75
4 doubleLinkedList[] hashTable
5
6 function HashTable()
7 hashTable = new doubleLinkedList[16]
8 for i = 0 to hashTable.length
9 hashTable[i] = new doubleLinkedList()
10 end for
11 size = 0
12 loadFactor = 0.75
13 end function
```

## Compute Index

We assume we are using the language-specific hash code function to actually generate our hash code. However, we need to convert that number (typically a very large integer) to an index for our array, which has a limited capacity. Therefore, we use the modulo function in line 2 to convert the hash code into the range of `0`

to the capacity of the array. The `computeIndex`

operation runs in constant time.

```
1function computeIndex(object key) returns integer
2 return hashCode(key) % getCapacity()
3end function
```

## Put

The `put`

operation is an important hash table operation, whose goal is to place a key-value pair into the right linked list. We start by calling `computeIndex`

to find which linked list we should store the key-value pair in, which we can access using `hashTable[index]`

. Next, we check to see if the `key`

has already been stored in the table by iterating through the list found at that location in the table. If the `key`

has already been stored, we should update the value in that tuple to be the new value in line 9 and return on line 10. If the `key`

isn’t found, then we actually store the key-value pair in the table by calling the `append`

operation for the `hashTable[index]`

linked list on line 15. Notice, we create a new `Tuple`

using the `key`

and `value`

input parameters before we call append. Since we successfully added the new tuple, we increment the `size`

attribute to keep track of the number of tuples stored in the table.

```
1function put(object key, object value)
2 index = computeIndex(key)
3
4 hashTable[index].reset()
5 current = hashTable[index].getNext()
6
7 while current != null
8 if current.key == key
9 current.value = value
10 return
11 end if
12 current = hashTable[index].getNext()
13 end while
14
15 hashTable[index].append(new Tuple(key, value))
16 size = size + 1
17
18 if size/capacity > loadFactor
19 doubleCapacity()
20 end if
21end function
```

Since we have added a new tuple to the table, we need to check to see if the size of the table exceeds our load factor. Therefore, in line 18, we check to see if the value of `size/capacity`

exceeds our `loadFactor`

. If it does, we call the `doubleCapacity`

operation to double the capacity of our array and rehash the table. The `doubleCapacity`

operation is defined below.

Since we are looping through a list of elements in the hash table, it can be difficult to analyze the running time of this method. So, we have to look at both the best case and worst case scenario. In the best case, the linked list is empty, so the entire operation runs in constant time. As long as we have a good hash function and the keys are equally distributed across the hash table, this should be the case.

However, if the hash function is poor, it could cause all of the elements in the hash table to be placed in the same list. In that case, the operation would run on the order of $N$ time, since it would have to iterate across all elements in the list.

## Get

The `get`

operation is another important hash table operation, whose goal is to retrieve (without deleting) a key-value pair from the table. Like the `put`

operation, the first thing we need to do is to compute the index of the `key`

we are looking for in line 2. Once we know the `index`

, we call the `reset`

operation on the `hashTable[index]`

linked list so we can call the `getNext`

iterator function in line 4. Lines 6 - 8 are a loop that walks through each key-value pair in the linked list. If we find the `key`

we are looking for in line 7, we return `true`

and the operation ends. If we do not find the `key`

, we call the `getNext`

function to get the next element in the linked list. If we end up going through the entire loop until `current != null`

becomes false, we fall out of the loop and return null in line 13, indicating that we did not find `key`

in the table.

```
1function get(object key) returns object
2 index = computeIndex(key)
3 hashTable[index].reset()
4 current = hashTable[index].getNext()
5
6 while current != null
7 if current.key == key
8 return current.value
9 end if
10 current = hashTable[index].getNext()
11 end while
12
13 return null
14end function
```

As discussed above, although we do end up looping through one of the linked lists in the hash table, these lists are much, much smaller than the overall size of the hash table under most normal circumstances. Thus, we say that the `get`

operation runs in constant time in the best case.

## Remove

The `remove`

operation is much like the `get`

operation above, except that when we find the `key`

we are searching for, we remove the key-value pair from the appropriate linked-list and return the `value`

to the calling function. The only real difference in the operations lies in the loop intervals in lines 8 and 9. Here, when we find the `key`

in the list, we call the `removeCurrent`

operation to remove the key-value pair from the linked list and then decrement size by 1. Line 10 then returns `current.value`

.

```
1function remove(object key) returns object
2 index = computeIndex(key)
3 hashTable[index].reset()
4 current = hashTable[index].getNext()
5
6 while current != null
7 if current.key == key
8 hashTable[index].removeCurrent()
9 size = size – 1
10 return current.value
11 end if
12 current = hashTable[index].getNext()
13 end while
14
15 return null
16end function
```

Like the `get`

operation above, we loop through one of the linked lists in the hash table. However, given the relatively small size of the list, we assume the `remove`

operation runs in constant time in the best case.

## Contains Key

The `containsKey`

operation returns a Boolean value based on whether we find the requested `key`

in the table. Since the `get`

operation already finds the key-value pair associated with a given `key`

, we simply call `get`

and then compute whether the key-value pair returned from `get`

exists to compute the `containsKey`

return value. The `containsKey`

operation runs in constant time in the best case.

```
1function containsKey(object key) returns boolean
2 return get(key) != null
3end function
```

## Contains Value

The `containsValue`

operation is not as simple as `containsKey`

since we don’t already have an operation to use to search for a value in our hash table. Since we are not given a `key`

to search for, we cannot use the `computeIndex`

to tell us which linked list to search for our `value`

. Therefore, we need to start at the beginning of our array and loop through each element, searching each of the linked lists stored there for our `value`

.

Line 1 is our outside loop that walks through each linked list stored in our array. Within that loop, we follow the same search process we used in the `get`

operation to search through `hashTable[i]`

. We set up the iterator in lines 3 and 4 and then walk through the linked list in the loop in lines 6 - 9. The only difference in this search process is what we are searching for. In line 7, instead of checking if the keys are equal, we check to see if the values are equal. If they are, we return true. However, if we do not find the value somewhere in the table, we must search through every key-value pair in the table, thus the time complexity for `containsValue`

is order $N$.

```
1function containsValue(object value) returns boolean
2 for i = 0 to getCapacity()
3 hashTable[i].reset()
4 current = hashTable[i].getNext()
5
6 while current != null
7 if current.value == value
8 return true
9 current = hashTable[i].getNext()
10 end while
11 end for
12
13 return false
14end function
```

## Get Size

The `getSize`

function is very simple. It simply returns the HashTable class’ `size`

attribute.

```
1function getSize() returns integer
2 return size
3end function
```

## Get Capacity

Like the `getSize`

function, `getCapacity`

simply returns the length of the `hashTable`

array.

```
1function getCapacity() returns integer
2 return length of the hashTable array
3end function
```

## Is Empty

The `isEmpty`

operation simply returns the Boolean value of whether or not the `size`

attribute is 0.

```
1function isEmpty() returns boolean
2 return size == 0
3end function
```

## Copy

The `copy`

operation is similar to the `containsValue`

operation in that it must walk through the entire hash table to get all the key-value pairs in the table and `put`

them into the new hash table. Line 2 creates the new empty hash table, which we call `copy`

.

The `for`

loop in line 4 walks through the `hashTable`

array, allowing us to access each linked list using `hashTable[i]`

. Within the loop we use the linked list iterator functions (lines 5, 6, and 10) to walk through each key-value pair in the linked lists. For each key-value pair, we call `copy.put`

to insert that key-value pair into the `copy`

hash table. Once we have completed both loops, we return the copy in line 14. Like the `containsValue`

operation, since we walk through each key-value pair in the hash table, the `copy`

operation runs in order $N$ time.

```
1function copy() returns HashTable
2 HashTable copy = new HashTable()
3
4 for i = 0 to getCapacity()
5 hashTable[i].reset()
6 current = hashTable[i].getNext()
7
8 while current != null
9 copy.put(current.key, current.value)
10 current = hashTable[i].getNext()
11 end while
12 end for
13
14 return copy
15end function
```

## To String

The `toString`

operation is almost identical to the `copy`

operation above, except that instead of inserting each key-value pair into a second hash table, we append the string representation of each key-value pair to an `output`

string. In fact, the only differences to the pseudocode come in lines 2, 9, and 10. Line 2 creates a new empty string and line 13 returns that string after walking through the hash table. Finally, line 9 is where we append the current key-value pair’s string representation to the `output`

string. We also append a comma to separate the key-value pairs in `output`

. Like the `copy`

operation, the `toString`

operation runs in order $N$ time.

```
1function toString() returns string
2 string output = null
3
4 for i = 0 to getCapacity()
5 hashTable[i].reset()
6 current = hashTable[i].getNext()
7
8 while current != null
9 output += current.toString() + ", "
10 current = hashTable[i].getNext()
11 end while
12 end for
13 return output
14end function
```

## Double Capacity

The `doubleCapacity`

operation is similar to the same operations for the array-based stack and queue implementations that we covered earlier. First, we create a new array with twice the capacity of the existing `hashTable`

. Next, we “rehash” each of the key-value pairs in `hashTable`

into the new table. Finally, we point the `hashTable`

attribute at the new table.

The implementation of this process is captured in the following pseudocode. In line 2, we double the size of `capacity`

. It is important to update `capacity`

first so we can use the new value when creating the new hash table array. It is especially important to use this new capacity when calculating the indexes for key-value pairs in the new table.

```
1function doubleCapacity()
2 capacity = capacity * 2
3 doubleLinkedList[] newTable = new doubleLinkedList[getCapacity()]
4
5 for i = 0 to getCapacity()
6 newTable[i] = new doubleLinkedList()
7 end for
8
9 for i = 0 to getCapacity() / 2
10 hashTable[i].reset()
11 current = hashTable[i].getNext()
12
13 while current != null
14 index = computeIndex(current.key)
15 newTable[index].append(current)
16 current = hashTable[i].getNext()
17 end while
18 end for
19
20 hashTable = newTable
21end function
```

Line 2 creates a new array called `newTable`

with twice the capacity of the existing table. In lines 5 and 6, we create a new doubly linked list for each element of `newTable`

. Then, in lines 9 - 16 we employ the same looping structure used above in `copy`

and `toString`

to walk through each key-value pair in `hashTable`

. Then for each key-value pair, we compute its new `index`

and then append the key-value pair to the linked list at `hashTable[index]`

. Once we have completed the copying process, we fall out of the loop. Our final action is to point the `hashTable`

attribute at `newTable`

. Like the `copy`

and `toString`

operations, the run time of `doubleCapacity`

is order $N$.

## Iterator Operations

While we do not present the iterator operations here, they are useful operations for hash tables. They are implemented similarly to the other iterator functions we have studied, except that in order to walk through the entire hash table we need to use nested loops where the outside loop walks through the array and the internal loop walks through each linked list. This is very similar to the looping structure in `doubleCapacity`

above.