As I continue my study of algorithms and data structures using [Wayne and Sedgewick Algorithms] text, I figured it was time to write some notes about symbol tables.

Symbol tables are a useful ADT for performing quick inserts and lookups given a key-value pair.

Fundamental behaviors of a symbol table:

• a value is inserted with a specified key; a key is unique (no duplicates)
• furthermore, when a client puts a key-value pair into a symbol table that contains the key, it updates the value instead of adding a new entry
• given a key, search for a corresponding value
• a key cannot be `null`

The API for a symbol table generally has the following methods:

``````    void put(Key k, Value v)    // put k-v pair into the table (a[k] = v)
Value get(Key k)            // value paired with key (a[k])
void delete(Key k)          // remove key (and value) from table
boolean contains(Key k)     // is there a value pareid with key?
boolean isEmpty()           // is the table empty?
int size()                  // number of k-v pairs in the table
Iterable<Key> keys()        // all keys in the table
``````

While Java does not offer the `a[k] = v` syntax for `put` or `a[k]` for `get`, these expressions eloquently display the idea behind how they work.

The symbol table ADT can be implemented with a linked list, array, or BST as its underlying data structure. While the first two examples are not viable options to implement a symbol table, it’s helpful to analyze these implementations to get a firmer grasp of why linked lists and ordered arrays can fall short as the underlying data structures.

### Sequential Search Symbol Table (linked list implementation)

``````public class SymbolTable<Key, Value> {
private Node first;

private class Node {
Key k;
Value v;
Node next;

public Node(Key k, Value v, Node next) {
this.k = k;
this.v = v;
this.next = next;
}
}

// search for key, return associated value
public Value get(Key k) {
for (Node x = first; x != null; x = x.next)
if (k.equals(x.k)) return x.v;
// search miss
return null;
}

// search for key. update value if found. grow table if new
public void put(Key k, Value v) {
for (Node x = first; x != null; x = x.next) {
// search hit. update
if (k.equals(x.k)) {
x.v = v;
return;
}
}

// search miss: add new node
// ("old" first is now "next" when key is new)
first = new Node(k, v, first);
}
}
``````

A linked list symbol table will perform poorly for large datasets. `put` and `get` will run in linear time in the worst case, as both methods require traversing the list to perform their respective operations.

### Binary Search Symbol Table (ordered array implementation)

Another way to go about implementing a symbol table is with an ordered array. The reason the array is ordered is to utilize binary search for insertions and retrievals.

``````public class SymbolTable extends Comparable<Key, Val> {
private Key[] keys;
private Val[] vals;
private int N;

public  BinarySearchST(int capacity) {
// also would use array resizing code
keys = (Key[]) new Comparable[capacity];
vals = (Value[]) new Object[capacity];
}

public int size() { return N; }

public Value get(Key key) {
if (isEmpty()) reutrn null;

int i = rank(key);

if (i < N && keys[i].compareTo(key) == 0) return vals[i];
else return null;
}

// recursive binary search. returns index of key in table
public int rank(Key key, int lo, int hi) {
if (hi < lo) return lo;

int mid = lo + (hi - lo) / 2;
int cmp = key.compareTo(keys[mid]);

// search key less than mid value (search left half of subarray)
if (cmp < 0) return rank(key, lo, mid - 1);

// search key greater than mid value (search right half of subarray)
else if (cmp > 0) return rank(key, mid + 1, hi);

// key in middle and search key are equal
else return mid;
}

// search for key. update value if found. grow table if new
public void put(Key key, Value val) {
int i = rank(key);

// update
if (i < N && keys[i].compareTo(k) == 0) {
vals[i] = val;
return;
}

// move items over in parallel arrays to make room for new entry
for (int j = N; j > i; j--) {
keys[j] = keys[j-1];
vals[j] = vals[j-1];
}

// perform insertions in the parallel arrays
keys[i] = key;
vals[i] = val;
N++;
}
}
``````

Note: for the sake of simplicity, I left out some beneficial overloaded constructor methods and array resizing code.

`put()` in this case is `O(n)`, as items have to be shifted in the parallel arrays to make room for new entries. `get()` is `O(log n)` (the same worst-case time complexity as binary search).

Because `rank()` is simply a binary search, key classes are required to extend Java’s `Comparable` interface.

Unfortunately inserts take too much time to make this a viable implementation for the symbol table.

### BST Implementation

The BST-symbol table does not look much different than the generic BST example in my previous blog post. `search` has been modified to accept a key and return a value as opposed to a `boolean` if the value is present, and is renamed to `get()`.

``````public class BST<Key extends Comparable<Key>, Value> {
// root of BST
private Node root;

private class Node {
// key
private Key key;
// associated value
private Value val;
private Node left, right;
// nodes in subtree rooted here
private int N;

public Node(Key key, Value val, int N) {
this.key = key;
this.val = val;
this.N = N;
}
}

public int size() { return size(root); }

private int size(Node x) {
if (x == null) return 0;
else return x.N;
}

public Value get(Key key) { return get(root, key); }

// return value associated with key in the subtree rooted at x;
// return null if key not present in subtree rooted at x
private Value get(Node x, Key key) {
if (x == null) return null;

int cmp = key.compareTo(x.key);

if (cmp < 0) return get(x.left, key);
else if (cmp > 0) return get(x.right, key);
else return x.val;
}

// search for key. update value if found; grow table if new.
public void put(Key key, Value val) {
root = put(root, key, val);
}

// add k:v pair or change key's value to `val` if key in subtree rooted at x.
private Node put(Node x, Key key, Value val) {
if (x == null) return new Node(key, val , 1);

int cmp = key.compareTo(x.key);

// recursively call `put` on the left or right subtree
if (cmp < 0) x.left = put(x.left, key, val);
else if (cmp > 0) x.right = put(x.right, key, val);
// otherwise update the current value
else x.val = val;

// records the size of subtree for each node, for public `size` method
x.N = size(x.left) + size(x.right) + 1;

return x;
}
}
``````