The binary tree is a tree where each node has at most two children.
The are a handful of ways to describle characteristics of binary trees:
Binary trees can be stored dynamically using pointers or in an array. However, using an array is only true for complete binary trees. Because I was planning on writing about binary heaps, I will leave dynamically creating a binary tree to a later post.
A binary heap is an array representation of a heap-ordered, complete binary tree. The level ordered of the the complete binary tree map to indices in the array.
What does heap ordered mean? Depending on whether the order of keys in the binary tree is ascending or descending, it means:
The binary heap is not a sorted data structure, but can be regarded as partially sorted; the heap will always keep the minimum value (or max value) as the first item in the array.
The heap is an excellent data structure for a priority queue (PQ) ADT. Priority queues are characterized by removing the smallest item (or largest depending on heap order) of a collection.
A typical example for the use of a PQ is given a stream of n items, find the largest m items. While sorting is the traditional way to solve this problem, PQs are handy when there is not enough memory to store all items for a sort.
A typical PQ API will have the following methods. Note: I will be using a max-heap ordered binary heap for this example.
public class MaxPQ<Key extends Comparable<Key>>
MaxPQ() // create an empty PQ
MaxPQ(Key[] a) // create a PQ with given keys
void insert(Key v) // insert a key into the PQ
Key delMax() // return and remove the largest key
boolean isEmpty() // is the PQ empty?
Key max() // return the largest key
int size() // number of entries in the PQ
The binary heap starts at the 1-index position of the array for simplifying arithmetic operations used to update the PQ.
When a child node in the heap becomes a larger key than its parent key, it needs to be swapped with the parent. This process continues until heap order is restored. The method responsible for this work is swim()
.
Insertions into the PQ are added to the end of the array, and swim()
is then called to move the new node up the tree, if needed.
Conversely, when a parent becomes smaller than one or both of its children (after removing the max item from the heap), the sink()
operation is called on that parent. This is accomplished by swapping the parent node with the largest of the children. This process continues until the violation is corrected.
delMax()
removes and returns the maximum value in the binary heap by caching the value in 1-index position, swapping it with the last item in the heap, then sinking the item put in 1-index position until order is restored.
Below is a simplified version of how a PQ works. Array resizing and iteration code were omitted to focus the main concepts of how a PQ works.
public class MaxPQ<Key extends Comparable<Key>> {
private Key[] pq;
private int N; // total entries in the PQ
public MaxPQ(int capacity) {
// cast to `Key[]` - Java does not allow generic arrays
pq = (Key[]) new Comparable[capacity + 1];
}
public boolean isEmpty() { return N == 0; }
public void insert(Key key) {
pq[++N] = key; // add item to last position
swim(N); // restore order
}
public Key max() {
if (isEmpty()) throw new NoSuchElementException("Priority queue underflow");
return pq[1];
}
public Key delMax() {
key max = pq[1]; // cache max key (first item in array)
exch(1, N--); // exchange with last item
pq[N+1] = null; // prevent loitering (+1 because 1-based index)
sink(1); // restore heap order
return max;
}
private void swim(int k) {
// parent of `k` is k/2`
// while the parent of k is less than k, promote k
while (k > 1 && less(k/2, k)) {
exch(k/2, k);
k = k/2;
}
}
private void sink(int k) {
// child nodes are `2*k` and `2*k+1`
// determine largest child; exchange parent with largest child
while (2*k <= N) {
int j = 2*k;
if (j < N && less(j, j+1)) j++;
if (!less(k, j)) break;
k = j;
}
}
// helper methods
private boolean less(int i, int j) {
return pq[i].compareTo(pq[j]) < 0;
}
private void exch(int i, int j) {
Key temp = pq[i];
pq[i] = pq[j];
pq[j] = temp;
}
}
In regards to performance, insert()
and delMax()
are guaranteed to perform in O(log n)
time. max()
performs in O(1)
time.