Sorting in Java
I am currently reviewing sorting in the Coursera/Princeton Algorithms, Part 1 course. While I have covered selection sort, insertion sort, and shell sort, I will be writing about mergesort and quicksort.
Sorting means arraying an array of N
items into ascending order by a natural order.
A natural order is generally defined as an order that is natural to humans. For instance, 1, 2, 3, 10, 20, 30
naturally ordered as opposed to a classic alphanumeric ordering, 1, 10, 11, 12, 2, 20, 21, 3
.
The items must be of the same type, and is enforced by Java’s type system.
The implementation details of how a data type should be sorted is seperate from the sorting mechanism. In Java, this is achieved by requiring objects to be sorted to implement the Comparable
interface:
1 2 3 

The data type itself has to determine how the compareTo()
method will work. Every data type that implements Comparable
is required to have a compareTo()
method. Below is the Java’s String
compareTo
method:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 

The following logic is expected by a client using the method (i.e. a sort implementation). Let’s assume v
is the item to be compared to another item, w
:
 return a negative integer when
v
is less thanw
 return
0
whenv
andw
are equal  return a positive integer when
v
is greater thanw
You can see String
’s compareTo
works in this fashion, as it uses the natural order of alphanumeric characters or the length of string.
Sorting Considerations
Certain considerations need to be taken when sorting. The following is not an exhaustive list.
 certification: are the elements really sorted? how do we check?
 time complexity: how is performance affected by array reads and writes?
 space complexity: does the sort work inplace or does it hold another array copy during a sort?
 inplace takes
O(1)
extra memory
 inplace takes
 stability: if a keys are equal, does the sort preserve the relative order of the items?
 is it recursive or nonrecursive? how will that affect memory usage?
Common operations in a sort
The are two common operations in sorting that are used across all sort methods. These are abstracted away into helper methods because they are so commonly used. They are the less()
and exch()
methods.
less()
compares two Comparable
items and returns a boolean value depending on whether the first item is less than or greater than the second item:
1 2 3 

exch()
swaps two items in the array, which is a common operation in sorting. The current value of arr[i]
is temporarily cached because arr[i]
gets overwritten by value arr[j]
, and arr[j]
is then set to the cached value:
1 2 3 4 5 

Mergesort
Profile of Mergesort:
 is stable; keys maintain relative order
 can be implemented in recursive (topdown) and iterative (bottomup) variations
 worst case running time is
O(N * logN)
 it’s prime disadvantage is it uses extra space proportional to array size
N
, so it’s grows by an order ofO(N)
“topdown” mergesort works as a divide and conquer algorithm. Divide and Conquer problems are recursively broken down into subproblems until some base case is reached. In Mergesort’s case, arrays are broken down into subarrays. The smallest “problem” is a single entry array, because a one item array is sorted by default.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 

Quicksort
Profile of Quicksort:
 unstable; items in the array do not necessarily stay in their natural order
 the average case running time is
O(N * logN)
 the worst case running time is
O(N^2)
, but it is almost impossible for time complexity to be this bad (see below)  average case space complexity is
O(logN)
and works inplace (versus Mergesort’s requirement for an auxillary array)  don’t use for small arrays
Quicksort is another D&C algorithm. It works as an inplace sorting method using partitioning to break arrays into subarrays. The point of partition for each array is called the pivot.
Array contents must be shuffled before any sorting or partitioning takes place in order to avoid the worst case running time. If the array is sorted or reversesorted, it will take quadratic time.
The basic steps are:
 shuffle the array (ie using Knuth Shuffle)
 select a pivot and partition the array so:
 entries smaller than the pivot are to the left of it
 entries larger than the pivot are to the right of it
 recursively call step 2 on the subarrays until the base case of 1 array entry is reached
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 

Final Notes
Java uses Mergesort to sort objects and Quicksort to sort primitive types. Based on this Stackoverflow question, the reasons are:
 Being that Quicksort is not stable, it might cause unwanted swaps for an Object array. For primitive types, this is not an issue as they have no identity
 Memory usage – because Mergesort doubles the memory necessary with an
O(n)
space complexity, it’s generally expected the reference types used in an array of objects will take up more memory, so adding slightly more overhead for the benefit of a stable sort is trivial.