taylorialcom/ Sorting

# Shell Sort

Insertion Sort has good best case performance $$O(n)$$ but mediocre worst case performance $$O(n^2)$$. Shell Sort attempts to optimize Insertion Sort to improve the worst case time complexity.

The worst case performance of Insertion Sort suffered when we had long chains of swaps when needing to move a small item for the yet-to-be-sorted partition to the beginning of the sorted partition. Shell Sort seeks to minimize long chains of swaps.

Consider:

238561

Here the first three are already sorted, so no swaps are needed, but then

• swap 5 with 8 (total swaps: 1)
• swap 6 with 8 (total swaps: 1)
• finally, to get 1 in the right spot, we have to swap with 8, 6, 5, 3, and 2 (total swaps: 7)
238561
235861
235681
123568

Suppose we started by swapping 8 and 1 first, and then execute as we normally would.

• swap 1 with 3, then 2 (total swaps: 2)
231568
123568
• After that, no more swaps are needed, so we can sort in 3 swaps instead of 7.
• If we can do a swap with a big gap in the the middle (swap third position and sixth position), that saves us a bunch of swapping.

Shell Sort uses this idea by making multiple passes through the array, starting with a large gap and progressively reducing it. In 1959, Dr. Donald Shell authored a two page paper introducing Shell Sort. In it he suggested using a gap of (( \lfloor \frac{n}{2^k} \rfloor \) for the $$k^{th}$$ iteration.

## Worst case performance

This requires fewer swaps in the worst case.

Over the past 60+ years computer scientists have published papers describing the merits of different criteria for selecting the gap sequence. In 1986, Dr. Robert Sedgewick published a paper demonstrating that the worst case time complexity could be as low as $$O(n^{1.33})$$ with an appropriately chosen gap sequence.