Quicksort worst-case and averages-case analysis

Quicksort algorithm was developed by a British computer scientist Tony Hoare in 1959.  Quick Sort is capable of sorting a list of data elements significantly faster (twice or thrice faster) than any of the common sorting algorithms and that’s why it is the favourite topic of interviewers in programming interviews. It is one of the most efficient sorting algorithms and is an in-place sorting algorithm. 

Quicksort(A, start, end) { 

	if (start < end) { 
		partitionIndex = partition(A, start, end); 
		Quicksort(A, start, partitionIndex-1); 
		Quicksort(A, partitionIndex+1, end); 
	} 

} 

Now partition procedure, rearranges the array such that pivot element is at its correct position. The total time taken to rearrange the array is always 𝑂(𝑛) where n = end-start+1.

partition (arr[], low, high) {

    // selecting pivot – Element at end position (right most)
    pivot = arr[high];
    i = (low - 1);

    for (j = low; j <= high-1; j++) {
        // If current element is smaller than the pivot, swap the element 
        // with pivot to the left
        if (arr[j] < pivot) {
            i++;    // increment index of smaller element
            swap(arr[i], arr[j]);
        }
    }

    swap(arr[i + 1], arr[high]);

    return (i + 1);
}

You could read more about Quicksort and its implementation here.

Let us suppose that the pivot has divided the array into two parts, one of size k and other of size n-k. Both of these parts will need to be sorted. This gives us the following relation

𝑇(𝑛) = 𝑇(𝑘) + 𝑇(𝑛−𝑘) + Θ(𝑛),

where 𝑇(𝑛) is time taken by algorithm to sort n elements and Θ(𝑛) is the cpu time to rearrange elements.
Now, let’s analyse, worst case and best case.

Worst case

Consider the case when pivot is the least element of array, we have k = 1.

𝑇(𝑛) = 𝑇(1) + 𝑇(𝑛−1) + Θ(𝑛)

Solving recurrence by substitution:

𝑇(𝑛) = 𝑇(1) + 𝑇(𝑛−2) + Θ(𝑛−1) + 𝑇(1) + Θ(𝑛)

𝑇(𝑛) = 𝑇(𝑛−2) + 2𝑇(1) + Θ(𝑛−1 + 𝑛)

Likewise,

𝑇(𝑛) = 𝑇(𝑛−3) + 3𝑇(1) + Θ(𝑛−2+𝑛−1+𝑛)

𝑇(𝑛) = 𝑇(𝑛−𝑖) + 𝑖𝑇(1) + Θ\sum\limits_{j=0}^{i-1} (n-j)

Now, such a recurrence can only go upto i = n-1, because otherwise n-i would be less than 1. Therefore,

T(n) = T(1) + (n-1)T(1) + Θ\sum\limits_{j=0}^{n-2} (n-j)

T(n) = T(1) + (n-1)T(1) + Θ((n^2+n-2)/2)

which is O(n^2) .

This happens when the array is already sorted.

Best case

The best case occurs when the pivot divides the array into two equal parts in every step. Thus we have, k = n/2.

Therefore the recurrence becomes,

𝑇(𝑛) = 2𝑇(𝑛/2) + Θ(𝑛)

Using substitution:

𝑇(𝑛) = 2(2(𝑇(𝑛/4) + Θ(𝑛/2))) + Θ(𝑛)

T(n) = 2^2 T(n/4) + 2Θ(n)

Similarly,

T(n) = 2^k T(n/2^k) + kΘ(n)

This will continue until n = 2^k i.e. until k = logn . Thus,

𝑇(𝑛) = 𝑛𝑇(1) + Θ(𝑛)𝑙𝑜𝑔𝑛

which is 𝑂(𝑛𝑙𝑜𝑔𝑛).

Leave a Reply

Your email address will not be published. Required fields are marked *