Saturday, October 13, 2012

Why Merge Operation in MergeSort is O(n)?

I have been asked this a couple of times by the people learning or ramping on sorting algorithm. Why does merge operation in merge sort takes O(n)? Most of the times it turns out that people think it's two sorted array, just append 'em. It's intuitive -- wrong intuition.

The idea is we have two sorted array, and not two arrays with one of them having it's smallest element larger than that the other's highest. What I meant to say is, it's not [1, 3, 9, 12] and [13, 17, 19]. But it's more like [3, 12, 13, 17] and [1, 9, 19].

You see, in later case you can't append them. So, how do you combine? You follow the following algorithm:
  1. Lets say the first array has m elements, and the second has n. You make an auxiliary array of size m+n. This will store the final combined and sorted array.
  2. Assign two cursors -- i and j, pointing to the heads of  the arrays.
  3. Compare the elements at i and j; choose the smaller one. Copy the value auxiliary array, and move the smaller cursor ahead.
  4. Repeat #3, till one of the arrays exhausts; and then copy the rest of the elements from unexhausted array to the auxiliary array.
So, in our case, you compare (the [] denotes the element selected and copied to aux array) (3, [1]); ([3], 9); (12, [9]); ([12], 19); ([13], 19); ([17], 19); (--, [19]) . You see? How may comparisons? (m+n). Assuming each operation O(1). The merge process for an array of length (m+n) is O(m+n).

In best case, you will have sorted array as input. In such case, both the subarrays to be merged will just require joining. But we will apply the generic algorithm mentioned above. In that case, you will have to compare min(m,n); and then copy max(m,n) element to auxiliary array. In anyway, you are going to have total (m+n) operations.

No comments:

Post a Comment