Complexity (Part 2)
Resources
Video Script
So now that we’ve learned about merge sort and quicksort, let’s take a look at the complexity of one of these algorithms. Just to understand how that works. For this example, we’re going to look at the complexity of merge sort. Let’s consider the example where we’re doing merge sort on eight numbers. So here we have the numbers 1,2,3,4,5,6,7,8. So the first step of Merge Sort would have us divide those in half into groups, 1,2,3,4, and group 5,6,7,8, then we would divide each of those in half again, ending up with four groups 1-2,3-4,5-6, and 7-8t. So this diagram helps us understand the complexity of this algorithm. And we need to measure two things. We need to measure how many swaps it can make, and then we need to measure how many divisions it makes. So let’s look at swaps first, these are all in the correct order, but we have to assume worst case. So in the worst case, we would make 1,2,3,4 swaps. So we know we need four swaps for one of our numbers. And we need to compare that to the size of the input, which is eight. So how does four compared to eight, an easy way to think of it is four is just our input size eight divided by two. That’s pretty easy. The second step is a little trickier. Because what we need to do is we need to look at how many times we have to divide the numbers to get all the way down to groups of two. And in this case, we have three levels. So we have three here, but how does three compared to eight?
That’s a little trickier to answer. But let’s think about what it would look like if we had four levels. How many numbers would we need to get all the way to four levels and fill four levels all the way up? As it turns out, to do that we would need to double the amount of numbers we would need to have 16 in order to fill up four layers. So how does three relate to eight in the same way that four relates to 16? This can be kind of tricky, but it actually lies in the idea of powers. Consider this, two to the power of three is eight. And likewise, two to the power four is 16. That’s where our answer lies. And it actually makes sense. We’re dividing these in half. And we do that three times for eight numbers. Likewise, if we’re dividing it in half, we would do it four times to get the 16 numbers. So how do we express this in terms of our number eight? This relies on a little bit of algebra and calculus, but the answer is in logarithms, and so this would be the logarithm of base two of N. And so that tells us that two to Two the whatever it is, is equal to eight. So we can put this all together by combining in divided by two and log base two event. And usually when we do this, we ignore the divided by two. And so we just end up with the answer in times log of in. And here specifically, we’re using lg as the shorthand for log base two. If you’re familiar with calculus, you know that In is log of e.
In computer science, we usually use lg as a shorthand for log base two. So as you can see, based on this analysis, mergesort runs in the complexity of nlog(n) if we graph that we find that nlog(n) is actually quite a bit shorter than n^2, meaning that as the input gets larger, Merge Sort will take many fewer steps to complete, then algorithms such as bubble sort, and Insertion Sort. Quicksort is a little bit difference. Quicksort has this really interesting worst case, where if the numbers are already sorted, and you always pick the pivot item as the last element in the list, then you’ll end up basically doing Insertion Sort every single time. And so it’s worst case is actually big O of n^2. However, in practice, if you pick a pivot item that is close to the middle of your input, then the average case for quicksort is big O of nlog(n) and in practicality, quicksort is usually the fastest of the sorting algorithms on random input data. So let’s take that idea a little bit further, we can look at these different sorting algorithms and determine exactly when they might be useful.
For example, Insertion Sort might be pretty useful in certain cases where there’s a small set of numbers, or maybe while we’re getting numbers one at a time and we just want to insert them in the proper place in an array bubblesort is really useful when we know the data is nearly sorted, because we only have to bubble a few items around till we get to the point where bubblesort has Completed because there’s no more swaps to make. Merge Sort is really good when we know nothing about our data, or if we’re worried about the size of our data not being big enough, or not being small enough to fit in memory. And of course, quicksort besides that really bad worst case is generally the fastest performing algorithm. So as long as we’re sure that we’re not going to run into that worst case, very often, we can generally use quicksort as a really great sorting algorithm in our code.