TOP Results

Video Transcription

Welcome back everyone. In this video, we are going to take a look at how we pick certain specify the number of rows that we actually want to return. And this is without the actual where filter and things like that all the even though, we can actually include that. So the first one that we’re going to talk about here is top. Now, top is not standard in SQL, you will only find it in certain database management systems. For example, SQL Server actually has the top keyword or top filter, I will talk about the anti standard here in just a minute. But top filters are rows based on order. So you can say, let’s say I want the top five rows by my results. And so based off of the ordering of the results, it will take the first x or n number of rows from it. So if I did top five, it gives me the top five rows. So you say top and then there’s, you specify the number or using an expression that you want to pick. And then you can specify whether you want the top in rows, or top in percent rows, right.

So if I say top five, but I could also say give me the top 5%. So depending on your use case, that would be an extremely useful bit of information. And we can also do ties as well. By default, it’s, well, I’ll actually show show the results here, what tarp does by default, but we can do with and without ties. So if there is, let’s say, I want the top two rows. But the third row is the same as the second the second row, we we can exclude or include that row. So as I mentioned, right top is non standard, and, and the following video here, I will actually cover the standard way of doing this particular filtering in your results of the query. But first, let’s take a look at how we run top. Let’s apply our top filter. Now, as I mentioned, when we were discussing tarp, remember that top is dependent on the ordering. So execution here is going to go from group by select and then if there is an ORDER BY clause, it would do the order by and then the top filter. So top happens last in this case. So let’s execute this, see what we get.

There we go. Alright, so what I am doing here is from sales from the sales orders table, grouped by the year, so give me all of the orders by year, and then give me the top two years in terms of order count. So our selection has the order year or account, first date and last state. But it’s going to essentially give us the two years that gave us the most orders, which is an extremely useful bit of information. But it doesn’t necessarily do that. At least that’s my intent here with the top two. If I take off the top two, and execute this query here again, we actually notice that the top two, so the the years with the most sales is actually 2014 and 2015. But when I add the top two here, you see my my default ordering is 2013 first 2016, then 2014 and 2015. And so that is the order that the rows come out by default. And so that’s what top depends on.

But if we want to be explicit, which is a very important thing to be, when you’re dealing with SQL queries, we actually need to expressly tell SQL what we actually want to order by so here we can order by our new year. Order. Count, right? I want the top the to the most The the most productive years and torn a most productive two years in terms of sales. So if I run this, ah, there we go, ooh, there we go, what happened here? Well remember, our ordering is ascending by default. So that didn’t really change anything. But if I do that, there we go, that’s the result that we want. So we have order year 15, and 14. So the most sales is 2015. And then the second most sales is 2014. And then if of course, if we change this number here to be, let’s say three, and run that, now I get 2013, which had 19,000, sale and 19,000 orders. Now, let’s explore this a little bit more. So by default, let’s run this.

Okay. So this gives me the top 10 orders with ties. So, top or sorry, not top 10 orders, the top 10 customers based off of their order count. So give me the most loyal customers, right the customers, the top 10 customers that made the most orders, essentially what we’re what’s going on here, very similar to the previous query that we did. But the new thing that I added here is the width ties. And so if we scroll down here, this is kind of important, because if we look here, 10, the 10th customer had 140 orders, but we had three more customers that also had 140 orders. So without the with ties specification here, I run this. So you know, see number 10 is 598 585 80. Right? See that? My my 10th customer, there is no longer 598. So if I add this back in and run this again, 10 is now 598. And we have 11, which is 580. So this really highlights the fact that top by itself is non deterministic, right.

So when there are ties, you’re not going to be guaranteed to get a specific row, right, because there when when the last row has ties, the sequel, the database management system has no idea which one to choose. And so it’s just going to pick one and go with that. And so the width ties will give you all of the things that tied with the final row, which makes things a little bit more deterministic, right. So it’ll give you all the things you won’t get to differentiating results between runs of the query. We can also tack on a percent to this as well. So let me take off the ties and give me the top 10% of the customers so 10% turns out to be quite a few customers as a result. So if we switch over here, we got 67 customers so 10%, the the top 10% of our order base, we have 67 customers and all of our order counts there. So the top 10 The percent is a is a useful feature, I find that I use it a little bit less than the raw number. But nonetheless, it is a useful feature to add to associate with the top function or the top filter. But that’ll be it for at least covering top for now. And the next video, we’ll be taking a look at the ancy standard offset fetch