Single Table Queries Part 2
Video Transcription
Welcome back everyone. In this video series, we’re going to continue our talk on single table queries. Now last time, we covered a lot about the SELECT clause, which can be paired with from to choose which tables you’re picking from, and the where clause, which can filter which rows are being picked. And then we also did a little bit with group by and the having clause which filters those groups. So we’ll review those a little bit today. And then we will cover order by distinct Top Offset fetch, and also the process or the order in which those are logically processed inside SQL. So let’s first start out with the having clause, this is what we ended up on in the previous video. So remember, the having clause is a post group filter, right? So meaning the having clause can be used with in place of the where clause which filters row by row. So we have SELECT FROM WHERE, so the from clause picks which tables, so the data source, and then the where clause will filter out any rows that you don’t want. And but where clauses only row by row, likewise, so having is grouped by group, so having an were cannot be used interchangeably. So having can’t be used to filter out rows, or individual rows, and the where clause can’t be used to filter out groups. So that’s why they are separate there.
So the big benefit there that we can have with having versus the where clause is that we can now actually use aggregates, so things like count, average sum, and all those sorts of things can’t be used inside of the where clause, because it is again, right single rows at a time, where the having clause has groups at a time that it can actually filter. And so we can actually use aggregates like counts, like we, like we did last time to filter out groups that don’t meet minimum specifications. And I’ll show another example of that here in just a moment. But I also wanted to introduce again, a nother clause that we didn’t get to last time is the order by now order by does allow us to as it sounds, order our results, which helps out quite a lot in providing some consistency. So by default, your queries aren’t necessarily sorted in any particular order. So for the most part, the results that you retrieve from your database are going to be retrieved in the order that those results were actually entered into your database. But not necessarily always the case, because sometimes that order can get flip flopped or shuffled. So if you are ever concerned about needing a specific order, the order by element or clause is going to be something that is a must to enable that consistency.
So the ORDER BY clause can do ascending and descending orders. So those are the two supported sorts, the ASC or D, S, C are the key words there can follow the expression. So order by ascending order by descending are the keywords that you’re looking there. But ASC or the ascending keyword is the default, the default order, so you can just do order by column without actually specifying which order you want. And by default, it’ll go ascending. So typically, if you ever want to sort anything other than that, you will need to use this, this ending, but the ascending keyword is not required. But nonetheless, let’s take a look at a couple of examples. So I did want to review slightly here, lists is one of the queries that we started to cover last time. So we have order year order count, first order date, and last order dates for all groups of orders by year. So we again remember the from clause is executed first, then the group by in this case and then the SELECT clause, which is again, right not the typical processing order compared to your programming languages.
But here, we’re just grouping all orders by year, which enables us to count the number of total orders for each year. And then we can also pick the min and max date there. But again, remember that we cannot have any columns in the SELECT clause that are not in the group by if we do have columns in the SELECT clause that are are not in the group by, then those must be presented as aggregates as we have here, with the count min, and Max, which are all aggregates, even though the order date, in its raw form are not as not part of the group by or even the star here, things are this is not included in the group by but I can actually do this aggregates per group and project those in the SELECT clause. Again, remember here with those, we have projection, this the set operation projection which is handled by the select that is which column so which vertical selection, the vertical parts of my set, and then the from and group by are going to deal with more of the selection operation, specifying which rows we’re going to have in our result. We also talked about the having clause last time, again, just as a quick review, remember, the having clause is going to filter your groups, right, the having cannot filter out row by row, but it can filter out group by group. And so the benefit there is that we can actually use aggregates here as a result, so we can execute this. There we go. So, now, we only have the order the order years that have more than 20,000 orders in that particular year.
So pretty useful operations here that we can actually start to do some more interesting things with the data that’s stored in our database. Let’s take a look at the order by now as well. So order by is new, we did not have that last time. So order by is going to come after your FROM clause, it will, you can also order but you can also order your groups by the way as well. So if you want to order your groups, the order by is going to come after group by and having clauses. But let’s for sake of simplicity, let’s take the grouping out and try to run this. So here, if I order by my order ID, you can see that I have 1234567, so on and so forth. Now I can show you what that looks like without it. And you can also see here, right that my order ID actually kind of ends up coming out the same. So I didn’t really change much there. This kind of highlights the differences here. But my order here is not guaranteed. Like for example, assuming that, you know my orders are entered in canonical ordering. So order one is first and order, you know XYZ is next. But if I actually end up going back, and let’s say deleting a couple or modifying a couple, let’s say this wasn’t supposed to be order for this is order 20. Well, the ordering then is still is now not guaranteed. So even though in this situation here, my order doesn’t actually change in my results. Adding the ORDER BY clause will actually guarantee the ordering of my results rather than betting on chance.
Now we can actually take out the A S C here, and it gives me the same ordering as I mentioned just a little bit ago. ASC is the default ordering. But we can actually we can order in descending order as well. So we can get the last first instead of the opposite way. We can also let’s say we want to also order by multiple things. So let’s switch this to order date. And let’s order in ascending order there. And then I want to do O dot two customer ID and run that. So this becomes a little bit more powerful right I think can order multiple columns, and I can actually switch this up as well. I can order in descending order and one and ascending order and the other. So it does not matter how, which one is which, necessarily, so you can order ascending or descending on multiple columns. Let’s go back real quick and talk about the processing order here. For these as well, we have seen most of the major parts of our select class, right, so the most common things that you’ll actually see a SELECT query, we’ve seen most of them now. So the standard processing order is not again, it’s not top to bottom, even though we’ll write our queries top to bottom, so select from where group by having order by, we are not going to the sequel is not going to actually be processed in that exact same order.
So from clause is first. So we pick our data source, the tables that we’re pulling from, then we can filter the rows out of those tables. So we can do a first pass of filtering, again, the where clauses single row by row, then we can group those rows together based off of some expression. So group by color, for example, or group by order year. And then we can filter the groups using the having clause which filters, which enables us to use aggregates in our filters. So counts, average sums, things like that, then we the selection, or the SELECT clause happens after that. So we can we pick the columns that we want. And again, this is the projection operation. So all of the things that we actually have from one through four here, there so far are the are the selection operation, as far as sets goes, and then we do the projection, so we pick which columns we want. And then we order, right, even though order by is listed after that, it is actually in this situation, the last thing that is executed, but we do not order before we actually pick the columns, because otherwise, again, if we if we think about this, as far as efficiency goes, there’s no sense in ordering more things than what we actually need to. So we pick the columns that we want. And then we can pick which columns we want to actually order.
But let’s take a look at a couple of examples of things that don’t quite work as far as our processing order is concerned. So again, these are going to be larger queries. What kind of query do we actually have going on here. So we are selecting the order year order month and order counts from the Sales Orders Table, where the year or the order date is between 2015 January, one, one, and January one one of 2017. So I’m giving giving me all of the orders between 2015 are that are in 2015 and 2016. Right? Excluding all orders in 2017, then we’re going to group by the year and month having a count more than 1000. And then we’re going to order by the year and then order by the month. So again, here, what we’re essentially doing is given me the total number of orders for each month, between 2015 and 2017. So if we actually run this here, there we go. So we can see here, here’s all of the orders for 2015. So we go from January all the way to December. And it looks like all of the months there and 2015 actually had a order count that was more than 1000. And we can scroll down here and look at 2016 2016 wasn’t such a great year we only had five months where we had more than 1000 orders.
But also notice here that it again it’s ordered by year and then ordered by month so when when we have multiple columns and our ORDER BY clause, it is sorted in order of from left to right so the year First here and then the month. But even though it is not executed SELECT FROM WHERE group by having order by the order that we actually put them here actually does matter. So if I cut out the from clause, for example, and put this first, you’ll see that it gives me incorrect syntax. Even though logically, the from clause is executed first, syntactically, it does not come first, syntactically, we put the SELECT clause first. So let’s undo that here. So order of which we actually write our query matters, we do not write it in logical processing order, we write it syntactically and this order, but when the when the query actually executes, logically, the from clause happens first, in this case, then the where clause, then the group by then the having, and then we jump all the way back up, we pick out our columns, and then we order them. Okay. So really, the highlight of this is that the position of each element is mandatory, right, the order that the elements are actually listed inside of the query is required by the syntax of SQL.
But the logical processing of each of those statements is different. So let’s try another example here. Do actually, let’s modify this example here real quick. What happens if I try to you can I use aliases here and the ORDER BY clause? Well, since the ORDER BY clause happen is logically executed after the Select I can write. So aliases are perfectly valid to be used. The aliases that are declared in like the SELECT clause, and the from clause can be used anywhere after that, after that statement has been logically executed. Right. So, for example, in the front, since the front clause is logically executed, first, I can use the alias for the tables in all statements that are executed after the from clause. Likewise, with the SELECT clause, on my order year order month order count, I can actually use those down here. And my order by because the order by happens after the select. So let’s do order year here. And you can see that it even comes up in IntelliSense there, and then let’s replace this with or run that. And we get the same results as same results out. So let us look at this last query here for the segment.
What’s wrong here? Well, if we read it top down, we’re selecting order ID order date, customer ID, the year of the order as ordered the order year, from the Sales Orders Table, where the order year is 2016. And then we order by the order year. So what’s wrong with this? Well, if we run it, we get an invalid column name order year. Well, as we showed in the previous example, the aliases only work in statements that are logically executed after the statement where the alias is actually defined. So here, my order year alias is defined in the SELECT clause, but my my WHERE clause is executed logically before the SELECT clause, right? So we go from where select, then order by, so we can use order year here, but we cannot use order by order year in the where clause. So instead of that, we have to actually use the year function here. And we can only use the columns that are provided for us through the from clause because that’s the only thing that the wearer is actually aware of. So we do your order date there, and then this query will execute. But that’s just a little bit of a Just a couple of examples of why this processing order matters right. So, syntactically, our SQL statements are programmed or listed in this particular order. So select from where group by have an order by, but logically they are processed from where group by having select and then order by