The Apply Operator
YouTube VideoVideo Transcription
Welcome back everyone. And this video, we’re going to be taking a look at the Apply operator. So this is going to be our last table expression that we’re going to be looking at for this course. But again, the general syntax of the Apply operator is included up here. But I will of course, show some examples of how that looks. But what is the Apply operator. So apply itself is non standard syntax apply is unique to SQL Server, if you are looking in other SQL based languages, you’ll be looking for what is called a lateral join. But SQL Server calls this the apply operation. But in general, how this operates is apply as a table value a table operator, and so your the left and right hand side need to be table expressions. Or rephrase that. On the left hand side, we must have an actual table and the right hand side of the Apply can actually be a table expression. So it could be a CTE, it could be a T provided sub query or or anything alike, but some table expression on the right hand side. Now, the main benefits that we get out of using the apply or otherwise known as the lateral join, it allows us to do a join on a correlated sub query, right.
So before when we did derive tables, and we even showed a situation with a join on to derive tables, but we couldn’t actually use a correlated sub query right, because a correlated sub query requires a reference to the outside a column on the outside query. And since the table since our table valued expressions that we’ve been using so far have been part of the from clause. No, the from clause has to execute in order for us to retrieve a reference from it. But the Apply allows us to reference that. So I’ll show some examples of how this looks here in just a moment. But there’s two kinds of applies that we may do. And now more so as the anti standard, right lateral join. Generally speaking, a lateral join makes a little bit more sense than apply. But nonetheless, right, we have cross apply and outer apply. And the same goes for the anti standard, we’ll have a cross lateral join and an outer lateral join. But their primary differences here is that the cross supply is more related to what you would expect for an inner join, right, it’ll apply the table expression on the right to every row and the table on the left, right. That’s why it’s applied to applying the expression table expression on the right to every row on the table in the left hand side. Typically, that involves matching on some predicates and things like that. And so that’s where the cross apply, is somewhat related towards what an inner join might look like, the outer apply is closer to what we would expect to a left outer join.
So it’s exact same as the cross apply. So we have the first steps, first step is the same applies the right table expression to every row of the left table, but then it’s going to add the outer rows from the or it adds the rows from the left table that did not actually have result in any rows in the table expression on the right. So when we apply the table expression, in step one, for both cross apply and outer apply, if that table expression results in no rows being returned, that is excluded from the join. Ok. So the row on the left is excluded from the join. Now, in the outer apply the rows that results, the rows that result in nothing be returned after being applied, or after the table valued expression being applied, then those are actually included in step two. So we do the cross apply. That’s step one. And then just like the outer joins that we used to do, anything that didn’t match, we add on the end. And so here, we’re not actually matching on a predicate, but the the expression, the rows for which the expression returned, nothing for those are added on the end. So that’s why it’s so closely related to it. An outer join. Now. Why would we need to use this? Well, ply provides particularly two kinds of benefits over other sub query types or approaches, particularly when combined with top or another words offset fetch. The first one here is that we can actually pull multiple rows, right? When we actually working with self contained sub queries that are that results. Here, we can actually have more than one more than one row being returned. But with apply, we can actually pull out more than one row.
So this provides an exclusive benefit compared to the self contained queries that we were pulling before. Likewise, we can pull multiple columns, even if we only have one single row. So really, otherwise, a scalar subquery can only be used in the SELECT clause. Because these are a single valued sub query, right, we can’t actually use anywhere else. Mainly because it’s only returning one value. But apply provides us extra benefit on top of that, right, we can actually pull more than one row and more than one column, if needed. This will make a little bit more sense. I’ll show an example of these two benefits here. And just a few moments. But let’s take a look at a basic apply first, remember here how we pulled the last order ID for every customer. So this is back in the video where we started talking about scalar sub queries. So again, I can’t have this, this has to be a single valued query, right, we can’t actually have more than one row being returned as a result here. Likewise, this can’t be more than one column, either, because this is pulled us is placed inside of our select clause. But what if we also but what if we want the last orders order date along with the Order ID? Well, we could actually do two correlated scalar valued sub queries, right? So we could do, we can have a sub query for each column name that we actually need to pull, which would work. But again, right, that’s not very efficient, right? Because we’re having to pull the same table multiple times in order to get that information out. Or we can do a derive table. But even if we use a derive table, we still have to have two references. So this is what it would look like as a derived table. Right?
But again, right, we have one row, right, one row here, right, so we have a select Customer ID Max order ID, last order ID from sales orders group by customer ID join on, right enter join on ID, customer ID equals customer ID. So this result here, right, for every row on the left hand side customers, I have one row one and only one row on the right hand side that I’m joining on. But we’re still referencing that table more than once, right, we’re still referencing the table more than once. So what if we did the apply operation. So with the apply operation, we actually only need to pull the table once we have only one reference here. And so instead of using Max, now though, we actually have top right, so top one, and I’ll show an example here, that shows the difference here. But let’s go ahead and execute this query here. Sorry, executed now. So here we go. We have order ID, but now with the order dates. So this is identical to this is identical to the query that we just had before with the derive table. So here’s that derived table that we did, identical in every way. But here we have customers inner join on the derive table, enter join on orders. So I’m actually having to pull all of the data from orders twice in order to achieve the information that I actually need. But here, this is where a table valued expression can actually save me some some more efficiency or get me some more efficiency out of my SQL is that I’m actually only pulling from my orders table once and only once, right? I’m not, I don’t actually pull all that data twice, multiple times. So this is the primary benefit that I get from using apply in this situation. So let’s go ahead and run this. There we go.
And so I can also, by the way, I can also change this to two, for example. That works, right, that’s pretty cool. We get now for every, for every customer, we get the top two, the last two orders, along with their order dates. And so this is the biggest, one of the biggest benefits. So out of those two bullet points, right, one of the one of the big benefits that we get with apply in combination with top versus a scalar valued sub query with like Max and so on. Likewise, we get some performance increase. Because before, if we wanted to do these queries, with just a scalar valued sub query, we’d actually have to pull that data multiple times from the table in order to achieve the same results. But here, we only have to pull the table once. In addition to this, how can we return all customers along with their most recent order of 2015? Only? Well, similar kind of thing here, I can actually add filters, add that filter inside and my sub query here. So we just have our order date between and then in 2015. But what does this return? Well, as General? Sorry, let me run this again. There we go. So this top order, and this is, we get how many rows we get 657 650 7am I missing any missing any customers here? Well, if you remember back a little ways, we can actually show that there are only 663 customers, as far as let’s select count, star. And then. We need distinct here cil. C dot customer. Id There we go. Now let’s give this a run. There we go. So this is the exact same query, basically the exact same count as I get up here. But now if I switch this here to a left join to get all customers. Right, we can see that we have 663 customers in total. So we have 600. And if I go back to enter join here, we have 606 157 customers that actually placed an order in 2015. And then we’ve got six other customers that didn’t place anything.
So how did we how do we actually get the customers out? That didn’t actually place an order? Well, simply enough, right? Since the way I showed down just just a second ago with the left join, we can do an outer apply. Outer apply, instead of a cross apply, give this a run. And if we switch over here, you notice I have 663 rows as a result. And then to actually see the last six the six rows that were added. Let’s go all the way down to the bottom. Here we are. Okay, here we are your all of the customers that did not actually place an order in 2015. Right? All the customers that didn’t place an order in 20 15 This makes a lot of sense. So if we think about the processing order, so switch back over here. So the cross supply is just the expression on the table expression on the right applied for every row on the left. And we only get, we only have the results of rows that actually result in a value or rows being returned from the table expression. And the outer apply, we do the exact same step. But then we have the step two, where I’m adding the rows on the left, that resulted in nothing being returned from the table expression on the right. And so when we apply the table expression here to pull the the top order from the Orders table for that particular customer, and 2015, we don’t get any orders as the results, right, because they didn’t place an order in 2015. But since I’m doing an outer apply, those customers are added back in at the end and step two of the outer apply the outer play operation to be included, just like what we did, what we kind of similar to what we would do with a left join. So this is a big benefit that we can get using an outer apply versus a cross apply.
So said cross apply is somewhat like an inner join. So we’re getting the results of the right table expressions applied to the rows on the left table, and only getting the results where we actually had rows returned from the table expression. And then the outer apply is the same operation but with the addition of the rows that had nothing returned from the table expression. And in those situations, no is substituted in. And so we can do very similar. We can do. If we wanted more than one row very similar to like what we did before, we can say top two, but now we get 12 We get still, in theory, right, we would get 12 if we if they actually placed orders. But since those are actually applied. And since these are actually added after the fact, we actually don’t get to rows for every customer that didn’t place an order, we just get the customers that didn’t place an order. Compared to previously, we get all of the customers that placed there the the customers that did place an order in 2015, we get their top two, top two results. Now, keep in mind though, if we did have a customer then we placed one order in 2015. That customer would only have one row because the inner query right the the table expression here would only return one row as as as it is. But that really concludes our examples for the apply operation in SQL Server. And remember that with the Apply operator, if you’re looking at the SQL standard, or an other database languages like Postgres, for example, you’d be looking for what they call a lateral join.