Offset Window Functions

Video Transcription

Welcome back everyone, In this video, we’re gonna be taking a look at offset window functions. So our syntax is very similar to other window functions. So we have our function that we are going to apply over our window that is defined here. And remember, we can define our window using a partition. Without the partition, our window is going to encompass our entire query set, then we can order the rows within our window. And then we can also define which rows we are pulling, which are which rows are being applied to within that window. So we are primarily going to cover four different offset window functions here, these are going to differ a little bit when compared to offset fetch, which primarily just focused on pulling a top five or top 10 rows off of our query set, or, you know starting five rows. And instead of starting with the first row, we can do similar things with offset window functions. But in this case, the offsets are being applied by window instead of the entire query set. Although we can achieve identical results if our window is the entire query set, but we have four different functions here lag lead first value last value, first and last sound just as they are they obtained the first value inside of my window or the last value inside of my window. And then we also have lag and lead lag is going to be a before the current row and lead is after the current row. So whatever we’re looking at, we can lag in number of rows behind or lead in number of rows ahead. And we can calculate things based off of that which can be really beneficial in terms of things like running totals, these window functions, just like the previous ones that we covered are applied using the over clause and the offset window function, we are able to partition so we can define our window. Ordering is actually required, again, similar to what we saw with rank. And framing is also supported. But framing is only supported for first value and last value, not for lag and lead.

Because lag lag could actually go outside of your your frame. So we don’t want to use framing inside lagging lead. Without further ado, let’s take a look at some examples. Because I think that makes things a little bit more clear, rather than just talking about things. So here is our first example. And in this example, we’re going to be taking a look at lag and lead. So remember, lag is going to be before the current row and lead is after our current row. So similar kind of thing that we had before we have a CTE that’s going to calculate every order total, along with pulling out the customer ID and order ID. And then down here, I actually have the query that applies my window functions. And I have quite a few different window functions here. So I have two lags and two leads. One, the first one here is lag one lag to lead one lead to so the parameters for the functions here for the window function on on lag and lead. The first one is the column that you want to pull or use as part of the apply to as part of the function and then the number of rows you want to pull that value from. So here, I’m going to pull the order total from one row before the current one. And so that’s what this previous value is. And then here I’m going to pull the order total two rows before the current row. And then lead does the same way except I’m pulling the pulling the value one row after and two rows after. And so the column that we use as part of the lead can be any column that you have as part of that you can put as part of the SELECT clause. Lagging lead can be a little bit confusing at first because we actually end up as a result with a lot of different null values.

So we’re actually partitioning our query results by our customer ID. So every window that we apply lead and lag on to is actually the same here. We could have different windows that we apply to each each window functions, why it’s very powerful and or expressive. But in this case, to make things easier to read and understand, I have the same window defined for each one. And the same ordering defined for each one, the only thing that’s different is which window function is being applied to. But if we partitioned by customer here, see that here. This is one window. Since we’re partitioning by my customer ID in all cases here, then that means this whole these, these first three rows is my first window. Now, if I look down inside here, we can see my order total. And then the results of each of my window functions. Now row one, you can see here has a no value for the previous value column. That’s because there is no previous row right row one, the first row here is the first row in my result set. And so there is no previous row to pull the order total from. And so that will be no same thing for previous two. But next value, and next to value. So this is 3092. That is this row here. And then next, next to value 188, is right here. So that’s where those two columns are the values are being actually pulled from, we can go down to the second row in this window. And so we have our previous value now works because we have a row to refer to. So 2225, here is pointing or pulling the value from the previous rows order total. Previous two is no because again, that’s all the way up here, and we don’t have two rows before me to actually compare to. And then we have next value, which is this row right here. And then next second, notice that doesn’t actually pull eight 980 97 here, that’s because that goes outside of my window. So the lead and lag applies only to the rows within the window that is being applied to. So if you run out of rows, either in the lag or the lead, then the value that actually gets pulled as a result of the window function will be no. And then last here for our last row and our window function, we have previous value in previous to value actually has, can actually pull value now because we have two rows. And then our next and next two values are both No, because this is the last window our last row in our window.

That’s the lag in lead, I’m not going to go through every every window here. But you kind of get the general gist here for lagging lead. And that can be any number of rows that can lag or lead. And you can kind of see the beginning benefits of this and creating unique columns to pull information from. But let’s look at a another example here for first value in last value. So same CTE that I had order total. But now instead of lead and lag, I have first value and last value. So just two window functions, I have these same partition. So this is the window that I’m defining. So my window is based off of the customer ID, I’m ordering based off of the Order ID and then my frame is a pretty long one here rows between unbounded preceding and unbounded following. So essentially, what this is, is the default behavior of our framing in all cases of window functions. So if I do not specify a frame, then my the rows that are included as part of my window are all rows. And this is just a long winded way of saying give me all rows within the window. You can say a number here. So rows between unbounded order like preceding one or following one, or two or three or so on and so forth, which essentially says cut off the the first or the first or last row or first or last in rows. I will include some more documentation on on this or the framing as part of the uploading canvas. But more or less I find myself specifying the framing less often than I actually Need to. But there are situations where you do need to expressly identify the frame when you don’t want all rows inside of the window. But unless, let’s go and give us a quick run, and you can see here is again, same windows that I had before, I have my first window here for customer one. And you can see the first value is the first order total. And the last value is the value in the last order for this customer. So 188. And those values stay the same throughout the entire window. Because first and last don’t actually change at all. But we could modify this to change the frame to exclude rows from this calculation as well. But let me just to showcase that this is truly the default behavior, just to show that this is the default behavior. Let me actually comment out one of these for the frame, and execute this again. And you can see that the results are identical to what I had before. So I included this in here just so you can see what framing can look like. And I’ll include all the different options and things that you can actually place as part of defining the frame and notes. But nonetheless, that will conclude our brief examples for lag lead first value and last value. Offset window functions are, I think a little bit simpler and easier to understand when compared to ranking and aggregate functions, but they can still be very useful and expressive in writing our SQL queries