Closure Sets

Video Transcription

Welcome back everyone. In this video, we’re going to get to our discussion on extracting more functional dependencies from our database. So before we ended on the Armstrong rules, which worked fairly well in pulling out more functional dependencies, but we found that it was a little bit tedious to actually do. And if we applied that to a much larger schema or much larger database becomes a little bit impractical and use. So that brings us to the idea of a closure set. So a closure set is defined as a set of attributes a one through a n, and the closure set of the that set of attributes is defined here, using this plus, right, so this is, if we have a one through a n with the curly brackets and the plus sign. That means this is the closure set of those attributes. The closure set of those attributes is the set of attributes b such that a one through a n implies E. So essentially, everything that you can imply for a given set of attributes, that is a closure set everything that we can imply from a set of attributes. So if we consider our previous example with name implies color category implies store and color category implies price to what are the closure set of these functional dependencies. So to start out, by computing the closure sets, we’ll list out each of the functional dependencies that we have. So name. So we’ll first start out with name. So what can we imply from name. So initially, we can just say, we can imply name and color, right name being the trivial implication, because we can name implies name. And we can get name and color, we can get color from our original functional dependency up here. So what about name and category? So what about name and category? So I’m just listing some of the individual attributes down here.

This isn’t necessarily a direct one to one mapping up here to our functional dependencies, although we’ll use our known functional dependencies to compute our closure sets. So what can we imply using name and category? Well, initially, we can just put the trivial ones first, so name and categories and then category. But what else can we imply? Well, name itself implies color based off of our functional dependency appear. And this existing closure sets, we can also imply store primarily because we have category up here as part of our functional dependency. So category implies store so we can include store here. And since we have color and category now, on the side here, that means we can also imply price based off of this last functional dependency. And so if we have the name and category, we can actually imply all of the other attributes as parts. But with color alone, we can’t imply anything other than just color. Now, this isn’t all closure sets. Based off of these examples. If we wanted to compute all closure sets, we would exhaustively go through all the combinations of attributes on the left hand side, but a lot of those would actually result in the exact same closure set. So we don’t have to compute all closure sets, because a lot of them are going to be duplicates of each other or equivalent in nature. But there is a nice handy algorithm that we can utilize to compute our closure sets. So if we have a set of attributes x a one through a n, we’re going to repeat this algorithm until x does not change. If b one through B, N is a implies c is a functional dependency and b one through B in our all in x, then we’ll add C to x. So let’s take a look at an example here. So if we take a look at our name, and category, right, so we can look through all of our functional dependencies here, right, so we have a name, category, color, store price. So name gets added, because name and category gets added by default, because those are the trivial functional dependencies. That name implies color. So color gets added first, we’ll loop back up and then try a another sets. So or try another. So if we can extract another functional dependency out from here.

So we can now that we have color, we have category and color implies price. So that is another functional dependency, right. And so that means we can add C, which is price to our closure set. And now that we add price here, we don’t actually have any other functional dependencies that we can extract. So our algorithm stops. So hence, right, we have a new functional dependency, right? Name and category implies color store and price, which is something which is a functional dependency that we did not have before, before we started calculating our closure sets. So this is one benefit that we can actually do, in terms of extracting more functional dependencies is computing the closure sets and see what other functional dependencies we can actually pull out. Typically, it is normal to in our closure set include the trivial, the trivial inferences here, like name and category. But when we, when we define a functional dependency, typically our functional pin are we don’t include in our functional dependencies, the trivial attributes, so we don’t include named category as part of on the right hand side as part of our functional dependency. So we stripped those out. Let’s take a look at a quick example here of this working in action. So if we have a relationship, a relation or table, A, B, has columns A, B, C, D, and F. And we have these functional dependencies here. What can we actually achieve here? Well, first off, we’re going to compute two closure sets, just as an example as part of this. So we have kala a closure set of A and B and a closure set of A and F. And we start out by essentially just putting the trivial ones out there, right, the columns that are on the left, we can just immediately put those on the right, so let’s take a look at our individual functional dependencies here.

So first off, we have A and B. So A and B are going to imply C, so we can include that down here and our closure sets. So let’s put C here. And then with the other of the other inferences, that we can make b implies d, so we can add D to this closure set. And then we also now are, we now have a and d as part of our closure set. So that means we can also imply E. So those are the attributes that we include as part of the closure set of A and B, let’s take a look at a an F now. So a an F, by itself, a f is a functional dependency implies B. So that is the first one we include. Now that we have B, we have a and b as our functional dependency up here at the top, so we can put C and R our closure set. Now that we have, we have B and R, we have B as well. So we can add D to our closure set. And now that we have A and D in our closure set, we can use this functional dependency to include E. So this is a I think a much quicker and easier way of pulling out all of the things that we can imply, given a set of attributes. And that allows us to identify more functional dependencies. And in this case, we just identified a couple new functional dependencies. So A and B, also a and b implies C, D, and then a f implies B, C, D E. And so those are two new functional dependencies that we’re able to extract by and by creating our closure sets. In general, why do we care? Why do we need closure? Well, with a closure set, we can find all functional dependencies as part of a relation. And with that, after we if we have a closure set, we can actually confirm whether or not a set of attributes implies another. So if we compute x plus or the closure set of x, so if a is in the closure set, we can confirm that that functional dependency does exist. So this will become a little bit more apparent as we work and build on this foundational database. You Particularly around defining our tables. And so one of the big reasons why we want to extract all of our functional dependencies is so we can make sure that our tables are in normal form. And we can use this as a way to identify keys as part of our relation as well. But that will be continued on in a another video.