Lossless Decomposition

Video Transcription

Welcome back everyone, we’re going to continue our discussion on database normalization and obtaining good database design. So previously, what we’ve seen, we’ve talked about third normal form and Boyce Codd Normal Form, how we could calculate the functional dependencies and the super keys, and how we can use those to find that or how we can use those to decompose our relation to break it down into Boyce Codd Normal Form. We’ve defined our functional dependencies, and how we can find all functional dependencies using Closure sets. And whether or not a dependency violates Boyce Codd. Normal Form. We also defined a super key, which is just a set of attributes that imply all other attributes. So a primary key. A super key is a minimal super key when there is no other super key in that set of attributes. So you have so we have no subset of attributes that also is a super key. And that minimal super key is what we use as our primary key for our database, or for our tables. And then we also learned that we can decompose our tables into Boyce Codd Normal Form relations or tables, using those bad functional dependencies as well if we have a bad functional dependency. So Boyce Codd Normal Form as a refresher as well, it’s in a relation R is in Boyce Codd Normal form if and only if for all functional dependencies. x implies a x implies a as a funk as a trivial dependency, or x is a super key. So for all x, either the closure set of x is x, or the closure set of x is all attributes.

So we have trivial functional dependency and the super key right so trivial, super key. Now, let’s take a deeper dive into decomposition. So So we’ve shown the example with our ID, name, and department and phone number. Here we have our name price category table that we’ve seen before. And so we can actually decompose this into name and price and name and category. But because we have a functional dependency name implies price. But this is a problem, right? This is a problem. Because if we do decompose it, well, not necessarily problem. When we decompose it, we have iPad 529 twice here. So this is actually a good thing because we do lose, we lose a row. But we we actually D duplicate, right, we d do D duplicate, we don’t actually have to have iPad 529 twice now, because we have different categories, right? We just have iPad 5.9. So this is last less decomposition. Last less decomposition. But let’s take a look at this example. If we decompose this into name, category and price category what’s incorrect here? What’s incorrect? So we have iPad, tablets, iPhone gadget, iPad gadget, we have 529 tablet for 29 Gadget 529 gadget. Uh huh. Okay. But how do we actually get this information back? Right? So the individual tables are okay, we didn’t lose rows of data. But can we re can we recombine our tables back into the original? Can we recombine this back into the original? Well, the short answer is no, no, we can’t.

Because if we look at if we look at this here, iPad 529 tablet, iPad 5.9 gadget, we can actually pair these back with the correct rows can no longer pair these back with the correct rows. So this is something that is in Boyce Codd. This isn’t normal form right this is a normal form, but this is loss he decomposition. So a decomposition is lost less if and only if we can recover the exact information we started with. So if we start off with a table ABC, and we decompose that into a, b and AC, if we can recover a, b, c, then we have achieved last less decomposition is lossy if we cannot recover ABC as a result of combining those tables again after they’ve been decomposed. So we don’t want that we don’t want to decompose our table, but lose the relationship that we had originally started with, right. So we do have to be careful when we start decomposing our tables. So, in general, if we have a relation are such that we have attributes a one through a n b, one through B M and C one through CP, we decompose that into our one a one through a n u one through BM and a one through a n c one through CP. If a one through a n implies b one through BM, then the decomposition is lossless, because then we can actually re compose the original relation, right? If that is not the case, then we have lossy decomposition. If we are doing Boyce Codd Normal Form, right Boyce Codd Normal Form decomposition is always lossless because they’re based off of our functional dependencies, our closure sets. So keep that in mind. We’re doing our decompositions we can’t just decompose our tables to try to make them better. We do have to be careful because if we decompose arbitrarily, we can have a situation where we actually lose the information we actually lose the related the relation, which is what we’re trying to leverage as part of SQL