Other Normal Forms

Video Transcription

Welcome back everyone. In this video, we’ll be taking a look at other normal forms. So so far we’ve only we focused on primarily Boyce Codd Normal Form. And we’ve mentioned third normal form. But if you remember this image here, there’s a lot of other things that we could actually achieve here. So we have normal forms one through five, as well as Boyce Codd Normal form, which is somewhere here in the middle here. So if we are in Boyce Codd Normal form, which is one that we’ve been focusing a lot on so far, then that if your relation R is in Boyce Codd Normal Form, then it is also in third normal second normal and first normal form, but it is not guaranteed to be in fourth or fifth normal form. So typically, what we’re going to be targeting on is hopefully, achieving Boyce Codd Normal Form. In this class, we’re not going to cover the fourth or fifth normal form. These are less common in industry, although you may encounter them and in some situations. But before we continue further into discussing our other normal forms, I do want to cover a couple more vocabulary terms here, particularly candidate keys, which is just another name for a minimal super key. And we call them candidate keys because these are keys that we will use as our primary key for that table. We also have prime attributes, which are attributes of a candidate key name, and non prime attributes. So these attributes do not occur in any candidate key. So basically, we have columns that are part of a key, and then columns that are not part of a key. So let’s keep those in mind as we start discussing our other normal forms.

So in this class, we’re going to cover our in this video, in particular, we’re going to cover normal forms one through three. So the first normal form is set is essentially covering the fact that a relationship only have simple attributes, it should only have simple attributes. This, this, in general means that a table should only have a single valued or atomic attribute or columns. So basically, the value stored inside of the column should be of the same domain. And, you know, of course, we still have all columns, of course have unique names, the order of the data here does not matter. But this table in particular is going to violate the First Normal Form because I Can I have more than one piece of information that’s being stored in the same column, right? So place of origin. And the place of origin here is the you know, Liverpool, UK, right? So city country, and that is bad form, right, this is bad form. So in order to normalize this and make it adhere to the first normal form, we would actually split those two pieces of information to their own columns, origin and country, for example. So anything that has any multivalued column or attributes will violate normal form one. And that incense also makes it very difficult. If something is not in first normal form, it’s almost nearly impossible to make it adhere to a third normal form. Because that data is all coupled together in a single column. This type of normalization is not as prevalent in things like no SQL databases, like MongoDB, for example. But we’ll have another lecture series later in the course, that talks about no SQL.

But this is our our first normal form, right? Our columns should have our column should only contain single values, data, single value data, not lists, not you know, multiple pieces of information like City State, Zip or, or anything like that. So one piece of information, one single piece of information. Our second normal form, here we have a relation for albums, and the attribute on the right, captures the country of the artist, right? So we have album and artist ID, the label for that album and then we take me off of the screen here. And then we also have artist country. Now the second normal form says that As artists nine which is the Beatles is a British Group, all their albums are from the UK. So artist implies country. So, but the country right itself should not be an attribute or should be an attribute of the artist not of the album right. So, the album should not be able to imply artists country. Simply put, though a relation is in second normal form if it is in first normal form, and every non prime attribute of the relation is dependent on the whole of every candidate key. So, this in general violates our second normal form because of this. So, album artist is our our key here, album and artists together imply label and country. So that’s our super key or minimal superkey. But we have artists implies country. And so in order to get this in second normal form, we actually need to split this out into two relations. Here, we have artist album and label and then artist name and country name I just added in there just so we can keep track of whose artists name but this is in second normal form, or as this table is not, so more or less, right? Everything. And if we have a composite if, if we have a composite key, all non prime attributes must depend on the full key, right, so we can’t have any sub dependencies as part normalization level three, right? Non prime attributes cannot depend on each other, right non prime attributes cannot depend on each other.

So here again, we have a relation for albums, and the attribute on the right captures the country of the recording studio. So we have studio implies country. In the same way, if the studio Abbey Road here is or so if the studio is Abbey Road, then the recording could only take place in the UK. And so we might want a table with all studios and countries where they’re located. But here right this violates our third normal form, right, because we have our album artists, which again is our super key are minimal superkey. But we have studio implies studio country and so this violates our third normal form, because we have non prime attributes right non prime attributes that are dependent on each other. So if if when you know one attribute, then you always know the second the second one should be in another table, right? So if that is the case, right? This should be in a separate table, if we know one and we know the other than that should be taken out and put into another table, right? This is just a functional dependency, right? A functional dependency that is a non prime attribute, right non prime attribute, you know, one, you know the other. So essentially, third normal form, we don’t want any transitive dependencies. So every non key or non prime attribute must provide a fact about the key, the whole key and nothing but the key. So if we split this out into a, or if we want to normalize this according to third normal form, we would have something like this, where we split the studio out into its own table, and then we link that and to the other table using a foreign key with a studio ID there. So this would adhere to our third normal form. But those are normal forms one through three and we will cover a little bit more particularly on the differences between third normal form and Boyce Codd Normal Form and following videos. And also remember, there are the fourth and fifth normal forms, but we will not be covering those two normal forms for this class.