Key, Check, and Default Constraints

Video Transcription

Welcome back, everyone. This is part two of our introduction into tables and constraints. But now let’s start talking about key constraints. Key constraints are really important as part of our database design, because it enforces relationships and uniqueness of data throughout our individual tables. And across the relationships between those tables. Our first one here is a primary key constraint. Primary Key constraints make particular values for a column mandatory are required because a primary key uniquely identifies a row as part of an part of a table. That value being that uniquely identifies a particular record inside of a table must be unique. So we can’t have two identifiers are two values and as a primary key that are the same otherwise, we wouldn’t be able to uniquely identify a record or row in a table. And in our previous discussion, that is one of the major requirements in a relational database for table is that we must be able to uniquely identify a single tuple or row inside of a table to that primary key is really important to be unique. Also, notice here that in my ER diagram, I’ve changed a little bit of things here, the website column is gone, all of my columns here are also not italicized anymore, so everything is going to be required. And then we have our first key constraint added to our ER diagram here, PK for primary key, now more than one column can be a primary key.

So we can have a composite primary key. But for now, we’re going to shoot for just doing a primary key of school ID, which is of type Ent. But let’s take a look at a few examples here. So here’s another example of us creating a the demo dot school table. This is exactly what I have up here in my slide as well. So the new addition here is that we have a new column called school ID of type int, that cannot be null. And it is a primary key, right. So this is our primary key constraint and our not null constraint all together. And one. Also quick note here naming convention wise that we’re going here, you’ve noticed that my column, all my column names are capitalized. And then Personally, I do capital I capital D, but you may also see a lowercase D as part of this. But we have our typical insert statement down here. Notice that we do have school ID added as part of this now as well. And then school ID is listed throughout there. But if we run this, our SQL query fails, right, so invalid column name, invalid column name, school ID. So do if you are continuing on the same database that we had from our previous video, you may also have to add a go statement in between these two, or just execute these separately, because we have, we’re adding a new column here. And so our insert statement would actually fail. Since these queries are these statements are all ran on the database management system in one big batch. And again, we will talk about batch how SQL SQL Server runs queries in batches in a future video. But for now, you can run the drop and create in one in one cell in Azure Data Studio, or add the go statement in between these two.

But now if we run this, we can see that there is a violation of primary key constraint, PK school cannot insert duplicate key an object demo dot school, the duplicate key value is eight. So note that all of our key constraints are all of our constraints. inside of our SQL Server instance, for our database objects will be automatically named by SQL Server. We will talk about in a future video on how we could actually name those constraints to be a little bit more user friendly. But for now, we will use the auto generated names for them. But the reason why we have an issue here is that line 25 and line 26. Both of these tuples have the same school ID so you need University of Nebraska University of Nebraska I have the same name, same ID, but one is you when l. So University of Nebraska Lincoln, and one is University of Nebraska, Omaha. And so what we can do to fix that issue is just to say, nine here. So if we run this again, Ah, there we go. So nine rows affected. And then if we go down a little bit here and add a select star, and then from demo dot school and run that, we get our nine records out of it. Alright, so that is the basic implementation of how we would work with a primary key. Like I said, we’ll get more into more complicated keys in the future. But this gets your feet wet into some of the basics and how those primary keys actually work, and how that’s enforced when we insert data into our tables.

Now, let’s look at our next constraints. So unique constraints are very similar to how primary key constraints work. Unique constraints enforce uniqueness within that column or within those columns, it can be a composite unique key. But unlike a primary key, you can have more than one unique constraint. So this will allow this will allow Knowles inside of it as well, where a primary key cannot be no. So that is some of the primary differences between a unique key and a primary key. So we can have a composite primary key meaning that more than one column together can be the primary key. But with a unique key we can have. So for example, we could have name be unique key. And then we could also have, let’s say colors be unique or nickname be a unique key. And they could be two entirely different keys separated out with each other. But again, when we start talking more about database design and consensus consistency of data, we’ll talk a little bit more about the more complicated key constraints that we can have as part of our database. But for now, let’s just take a look at an example. So if you notice, and my ER diagram on my slide, we have the UK in the key in the key area of the diagram next to name so we’re trying to enforce unique names as part of our university table. And so what we want to do first here, let’s go ahead and do a new create statement. And again, if you get an error, you might have to add a go statement in between these. But we’re going to run the CREATE TABLE school again, primary key school ID. But now we added the unique constraint to our name. So if we run this, ah, notice that my unique, my unique key constraint was violated in our insert duplicate key and object demo dot School, which is our table, the duplicate key is University of Nebraska. So even though our primary key is not violated, the unique constraint that we have as part of our name column is because we still have this two same names University of Nebraska, and a university in Nebraska.

But what we could do is we could just add something that is a little bit more unique as part of the name University of Nebraska Lincoln, and University of Nebraska, Omaha here. And if we run this now, all nine rows were affected. And now if we run this, we can see that all of my records here are added and I correctly get University of Nebraska Lincoln and Omaha down here at the bottom. Keep in mind though, while unique keys do operate similarly to primary keys, they aren’t necessarily don’t necessarily do the exact same thing as being noted above there in the corner in our slide. And again, once we start talking about database design, and specifically normalizing our tables normalizing our information that we store in our database, we’ll get more into the harder details of why there is a big difference between these two types of PII constraints.

Alright, so then let’s take a look at our next constraint, which is check constraints. So check constraints are really cool feature inside our relational database languages, check constraints that allow us to enforce certain kinds of information for our particular columns. If you would have noticed earlier, when I’ve been doing my insert statements, we had that last row that had zero for the for the year established. So check constraints could actually enforce a particular range of data, it could can enforce a discrete set. So if we had a column we could enforce yes, no, or maybe, or yes, no, true false. And we can enforce a lot of different things here. So specific range of data, discrete values, we could also use a comparison here. So we could say, well, this column has to be between, or the end date has to be greater than or equal to the start date.

So essentially, any predicate that we can use, comparing columns of the table can be a valid check constraint. So we can use any of the columns as part of the table. And we can pretty much do any Boolean operation on those columns. That’s what we’re referring to as a predicate here. So we can do less than greater than not equal to, and, and so on. So this is a really powerful tool to verify certain, verify that our column, the data that’s being inserted into it or modified, meet certain conditions. But for now, let’s take a look at how this looks in our SQL code. And you’ll notice that I didn’t really include there’s no ER diagram here, some er diagrams, and I’ll draw some later in the class. But some er diagrams will include check constraints as part of it, but it’s not very common. But you may see them on some database design documents. But let’s see what we can do for check constraints. So the same this year, similar kind of statement that we just had before. But now we’re doing a CREATE TABLE demo dot school, and still have the unique constraint and the primary key constraint. But now what I’m actually adding here is a check constraint.

So for a year established, because we have a zero, a pesky zero down here and part of our data, so we don’t want to get an invalid year, or some really weird years. And even this, we’re I’m maxing out at your 9999. But just for a second example, we can kind of show this here. But what we’re doing here is we’re checking that the year established column. So the data that is contained inside of this column is between 1009 1999. And of course, we could put more realistic values here. But let’s go ahead and run this as an example. And so you can see that an error is generated that says that we have violated our check constraint for this particular for this particular column. So what we can do is actually create or improve our data accuracy here. So u and o was founded in 1908. So if we change that value, run it our table gets created, and all of our data gets inserted just fine. And there we have it. So there we have universe, Nebraska, Omaha with the correct date in mind. But check constraints are a very, very useful tool to enforce data integrity, data consistency, as well. Now in practice, not not all data can be cleaned this way through the database. There is some data, some data cleaning that must be done at the application level. So the tool that’s being used the to create data in certain data into the database, that tool or application recode program will need to actually do some of that data cleaning before it gets to the database. But the database management system can do some of that for you, which is a really cool thing.

Lets take a look at a another another key constraint, which is the foreign key constraint. Again, we will be talking about foreign keys throughout this class. This is just a gentle introduction. But foreign key constraints enforces what we call referential integrity. These are there are a few rules that go along with foreign keys. So foreign keys are going to just as they sound before, and so a foreign key is going to be in let’s say, table B. So table B has a foreign key. And that foreign key is going to reference a column in another table. And so the columns that are referenced must either be a primary key or a unique key name. So it can’t be just some random column, it has to be a unique inside of that particular table. So unique columns are unique keys, of course, and then primary keys which are forced to be unique in the referencing column. So the actual foreign key itself must match the type of the reference column. So the column in table B must be the same type and column A. But the constraint does work in both directions. So it’s a bi directional inference. So the referencing table is checked when the foreign key value is inserted. And when it’s updated, as well as deleted, but when when a value is deleted, and the referenced table is checked, the reference table is checked. So if a new record is inserted, the referencing table is chaps. And then vice versa for when a record is actually deleted. But this will make a lot more sense when we start looking into relationships. And we’ll talk more about foreign keys.

So this may be a little bit much initially. But again, we’ll be explaining this in much greater detail here in the new few near future. But this is what a foreign key constraint might look like inside of a database. And this is going to be the example that I’m going to show here. But we have demo dot school here, but we now have a new table called demo conference. And then notice here that we have a foreign key now and a school table or school table. So conference, which is a foreign key fk, and then we have a conference table. And then we have a relationship that’s drawn between these. So this is a crows foot notation to reference the, the, the relationship between these two tables, so one, so one and only one conference. So a school can be can be in one and only one conference. And a conference can be in it can be found in zero or more schools. And again, we’ll talk more about the crows foot notation in a future lecture. But just note this, this is what you’ll see for how a foreign key is referenced in the ER diagram. But again, let’s go ahead and take a look at an example because things make a little bit more sense when you actually see it in practice. So I’ll leave my diagram up there at the top corner for you. But let’s take a look at some new code here.

So this is a lot of code. But all I’m doing here is creating my demo conference table and my demo school. I have a bunch of data that’s being inserted here. And notice here that we have a foreign key constraint now in our school table. So conference, that foreign key references demo conference nickname. So references that table up there, they’ll be run this invalid name, invalid column name conference. Again, that’s because we’re doing we have batch processing issues. So the conference table must be created first. So if you’re using as your Data Studio, and using a Jupyter Notebook, you’ll need to make sure that that conferences ran first, and I’ll try to make sure that this is set up more properly in your SQL notes as well. But let’s go ahead and run this again. Ah, there we go. So five rows affected eight rows affected there. So if we run demo dot school, so all of that is and we can actually see the conference name up here now. And then if we also change this to a conference, we can actually see all of our conferences were inserted as well.

So cool, everything looks good so far. Let’s go ahead and look at how we can insert some more values. So demo dot school, inserting into school and inserting into conference. So here I am trying to insert University of Nebraska Omaha, which isn’t here yet. I took that out. We have University of Missouri University, Nebraska Lincoln, and both put them on here. And they’re going to be in the summit conference. And I’m also inserting summit into the conference table. But if I try to run this, haha, alright, so the insert statement conflicted with the foreign key constraint. School conference, the conflict occurred and the database cc 520 table demo conference column and nickname. So foreign key constraint, right. So the issue here, hey, let me for the sub summit, it does not actually. So we have you and oh here and then 1908, Mavericks, black, crimson, Omaha, everything here is the same. But this line here was not actually ran yet. So since summit and Summit League, right, because summit, if we scroll up here, the conference, right is that primary key is nickname.

And so since the summit here, did not exist when I ran this line of code, this one. So this failed, because of our foreign key constraint, we were still able to insert this into our conference table, because this executed so that’s what this one row affected down here. This one worked, but our insert did not because the summit did not exist yet. But now that it does exist, we can actually execute this in our our table gets in our value gets inserted just fine. And now when we actually run our demo dot school, University of Nebraska, Omaha is there. And then we can also pull from our conference table as well. And see that summit is there. Um, but that is the basics of how our foreign key constraints are done. But let’s go ahead and look at our next constraint thing.

Our next one here is a default constraint. Default constraints are extremely useful for information or columns that are knowable, right? So if a column is knowable, but you want data to actually be there, we can use a default value, which is really, really handy. So of course, just as it sounds, assigns a default value for a column, right, this is used on inserts when a value is not provided for that particular insult insert. So if a column is knowable, and that column is not provided as part of an insert, then a D, the default value is provided as part of the default constraint. So you can provide a value, you can still override that default value. So just because you have a default constraint does not mean that you can’t have a value insert into that column. It just means that if you don’t insert a value, the default value is actually used, I think, we can also utilize what we call an identity property that provides similar behavior. Although for an identity column, you’re not actually supposed to override those values. So that’s a little bit different. And identity columns are a little bit special. And again, right we will be talking more about what the identity column does in a future lecture. But I’ll showcase the basics here for this demo. But for now, let’s take a look at what this looks like in code. So let’s go ahead and re create our table here.

And so similar kind of thing. We had a school here. So our same school table that we had before, we do have our foreign keys still. So demo conference. So that still exists. I haven’t taken that on out. But I’ve added two new constraints here, right, two new constraints. So I’ve added an identity constraint. So identity column is going to actually auto increment that column. So every record that gets inserted, it will be one more than the previous record that was inserted. So in this case, I am telling it to start counting at one and counting up. Sorry, I’m telling it to start at one and then count up by one each time. So increment each value up by one. The other new constraint that added here is the default constraint down here. This is the identity. And this is the default constraint. In this case, I’m using assist date, time offset. So what this is going to do is it’s going to get the current time on the SQL Server instance, when this data or when this record is actually inserted. And that’s the value that will be defaulted there, created or updated on and that sort of thing is a very useful or very useful columns to provide some extra information in terms of debugging, and tracking, when records are actually inserted into your database. So notice down here, for my insert statement here, I don’t have school ID or created on as part of my insert those records are left out, because school ID is taken care of for us by the identity column.

So that will count up by one, and then my created on and I don’t need to create, I don’t need to add that manually, right? It takes the timestamp of when that record is inserted and adds that timestamp to our comm by default. So if we scroll down here, we can see our results. Notice that my school ID is auto incrementing. So k state was the first record u and l was the last one that I added and incremented by one each time I actually inserted it for that column. And then we also have my created on column where it did a timestamp when these records were actually inserted now looks like it appears like it is all at the same time. Let’s just because all these records were able to be inserted instantaneously because there’s not very many. But that does change as we add more information. So for example, if I add a new record here. So notice your an O is missing again. So if I add you and oh, I can actually provide a timestamp if I wanted to. So let’s do you just like to 21 run that. Notice that here is my 110101 University of Nebraska, Omaha, my dignity column is there. And then we can also leave that off if we wanted to as well just like what we did up here. Now, you’re not supposed to do this, where we where we actually put a value in for an identity column. So you cannot actually explicitly insert value for an identity column. But you can override that behavior.

So in SQL Server, there is a a setting for your actual database itself, where you can turn this protective feature off. If you wanted to, I do not recommend this. That’s the whole point of an identity column. If you find yourself needing to override the value and identity column, this probably means that you’re using identity column in a wrong spot. But rather than using an identity column, what we can actually do is use a sequence object. And again, we can cover this more in detail in future. But a sequence object is very similar to an identity column and the sense that The sequence object is going to auto increment, it’s going to be a default value. But instead, we’re actually having more control over that value. And so this is my sequence object up here. So create sequence, demo dot school ID, and then as int, so that’s the data type. And then this is the value that it starts at. So min value is two increments by two every time and then we have a no cycle.

So our sequence doesn’t wrap around itself. So you can have a pattern that way too, if you want your sequence to wrap, which is something that the identity column cannot do. And so to use this as part of our table, we use a default constraint. And then inside of that, we’re going to say next value four, and then our sequence object, that would be the syntax here. And so now if we run this, you can kind of notice down here, school ID now starts at zero and goes up by two every time because that is what our sequence object up here starts. But if I wanted to make it just like what I had up above, and my identity column, I can say one and one year. Aha, there we go. Now 12345678. But just two different ways to do the exact same or very similar idea very similar concept. Again, though, this can also be done programmatically in an application that’s utilizing this database as well. But that is going to conclude our introduction into database constraints for our tables. Next time we’ll take a deeper dive into some SQL code, and look how all of this works in action.