Database Design Principles

Video Transcription

Welcome back everyone. And in this video, we’re going to be starting our discussion on what makes a database a good database. We’ve talked a lot about so far how we might make a SQL query to retrieve data. And later, we’ll also talk about SQL queries that we can use to actually insert and modify and delete data as well. But what makes a database design a good database. So there’s a couple primary things that we’re after as part of a good design. And we’ve talked about some of these items, when we talked about the motivation for why we need a database over something like an Excel sheet, for example. But data integrity is one of the big ones, right? So the data that we actually store is, is sent, received and stored in the same way that the users are actually translating it as so you know, if we need to store some number, or string or whatever it may be inside of our database, that same number that is transmitted is stored. And then when we retrieve that data, that same data is actually retrieved as well, it didn’t actually change or get corrupted or, or anything like that. But the big point behind data integrity is that we’re trying to avoid unintentional changes of our data, whether that be in transmission, writing, or retrieving our information that we’re actually working with. But that also also kind of helps us with maintaining our data as well. So this includes anything, any data that gets updated, deleted, all those sorts of things, and also the maintenance of our code. A poorly designed database is going to typically impede our ability to maintain a consistent state of information, and especially maintaining that consistent state across a larger period of time. poorly designed databases often include multiple copies of the same information, and summon, some copies get updated, some don’t, some get deleted, some don’t. And a variety of other things in between. And I’ll show some examples of this here in just a little bit.

But that also helps us write better code as well, allowing us to maintain good quality SQL code that we write as part of our database alongside application code as well that utilizes the results from that database. This also helps translate usually to better performance, a well designed database will typically take up less storage, especially with a database that doesn’t store duplicate information. Of course, less duplicate information means that we’re having to store the same information only once, rather than multiple times. So that usually translates to better performance as far as storage have better efficiency for storage. And typically, better performance on SQL queries. Although sometimes, we do sacrifice some performance to improve the footprint or storage of our database. But again, it really depends on the use case. Here, we talked about good design that provides data integrity, maintenance of that data, as well as better code as a result for this class. Overall, we’re going to talk about a couple primary, a couple primary design principles that we want to follow. And these are of course, covered at a very high level. And we’ll have some more videos that cover these in a little bit more detail. But these are some of the things that you’ll want to strive for as you’re designing your databases. But the first one here is to avoid unnecessary complexity. Sometimes when we’re working with data and information, it can be a big pitfall to overcomplicate things. So sometimes we make extra tables when we don’t need extra tables, adding more columns, and we don’t need more columns, and a variety of things in between. But unnecessary complexity, of course, also increase in our database also increases the complexity of our SQL that we have to write in turn complicating our code that we have to write, which makes your your program your programming your SQL code, and the program that uses it more susceptible to bugs and issues down the road. This also can affect our data integrity as a result.

So unnecessary complexity is a kind of a difficult one to avoid because it’s not always apparent that you’re going down a road that is more complex than others. But this does come with practice and we’ll see some examples of this as we start designing more databases, our other general design principle that we’re going to focus on here is avoiding redundancy. Most of the time, this just equates to not storing the same data more than once. So this is going to be kind of our mantra as we start going through writing our databases. So I’ll show I’ll, I’ll show this in my upcoming example here. But this is also something that you should consider following when you work on your final projects and go out into the real world and start designing databases as well. This is probably the most common issue that I actually see in database designs, not only in industry, but especially for new database engineers, or designers that come out. So typically, it is easier up front for someone to store the same data in in one column and store that exact same data in another column in another table. But that duplication of data can, again, as we talked about a little bit ago, influence our data integrity, right, the data consistency gets lost, because if one record is updated, but the other is not, then we have an inconsistent state of data of the same kind of record that we’re actually storing. So one of the things that we’re going to really try to focus on here is trying to avoid storing the same information more than once. On occasion, we can’t avoid this, but on more, more often than not, we can avoid this kind of issue. But those are two primary design principles that we’re going to be focusing on as part of this course. And here in the next video, we’ll take a look at an example of the beginning of designing a database