Character Types

Video Transcription

Welcome back everyone. In this video, we’re going to be exploring a little bit more into different data types in SQL. So this first bit is going to be mostly on just text. So text comes in a variety of a couple different forms and databases, little bit more so in SQL Server, but we have a fixed length type, so char, and then in char, so in char being Unicode type, and we’ll talk about that here in just a minute. But this allows you to specify a column to be a fixed number of characters. So like, for example, a state is only two characters, long i k s, because this will help us reduce for one control the data that’s being inputted into our table, which will become a little bit more important as we talk about normalization and things like that. But it also helps with storage space as well, for anything that is, say you had a fixed character of size 10. But the value that you stored only had five characters, what the database system will actually do is pad the rest of that with spaces to consume that specific number of characters. But anything larger than that will not be able to be inserted.

But we can also have variable length data types. So our variable variable length strings, like var char and N var char. So in this sense, we can specify again, the number of characters that we expect, but nothing is actually padded. And so the storage is fit as not fixed anymore, right. So it adjusts to how many characters are actually in that given string. Now, this because this starts to showcase just a slightly different slight difference with how with how SQL Server manages strings versus other database management systems. So by default, now, if you are working with SQL Server 2019, or newer, which is what we’re working with, it supports UTF, eight encoded strings. But if you have anything that is not able to be encoded in UTF, in the specification, UTF eight, then you will have to use in char or n bar char.

So char and bar char versus in char and in bar char has a little bit more restriction as far as what what strings can be stored. So char and var char do not support all Unicode characters, and shower and Invar char will. So this becomes more important when you have user inputs. So if you have like say a form on a website, and not text then gets inputted into your database in some way. Your char and var char won’t necessarily be guaranteed to be able to consume and store those characters if they’re in different encoding. Specifically, if you have different languages and things like that different like English like English versus Spanish versus Italian versus Japanese and things like that, because Japanese characters are different Unicode values than English characters. But if you have a program and in between that is going to filter and filter and verify the the encoding of the text before it goes into your database.

Char and var char I would actually be the preferred data type to use because chart and bar char uses one byte per character to store verses in char and Invar char which uses two bytes per character to store. So char and var char are a little bit cheaper to store in your database versus an in char in bar char. So just like your programming languages, we have a variety of string functions that we can apply here. So both for ASCII and ASCII and characters, so char index substring, left and right I’ll show some examples of what this looks like here in just a few minutes. But substring char index, upper lower, very equivalent to most of the operations that you see in Python and Java. We can trim the whitespace off from the left side or the right side of the string. So left trim and right trim. Very similar to replace replicate and stuff. concat to bind strings together. So we’ll use concat. Quite, quite a lot. But let’s take a look at a few examples of these. And I’m not going to show examples of every single one of these functions. And again, I will link to some of the standard documentation for these. So you can explore and play with these a little bit more.

But let’s, let’s showcase a couple of these, to see them in action. Alright, so let’s take a look at this example here. So this is a big piece of this is a big piece of SQL, I have two SELECT statements here. This first one is equivalent or not equivalent to this first one is going to showcase concat. So concat is going to be equivalent to the plus operator, just like what you would expect in most. So if I wanted to add a comma to the country name, so country name, comma continent, and then this down here is the same thing. But using the concat. So concat, this with that, and that. So both of those are equivalent. So let’s highlight that and execute that code. And so you can see the country continent and country continent to being the exact same value. So the the string function can cat is equivalent to using the plus operator. But, but the difference here is that concat will actually replace null values with an empty value.

So that’s a little bit different functionality versus what you would expect with the plus operator. So concat would be the preferred way of joining strings inside of your SQL statements. We can also also wanted to showcase substring, which is another common one that you’ll be using. So we can showcase this so sub i can get parts of a string. So this is equivalent to the slice operation in Python, and the substring operation in Java. But we can pull out individual characters or range of characters from a given text or string. But again, I’m not going to take time to record a video to cover each and every single one of those, each and every single one of these string functions. Just because the video would be way too long and a little bit dry. So if you are interested in learning more about each of these, you can explore the documentation and I would encourage you to do so because some of these like upper lower, we might see and might use in a later exercise, homework or project