In this module, we’re going to discuss a lot of topics related to cyber security. Cyber Security is another important research area in computer science, and it’s one that directly impacts a lot of computer users in their daily lives. So in cybersecurity, we really are asking ourselves one big question, how do we keep our data secure, and that’s really all it comes down to. We’re trying to secure data both on our computers, but also as we transmitted across the Internet, and any other communication technologies we might be using. And so we’re going to talk about some different ways that we can keep our data secure on our computers.
Before we get into that, a word of warning, I’m encouraging each of you to put your white hat on when we talk about this. So in computer science, we talk about different types of hackers. And typically, you have the black hat hackers, which are the ones that do so maliciously. But you can also have white hat hackers, which are hackers that use their skills for benevolent means– to help companies find security holes in their infrastructures and hopefully patch those holes and become a little bit safer. And so some of the things we’re going to talk about today, if used maliciously, could be very illegal, they could be felonies, they’re very, very dangerous things for you to use maliciously. But as a computer scientist, it’s important for you to understand those topics so that you can defend against them and know what they are in case they get used against you. And so I’m encouraging us all to put our white hats on at this point, and come at this topic from the view of doing this for the good of everybody else and trying to help them secure and protect their data. Okay, let’s get started.
First, we need to talk about authentication. Authentication is a very, very important part of anything in cyber security. And authentication mainly deals with a few things, it’s determining if the person is who they say they are. So when you sit down to a computer, and you type in your password, that is a form of authentication. You’re letting your computer authenticate the fact that you are who you say you are. Now, typically, authentication requires three different factors. There are ownership factors, which is something the user has. For example, an ownership factor could be a physical key to a building, it could be a USB drive that has a token on it, or it could be some other symbol or some other device that the user has to authenticate that they are who they say they are. A police badge, for example, is a form of authentication that authenticates who that person is. The second one we can talk about is knowledge factors. A knowledge factor is something the user knows that authenticates themselves. Typically, we think of this as passwords and pin numbers or anything else the user has memorized. But it could be even other facts such as birth dates, and mother’s maiden names, and social security numbers, things that the user would know very quickly. Those are what makes up a knowledge factor. And for most computer systems, we use knowledge factors as the primary form of authentication. The third form of authentication is an inherent factor. And an inherent factor is something the user inherently is. An inherent factor would be things such as a retinal scan, or a fingerprint scan, or DNA test, something that these are really can’t change about themselves.
And so to authenticate a user in a computer system, we typically use one of these factors at a minimum to authenticate them. There is also something called multi factor authentication or two factor authentication, which you’ve probably come across, especially if you do online banking or play certain video games. And that is pretty much exactly what you think it is, it is two different factors of authentication combined to provide greater security. And typically, they combine something the user knows, such as a password or a pin number ,with something the user has, such as a credit card, or in a lot of cases, it’s access to a mobile phone or an email account. And that is something the user has or possesses as the second factor of authentication. So think about some places that you run into two factor authentication, video games, online banking. At K-State, a lot of faculty and staff now use two factor authentication whenever they authenticate at K-State. But a big question to ask yourself is, would you as a student like to have two factor authentication on your K-State account? Why or why not? Do you think it’s worth the extra security to really protect all of your important academic records? Or is it just an extra hassle if you have to get your phone out every single time you want to log in? Currently, for faculty and staff, we log in with our phones, but then we can have it remember our device for up to 10 days before we have to do that two factor authentication again. So it’s a little bit extra hassle, but it’s not super inconvenient if we’re using the same computer day after day. But I could see if you’re using lab computers and have to authenticate every single time, that might be a little extra hassle for you. So it’s something to think about.
So in this lecture, we’re going to hone in on one of the most common authentication factors, which is the use of a password. A password is a very traditional system used to authenticate users on computer systems, on websites, just about anywhere. But I think there’s a lot of misconceptions about how to make secure passwords and how secure passwords really are. And so we’re going to rely on some information to really look at passwords and how they can be made more secure and some of the ways that they may be are less secure than we thought. So this is a comic from XKCD. It’s one of the great comics that he does. And here, he’s talking about how we make a particular password. And we’re going to come back to this comic. But here he shows a pretty common password, we start with an uncommon, but non jibberish word, then we add like a number and a punctuation because almost every website, you need to have at least a number and a punctuation. And we do some common substitutions from leet speak so zeros for O’s and fours for A’s. Usually we have caps and 99% of the time, if you’re going to capitalize the letter, it’s going to be the first letter of your password. So what we have is we have a password here that has a few different bits of entropy. And in fact, if you calculate it out, there’s just about 16 bits of entropy in this box. So that means that there are roughly 65,535 different ways that you could build a password. Based on these rules. You choose a word, you add some numbers, and punctuations, and substitutions and things. And so 16 bits of entropy, that sounds pretty powerful.
But how would we go about cracking this password? What would that look like? So let’s look at some different ways you could possibly crack this password. Obviously, the first thing you could do is you could try brute force. You start with aaaaaa, that didn’t work aaaaab, that didn’t work, and so on. And so brute force hacking does work in certain scenarios, especially for things like combination locks. If you’ve ever done a an escape room, one thing you might realize is they give you those combination locks with four dials on it. And hopefully, you’re smart enough to realize that if you only get three of the four dials, and you can’t quite figure out the fourth one, you could brute force it in about 10 seconds. So you really don’t have to get all of the dials, you can just brute force a little bit. And so with combination locks and things, sometimes it’s very, very easy to brute force them. And in fact was simple passwords like old websites that required your passwords to be eight characters or less, you could actually brute force a password very quickly. For example, here is a six character password, there would be about 308 million different six character all lowercase passwords. It seems like a lot, but if we try it about 1000 a second, which is pretty, pretty common. I mean, even on a bad website, you could try 1000 passwords a second, it would only take us about three and a half days to crack that password. And in the grand scheme of things, three and a half days is not that long.
So let’s look at another way. How about things like rainbow tables. For example, this slide shows the 25 most common passwords used on the internet, according to some research done by Gizmodo a few years ago. And looking at this list, it is pretty disappointing. You’ll see passwords such as 123456, or 123456789. But you’ll see things like sunshine, qwerty, iloveyou, admin, abc123, certain profanity words. And so these passwords are really not all that great. And in fact, it’s really easy to go online and find some of the most common passwords that are available. Another thing that we can look at is what’s called rainbow tables. And so a rainbow table is actually a password lookup table that is calculated all of the protected versions of these passwords. For example, on older versions of Windows, when you set your windows password would actually be stored in the Windows registry using a hash. And so a hash, if you remember from our previous module is an algorithm that takes a piece of text and converts it to a number using a one way algorithm. And so the theory is if you type in the same password and go through the same hash algorithm, if you get the same output, you know, they put in the right password. But of course, what you could do is put in all possible passwords and store all of the possible hashes and create a table that matches them up. And so that’s what a rainbow table is, it basically creates a rainbow of all the different possible password combinations and the hashes that those create. And so if you have a Windows computer, an older Windows computer and can get the password hash out of the registry, you can go online to these websites that have rainbow tables and just put in that hash, and they will look up the password for you or at least a password that creates that hash. And so for a lot of really bad algorithms such as the early windows algorithm, there are some algorithms such as MD5 that rainbow tables are created for, you could just go out and look up a password based on a hash.
So between brute forcing common passwords and rainbow tables, there are a lot of different ways that you can crack really easy passwords. So let’s go back and look at that password example we saw earlier and talk about entropy. So he calculates that there would be about 28 bits of entropy in a common password there and 2 the 28 to get about three days at 1000 guesses a second. It’s really similar to what we saw with brute forcing, even though it’s a much longer more complex password, but it’s actually pretty easy to break a password like that. Now, here’s the hard part. Can you remember what that password was on that slide a few minutes ago? Don’t Look, don’t rewind the video and look, but see if you can write down that password that we saw earlier. Did you get it? Now you can go look and see if you got it. And so it turns out that we’re creating passwords that are really easy to actually crack if we understand the structure of the password. But it’s very hard for us to remember is it’s troubadour with an & and a three, but I don’t remember exactly what order so it’s hard to remember.
And so what he’s arguing here is we’re creating passwords that are basically easy to crack and hard to remember, what we really should focus on is creating passwords that are hard to crack, but easy to remember. How do you suppose we would do that? It turns out that to make more complex passwords, there is exactly one rule that you need to follow. And that is make them longer. That’s it. No special characters, no capitalization, punctuation, lowercase, uppercase, numbers, symbols, foreign words, does not matter. The only thing that matters to make your password more secure, is making it longer. So here we have four random common words, you start with 1000 common words in the English language. So that means you have about 10, or 11 bits of entropy per word. And you pick four of them correct horse battery staple. purple monkey umbrella dishwasher. very, very simple. All you have to remember is those four words, and I can even go out there and say, here’s a list of 1000 words. And my password is four words separated by spaces. And that right there would have 44 bits of entropy. So even if I gave you the list of words and told you exactly how my password was set up, it could take you 550 years to try all possible combinations of that password at 1000 guesses a second. That is much, much harder to do. But it’s very easy to remember, you probably already remember that password correct horse battery staple. It’s very easy to remember.
And so the whole idea behind making secure passwords, we see a lot of websites today that tell you you have to have a number and a symbol and special characters and whatnot, doesn’t matter. The only thing they should do is set a minimum password requirement of 20 or 30 characters and just tell you to make a long password. That right there will make your password more secure with a big asterisk on it. Understand that when we talk about security here, we’re only talking about security based on cracking the password using some sort of brute force method or some sort of dictionary based method. We are not saying that that password is secure against all attacks, for example, correct horse battery staple, if you write that down on a post it note and stick it under your keyboard, it would only take somebody about two seconds to read that password off of the post a note and remember it instantly. There’s nothing special about it. And so just because it’s easy for you to remember doesn’t mean that wouldn’t also be easy for someone else to remember. Likewise, if you don’t pick four random words, if you pick four words like your four grandparents or something like that, like their names, that could be much easier for people to crack. And so while it’s uncrackable from a computer standpoint, there are other parts of cybersecurity that we’ll get into a little bit later that make this password maybe less secure than what you want.