CC 310 Textbook
This is the homepage
This is the homepage
This page is the main page for Review Control Flow, I/O and Exceptions
^[https://pxhere.com/en/photo/1172040]
Programming is the act of writing source code for a computer program in such a way that a modern computer can understand and perform the steps described in the code. There are many different programming languages that can be used, such as high-level languages like Java and Python.
To run code written in those languages, we can use a compiler to convert the code to a low-level language that can be directly executed by the computer, or we can use an interpreter to read the code and perform the requested operations on the computer.
At this point, we have most likely written some programs already. This chapter will review the important aspects of our chosen programming language, giving us a solid basis to build upon. Hopefully most of this will be review, but there may be a few new terms or concepts introduced here as well.
In this course, we will primarily be learning different ways to store and manipulate data in our programs. Of course, we could do this using the source code of our chosen programming language, but in many cases that would defeat the purpose of learning how to do it ourselves!
Instead, we will use several different ways to represent the steps required to build our programs. Let’s review a couple of them now.
One of the simplest ways to describe a computer program is to simply write what it does using our preferred language, such as English. Of course, natural language can be very ambiguous, so we must be careful to make our written descriptions as precise as possible. So, it is a good idea to limit ourselves to simple, clear sentences that aren’t written as prose. It may seem a bit boring, but this is the best way to make sure our intent is completely understood.
A great example is a recipe for baking. Each step is written clearly and concisely, with enough descriptive words used to allow anyone to read and follow the directions.
One method of representing computer programs is through the use of flowcharts. A flowchart consists of graphical blocks representing individual operations to be performed, connected with arrows which describe the flow of the program. The image above gives the basic building blocks of the flowcharts that will be used in this course. We will mostly follow the flowchart design used by the Flowgorithm program available online. The following pages in this chapter will introduce and discuss each block in detail.
We can also express our computer programs through the use of pseudocode. Pseudocode is an abstract language that resembles a high-level programming language, but it is written in such a way that it can be easily understood by any programmer who is familiar with any one of several common languages. The pseudocode may not be directly executable as written, but it should contain enough detail to be easily understood and adapted to an actual programming language by a skilled programmer.
There are many standards that exist for pseudocode, each with their own unique features and uses. In this course, we will mostly follow the standards from the International Baccalaureate Organization . In the following pages in this chapter, we’ll also introduce pseudocode for each of the flowchart blocks shown above.
This page is the main page for Java
Let’s discuss some of the basic concepts we need to understand about the Java programming language.
To begin, let’s look at a simple Hello World program written in Java:
This program contains multiple important parts:
HelloWorld
, and it will be stored in a file called HelloWorld.java
.{}
.main()
method. The method declaration of this main method should exactly match what is shown here. We’ll discuss these keywords in more detail in a later chapter.main()
method is called should directly follow the method declaration. As before, the contents of the method are surrounded by curly braces {}
.;
.Of course, this is a very brief overview for the Java programming language. To learn more, feel free to refer to the references listed below, as well as the textbook content for previous courses.
See if you can use the code above to write your own Hello World program in the HelloWorld.java
file that is open to the left. We’ll learn how to compile and run that program on the next page.
Now that we’ve written our first Java program, we must compile and run the program to see the fruits of our labors. There are many different ways to do this using the Codio platform. We’ll discuss each of them in detail here.
Codio includes a built-in Linux terminal, which allows us to perform actions directly on a command-line interface just like we would on an actual computer running Linux. We can access the Terminal in many ways:
Additionally, some pages may already open a terminal window for us in the left-hand pane, as this page so helpfully does. As we can see, we’re never very far away from a terminal.
No worries! We’ll give you everything you need to know to compile and run your Java programs in this course.
If you’d like to learn a bit more about the Linux terminal and some of the basic commands, feel free to check out this great video on YouTube:
Let’s go to the terminal window and navigate to our program. When we first open the Terminal window, it should show us a prompt that looks somewhat like this one:
There is quite a bit of information there, but we’re interested in the last little bit of the last line, where it says ~/workspace
. That is the current directory, or folder, our terminal is looking at, also known as our working directory. We can always find the full location of our working directory by typing the pwd
command, short for “Print Working Directory,” in the terminal. Let’s try it now!
Enter this command in the terminal:
and we should see output similar to this:
In that output, we’ll see that the full path to our working directory is /home/codio/workspace
. This is the default location for all of our content in Codio, and its where everything shown in the file tree to the far left is stored. When working in Codio, we’ll always want to store our work in this directory.
Next, let’s use the ls
command, short for “LiSt,” to see a list of all of the items in that directory:
We should see a whole list of items appear in the terminal. Most of them are directories containing examples for the chapters this textbook, including the HelloWorld.java
file that we edited in the last page. Thankfully, the directories are named in a very logical way, making it easy for us to find what we need. For example, to find the directory for Chapter 1 that contains examples for Java, look for the directory with the name starting with 1j
. In this case, it would be 1j-hello
.
Finally, we can use the cd
command, short for “Change Directory,” to change the working directory. To change to the 1j-hello
directory, type cd
into the terminal window, followed by the name of that directory:
We are now in the 1j-hello
directory, as we can see by observing the ~/workspace/1j-hello
on the current line in the terminal. Finally, we can do the ls
command again to see the files in that directory:
We should see our HelloWorld.java
file! If it doesn’t appear, try using this command to get to the correct directory: cd /home/codio/workspace/1j-hello
.
Once we’re at the point where we can see the HelloWorld.java
file, we can move on to actually compiling and running the program.
To compile a Java program in the terminal, we’ll use the javac
command, short for Java Compiler, followed by the name of the Java file we’d like to compile. So, in our case, we’ll do the following:
If it works correctly, we shouldn’t get any additional output. The compiler will look through our Java file and create a new file containing the Java bytecode for our program, called HelloWorld.class
. We can use the ls
command to see it:
If the javac
command gives you any output, or doesn’t create a HelloWorld.class
file, that most likely means that your code has an error in it. Go back to the previous page and double-check that the contents of HelloWorld.java
exactly match what is shown at the bottom of the page. You can also read the error message output by javac
to determine what might be going wrong in your file.
We’ll cover information about simple debugging steps on the next page as well. If you get stuck, now is a great time to go to Piazza and ask for assistance. You aren’t in this alone!
Finally, we can now run our program! Once it is compiled, just type the following in the terminal to run it:
That’s all there is to it! We’ve now successfully compiled and run our first Java program. Of course, we can run the program as many times as we want by repeating the previous java
command. If we make changes to the HelloWorld.java
file, we’ll need to recompile it using the previous javac
command first. Then, if those changes instruct the computer to do something different, we should see those changes when we run the program after compiling it.
See if you can change the HelloWorld.java
file to print out a different message. Once you’ve changed it, use the javac
and java
commands to compile and run the updated program. Make sure you see the correct output!
In many of the Codio projects and tutorials in this course, the Run Menu will be populated with helpful commands. The Run Menu can be found at the top of the screen, right here:
Each Codio project or tutorial may have different items in this menu, since they can be configured by the author of the project. For this book, there will always be the following options:
To use these commands, we must simply open up the file we’d like to use, then select the appropriate option from the Run Menu. It will automatically use the currently open file in the command.
So, to compile and run our file, we must simply open HelloWorld.java
in the panel to the left, then click the arrow in the Run Menu and first select Java - Compile File. It should open up a Terminal tab and show output similar to the following:
It looks very similar to the command we entered manually. The only difference is that it uses the folder name along with the filename in the command, which ensures that it gets the correct file without even opening that directory.
Once we’ve compiled the file, we can go back to that tab and select the Java - Run File option. It should show output similar to this:
Again, it looks very similar to the commands we performed earlier. Since the Java bytecode file is in a directory, we have to use a -classpath
option to let Java know where to find the file.
Make another change to the HelloWorld.java
file, and then see if you can use the options in the Run Menu to compile and run it. Make sure you see the correct output!
Last, but not least, many of the Codio tutorials and projects in this program will include assessments that we must solve by writing code. Codio can then automatically run the program and check for specific things, such as the correct output, in order to give us a grade. For most of these questions, we’ll be able to make changes to our code as many times as we’d like to get the correct answer. Try the example below!
{Check It!|assessment}(code-output-compare-146573703)
As we can see, there are many different ways to compile and run our code using Codio. Feel free to use any of these methods throughout this course.
Codio also includes an integrated debugger, which is very helpful when we want to determine if there is an error in our code. We can also use the debugger to see what values are stored in each variable at any point in our program.
To use the debugger, find the Debug Menu at the top of the Codio window. It is to the right of the Run Menu we’ve already been using. On that menu, we should see an option for Java - Debug File. Select that option to run our program in the Codio debugger.
As we build more complex programs in this course, we’ll be able to configure our own debugger configurations that allow us to test multiple files and operations.
The Codio debugger only works with input from a file, not from the terminal. So, to use the debugger, we’ll need to make sure the input we’d like to test is stored in a file, such as [input.txt](open_file 1j-hello/input.txt), before debugging. We can then give that file as an argument to our program in our debugger configuration, and write our program to read input from a file if one is provided as an argument.
Learning how to use a debugger is a hands-on process, and is probably best described in a video. So, here are a couple of videos that should help us get up to speed on working in the Codio debugger.
Computational Core - Java Debugging Tutorial Codio Documentation - Debugging
We can always use the debugger to help us find problems in our code.
^[https://www.codio.com/blog/python-tutor-codio-visualizer]
Codio now includes support for Python Tutor , allowing us to visualize what is happening in our code. We can see that output in the second tab that is open to the left. It even works in Java!
Unfortunately, students are not able to open the visualizer directly, so it must be configured by an instructor in the Codio lesson. If you find a page in this textbook where you’d like to be able to visualize your code, please post in Piazza and let us know!
A variable in a programming language is an abstraction that allows storing one value in each instant of time, but this value can change along with the program execution. A variable can be represented as a box holding a value. If the variable is a container, e.g., a list (or array or vector), a matrix, a tuple, or a set of values, each box in the container contains a single value.
A variable is characterized by:
results
, numberOfNodes
, numberOfEdges
. For writing variable names composed of two or more words in Java, we can use “CamelCase,” writing words without spaces and with the first letter of each word after the first one uppercase.Depending on the programming language, we could also specify for a variable:
A programming language allows to perform two basic operations with a variable:
+
, and subtraction -
. They allow performing basic arithmetic operations with numbers.<
, and greater than >
. Usually, they allow to comparing two operands, each of which could be a variable. The result of the comparison is either the Boolean value true
or the Boolean value false
.&&
, or ||
, and not !
. This operator allows us to relate logical conditions together to create more complex statements.+
to concatenate the strings “Hello” and the string “world” to produce the string “Hello world”. These operators allow us to manipulate strings.a = b
.The table below lists the flowchart blocks used to represent variables, as well as the corresponding pseudocode:
Operation | Flowchart | Pseudocode |
---|---|---|
Declare |
![]() ![]() |
X = 0 |
Assign |
![]() ![]() |
X = 5 |
Declare & Assign |
![]() ![]() |
X = 5 |
Notice that variables must be assigned a value when declared in pseudocode. By default, most programming languages automatically assign the value $0$ to a new integer variable, so we’ll use that value in our pseudocode as well.
Likewise, variables in a flowchart are given a type, whereas variables in pseudocode are not. Instead, the data type of those variables can be inferred by the values stored in them.
Variables in Java must be declared with a type and a name. Once declared, a variable can only store the type of data it was declared to store.
There are several primitive data types we can use in Java. The following table lists several of the numeric types:
Name | Type | Size | Range |
---|---|---|---|
Byte | byte |
8 bits | $-128$ to $127$ |
Short | short |
16 bits | $-32,768$ to $32,767$ |
Integer | int |
32 bits | $-2,147,483,648$ to $2,147,483,647$ |
Long | long |
64 bits | $-2^{63}$ to $2^{63} - 1$ |
Float | float |
32 bits | $ \pm 10^{\pm 38} $ |
Double | double |
64 bits | $ \pm 10^{\pm 308} $ |
In addition, there is the boolean
type which can store a single Boolean value, either true
or false
. Finally, there is the char
primitive data type, which can store a single character of text.
To declare a variable, we can simply place the type of the variable before the name in our code:
We can then assign a value to that variable using an assignment statement:
We can even combine them into a single statement:
We can also convert, or cast, data between different types. When we do this, the results may vary a bit due to how computers store and calculate numbers. So, it is always best to fully test any code that casts data between data types to make sure it works as expected.
To cast, we can simply place the new type in parentheses before the value in a statement:
This will convert the floating point value stored in x
to an integer value stored in y
.
The conditional statement, also known as the If-Then statement, is used to control the program’s flow by checking the value of a Boolean statement and determining if a block of code should be executed based on that value. This is the simplest conditional instruction. If the condition is true, the block enclosed within the statement is executed. If it is false, then the code in the block is skipped.
A more advanced conditional statement, the If-Then-Else or If-Else statement, includes two blocks. The first block will be executed if the Boolean statement is true. If the Boolean statement is false, then the second block of code will be executed instead.
Simple conditions are obtained by means of the relational operators, such as <
, >
, and ==
, which allow you to compare two elements, such as two numbers, or a variable and a number, or two variables. Compound conditions are obtained by composing two or more simple conditions through the logical operators and &&
, or ||
, and not !
.
Recall that the Boolean logic operators and &&
, or ||
, and not !
can be used to construct more complex Boolean logic statements.
For example, consider the statement x <= 5
. This could be broken down into two statements, combined by the or ||
operation: x < 5 || x == 5
. The table below, called a truth table, gives the result of the or operation based on the values of the two operands:
Operand 1 | Operand 2 | Operand 1 or Operand 2 |
---|---|---|
False | False | False |
False | True | True |
True | False | True |
True | True | True |
As shown above, the result of the or operation is true
if at least one of the operands is true
.
Likewise, to express the mathematical condition 3 < a < 5
we can use the logical operator and &&
by dividing the mathematical condition into two logical conditions: a > 3 && a < 5
. The table below gives the result of the and operation based on the values of the two operands:
Operand 1 | Operand 2 | Operand 1 or Operand 2 |
---|---|---|
False | False | False |
False | True | False |
True | False | False |
True | True | True |
As shown above, the result of the and operation is true
if both of the operands are true
.
Finally, the not !
logical operator is used to reverse, or invert, the value of a Boolean statement. For example, we can express the logical statement x < 3
as !(x >= 3)
, using the not operator to invert the value of the statement. The table below gives the result of the not operation based on the value of its operand:
Operand | not Operand |
---|---|
False | True |
True | False |
In propositional logic, the completeness theorem shows that all other logical operators can be obtained by appropriately combining the and, or and not operators. So, by just understanding these three operators, we can construct any other Boolean logic statement.
The table below lists the flowchart blocks used to represent conditional statements, as well as the corresponding pseudocode:
Operation | Flowchart | Pseudocode |
---|---|---|
If-Then |
![]() ![]() |
|
If-Then-Else |
![]() ![]() |
|
The mechanism for determining which block an If-Then-Else statement executes is the following:
To understand how a conditional statement works, let’s look at this example of a simple If-Then-Else statement. Consider the following flowchart:
In this case, if a
is less than zero, the output message will be “The value of a is less than zero”. Otherwise, if a is not less than zero (that is, if a is greater than or equal to zero), the output message will be “The value of a is greater than or equal to zero”.
We can also nest conditional statements together, making more complex programs.
Consider the following flowchart:
In this case, if a is less than zero the output message will be “The value of a is less than zero”. Otherwise (that is, if a is not less than zero so if a is greater than or equal to zero) the block checks whether a is equal to zero; if so, the output message will be “The value of a is equal to zero”. Otherwise (that is, if the first condition is false, i.e. a >= 0
and the second condition is false, i.e. is nonzero; the two conditions must be both true as if they were bound by a logical and, and they are the same as the condition a > 0
) the output message will be “The value of a is greater than zero”.
To see how conditional statements look in Java, let’s recreate them from the flowcharts shown above.
As we can see in the examples above, we must use curly braces {}
to separate each block of code. In addition, we typically indent the code inside of each block making it easier to read and follow.
Loops are another way we can control the flow of our program, this time by repeating steps based on a given criteria. A computer is able to repeat the same instructions many times. There are several ways to tell a computer to repeat a sequence of instructions:
while true
. This construct is useful in software applications such as servers that will offer a service. The service is supposed to be available forever.Repeat 10 times
or for i = 1 to 10
. This loop can be used when you know the number of repetitions. There are also loops that allow you to repeat as many times as there are elements of a collection, such as for each item in list
while
loop, which repeats while the condition is true.In repeat while loops, the number of repetitions depends on the occurrence of a condition: the cycle repeats if the condition is true. Loops can also be nested, just like conditional statements.
The table below lists the flowchart blocks used to represent loop statements, as well as the corresponding pseudocode:
To see how loops look in Java, let’s recreate them from the flowcharts shown above.
As we can see in the examples above, we must use curly braces {}
to separate each block of code. In addition, we typically indent the code inside of each block making it easier to read and follow.
At this point, we’ve covered enough material to build a simple program. So, let’s see if we can complete the following example program before continuing.
Write a program that reads an integer from either the terminal, or a file if one is provided as a command-line argument. It should not worry about handling any exceptions encountered.
The program should compute and print the sum of all integers from 1 up to and including the integer provided as input, except those integers which are evenly divisible by 3. If the provided input is not a positive integer, the program should simply print 0.
Since we haven’t covered how to handle input yet, we can use the following skeleton code to help us build our program.
This code will create a Scanner
variable called reader
, and initialize it to either read from file provided as a command-line argument, or from the terminal if an argument is not provided. It will then read a single integer from the input, storing it in the variable x
.
To complete this exercise, we can continue to write this program where the MORE CODE GOES HERE
comment is in the skeleton code.
{Check It!|assessment}(code-output-compare-4282752777)
^[File:USPS Post office boxes 1.jpg. (2017, May 17). Wikimedia Commons, the free media repository. Retrieved 18:17, November 5, 2018 from https://commons.wikimedia.org/w/index.php?title=File:USPS_Post_office_boxes_1.jpg&oldid=244476438.]
Arrays allow us to store multiple values in the same variable, using an index to determine which value we wish to store or retrieve from the array. We can think of arrays like a set of post office boxes. Each one has the same physical address, the post office, but within the post office we can find an individual box based on its own box number.
Some programming languages, such as Java, use arrays that are statically sized when they are first created, and those arrays cannot be resized later. In addition, many languages that require variables to be declared with a type only allow a single variable type to be stored in an array.
Other languages, such as Python, use lists in place of arrays. List can be resized, and in untyped languages such as Python they can store different data types within the same list.
The table below lists the flowchart blocks used to represent arrays, as well as the corresponding pseudocode:
Operation | Flowchart | Pseudocode |
---|---|---|
Declare Array |
![]() ![]() |
|
Store Item |
![]() ![]() |
|
Retrieve Item |
![]() ![]() |
|
Let’s review the syntax for working with arrays in Java.
To declare an array in Java, we must give it a type, and a name, with square brackets []
included after the type:
Once the array is declared, we can initialize it using the new
keyword, followed by the type and then the size in square brackets []
:
Of course, we can combine these two statements in to a single statement as well:
Finally, if we already know the values which we want to store in the array, we can use a shortcut syntax to initialize the array and place those values directly within it:
Once the array is created, we can access individual items in the array by placing the index in square brackets []
after the array’s variable name:
Java arrays can also be created with multiple dimensions, simply by adding additional square brackets []
to represent and access items in each dimension:
There are several operations that can be performed on arrays in Java as well:
Finally, we can use a special form of loop, called an Enhanced For loop, to iterate through items in an array in Java:
Once important thing to note is that arrays accessed within an Enhanced For loop are read only. So, we cannot change the values stored in the array using this loop, but we can access them. If we want to change them, we should use a standard For loop to iterate through the indices of the array:
Variables in our programs can be used in a variety of different roles. The simplest role for any variable is to store a value that does not change throughout the entire program. Most variables, however, fit into one of several roles throughout the program.
To help us understand these roles, let’s review them in detail here. As we move forward in this course, we’ll see many different data structures that use variables in these ways, so it helps to know each of them early on!
In this role, the variable is used to hold a value. This value can be changed during the program execution. In the example:
operand
of type Integer is declaredIn this role, variables are used to hold a sequence of values known beforehand. In the example, the variable counter
holds values from 1 to 10 and these values are conveyed to the user.
In this role, the variable is used to hold a value that aggregates, summarizes, and synthesize multiple values by means of an operation such as sum, product, mean, geometric mean, or median. In the example, we calculate the sum of the first ten numbers in the accumulator variable sum
.
In this role, the variable answer
contains the last value encountered so far in a data series, such as the last value that the program receives from the user.
In this role, the variable contains the value that is most appropriate for the purpose of the program, e.g. the minimum or the maximum. The instruction scores[counter] > max
checks if the list item under observation is greater than the maximum. If the condition is true the value of the maximum variable is changed.
A variable, such as second
, to which you assign the value of another variable that will be changed immediately after. In the example, the second variable contains the second largest value in a list.
A flag variable is used to report the occurrence or not of a particular condition, e.g. the occurrence of an error, the first execution, etc..
A variable used to hold a temporary value. For example, to exchange two variables, you must have a temporary variable temp
to store a value before it is replaced.
A variable used to indicate the position of the current item in a set of elements, such as the current item in an array of elements. The index
variable here is a great example.
Strings are another very important data type in programming. A string is simply a set of characters that represent text in our programs. We can then write programs that use and manipulate strings in a variety of ways, allowing us to easily work with textual data.
The table below lists the flowchart blocks used to represent strings, as well as the corresponding pseudocode:
Operation | Flowchart | Pseudocode |
---|---|---|
Create String |
![]() ![]() |
|
Access Character |
![]() ![]() |
|
String Length |
![]() ![]() |
|
Let’s review the syntax for working with strings in Java.
Strings in Java are declared just like any other variable:
Notice that strings are enclosed in double quotations marks "
, whereas a single character is enclosed in single quotation marks:
There are several special characters we can include in our strings. Here are a few of the more common ones:
\'
- Single Quotation Mark (usually not required)\"
- Double Quotation Mark\n
- New Line\t
- TabMost of the time, we will need to be able to parse strings in order to read input from the user. This is easily done using the Scanner
class in Java. Let’s refer to the skeleton code given in the earlier exercise:
This code will initialize a Scanner
to read input from a file if one is provided as a command-line argument. Otherwise, input will be read from the terminal, or System.in
in Java.
Once we have a Scanner
initialized, we can use several methods to read data from the input:
We can find a list of all Scanner
methods in the Java API Documentation
.
Finally, if we have read an entire string of input consisting of multiple parts, we can use the split
method to split the string in to tokens that are separated by a special delimiter. When we do this, we’ll have to use special methods to convert the strings to other primitive data types. Here’s an example:
In this example, we are able to split the first string variable into $5$ parts, each one separated by a space in the original string. Then, we can use methods such as Integer.parseInt()
to convert each individual string token into the desired data type.
When reading an unknown number of lines of input, we can use a loop in Java such as the following example:
This will read input until either a blank line is received (usually via the terminal), or there is no more input available to read (from a file).
There are also several operations we can perform on Strings in Java:
Additional methods can be found in the Java API Documentation
Strings can also be used to create formatted output in Java through the use of the format()
method. Here’s a short example:
When we run this program, the output will be:
Each item in the formatted output can also be given additional attributes such as width and precision. More details can be found in the Java API Documentation
An exception is an error that a program encounters when it is running. While some errors cannot be dealt with directly by the program, many of these exceptions can be caught and handled directly in our programs.
There isn’t really a standard way to display exceptions in flowcharts and pseudocode, but we can easily create a system that works well for our needs. Below are the flowchart blocks and pseudocode examples we’ll use in this course to represent exceptions and exception handling:
Let’s review the syntax for working with exceptions in Java.
In Java, we can use a Try-Catch statement to detect and handle exceptions in our code:
In this example, the program will try to open a file using the first command-line argument as a file name. There are several exceptions that could occur in this code, such as an ArrayIndexOutOfBoundsException
, a FileNotFoundException
, an IOException
, and more. They can also be handled individually:
If desired, we can also throw our own exceptions in Java:
This will cause an exception to be thrown if the value of y
is equal to $0.0$.
We can also add a Finally block at the end of each Try-Catch block. This code will be executed whenever the control exits the Try-Catch block, even through the use of a return
statement to return from a method.
When working with resources such as files in Java, we can also use a Try with Resources block to ensure that those resources are properly closed when we are done with them. In addition, a Try with Resources block will automatically catch and suppress any exceptions that result from trying to close the resource after an exception has occurred, preventing us from being bombarded by unavoidable exceptions. Here’s an example:
In this example, we are opening a Scanner
object within parentheses after a try
keyword. That Scanner
will automatically be closed once the program leaves the Try with Resources block where it is declared.
One of the major features of a modern computer is the ability to store and retrieve data from the computer’s file system. So, we need to be able to access the file system in our code in order to build useful programs. Thankfully, most modern programming languages include a way to do this.
Most operations working with files in code take the form of method calls. So, we will primarily use the call block to represent operations on files:
Operation | Flowchart | Pseudocode |
---|---|---|
Open File |
![]() ![]() |
|
Read from File |
![]() ![]() |
|
Write to File |
![]() ![]() |
|
Let’s review the syntax for working with files in Java.
To open a file in Java, we can use methods from the NIO library. Here is an example:
In this example, the program will try to open a file provided as the first command-line argument. If no argument is provided, it will automatically read from standard input instead. However, if an argument is provided, it uses Paths.get()
to get a reference to that file and tries to open it. In addition, we can use a Try with Resources statement to make sure the file is properly closed once it is open.
Once we have opened the file, we can read the file just like we would any other input:
To write to a file, we must open it a different way. In Java, we can use a BufferedWriter
to write data to a file:
This example shows to how to open a file for writing by creating a BufferedWriter
object inside of a Try with Resources statement. It also lists several of the common exceptions and their cause.
It is important both to easily grasp the design choice and the code structure of a project even long after it has been completed. The documentation process starts by commenting the code. Code comments are usually intended for software developers and aim at clarifying the code by giving details of how it works. They are usually performed using inline or multiple lines comments using the language syntax.
As we’ve seen before, we can add single-line or inline comments to our Java programs using two forward slashes //
before a line in our source file:
Java also includes the ability to add a comment that spans multiple lines, without requiring each line to be prefixed with forward slashes. Instead, we can use a forward slash followed by an asterisk /*
to start a comment, and then an asterisk followed by a forward slash to end it */
, as shown below:
Finally, Java also includes a secondary type of comment that spans multiple lines, specifically for creating documentation. Instead of a single asterisk, we use a double asterisk after the forward slash at the beginning /**
, but the ending is the same as before */
.
In addition, these comments typically include an asterisk at the beginning of each line, aligned with the first asterisk of the start of the comment. Thankfully, most code editors will do this for us automatically, including Codio!
These comments are specifically designed to provide information about classes and methods in our code. Here’s a quick example, using the IntTuple
class developed earlier in this module:
Once we’ve written this documentation in our code, Java includes a special tool called Javadoc that will generate HTML files that describe what our code does. In fact, the Java API files, such as the one for Scanner , are generated using this tool!
For more information about writing comments for the Javadoc tool, as well as some great examples, consult the documentation .
To make your code easier to read, many textbooks and companies use a style guide that defines some of the formating rules that you should follow in your source code. However, this is a point of contention, and many folks disagree over what is the best format. These formatting rules do not affect the actual code itself, only how easy it is to read.
For this book, most of the examples will be presented in a variant of the K&R Style used by most Java developers, which places the opening brace on the same line as the declaration, but the closing brace is placed on a line by itself and indented at the same level as the declaration. The code inside will be indented by four spaces.
Google provides a comprehensive style guide that is recommended reading if you’d like to learn more about how to format your source code.
Codio also includes a special assessment that will validate the coding style of your code based on the Google style guide. Use the assessment below to make sure your solution to Exercise 1 meets the expected coding style standard.
–removed–
We will also be enforcing this style on many projects and assignments in this class, so it is very important to become familiar with proper coding style!
Let’s build another sample program to review the content we’ve covered thus far.
Write a program that accepts three files as command line arguments. The first two represent input files, and the third one represents the desired output file. If there aren’t three arguments provided, either input file is not an existing file, or the output file is an existing directory, print “Invalid Arguments” and exit the program. The output file may be an existing file, since it will be overwritten.
The program should open each input file and read the contents. Each input file will consist of a list of numbers, one per line. The numbers may be integers or floating point numbers. If there are any errors parsing the contents of either file, the program should print “Invalid Input” and exit. As the input is read, the program should keep track of both the count and sum of all positive and negative inputs.
Once all input is read, the program should open the output file and print the following four items, in this order, one per line: number of positive inputs, sum of positive inputs, number of negative inputs, sum of negative inputs.
Finally, when the program is done, it should simply print “Complete” to the terminal and exit. Don’t forget to close any open files! Your program must catch and handle all possible exceptions, printing either “Invalid Arguments” or “Invalid Input” as described above.
We can use any of the code examples on previous pages to help us complete this exercise.
This exercise uses a custom grading program to grade submissions, which will be used extensively throughout this course. The grading program will create two files in your work directory showing more detailed output. To open the HTML file as a webpage, right-click on it and select Preview Static. The log file may contain helpful debugging messages if your program experiences an unhandled exception.
{Check It!|assessment}(test-2449394732)
This page is the main page for Python
Let’s discuss some of the basic concepts we need to understand about the Python programming language.
To begin, let’s look at a simple Hello World program written in Python:
This program contains multiple important parts:
main()
. Python does not require us to do this, since we can write our code directly in the file and it will execute. However, since we are going to be building larger programs in this course, it is a good idea to start using functions now.:
, and then the code inside of that function comes directly after it. The code contained in the function must be indented a single level. By convention, Python files should use 4 spaces to indent the code. Thankfully, Codio does that for us automatically.main()
function to run the program.Of course, this is a very brief overview for the Python programming language. To learn more, feel free to refer to the references listed below, as well as the textbook content for previous courses.
See if you can use the code above to write your own Hello World program in the HelloWorld.py
file that is open to the left. We’ll learn how to compile and run that program on the next page.
Now that we’ve written our first Python program, we must run the program to see the fruits of our labors. There are many different ways to do this using the Codio platform. We’ll discuss each of them in detail here.
Codio includes a built-in Linux terminal, which allows us to perform actions directly on a command-line interface just like we would on an actual computer running Linux. We can access the Terminal in many ways:
Additionally, some pages may already open a terminal window for us in the left-hand pane, as this page so helpfully does. As we can see, we’re never very far away from a terminal.
No worries! We’ll give you everything you need to know to run your Python programs in this course.
If you’d like to learn a bit more about the Linux terminal and some of the basic commands, feel free to check out this great video on YouTube:
Let’s go to the terminal window and navigate to our program. When we first open the Terminal window, it should show us a prompt that looks somewhat like this one:
There is quite a bit of information there, but we’re interested in the last little bit of the last line, where it says ~/workspace
. That is the current directory, or folder, our terminal is looking at, also known as our working directory. We can always find the full location of our working directory by typing the pwd
command, short for “Print Working Directory,” in the terminal. Let’s try it now!
Enter this command in the terminal:
and we should see output similar to this:
In that output, we’ll see that the full path to our working directory is /home/codio/workspace
. This is the default location for all of our content in Codio, and it’s where everything shown in the file tree to the far left is stored. When working in Codio, we’ll always want to store our work in this directory.
Next, let’s use the ls
command, short for “LiSt,” to see a list of all of the items in that directory:
We should see a whole list of items appear in the terminal. Most of them are directories containing examples for the chapters this textbook, including the HelloWorld.py
file that we edited in the last page. Thankfully, the directories are named in a very logical way, making it easy for us to find what we need. For example, to find the directory for Chapter 1 that contains examples for Python, look for the directory with the name starting with 1p
. In this case, it would be 1p-hello
.
Finally, we can use the cd
command, short for “Change Directory,” to change the working directory. To change to the 1p-hello
directory, type cd
into the terminal window, followed by the name of that directory:
We are now in the 1p-hello
directory, as we can see by observing the ~/workspace/1p-hello
on the current line in the terminal. Finally, we can do the ls
command again to see the files in that directory:
We should see our HelloWorld.py
file! If it doesn’t appear, try using this command to get to the correct directory: cd /home/codio/workspace/1p-hello
.
Once we’re at the point where we can see the HelloWorld.py
file, we can move on to actually running the program.
To run it, we just need to type the following in the terminal:
That’s all there is to it! We’ve now successfully run our first Python program. Of course, we can run the program as many times as we want by repeating the previous python3
command. If we make changes to the HelloWorld.py file that instruct the computer to do something different, we should see those changes the next time we run the file..
If the python3
command doesn’t give you any output, or gives you an error message, that most likely means that your code has an error in it. Go back to the previous page and double-check that the contents of HelloWorld.py
exactly match what is shown at the bottom of the page. You can also read the error message output by python3
to determine what might be going wrong in your file.
Also, make sure you use the python3
command and not just python
. The python3
command references the newer Python 3 interpreter, while the python
command is used for the older Python 2 interpreter. In this book, we’ll be using Python 3, so you’ll need to always make sure you use python3
when you run your code.
We’ll cover information about simple debugging steps on the next page as well. If you get stuck, now is a great time to go to Piazza and ask for assistance. You aren’t in this alone!
See if you can change the HelloWorld.py
file to print out a different message. Once you’ve changed it, use the python3
command to run the file again. Make sure you see the correct output!
In many of the Codio projects and tutorials in this course, the Run Menu will be populated with helpful commands. The Run Menu can be found at the top of the screen, right here:
Each Codio project or tutorial may have different items in this menu, since they can be configured by the author of the project. For this book, there will always be the following options:
To use these commands, we must simply open up the file we’d like to use, then select the appropriate option from the Run Menu. It will automatically use the currently open file in the command.
So, to run our file, we must simply open HelloWorld.py
in the panel to the left, then click the arrow in the Run Menu and select Python - Run File. It should open up a Terminal tab and show output similar to the following:
It looks very similar to the command we entered manually. The only difference is that it uses the folder name along with the filename in the command, which ensures that it gets the correct file without even opening that directory.
Make another change to the HelloWorld.py
file, and then see if you can use the options in the Run Menu to run it. Make sure you see the correct output!
Last, but not least, many of the Codio tutorials and projects in this program will include assessments that we must solve by writing code. Codio can then automatically run the program and check for specific things, such as the correct output, in order to give us a grade. For most of these questions, we’ll be able to make changes to our code as many times as we’d like to get the correct answer. Try the example below!
{Check It!|assessment}(code-output-compare-1653086498)
As we can see, there are many different ways to compile and run our code using Codio. Feel free to use any of these methods throughout this course.
Codio also includes an integrated debugger, which is very helpful when we want to determine if there is an error in our code. We can also use the debugger to see what values are stored in each variable at any point in our program.
To use the debugger, find the Debug Menu at the top of the Codio window. It is to the right of the Run Menu we’ve already been using. On that menu, we should see an option for Python - Debug File. Select that option to run our program in the Codio debugger.
As we build more complex programs in this course, we’ll be able to configure our own debugger configurations that allow us to test multiple files and operations.
The Codio debugger only works with input from a file, not from the terminal. So, to use the debugger, we’ll need to make sure the input we’d like to test is stored in a file, such as [input.txt](open_file 1p-hello/input.txt), before debugging. We can then give that file as an argument to our program in our debugger configuration, and write our program to read input from a file if one is provided as an argument.
Learning how to use a debugger is a hands-on process, and is probably best described in a video. So, here are a couple of videos that should help us get up to speed on working in the Codio debugger.
Computational Core - Python Debugging Tutorial Codio Documentation - Debugging
We can always use the debugger to help us find problems in our code.
^[https://www.codio.com/blog/python-tutor-codio-visualizer]
Codio now includes support for Python Tutor , allowing us to visualize what is happening in our code. We can see that output in the second tab that is open to the left.
Unfortunately, students are not able to open the visualizer directly, so it must be configured by an instructor in the Codio lesson. If you find a page in this textbook where you’d like to be able to visualize your code, please post in Piazza and let us know!
A variable in a programming language is an abstraction that allows storing one value in each instant of time, but this value can change along with the program execution. A variable can be represented as a box holding a value. If the variable is a container, e.g., a list (or array or vector), a matrix, a tuple, or a set of values, each box in the container contains a single value.
A variable is characterized by:
results
, number_of_nodes
, number_of_edges
. For writing variable names composed of two or more words in Python we can use underscores to separate the words.Depending on the programming language, we could also specify for a variable:
A programming language allows to perform two basic operations with a variable:
+
, and subtraction -
. They allow performing basic arithmetic operations with numbers.<
, and greater than >
. Usually, they allow to comparing two operands, each of which could be a variable. The result of the comparison is either the Boolean value true
or the Boolean value false
.and
, or
, and not
. This operator allows us to relate logical conditions together to create more complex statements.+
to concatenate the strings “Hello” and the string “world” to produce the string “Hello world”. These operators allow us to manipulate strings.a = b
.The table below lists the flowchart blocks used to represent variables, as well as the corresponding pseudocode:
Operation | Flowchart | Pseudocode |
---|---|---|
Declare |
![]() ![]() |
X = 0 |
Assign |
![]() ![]() |
X = 5 |
Declare & Assign |
![]() ![]() |
X = 5 |
Notice that variables must be assigned a value when declared in pseudocode. By default, most programming languages automatically assign the value $0$ to a new integer variable, so we’ll use that value in our pseudocode as well.
Likewise, variables in a flowchart are given a type, whereas variables in pseudocode are not. Instead, the data type of those variables can be inferred by the values stored in them.
Variables in Python are simply defined by giving them a value. The type of the variable in inferred from the data stored in it at any given time, and a variable’s type may change throughout the program as different values are assigned to it.
To define a variable, we can simply use an assignment statement to give it a value:
We can also convert, or cast, data between different types. When we do this, the results may vary a bit due to how computers store and calculate numbers. So, it is always best to fully test any code that casts data between data types to make sure it works as expected.
To cast, we can simply use the new type as a function and place the value to be converted in parentheses:
This will convert the floating point value stored in x
to an integer value stored in y
.
The conditional statement, also known as the If-Then statement, is used to control the program’s flow by checking the value of a Boolean statement and determining if a block of code should be executed based on that value. This is the simplest conditional instruction. If the condition is true, the block enclosed within the statement is executed. If it is false, then the code in the block is skipped.
A more advanced conditional statement, the If-Then-Else or If-Else statement, includes two blocks. The first block will be executed if the Boolean statement is true. If the Boolean statement is false, then the second block of code will be executed instead.
Simple conditions are obtained by means of the relational operators, such as <
, >
, and ==
, which allow you to compare two elements, such as two numbers, or a variable and a number, or two variables. Compound conditions are obtained by composing two or more simple conditions through the logical operators and
, or
, and not
.
Recall that the Boolean logic operators and
, or
, and not
can be used to construct more complex Boolean logic statements.
For example, consider the statement x <= 5
. This could be broken down into two statements, combined by the or
operation: x < 5 or x == 5
. The table below, called a truth table, gives the result of the or operation based on the values of the two operands:
Operand 1 | Operand 2 | Operand 1 or Operand 2 |
---|---|---|
False | False | False |
False | True | True |
True | False | True |
True | True | True |
As shown above, the result of the or operation is True
if at least one of the operands is True
.
Likewise, to express the mathematical condition 3 < a < 5
we can use the logical operator and
by dividing the mathematical condition into two logical conditions: a > 3 and a < 5
. The table below gives the result of the and operation based on the values of the two operands:
Operand 1 | Operand 2 | Operand 1 or Operand 2 |
---|---|---|
False | False | False |
False | True | False |
True | False | False |
True | True | True |
As shown above, the result of the and operation is True
if both of the operands are True
.
Finally, the not
logical operator is used to reverse, or invert, the value of a Boolean statement. For example, we can express the logical statement x < 3
as not (x >= 3)
, using the not operator to invert the value of the statement. The table below gives the result of the not operation based on the value of its operand:
Operand | not Operand |
---|---|
False | True |
True | False |
In propositional logic, the completeness theorem shows that all other logical operators can be obtained by appropriately combining the and, or and not operators. So, by just understanding these three operators, we can construct any other Boolean logic statement.
The table below lists the flowchart blocks used to represent conditional statements, as well as the corresponding pseudocode:
Operation | Flowchart | Pseudocode |
---|---|---|
If-Then |
![]() ![]() |
|
If-Then-Else |
![]() ![]() |
|
The mechanism for determining which block an If-Then-Else statement executes is the following:
To understand how a conditional statement works, let’s look at this example of a simple If-Then-Else statement. Consider the following flowchart:
In this case, if a
is less than zero, the output message will be “The value of a is less than zero”. Otherwise, if a is not less than zero (that is, if a is greater than or equal to zero), the output message will be “The value of a is greater than or equal to zero”.
We can also nest conditional statements together, making more complex programs.
Consider the following flowchart:
In this case, if a is less than zero the output message will be “The value of a is less than zero”. Otherwise (that is, if a is not less than zero so if a is greater than or equal to zero) the block checks whether a is equal to zero; if so, the output message will be “The value of a is equal to zero”. Otherwise (that is, if the first condition is false, i.e. a >= 0
and the second condition is false, i.e. is nonzero; the two conditions must be both true as if they were bound by a logical and, and they are the same as the condition a > 0
) the output message will be “The value of a is greater than zero”.
To see how conditional statements look in Python, let’s recreate them from the flowcharts shown above.
As we can see in the examples above, we must carefully indent each block of code to help set it apart from the other parts of the program. In addition, each line containing if
, elif
and else
must end in a colon :
.
Loops are another way we can control the flow of our program, this time by repeating steps based on a given criteria. A computer is able to repeat the same instructions many times. There are several ways to tell a computer to repeat a sequence of instructions:
while true
. This construct is useful in software applications such as servers that will offer a service. The service is supposed to be available forever.Repeat 10 times
or for i = 1 to 10
. This loop can be used when you know the number of repetitions. There are also loops that allow you to repeat as many times as there are elements of a collection, such as for each item in list
while
loop, which repeats while the condition is true.In repeat while loops, the number of repetitions depends on the occurrence of a condition: the cycle repeats if the condition is true. Loops can also be nested, just like conditional statements.
The table below lists the flowchart blocks used to represent loop statements, as well as the corresponding pseudocode:
To see how loops look in Python, let’s recreate them from the flowcharts shown above.
As we can see in the examples above, we must carefully indent each block of code to help set it apart from the other parts of the program. In addition, each line containing for
and while
must end in a colon :
. Finally, notice that the range()
function in Python does not include the second parameter in the output. So, to get the numbers $1$ through $10$, inclusive, we must use range(1, 11)
in our code.
At this point, we’ve covered enough material to build a simple program. So, let’s see if we can complete the following example program before continuing.
Write a program that reads an integer from either the terminal, or a file if one is provided as a command-line argument. It should not worry about handling any exceptions encountered.
The program should compute and print the sum of all integers from 1 up to and including the integer provided as input, except those integers which are evenly divisible by 3. If the provided input is not a positive integer, the program should simply print 0.
Since we haven’t covered how to handle input yet, we can use the following skeleton code to help us build our program.
This code will create a Scanner
variable called reader
, and initialize it to either read from file provided as a command-line argument, or from the terminal if an argument is not provided. It will then read a single integer from the input, storing it in the variable x
.
To complete this exercise, we can continue to write this program where the MORE CODE GOES HERE
comment is in the skeleton code.
{Check It!|assessment}(code-output-compare-2708968079)
^[File:USPS Post office boxes 1.jpg. (2017, May 17). Wikimedia Commons, the free media repository. Retrieved 18:17, November 5, 2018 from https://commons.wikimedia.org/w/index.php?title=File:USPS_Post_office_boxes_1.jpg&oldid=244476438.]
Arrays allow us to store multiple values in the same variable, using an index to determine which value we wish to store or retrieve from the array. We can think of arrays like a set of post office boxes. Each one has the same physical address, the post office, but within the post office we can find an individual box based on its own box number.
Some programming languages, such as Java, use arrays that are statically sized when they are first created, and those arrays cannot be resized later. In addition, many languages that require variables to be declared with a type only allow a single variable type to be stored in an array.
Other languages, such as Python, use lists in place of arrays. List can be resized, and in untyped languages such as Python they can store different data types within the same list.
The table below lists the flowchart blocks used to represent arrays, as well as the corresponding pseudocode:
Operation | Flowchart | Pseudocode |
---|---|---|
Declare Array |
![]() ![]() |
|
Store Item |
![]() ![]() |
|
Retrieve Item |
![]() ![]() |
|
Let’s review the syntax for working with lists in Python.
To define a list in Python, we can simply place values inside of a set of square brackets []
, separated by commas ,
:
We can also create an empty list by simply omitting any items inside the square brackets
Once we’ve created a list in Python, we can add items to the end of the list using the append()
method:
Once the list is created, we can access individual items in the list by placing the index in square brackets []
after the list’s variable name:
Python lists can also be created with multiple dimensions, simply by appending lists as elements in a base list.
They can also be created through the use of lists as individual elements in a list when it is defined:
To access elements in a multidimensional list, simply include additional sets of square brackets containing an index []
for each dimenison:
There are several operations that can be performed on lists in Python as well:
Finally, we can use a special form of loop, called a For Each loop, to iterate through items in a list in Python:
Once important thing to note is that lists accessed within a For Each loop are read only. So, we cannot change the values stored in the list using this loop, but we can access them. If we want to change them, we should use a standard For loop to iterate through the indices of the list:
Variables in our programs can be used in a variety of different roles. The simplest role for any variable is to store a value that does not change throughout the entire program. Most variables, however, fit into one of several roles throughout the program.
To help us understand these roles, let’s review them in detail here. As we move forward in this course, we’ll see many different data structures that use variables in these ways, so it helps to know each of them early on!
In this role, the variable is used to hold a value. This value can be changed during the program execution. In the example:
operand
of type Integer is declaredIn this role, variables are used to hold a sequence of values known beforehand. In the example, the variable counter
holds values from 1 to 10 and these values are conveyed to the user.
In this role, the variable is used to hold a value that aggregates, summarizes, and synthesize multiple values by means of an operation such as sum, product, mean, geometric mean, or median. In the example, we calculate the sum of the first ten numbers in the accumulator variable sum
.
In this role, the variable answer
contains the last value encountered so far in a data series, such as the last value that the program receives from the user.
In this role, the variable contains the value that is most appropriate for the purpose of the program, e.g. the minimum or the maximum. The instruction scores[counter] > max
checks if the list item under observation is greater than the maximum. If the condition is true the value of the maximum variable is changed.
A variable, such as second
, to which you assign the value of another variable that will be changed immediately after. In the example, the second variable contains the second largest value in a list.
A flag variable is used to report the occurrence or not of a particular condition, e.g. the occurrence of an error, the first execution, etc..
A variable used to hold a temporary value. For example, to exchange two variables, you must have a temporary variable temp
to store a value before it is replaced.
A variable used to indicate the position of the current item in a set of elements, such as the current item in an array of elements. The index
variable here is a great example.
Strings are another very important data type in programming. A string is simply a set of characters that represent text in our programs. We can then write programs that use and manipulate strings in a variety of ways, allowing us to easily work with textual data.
The table below lists the flowchart blocks used to represent strings, as well as the corresponding pseudocode:
Operation | Flowchart | Pseudocode |
---|---|---|
Create String |
![]() ![]() |
|
Access Character |
![]() ![]() |
|
String Length |
![]() ![]() |
|
Let’s review the syntax for working with strings in Python.
Strings in Python are declared just like any other variable:
Notice that strings are enclosed in double quotations marks "
. Since Python does not have a data type for a single character, we can do the same for single character strings as well:
There are several special characters we can include in our strings. Here are a few of the more common ones:
\'
- Single Quotation Mark (usually not required)\"
- Double Quotation Mark\n
- New Line\t
- TabMost of the time, we will need to be able to parse strings in order to read input from the user. This is easily done using Python. Let’s refer to the skeleton code given in the earlier exercise:
This code will initialize a variable called reader
to read input from a file if one is provided as a command-line argument. Otherwise, input will be read from the terminal, or sys.stdin
in Python.
Once we have a reader initialized, we can read a line of data from the input as follows:
If we know that line will contain a single item of a different data type, such as an integer, we can also convert that input using the appropriate method:
Finally, if we have read an entire string of input consisting of multiple parts, we can use the split
method to split the string in to tokens that are separated by a special delimiter. When we do this, we’ll have to use special methods to convert the strings to other primitive data types. Here’s an example:
In this example, we are able to split the first string variable into $5$ parts, each one separated by a space in the original string. Then, we can use methods such as int()
to convert each individual string token into the desired data type.
When reading an unknown number of lines of input, we can use a loop in Python such as the following example:
This will read input until either a blank line is received (usually via the terminal), or there is no more input available to read (from a file).
There are also several operations we can perform on strings in Python:
Additional methods can be found on the Python Built-In Types: str and Python Common String Operations pages
Strings can also be used to create formatted output in Python through the use of the format()
method. Here’s a short example:
When we run this program, the output will be:
Each item in the formatted output can also be given additional attributes such as width and precision. More details can be found on the Python Format String Syntax page.
An exception is an error that a program encounters when it is running. While some errors cannot be dealt with directly by the program, many of these exceptions can be caught and handled directly in our programs.
There isn’t really a standard way to display exceptions in flowcharts and pseudocode, but we can easily create a system that works well for our needs. Below are the flowchart blocks and pseudocode examples we’ll use in this course to represent exceptions and exception handling:
Let’s review the syntax for working with exceptions in Python.
In Python, we can use a Try-Except statement to detect and handle exceptions in our code:
In this example, the program will try to open a file using the first command-line argument as a file name. There are several exceptions that could occur in this code, such as a ValueError
, a IndexError
, a FileNotFoundError
, and more. They can also be handled individually:
If desired, we can also raise our own exceptions in Python:
This will cause an exception to be thrown if the value of y
is equal to $0.0$.
We can also add Else and Finally blocks at the end of each Try-Except block. A Finally block will be executed whenever the control exits the Try-Except block, even through the use of a return
statement to return from a method. The Else block will be executed if the entire Try-Except block completes without any exceptions being raised:
When working with resources such as files in Python, we can also use a With block to ensure that those resources are properly closed when we are done with them. In addition, a With block will automatically catch and suppress any exceptions that result from trying to close the resource after an exception has occurred, preventing us from being bombarded by unavoidable exceptions. Here’s an example:
In this example, we are opening a file using the open()
method inside of the With statement. That file will automatically be closed once the program leaves the With statement.
One of the major features of a modern computer is the ability to store and retrieve data from the computer’s file system. So, we need to be able to access the file system in our code in order to build useful programs. Thankfully, most modern programming languages include a way to do this.
Most operations working with files in code take the form of method calls. So, we will primarily use the call block to represent operations on files:
Operation | Flowchart | Pseudocode |
---|---|---|
Open File |
![]() ![]() |
|
Read from File |
![]() ![]() |
|
Write to File |
![]() ![]() |
|
Let’s review the syntax for working with files in Python.
To open a file in Python, we can simply use the open()
method. Here is an example:
In this example, the program will try to open a file provided as the first command-line argument. If no argument is provided, it will automatically read from standard input instead. However, if an argument is provided, it will try to open it as a file. In addition, we can use a With statement to make sure the file is properly closed once it is open.
Once we have opened the file, we can read the file just like we would any other input:
To write to a file, we must open it a different way. In Python, we must provide an optional "w"
argument to the open()
method call to make the file writable:
This example shows to how to open a file for writing using the open()
method inside of a With statement. It also lists several of the common exceptions and their cause.
It is important both to easily grasp the design choice and the code structure of a project even long after it has been completed. The documentation process starts by commenting the code. Code comments are usually intended for software developers and aim at clarifying the code by giving details of how it works. They are usually performed using inline or multiple lines comments using the language syntax.
As we’ve seen before, we can add single-line comments to our Python programs using a hash symbol #
before a line in our source file:
Finally, Python also includes a secondary type of comment that spans multiple lines, specifically for creating documentation. A docstring is usually the first line of text inside of a class or method definition, and is surrounded by three double quotes """
with one set of three on each end.
These comments are specifically designed to provide information about classes and methods in our code. Here’s a quick example using a simple class:
Unfortunately, Python does not enforce a particular style for these docstrings, so there are many different formats used in practice. To learn more, we can consult the following references.
To make your code easier to read, many textbooks and companies use a style guide that defines some of the formating rules that you should follow in your source code. In Python, these rules are very important, as the structure of your code is defined by the layout. We’ll learn more about that in a later module.
For this book, most of the examples will be presented using the guidelines in the Style Guide for Python . However, by default Codio used to use 2 spaces for an indentation level instead of 4, so that is the format that will be used in some examples in this book.
Google also provides a comprehensive style guide that is recommended reading if you’d like to learn more about how to format your source code.
Codio also includes a special assessment that will validate the coding style of your code based on the Google style guide. Use the assessment below to make sure your solution to Exercise 1 meets the expected coding style standard.
–removed–
We will also be enforcing this style on many projects and assignments in this class, so it is very important to become familiar with proper coding style!
Let’s build another sample program to review the content we’ve covered thus far.
Write a program that accepts three files as command line arguments. The first two represent input files, and the third one represents the desired output file. If there aren’t three arguments provided, either input file is not an existing file, or the output file is an existing directory, print “Invalid Arguments” and exit the program. The output file may be an existing file, since it will be overwritten.
The program should open each input file and read the contents. Each input file will consist of a list of numbers, one per line. The numbers may be integers or floating point numbers. If there are any errors parsing the contents of either file, the program should print “Invalid Input” and exit. As the input is read, the program should keep track of both the count and sum of all positive and negative inputs.
Once all input is read, the program should open the output file and print the following four items, in this order, one per line: number of positive inputs, sum of positive inputs, number of negative inputs, sum of negative inputs.
Finally, when the program is done, it should simply print “Complete” to the terminal and exit. Don’t forget to close any open files! Your program must catch and handle all possible exceptions, printing either “Invalid Arguments” or “Invalid Input” as described above.
We can use any of the code examples on previous pages to help us complete this exercise.
This exercise uses a custom grading program to grade submissions, which will be used extensively throughout this course. The grading program will create two files in your work directory showing more detailed output. To open the HTML file as a webpage, right-click on it and select Preview Static. The log file may contain helpful debugging messages if your program experiences an unhandled exception.
{Check It!|assessment}(test-3534615650)
That’s a quick overview of the basics we’ll need to know before starting the new content in this course. The next module will provide a quick review of object-oriented programming concepts as well as the model-view-controller or MVC architecture, both of which will be used heavily in this course.
This page is the main page for Review Object Oriented Programming
^[File:CPT-OOP-inheritance.svg. (2014, June 26). Wikimedia Commons, the free media repository. Retrieved 01:22, January 14, 2020 from https://commons.wikimedia.org/w/index.php?title=File:CPT-OOP-inheritance.svg&oldid=127549650.]
Object-oriented programming uses the idea of objects and classes to provide many improvements over other programming paradigms. The key concept of object-oriented programming - encapsulation - allows our data and the operations that manipulate that data to be bundled together within a single object.
Functions are small pieces of reusable code that allow you to divide complex programs into smaller subprograms. Ideally, functions perform a single task and return a single value. (It should be noted that some programming languages allow for procedures, which are similar to functions but return no values. Except for the return value, it is safe to group them with functions in our discussion below.)
Functions can be thought of as black boxes. When we talk about black boxes we mean that users cannot look inside the box to see how it actually works. A good example of a black box is a soda machine. We all use them and know how to operate them, but very few of actually know how they work inside. Nor do we really want to know. We are happy to simply use them machine and have it give a nice cold soda when we are thirst!
To be able to reuse functions easily, it is important to define what a function does and how it should be called.
Before we can call a function, we must know the function’s signature. A function’s signature includes the following.
While a signature will allow us to actually call the function in code. Of course to use functions effectively, we must also know exactly what the function is supposed to do. We will talk more about how we do this in the next module on programming by contract. For now we can assume that we just have a good description of what the function does.
While we do not need to know exactly how a function actually performs its task, the algorithm used to implement the function is vitally important as well. We will spend a significant amount of time in this course designing such algorithms.
The lifecycle of a function is as follows.
When the function is called, the arguments, or actual parameters, are copied to the function’s formal parameters and program execution jumps from the “call” statement to the function. When the function finishes execution, execution resumes at the statement following the “call” statement.
In general, parameters are passed to functions by value, which means that the value of the calling program’s actual parameter is copied into the function’s formal parameter. This allows the function to modify the value of the formal parameter without affecting the actual parameter in the calling program.
However, when passing complex data structures such as objects, the parameters are passed by reference instead of by value. In this case, a pointer to the parameter is passed instead of a copy of the parameter value. By passing a pointer to the parameter, this allows the function to actually make changes to the calling program’s actual parameter.
As you might guess from its name, object-oriented programming languages are made to create and manipulate entities called objects. But what exactly are these objects? Objects were created to help decompose large complex programs with a lot of complex data into manageable parts.
An object is a programming entity that contains related data and behavior.
A good example of an object is dog. But not just any dog, or all dogs, but a specific dog. Each dog has specific characteristics that are captured as data such as their name, their height, their weight, their breed, their age, etc. We call these characteristics attributes and all dogs have the same type of attributes, although the values of those attributes may be unique. And generally, all dogs exhibit the same behaviors, or methods. Almost all dogs can walk, run, bark, eat, etc.
So, how do we define the basic attributes and behaviors of a dog? We probably start with some kind of idea of what a dog is. How do we describe dogs in general. In object orientation we do that through classes.
A class is a blueprint for an object.
What do we use blueprints for? Well, when we are building a physical structure such as a home or office building, an architect first creates a blueprint that tells the builder what to build and how everything should fit together. That is essentially what a class does. A class describes the types of attributes and methods that an object of that class will have.
Then to create objects, we say we create an instance of a class by calling the class’s constructor method, which creates an object instance in memory and makes sure it’s attributes are properly created. Once the object has been created, the methods defined by the class can be used to manipulate the attributes and internal data of the object.
Two of the most powerful concepts in object orientation are encapsulation and information hiding.
Encapsulation enables information hiding, and information hiding allows us to simplify the interface used to interact with an object. Instead of needing to know everything about a particular class of objects in order to use or interact with those objects. This will make our programs less complex and easier to implement and test. It also makes it easier for you to change the internal implementations of methods without affecting the rest of your program. As long as the method behaves in the same way (i.e., produces the same outputs given a given set of inputs), the rest of your program will not be affected. Thus, we see two key parts of any class:
Encapsulation and information hiding are actually all around us. Take for example, a soda vending machine. There are many internal parts to the machine. However, as a user, we care little about how the machine works or what it does inside. We need to simply know how to insert money or swipe our card and press a couple of buttons to get the soda we desire. If a repair is needed and an internal motor is replaced, we don’t care whether they replaced the motor with the exact same type of motor or the new model. As long as we can still get our soda by manipulating the same payment mechanisms and buttons, we are happy. You and I care only about the interface to the machine, not the implementation hiding inside.
To implement information hiding in our classes, we use visibility. In general, attributes and methods can either be public or private. If we want and attribute or method to be part of the class interface, we define them as public. If we want to hide a attribute or method from external objects, we defined them as private. An external object may access public attributes and call public methods, which is similar to using the payment mechanism or the buttons on a soda machine. However, the internals of how the object works is hidden by private attributes and methods, which are equivalent to the internal workings of the soda machine.
To implement information hiding, we recommend that you declare all attributes of a class as private. Any attribute whose value should be able to be read or changed by an external object should create special “getter” and “setter” methods that access those private variables. This way, you can make changes to the implementation of the attributes without changing how it is accessed in the external object.
Polymorphsim is a concept that describes the fact that similar objects tend to behave in similar ways, even if they are not exactly alike. For example, if we might have a set of shapes such as a square, a circle, and a rhombus. While each shape shares certain attributes like having an area and a perimeter. However, each shape is also unique and may have differing number of sides and angles between those sides, or in the case of a circle, a diameter. We describe this relationship by saying a circle (or rectangle, or rhombus) “is a” shape as shown in the figure below.
Inheritance is a mechanism that captures polymorphism by allowing classes to inherit the methods and attributes from another class. The basic purpose of inheritance to to reuse code in a principled and organized manner. We generally call the inheriting class the subclass or child class, while the class it inherits from is called the superclass or parent class.
Basically, when class ‘A’ inherits from class ‘B’, all the methods and attributes of class ‘A’ are automatically copied to class ‘B’. Class ‘B’ can then add additional methods or attributes to extend class ‘A’, or overwrite the implementations of methods in class ‘A’ to specialize it.
When programming, we use inheritance to implement polymorphism. In our shape example, we would have a generic (or abstract) Shape class, which is inherited by a set of more specific shape classes (or specializations) as shown below.
In this example, the Shape class defines the ‘color’ attribute and the ‘getArea’ and ‘getCircumference’ methods, which are inherited by the Rectangle, Circle, and Rhombus classes. Each of the subclasses define additional attributes that are unique to the definition of each shape type.
Notice that although the Shape class defines the signatures for the ‘getArea’ and ‘getCircumference’ methods, it cannot define the implementation of the methods, since this is unique to each subclass shape. Thus, each subclass shape will specialize the Shape class by implementing their own ‘getArea’ and ‘getCircumference’ methods.
So far, we have discussed Single inheritance, which occurs when a class has only one superclass. However, theoretically, a class may inherit from more than one superclass, which is termed multiple inheritance. While a powerful mechanism, multiple inheritance also introduces complexity into understanding and implementing programs. And, there is always the possibility that attributes and methods from the various superclasses contradict each other in the subclass.
For object ‘a’ to be able to call a method in object ‘b’, object ‘a’ must have a reference (a pointer, or the address of) object ‘b’. In many cases, objects ‘a’ and ‘b’ will be in a long-term relationship so that one or both objects will need to store the reference to the other in an attribute. When an object holds a reference to another object in an attribute, we call this a link. Examples of such relationships include a driver owning a car, a person living at an address, or a worker being employed by a company.
As we discussed earlier, objects are instances of classes. To represent this in a UML class diagram, we use the notion of an association, which is shown as a line connecting to two classes. More precisely, a link is an instance of an association. The figure belows shows three examples of an association between class ‘A’ and class ‘B’.
The top example shows the basic layout of an association in UML. The line between the two classes denotes the association itself. The diagram specifies that ‘ClassA’ is associated with ‘ClassB’ and vice versa. We can name the association as well as place multiplicities on the relationships. The multiplicities show exactly how many links an object of one class must have to objects of the associated class. The general form a multiplicity is ’n .. m’, which means that an object must store at least ’n’, but no more than ’m’ references to other objects of the associated class; if only one number is given such as ’n’, then the object must store exactly ’n’ references to objects in the associated class.
There are two basic types of associations.
The middle example shows a two-way association between ‘ClassA’ and ‘ClassB’. Furthermore, each object of ‘ClassA’ must have a link to exactly three objects of ‘ClassB’, while each ‘ClassB’ object must have a link with exactly one ‘ClassA’ object. (Note that the multiplicity that constrains ‘ClassA’ is located next to ‘ClassB’, while the multiplicity that constrains ‘ClassB’ is located next to ‘ClassA’.)
The bottom example shows a one-way association between ‘ClassA’ and ‘ClassB’. In this case, ‘ClassA’ must have links to either zero or one objects of ‘ClassB’. Since it is a one-way association, ‘ClassB’ will have no links to objects of ‘ClassA’.
This page is the main page for Java
In Java, each piece of code is broken down into functions, which are individual routines that we can call in our code. Let’s review how to create functions in Java.
The table below lists the flowchart blocks used to represent functions, as well as the corresponding pseudocode:
Operation | Flowchart | Pseudocode |
---|---|---|
Declare Function |
![]() ![]() |
|
Call Function |
![]() ![]() |
|
In general, a function declaration in Java needs a few elements. Let’s start at the simplest case:
static void foo(){
System.out.println("Foo");
return;
}
Let’s break this example function declaration down to see how it works:
static
at the beginning of this function declaration. That keyword allows us to use this function without creating an object first. We’ll cover how to create and work with objects in a later module. For now, each function we create will need the static
keyword in front of it, just like the main()
function.void
, determines the type of data returned by the function. We use a special keyword void
when the function does not return a value. We’ve already seen this keyword used in our declaration of the main
function.foo
. We can name a function using any valid identifier in Java. In general, function names in Java always start with a lowercase letter.()
that list the parameters for this function. Since there is nothing included in this example, the function foo
does not require any parameters.{}
that surround the code of the function itself. In this case, the function will simply print Foo
to the terminal.return
keyword. Since we aren’t returning a value, we aren’t required to include a return
keyword in the function. However, it is helpful to know that we may use that keyword to exit the function at any time.Once that function is created, we can call it using the following code:
foo();
In a more complex case, we can declare a function that accepts parameters and returns a value, as in this example:
static int countLetters(String input, char letter){
int output = 0;
for(int i = 0; i < input.length(); i++){
if(input.charAt(i) == letter){
output++;
}
}
return output;
}
In this example, the function accepts two parameters: input
, which is a string, and letter
, which is a character. It also declares that it will return an int
value.
We can use the parameters just like any other variable in our code. To return a value, we use the return
keyword, followed by the value or variable containing the value we’d like to return.
To call a function that requires parameters, we can include values as arguments in the parentheses of the function call:
sum += countLetters("The quick brown fox jumped over the lazy dog", 'e');
Java allows us to create multiple functions using the same name, or identifier, as long as they have different parameters. This could include a different number of parameters, different data types for each parameter, or a different ordering of types. The names of the parameters, however, does not matter here. This is called function overloading.
For example, we could create a function named max()
that could take either two or three parameters:
public class Overloading{
public static void main(String[] args){
max(2, 3);
max(3, 4, 5);
}
static void max(int x, int y){
if(x >= y){
System.out.println(x);
}else{
System.out.println(y);
}
}
static void max(int x, int y, int z){
if(x >= y){
if(x >= z){
System.out.println(x);
}else{
System.out.println(z);
}
}else{
if(y >= z){
System.out.println(y);
}else{
System.out.println(z);
}
}
}
}
In this example, we have two functions named max()
, one that requires two parameters, and another that requires three. When Java sees a function call to max()
elsewhere in the code, it will look at the number and types of arguments provided, and use that information to determine which version of max()
it should use.
Of course, we could just use the three argument version of max()
in both cases:
public class Overloading{
public static void main(String[] args){
max(2, 3);
max(3, 4, 5);
}
static void max(int x, int y){
max(x, y, y);
}
static void max(int x, int y, int z){
if(x >= y){
if(x >= z){
System.out.println(x);
}else{
System.out.println(z);
}
}else{
if(y >= z){
System.out.println(y);
}else{
System.out.println(z);
}
}
}
}
In this case, we are calling the three parameter version of max()
from within the two parameter version. In effect, this allows us to define default parameters for functions such as this. If we only provide two arguments, the code will automatically call the three parameter version, filling in the third argument for us.
Unfortunately, Java does not support any other way of defining default parameters, but we can use function overloading to achieve something similar, as demonstrated above.
Finally, Java allows us to define a single parameter that is a variable length parameter. In essence, it will allow us to accept anywhere from 0 to many arguments for that single parameter, which will then be stored in an array. Let’s look at an example:
public class Overloading{
public static void main(String[] args){
max(2, 3);
max(3, 4, 5);
max(5, 6, 7, 8);
max(10, 11, 12, 13, 14, 15, 16);
}
static void max(int ... values){
if(values.length > 0){
int max = values[0];
for(int i : values){
if(i > max){
max = i;
}
}
System.out.println(max);
}
}
}
Here, we have defined a function named max()
that accepts a single variable length parameter. To show a parameter is variable length we use three periods ...
between the type and the variable name. We must respect three rules when creating a variable length parameter:
So, when we run this program, we see that we can call the max()
function with any number of integer arguments, and it will be able to determine the maximum of those values. Inside of the function itself, values
can be treated just like an array of integers.
Before we learn about classes and objects, let’s do a quick exercise to review how to create and use functions in our code.
Write a program that accepts input from a file provided as a command-line argument. If an incorrect number of arguments are provided, or if the program is unable to open the file, it should print “Invalid Arguments” and terminate.
The program’s input will consist of a list of 100 integers, one per line. If any line of the input cannot be converted to an integer, the program should print “Invalid Input” and terminate.
The program should determine whether the list of integers is considered a mathematical set. That is, each item in the list should be unique, with no duplicate numbers. If the input is not a set, it should print “Not a set” and terminate.
If the input is a set, then the program should print the sum of the values in the set and then terminate.
This program should consist of three functions:
main(args)
- The main function that controls the program. It should accept an array of strings representing the command-line arguments to the program.isSet(int[] numbers)
- A function to determine if the given array is a set. The input should be a single array of integers, and the return value should be a Boolean value.sumSet(int[] numbers)
- A function to find the sum of all the elements in the given array. The input should be a single array of integers, and the return value should be an integer.
There may be easier ways of determining if an array contains duplicate items, but could we simply check that, for each item, there is only one of that item in the list?
This exercise uses a custom grading program to grade submissions, which will be used extensively throughout this course. The first step of the grading process will examine the structure of your code, making sure that it contains the correct classes and functions. The second step will directly examine each function in the program, making sure that they operate as expected. You are welcome to include any additional code to complete the project that is not specified above.
Each step of the grading process will create two files in your work directory showing more detailed output. To open the HTML file as a webpage, right-click on it and select Preview Static. The log file may contain helpful debugging messages if your program experiences an unhandled exception.
{Check It!|assessment}(test-500463119)
{Check It!|assessment}(test-2806204399)
import java.util.Scanner;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.nio.file.InvalidPathException;
import java.nio.file.NoSuchFileException;
import java.io.BufferedWriter;
import java.io.IOException;
import java.lang.NumberFormatException;
public class Functions{
public static void main(String[] args){
if(args.length != 1){
System.out.println("Invalid Arguments");
return;
}
try(
Scanner scanner1 = new Scanner(Paths.get(args[0]));
){
int[] nums = new int[100];
int i = 0;
while(scanner1.hasNext()){
String line = scanner1.nextLine().trim();
int input = Integer.parseInt(line);
nums[i++] = input;
}
if(isSet(nums)){
System.out.println(sumSet(nums));
}else{
System.out.println("Not a set");
}
}catch(InvalidPathException e){
System.out.println("Invalid Arguments");
return;
}catch(NoSuchFileException e){
System.out.println("Invalid Arguments");
return;
}catch(IOException e){
System.out.println("Invalid Arguments");
return;
}catch(NumberFormatException e){
System.out.println("Invalid Input");
return;
}
}
static boolean isSet(int[] nums){
for(int i : nums){
int count = 0;
for(int j : nums){
if(i == j){
count++;
}
}
if(count > 1){
return false;
}
}
return true;
}
static int sumSet(int[] nums){
int sum = 0;
for(int i : nums){
sum += i;
}
return sum;
}
}
In programming, a class describes an individual entity or part of the program. In many cases, the class can be used to describe an actual thing, such as a person, a vehicle, or a game board, or a more abstract thing such as a set of rules for a game, or even an artificial intelligence engine for making business decisions.
In object-oriented programming, a class is the basic building block of a larger program. Typically each part of the program is contained within a class, representing either the main logic of the program or the individual entities or things that the program will use.
We can represent the contents of a class in a UML Class Diagram. Below is an example of a class called Person
:
Throughout the next few pages, we will realize the design of this class in code.
To create a class in Java, we can simply use the class
keyword at the beginning of our file:
public class Person{
}
As we’ve already learned, each class declaration in Java includes these parts:
public
- this keyword is used to identify that the item after it should be publicly accessible to all other parts of the program. Later in this chapter, we’ll discuss other keywords that could be used here.class
- this keyword says that we are declaring a new class.Person
- this is an identifier that gives us the name of the class we are declaring.Following the declaration, we see a set of curly braces {}
, inside of which will be all of the fields and methods stored in this class.
According to the Java standards, this class must be stored in a file called Person.java
.
Of course, our classes are not very useful at this point because they don’t include any attributes or methods. Including attributes in a class is one of the simplest uses of classes, so let’s start there.
To add an attribute to a class, we can simply declare a variable inside of our class declaration:
public class Person{
String lastName;
String firstName;
int age;
}
That’s really all there is to it! We can also add default values to these attributes by assigning a value to the variable in the same line as the declaration:
public class Person{
String lastName = "Person";
String firstName = "Test";
int age = 25;
}
However, it is very important to note that we cannot declare an attribute and then set the default value on a separate line. So, code such as this is not allowed:
Finally, we can add either the public
keyword to the beginning of each of these attributes to make them available to code outside of this class, or the private
keyword to prevent other code from accessing those attributes directly. We denote this by adding a +
in front of the attribute in our UML diagram for public attributes, and a -
for private attributes. In the diagram above, each attribute is private, so we’ll do that in our code:
public class Person{
private String lastName;
private String firstName ;
private int age;
}
We can also add methods to our classes. These methods are used either to modify the attributes of the class or to perform actions based on the attributes stored in the class. Finally, we can even use those methods to perform actions on data provided as arguments. In essence, the sky is the limit with methods in classes, so we’ll be able to do just about anything we need to do in these methods. Let’s see how we can add methods to our classes.
To add a method to our class, we can simply add a function declaration inside of our class. In fact, all of the functions we’ve been creating up to this point have been inside of a class. The only difference is that we’ll now be able to remove the static
keyword from our function declarations. We’ll discuss more about exactly what that keyword does later in this chapter.
public class Person{
private String lastName;
private String firstName ;
private int age;
public String getLastName(){ return this.lastName; }
public String getFirstName(){ return this.firstName; }
public int getAge(){ return this.age; }
private void setAge(int age){ this.age = age; }
public void happyBirthday(){
this.setAge(this.getAge() + 1);
}
}
In this example, the first four methods are getter and setter methods. We have three public getter methods that allow us to access the values stored in our private attributes in a read-only way. In addition, we have created a private setter method for the age attribute. This isn’t technically required, since we can always just change it directly from within our code, but it is a good practice to include one.
Lastly, we have created a happyBirthday()
method that uses getters and setters to update the person’s age by 1 year.
We’ve already discussed variable scope earlier in this course. Recall that variables declared inside of a block are not accessible outside of the block. Similarly, two different functions may reuse variable names, because they are in different scopes.
The same applies to classes. A class may have an attribute named age
, but a method inside of the class may also declare a local variable named age
. Therefore, we must be careful to make sure that we access the correct variable, usually by using the this
keyword to access the attribute variable. Here’s a short example:
public class Test{
int age = 15;
void foo(){
int age = 12;
System.out.println(age); // 12
System.out.println(this.age); // 15
}
void bar(){
System.out.println(age); // 15
}
}
As we can see, in the method foo()
we must be careful to use this.age
to refer to the attribute, since there is another variable named age
declared in that method. However, in the method bar()
we see that age
automatically references the attribute, since there is no other variable named age
defined in that scope.
This can lead to some confusion in our code. So, we should always get in the habit of using this
to refer to any attributes, just to avoid any unintended problems later on.
A constructor is a special method that is called whenever a new instance of a class is created. It is used to set the initial values of attributes in the class. We can even accept parameters as part of a constructor, and then use those parameters to populate attributes in the class.
Let’s go back to the Person
class example we’ve been working on and add a simple constructor to that class
public class Person{
private String lastName;
private String firstName ;
private int age;
public String getLastName(){ return this.lastName; }
public String getFirstName(){ return this.firstName; }
public int getAge(){ return this.age; }
private void setAge(int age){ this.age = age; }
public Person(String lastName, String firstName, int age){
this.lastName = lastName;
this.firstName = firstName;
this.age = age;
}
public void happyBirthday(){
this.setAge(this.getAge() + 1);
}
}
Inside that constructor, notice that we use each parameter to set the corresponding attribute, using the this
keyword once again to refer to the current object.
Now that we have fully constructed our class, we can use it elsewhere in our code through the process of instantiation. In Java, we use the new
keyword to create a new instance of class, which calls the constructor, and then we can use dot-notation to access any attributes or methods inside of that object.
Person john = new Person("Smith", "John", 25);
System.out.println(john.getLastName());
john.happyBirthday();
We can also build classes that inherit attributes and methods from another class. This allows us to build more complex structures in our code, better representing the relationships between real world objects.
As we learned earlier in this chapter, we can represent an inheritance relationship with an open arrow in our UML diagrams, as shown below:
In this diagram, the Student
class inherits from, or is a subclass of, the Person
class.
To show inheritance in Java, we can use the extends
keyword after the class name in our class declaration, listing the parent class that we are inheriting from:
public class Student extends Person{
}
From there, we can quickly implement the code for each attribute and getter method in the new class:
public class Student extends Person{
private int studentID;
private int gradeLevel;
public int getStudentID(){ return this.studentID; }
public int getGradeLevel(){ return this.gradeLevel; }
}
Since the subclass Student
also includes a definition for the method happyBirthday()
, we say that that method has been overridden in the subclass. We can do this by simply creating the new method in the Student
class, and prefixing it with the @Override
annotation:
public class Student extends Person{
private int studentID;
private int gradeLevel;
public int getStudentID(){ return this.studentID; }
public int getGradeLevel(){ return this.gradeLevel; }
@Override
public void happyBirthday(){
super.happyBirthday();
this.gradeLevel += 1;
}
}
Here, we are using the keyword super
to refer to our parent class. In that way, we can still call the happyBirthday()
method as defined in Person
, but extend it by adding our own code as well.
In addition, we can use the super()
method to call our parent class’s constructor. This must be done as the first line of our subclass’s constructor:
public class Student extends Person{
private int studentID;
private int gradeLevel;
public int getStudentID(){ return this.studentID; }
public int getGradeLevel(){ return this.gradeLevel; }
public Student(String lastName, String firstName, int age, int studentID, int gradeLevel){
super(lastName, firstName, age);
this.studentID = studentID;
this.gradeLevel = gradeLevel;
}
@Override
public void happyBirthday(){
super.happyBirthday();
this.gradeLevel += 1;
}
}
In addition to private
and public
, Java also includes a keyword protected
. This modifier prevents external code from accessing attributes and methods, but will allow any subclasses to access them. In a UML diagram, the protected keyword is denoted by a hash symbol #
in front of the attribute or method.
Inheritance allows us to make use of polymorphism in our code. Loosely polymorphism allows us to store an instance of a class within the data type of any of its parent classes. By doing so, we can only access the methods and attributes defined by the data type, but any overriden methods will use the implementation from the child class.
Here’s a quick example:
Student steveStudent = new Student("Jones", "Steve", "19", "123456", "13");
Person stevePerson = (Person)steveStudent;
// Can access methods in Person
System.out.println(stevePerson.getFirstName());
// Cannot access methods in Student
System.out.println(stevePerson.getStudentID()); // will not compile
// Can call methods from Person
// This will use the code defined in Student,
// even though it is stored as a Person.
stevePerson.happyBirthday();
System.out.println(steveStudent.getGradeLevel()); // 14
Polymorphism is a very powerful tool in programming, and we’ll use it throughout this course as we develop complex data structures.
The other important modifier we can use in Java is the static
modifier. Again, we’ve seen this modifier each time we declare the main
method in our programs, but we haven’t really been able to discuss exactly what it means. Thankfully, we now have the knowledge we need to talk about the static
modifier.
In essence, the static
modifier makes an attribute or method part of the class in which it is declared instead of part of objects instantiated from that class. If we think about it, the word static means “lacking in change”, and that’s sort of a good way to think about it.
First, we can use the static
modifier with an attribute, attaching that attribute to the class instead of the instance. Here’s an example:
public class Stat{
public static int x = 5;
public int y;
public Stat(int an_y){
this.y = an_y;
}
}
In this class, we’ve created a static
attribute named x
, and a normal attribute named y
. Here’s a main()
method that will help us explore how the static keyword operates:
public class Main{
public static void main(String[] args){
Stat someStat = new Stat(7);
Stat anotherStat = new Stat(8);
System.out.println(someStat.x); // 5
System.out.println(someStat.y); // 7
System.out.println(anotherStat.x); // 5
System.out.println(anotherStat.y); // 8
someStat.x = 10;
System.out.println(someStat.x); // 10
System.out.println(someStat.y); // 7
System.out.println(anotherStat.x); // 10
System.out.println(anotherStat.y); // 8
Stat.x = 25;
System.out.println(someStat.x); // 25
System.out.println(someStat.y); // 7
System.out.println(anotherStat.x); // 25
System.out.println(anotherStat.y); // 8
}
}
First, we can see that the attribute x
is set to 5 as its default value, so both objects someStat
and anotherStat
contain that same value. Then we can update the value of x
attached to someStat
to 10, and we’ll see that both objects will now contain that value. That’s because the value is static
, and there is only one copy of that value for all instances of the Stat
class.
Finally, and most interestingly, since the attribute x
is static, we can also access it directly from the class Stat
, without even having to instantiate an object. So, we can update the value in that way, and it will take effect in any objects instantiated from Stat
.
We can also do the same for static methods.
public class Stat{
public static int x = 5;
public int y;
public Stat(int an_y){
this.y = an_y;
}
public static int sum(int a){
return x + a;
}
}
We have now added a static method sum()
to our Stat
class. The important thing to remember is that a static method cannot access any non-static attributes or methods, since it doesn’t have access to an instantiated object. Likewise, we cannot use the this
keyword inside of a static method.
As a tradeoff, we can call a static method without instantiating the class either, as in this example:
public class Main{
public static void main(String[] args){
//other code omitted
Stat.x = 25;
Stat moreStat = new Stat(7);
System.out.println(moreStat.sum(5)); // 30
System.out.println(Stat.sum(5)); // 30
}
}
This becomes extremely useful in our main()
method. Since the main()
method is always static, it can only access static attributes and methods in the class it is declared in. So, we can either create all of our additional methods in that class as static
methods, or we can instantiate the class it is contained in.
Another major feature of class inheritance is the ability to define a method in a parent class, but not provide any code that implements that function. In effect, we are saying that all objects of that type must include that method, but it is up to the child classes to provide the code. These methods are called abstract methods, and the classes that contain them are abstract classes. Let’s look at how they work!
In the UML diagram above, we see that the describe()
method in the Vehicle
class is printed in italics. That means that the method should be abstract, without any code provided. To do this in Java, we simply must use the abstract
keyword on both the method and the class itself:
public abstract class Vehicle{
private String name;
protected double speed;
public String getName(){ return this.name; }
protected Vehicle(String name){
this.name = name;
this.speed = 1.0;
}
public double move(double distance){
System.out.println("Moving");
return distance / this.speed;
}
public abstract String describe();
}
Notice that the keyword abstract
goes after the security modifier, but before the class
keyword on a class declaration and the return type on a method declaration.
In addition, since we have declared the method describe()
to be abstract, we must place a semicolon after the method declaration, without any curly braces. This is because an abstract method cannot include any code.
Now, any class that inherits from the Vehicle
class must provide an implementation for the describe()
method. If it does not, that class must also be declared to be abstract. So, for example, in the UML diagram above, we see that the MotorVehicle
class does not include an implementation for describe()
, so we’ll also have to make it abstract.
We can also declare a class to be abstract without including any abstract methods. By doing so, it prevents the class from being instantiated directly. Instead, the class can only be inherited from, and those child classes can choose to be instantiated by omitting the abstract
keyword.
Let’s build a quick program following the MVC architecture style to review working with classes, object, inheritance, and polymorphism.
Write a program to store a list of students and teachers at a school. The program should have methods to add a student or a teacher, as well as a method to print the entire list.
The program should conform to the following UML diagram:
Right-click and select “Open image in new tab” to view larger
The purpose of each method will be further described below.
Person
ClassPerson()
- constructor that initializes all attributes based on parametersgetLastName()
- getter for lastName
attributegetFirstName()
- getter for firstName
attributegetAge()
- getter for age
attributehappyBirthday()
- method to increase person’s age
attribute by $1$toString()
- method that overrides the built-in Object
class toString()
method. It should return a string in the form "firstName lastName: age"
Student
ClassStudent()
- constructor that initializes all attributes (including in super class) based on parametersgetStudentID()
- getter for studentID
attributegetGradeLevel()
- getter for gradeLevel
attributehappyBirthday()
- method to increase student’s age
and gradeLevel
attribute by $1$toString()
- method that overrides the built-in Object
class toString()
method. It should return a string in the form "firstName lastName: age (studentID - gradeLevel)"
Teacher
ClassTeacher()
- constructor that initializes all attributes (including in super class) based on parametersgetClassroom()
- getter for classroom
attributegetSalary()
- getter for salary
attributehappyBirthday()
- method to increase teacher’s age
by $1$ and salary
attribute by $1000$toString()
- method that overrides the built-in Object
class toString()
method. It should return a string in the form "firstName lastName: age (classroom - $salary)"
View
ClassshowMenu()
- a method to show a menu of options to the user. The user should be prompted to input exactly one of the options listed below, which is returned as a String. The wording of the menu is up to you. The method should return whatever was input by the user, without any error checking (that is done in the Controller)
addStudent()
- a method to add a new student to the system. The user should input a list of parameters for each attribute as they are listed in the constructor for Student
, separated by spaces. The wording of the prompt is up to you. The method should return whatever was input by the user, without any error checking (that is done in the Controller)
addTeacher()
- a method to add a new teacher to the system. The user should input a list of parameters for each attribute as they are listed in the constructor for Teacher
, separated by spaces. The wording of the prompt is up to you. The method should return whatever was input by the user, without any error checking (that is done in the Controller)listPeople()
- a method to list all Person
objects in the persons
array given as a parameter. Each one should be prefixed by an index starting at $0$, incrementing by one for each Person
in the array. Remember that unused array slots will contain the value null
so that should be considered in your code.
showError()
- a method to display an error to the user. The parameter error
should be printed to the screen, prefixed by “Error: "Controller
Classmain()
- the main method for this program. It should simply instantiate a new instance of the Controller class, and then call the run()
method of that object.Controller()
- the constructor for the Controller object. It initialize the persons
attribute to an array with a maximum size of 20 items, as well as a View
object stored in the view
attribute.run()
- this method consists of a loop that will execute the program until it is terminated. It will call the showMenu()
method of the view to show a menu to the user (see above). Finally, it will parse the string returned by the call to showMenu()
and call additional appropriate methods in the Controller
or View
class to complete the operation. If the user inputs “exit” then it should terminate. Otherwise, the program will repeatedly display the menu to the user until “exit” is chosen. If at any time the user provides input that cannot be properly parsed, the controller should call the showError()
method in the View
class and restart the process (loop back to the beginning) by showing the menu again.addStudent()
- this method will receive the string input by the user from the addStudent()
method in View
, parse the input, and call the appropriate methods to create a new Student
object and add it to the first empty slot in the persons
array.addTeacher()
- this method will receive the string input by the user from the addTeacher()
method in View
, parse the input, and call the appropriate methods to create a new Teacher
object and add it to the first empty slot in the persons
array.getPersons()
- this method will simply return the current persons
attribute as an array. This is for testing purposes onlysetPersons()
- this method will replace the persons
attribute with the array provided as a parameter. This is for testing purposes only.A sample execution of the program is shown below.
{Check It!|assessment}(test-3269626908)
{Check It!|assessment}(test-2757569305)
{Check It!|assessment}(test-2923816668)
{Check It!|assessment}(test-3356253417)
{Check It!|assessment}(test-2664871931)
{Check It!|assessment}(test-3442438399)
{Check It!|assessment}(test-163667115)
{Check It!|assessment}(test-496994260)
{Check It!|assessment}(test-2263639856)
{Check It!|assessment}(test-2809305019)
public class Person{
private String lastName;
private String firstName;
private int age;
public Person(String lastName, String firstName, int age){
this.lastName = lastName;
this.firstName = firstName;
this.age = age;
}
public String getLastName(){ return this.lastName; }
public String getFirstName(){ return this.firstName; }
public int getAge(){ return this.age; }
public void happyBirthday(){
this.age = this.age + 1;
}
@Override
public String toString(){
return this.firstName + " " + this.lastName + ": " + this.age;
}
}
public class Student extends Person{
private int studentID;
private int gradeLevel;
public Student(String lastName, String firstName, int age, int studentID, int gradeLevel){
super(lastName, firstName, age);
this.studentID = studentID;
this.gradeLevel = gradeLevel;
}
public int getStudentID(){ return this.studentID; }
public int getGradeLevel(){ return this.gradeLevel; }
@Override
public void happyBirthday(){
super.happyBirthday();
this.gradeLevel = this.gradeLevel + 1;
}
@Override
public String toString(){
return super.toString() + " (" + this.studentID + " - " + this.gradeLevel + ")";
}
}
public class Teacher extends Person{
private String classroom;
private int salary;
public Teacher(String lastName, String firstName, int age, String classroom, int salary){
super(lastName, firstName, age);
this.classroom = classroom;
this.salary = salary;
}
public String getClassroom(){ return this.classroom; }
public int getSalary(){ return this.salary; }
@Override
public String toString(){
return super.toString() + " (" + this.classroom + " - $" + this.salary + ")";
}
}
import java.util.Scanner;
public class View{
public String showMenu(){
System.out.println("Please enter one of the following options:");
System.out.println(" add student");
System.out.println(" add teacher");
System.out.println(" list people");
System.out.println(" exit");
try{
Scanner scanner = new Scanner(System.in);
String input = scanner.nextLine();
return input;
}catch(Exception e){
return "";
}
}
public String addStudent(){
System.out.println("Please enter the following items for the new student, all on the same line");
System.out.println("LastName FirstName Age StudentID GradeLevel");
try{
Scanner scanner = new Scanner(System.in);
String input = scanner.nextLine();
return input;
}catch(Exception e){
return "";
}
}
public String addTeacher(){
System.out.println("Please enter the following items for the new teacher, all on the same line");
System.out.println("LastName FirstName Age Classroom Salary");
try{
Scanner scanner = new Scanner(System.in);
String input = scanner.nextLine();
return input;
}catch(Exception e){
return "";
}
}
public void listPeople(Person[] persons){
System.out.println("The school contains the following people:");
int i = 0;
for(Person p : persons){
if(p == null){
continue;
}
System.out.println(i + ") " + p.toString());
i++;
}
}
public void showError(String error){
System.out.println("Error: " + error);
}
}
public class Controller{
private Person[] persons;
private View view;
private int size;
public static void main(String[] args){
new Controller().run();
}
public Controller(){
this.persons = new Person[20];
this.view = new View();
this.size = 0;
}
public void run(){
while(true){
String input = view.showMenu();
if(input.equals("add student")){
addStudent(view.addStudent());
}else if(input.equals("add teacher")){
addTeacher(view.addTeacher());
}else if(input.equals("list people")){
view.listPeople(persons);
}else if(input.equals("exit")){
break;
}else{
view.showError("Invalid Input!");
}
}
}
public void addStudent(String input){
String[] splits = input.split(" ");
try{
Person p = new Student(splits[0], splits[1], Integer.parseInt(splits[2]), Integer.parseInt(splits[3]), Integer.parseInt(splits[4]));
if(size < 20){
persons[size++] = p;
}else{
view.showError("Array full!");
}
}catch(Exception e){
view.showError("Unable to parse input!");
}
}
public void addTeacher(String input){
String[] splits = input.split(" ");
try{
Person p = new Teacher(splits[0], splits[1], Integer.parseInt(splits[2]), splits[3], Integer.parseInt(splits[4]));
if(size < 20){
persons[size++] = p;
}else{
view.showError("Array full!");
}
}catch(Exception e){
view.showError("Unable to parse input!");
}
}
public Person[] getPersons() { return persons; }
public void setPersons(Person[] input) { persons = input; }
}
This page is the main page for Python
In Python, we can break our programs up into individual functions, which are individual routines that we can call in our code. Let’s review how to create functions in Python.
The table below lists the flowchart blocks used to represent functions, as well as the corresponding pseudocode:
Operation | Flowchart | Pseudocode |
---|---|---|
Declare Function |
![]() ![]() |
|
Call Function |
![]() ![]() |
|
In general, a function definition in Python needs a few elements. Let’s start at the simplest case:
def foo():
print("Foo")
return
Let’s break this example function definition down to see how it works:
def
at the beginning of this function definition. That keyword tells Python that we’d like to define a new function. We’ll need to include it at the beginning of each function definition.foo
. We can name a function using any valid identifier in Python. In general, function names in Python always start with a lowercase letter, and use underscores between the words in the function name if it contains multiple words.()
that list the parameters for this function. Since there is nothing included in this example, the function foo
does not require any parameters.:
indicating that the indented block of code below this definition is contained within the function. In this case, the function will simply print Foo
to the terminal.return
keyword. Since we aren’t returning a value, we aren’t required to include a return
keyword in the function. However, it is helpful to know that we may use that keyword to exit the function at any time.Once that function is created, we can call it using the following code:
foo()
In a more complex case, we can declare a function that accepts parameters and returns a value, as in this example:
def count_letters(input, letter):
output = 0
for i in range(0, len(input)):
if input[i] == letter:
output += 1
return output
In this example, the function accepts two parameters: input
, which could be a string, and letter
, which could be a single character. However, since Python does not enforce a type on these parameters, they could actually be any value. We could add additional code to this function that checks the type of each parameter and raises a TypeError
if they are not the expected type.
We can use the parameters just like any other variable in our code. To return a value, we use the return
keyword, followed by the value or variable containing the value we’d like to return.
To call a function that requires parameters, we can include values as arguments in the parentheses of the function call:
sum += count_letters("The quick brown fox jumped over the lazy dog", "e")
Python allows us to specify default values for parameters in a function definition. In that way, if those parameters are not provided, the default value will be used instead. So, it may appear that there are multiple functions with the same name that accept a different number of parameters. This is called function overloading.
For example, we could create a function named max()
that could take either two or three parameters:
def main():
max(2, 3)
max(3, 4, 5)
def max(x, y, z=None):
if z is not None:
if x >= y:
if x >= z:
print(x)
else:
print(z)
else:
if y >= z:
print(y)
else:
print(z)
else:
if x >= y:
print(x)
else:
print(y)
# main guard
if __name__ == "__main__":
main()
In this example, we are calling max()
with both 2 and 3 arguments from main()
. When we only provide 2 arguments, the third parameter will be given the default value None
, which is a special value in Python showing that the variable is empty. Then, we can use if z is not None
as part of an If-Then statement to see if we need to take that variable into account in our code.
This example also introduces a new keyword, is
. The is
keyword in Python is used to determine if two variables are exactly the same object, not just the same value. In this case, we want to check that z
is exactly the same object as None
, not just that it has the same value. In Python, it is common to use the is
keyword when checking to see if an optional parameter is given the value None
. We’ll see this keyword again in a later chapter as we start dealing with objects.
Python also allows us to specify function arguments using keywords that match the name of the parameter in the function. In that way, we can specify the arguments we need, and the function can use default values for any unspecified parameters. Here’s a quick example:
def main():
args(1) # 6
args(1, 5) # 9
args(1, c=5) # 8
args(b=7, a=2) # 12
args(c=5, a=2, b=3) # 10
def args(a, b=2, c=3):
print(str(a + b + c))
# main guard
if __name__ == "__main__":
main()
In this example, the args()
method has one required parameter, a
. It can either be provided as the first argument, known as a positional argument, or as a keyword argument like a=2
. The other parameters, b
and c
, can either be provided as positional arguments or keyword arguments, but they are not required since they have default values.
Also, we can see that when we use keyword arguments we do not have to provide the arguments in the order they are defined in the function’s definition. However, any arguments provided without keywords must be placed at the beginning of the function call, and will be matched positionally with the first parameters defined in the function.
Finally, Python allows us to define a single parameter that is a variable length parameter. In essence, it will allow us to accept anywhere from 0 to many arguments for that single parameter, which will then be stored in a list. Let’s look at an example:
def main():
max(2, 3)
max(3, 4, 5)
max(5, 6, 7, 8)
max(10, 11, 12, 13, 14, 15, 16)
def max(*values):
if len(values) > 0:
max = values[0]
for value in values:
if value > max:
max = value
print(max)
# main guard
if __name__ == "__main__":
main()
Here, we have defined a function named max()
that accepts a single variable length parameter. To show a parameter is variable length we use an asterisk *
before variable name. We must respect two rules when creating a variable length parameter:
So, when we run this program, we see that we can call the max()
function with any number of arguments, and it will be able to determine the maximum of those values. Inside of the function itself, values
can be treated just like a list.
Before we learn about classes and objects, let’s do a quick exercise to review how to create and use functions in our code.
Write a program that accepts input from a file provided as a command-line argument. If an incorrect number of arguments are provided, or if the program is unable to open the file, it should print “Invalid Arguments” and terminate.
The program’s input will consist of a list of 100 integers, one per line. If any line of the input cannot be converted to an integer, the program should print “Invalid Input” and terminate.
The program should determine whether the list of integers is considered a mathematical set. That is, each item in the list should be unique, with no duplicate numbers. If the input is not a set, it should print “Not a set” and terminate.
If the input is a set, then the program should print the sum of the values in the set and then terminate.
This program should consist of three functions:
main(args)
- The main function that controls the program. It should accept an array of strings representing the command-line arguments to the program.is_set(numbers)
- A function to determine if the given array is a set. The input should be a single array of integers, and the return value should be a Boolean value.sum_set(numbers)
- A function to find the sum of all the elements in the given array. The input should be a single array of integers, and the return value should be an integer.
Don’t forget to include a main guard at the end of the file that passes the contents of sys.argv
as an argument to the main()
function.
There may be easier ways of determining if an array contains duplicate items, but could we simply check that, for each item, there is only one of that item in the list?
This exercise uses a custom grading program to grade submissions, which will be used extensively throughout this course. The first step of the grading process will examine the structure of your code, making sure that it contains the correct classes and functions. The second step will directly examine each function in the program, making sure that they operate as expected. You are welcome to include any additional code to complete the project that is not specified above.
Each step of the grading process will create two files in your work directory showing more detailed output. To open the HTML file as a webpage, right-click on it and select Preview Static. The log file may contain helpful debugging messages if your program experiences an unhandled exception.
{Check It!|assessment}(test-4065492757)
{Check It!|assessment}(test-2578058679)
import sys
def main(argv):
if len(argv) != 2:
print("Invalid Arguments")
sys.exit()
try:
with open(argv[1]) as scanner1:
nums = []
for line in scanner1:
line = line.strip()
input = int(line)
nums.append(input)
if is_set(nums):
print(sum_set(nums))
else:
print("Not a set")
except FileNotFoundError:
print("Invalid Arguments")
return
except IOError:
print("Invalid Arguments")
return
except ValueError:
print("Invalid Input")
return
def is_set(nums):
for i in nums:
count = 0
for j in nums:
if i == j:
count += 1
if count > 1:
return False
return True
def sum_set(nums):
sum = 0
for i in nums:
sum += i
return sum
# main guard
if __name__ == "__main__":
main(sys.argv)
In programming, a class describes an individual entity or part of the program. In many cases, the class can be used to describe an actual thing, such as a person, a vehicle, or a game board, or a more abstract thing such as a set of rules for a game, or even an artificial intelligence engine for making business decisions.
In object-oriented programming, a class is the basic building block of a larger program. Typically each part of the program is contained within a class, representing either the main logic of the program or the individual entities or things that the program will use.
We can represent the contents of a class in a UML Class Diagram. Below is an example of a class called Person
:
Throughout the next few pages, we will realize the design of this class in code.
To create a class in Python, we can simply use the class
keyword at the beginning of our file:
class Person:
pass
As we’ve already learned, each class declaration in Python includes these parts:
class
- this keyword says that we are declaring a new class.Person
- this is an identifier that gives us the name of the class we are declaring.Following the declaration, we see a colon :
marking the start of a new block, inside of which will be all of the fields and methods stored in this class. We’ll need to indent all items inside of this class, just like we do with other blocks in Python.
In order for Python to allow this code to run, we cannot have an empty block inside of a class declaration. So, we can add the keyword pass
to the block inside of the class so that it is not empty.
By convention, we would typically store this class in a file called Person.py
.
Of course, our classes are not very useful at this point because they don’t include any attributes or methods. Including attributes in a class is one of the simplest uses of classes, so let’s start there.
To add an attribute to a class, we can simply declare a variable inside of our class declaration:
class Person:
last_name = "Person"
first_name = "Test"
age = 25
That’s really all there is to it! These are static attributes or class attributes that are shared among all instances of the class. On the next page, we’ll see how we can create instance attributes within the class’s constructor.
Finally, we can make these attributes private by adding two underscores to the variable’s name. We denote this on our UML diagram by placing a minus -
before the attribute or method’s name. Otherwise, a +
indicates that it should be public. In the diagram above, each attribute is private, so we’ll do that in our code:
class Person:
__last_name = "Person"
__first_name = "Test"
__age = 25
Unfortunately, Python does have a way to get around these restrictions as well. Instead of referencing __last_name
, we can instead reference _Person__last_name
to find that value, as in this example:
ellie = Person("Jonson", "Ellie", 29)
ellie._Person__last_name = "Jameson"
print(ellie.last_name) # Jameson
Behind the scenes, Python adds an underscore _
followed by the name of the class to the beginning of any class attribute or method that is prefixed with two underscores __
. So, knowing that, we can still access those attributes and methods if we want to. Thankfully, it’d be hard to do this accidentally, so it provides some small level of security for our data.
We can also add methods to our classes. These methods are used either to modify the attributes of the class or to perform actions based on the attributes stored in the class. Finally, we can even use those methods to perform actions on data provided as arguments. In essence, the sky is the limit with methods in classes, so we’ll be able to do just about anything we need to do in these methods. Let’s see how we can add methods to our classes.
A constructor is a special method that is called whenever a new instance of a class is created. It is used to set the initial values of attributes in the class. We can even accept parameters as part of a constructor, and then use those parameters to populate attributes in the class.
Let’s go back to the Person
class example we’ve been working on and add a simple constructor to that class:
class Person:
__last_name = "Person"
__first_name = "Test"
__age = 25
def __init__(self, last_name, first_name, age):
self.__last_name = last_name
self.__first_name = first_name
self.__age = age
Since the constructor is an instance method, we need to add a parameter to the function at the very beginning of our list of parameters, typically named self
. This parameter is automatically added by Python whenever we call an instance method, and it is a reference to the current instance on which the method is being called. We’ll learn more about this later.
Inside that constructor, notice that we use each parameter to set the corresponding attribute, using the self
keyword once again to refer to the current object.
Also, since we are now defining the attributes as instance attributes in the constructor, we can remove them from the class definition itself:
class Person:
def __init__(self, last_name, first_name, age):
self.__last_name = last_name
self.__first_name = first_name
self.__age = age
We’ve already discussed variable scope earlier in this course. Recall that two different functions may use the same local variable names without affecting each other because they are in different scopes.
The same applies to classes. A class may have an attribute named age
, but a method inside of the class may also use a local variable named age
. Therefore, we must be careful to make sure that we access the correct variable, using the self
reference if we intend to access the attribute’s value in the current instance. Here’s a short example:
class Test:
age = 15
def foo(self):
age = 12
print(age) # 12
print(self.age) # 15
def bar(self):
print(self.age) # 15
print(age) # NameError
As we can see, in the method foo()
we must be careful to use self.age
to refer to the attribute, since there is another variable named age
declared in that method. However, in the method bar()
we see that age
itself causes a NameError
since there is no other variable named age
defined in that scope. We have to use self.age
to reference the attribute.
So, we should always get in the habit of using self
to refer to any attributes, just to avoid any unintended problems later on.
In Python, we can use a special decorator @property
to define special methods, called getters and setters, that can be used to access and update the value of private attributes.
In Python, a getter method is a method that can be used to access the value of a private attribute. To mark a getter method, we use the @property
decorator, as in the following example:
class Person:
def __init__(self, last_name, first_name, age):
self.__last_name = last_name
self.__first_name = first_name
self.__age = age
@property
def last_name(self):
return self.__last_name
@property
def first_name(self):
return self.__first_name
@property
def age(self):
return self.__age
Similarly, we can create another method that can be used to update the value of the age
attribute:
class Person:
def __init__(self, last_name, first_name, age):
self.__last_name = last_name
self.__first_name = first_name
self.__age = age
@property
def last_name(self):
return self.__last_name
@property
def first_name(self):
return self.__first_name
@property
def age(self):
return self.__age
@age.setter
def age(self, value):
self.__age = value
However, this method is not required in the UML diagram, so we can omit it.
To add a method to our class, we can simply add a function declaration inside of our class.
class Person:
def __init__(self, last_name, first_name, age):
self.__last_name = last_name
self.__first_name = first_name
self.__age = age
@property
def last_name(self):
return self.__last_name
@property
def first_name(self):
return self.__first_name
@property
def age(self):
return self.__age
def happy_birthday(self):
self.__age = self.age + 1
Notice that once again we must remember to add the self
parameter as the first parameter. This method will update the private age
attribute by one year.
Now that we have fully constructed our class, we can use it elsewhere in our code through the process of instantiation. In Python, we can simply call the name of the class as a method to create a new instance, which calls the constructor, and then we can use dot-notation to access any attributes or methods inside of that object.
from Person import *
john = Person("Smith", "John", 25)
print(john.last_name)
john.happy_birthday()
Notice that we don’t have to provide a value for the self
parameter when we use any methods. This parameter is added automatically by Python based on the value of the object we are calling the methods from.
We can also build classes that inherit attributes and methods from another class. This allows us to build more complex structures in our code, better representing the relationships between real world objects.
As we learned earlier in this chapter, we can represent an inheritance relationship with an open arrow in our UML diagrams, as shown below:
In this diagram, the Student
class inherits from, or is a subclass of, the Person
class.
To show inheritance in Python, we place the parent class inside of parentheses directly after the name of the subclass when it is defined:
from Person import *
class Student(Person):
pass
From there, we can quickly implement the code for each property and getter method in the new class:
from Person import *
class Student(Person):
@property
def student_id(self):
return self.__student_id
@property
def grade_level(self):
return self.__grade_level
Since the subclass Student
also includes a definition for the method happy_birthday()
, we say that that method has been overridden in the subclass. We can do this by simply creating the new method in the Student
class, making sure it accepts the same number of parameters as the original:
from Person import *
class Student(Person):
@property
def student_id(self):
return self.__student_id
@property
def grade_level(self):
return self.__grade_level
def happy_birthday(self):
super().happy_birthday()
self.__grade_level += 1
Here, we are using the function super()
to refer to our parent class. In that way, we can still call the happy_birthday()
method as defined in Person
, but extend it by adding our own code as well.
In addition, we can use the super()
method to call our parent class’s constructor.
from Person import *
class Student(Person):
@property
def student_id(self):
return self.__student_id
@property
def grade_level(self):
return self.__grade_level
def __init__(self, last_name, first_name, age, student_id, grade_level):
super().__init__(last_name, first_name, age)
self.__student_id = student_id
self.__grade_level = grade_level
def happy_birthday(self):
super().happy_birthday()
self.__grade_level += 1
In addition to private and public attributes and methods, UML also includes the concept of protected methods. This modifier is used to indicate that the attribute or method should not be accessed outside of the class, but will allow any subclasses to access them. Python does not enforce this restriction; it is simply convention. In a UML diagram, the protected keyword is denoted by a hash symbol #
in front of the attribute or method. In Python, we then prefix those attributes or methods with a single underscore _
.
Inheritance allows us to make use of polymorphism in our code. Loosely polymorphism allows us to treat an instance of a class within the data type of any of its parent classes. By doing so, we can only access the methods and attributes defined by the data type, but any overriden methods will use the implementation from the child class.
Here’s a quick example:
steve_student = new Student("Jones", "Steve", "19", "123456", "13")
# We can now treat steve_student as a Person object
steve_person = steve_student
print(steve_person.first_name)
# We can call happy_birthday(), and it will use
# the code from the Student class, even if we
# think that steve_student is a Person object
steve_person.happy_birthday()
# We can still treat it as a Student object as well
print(steve_person.grade_level) # 14
Polymorphism is a very powerful tool in programming, and we’ll use it throughout this course as we develop complex data structures.
Many programming languages include a special keyword static
. In essence, a static
attribute or method is part of the class in which it is declared instead of part of objects instantiated from that class. If we think about it, the word static means “lacking in change”, and that’s sort of a good way to think about it.
In a UML diagram, static attributes and methods are denoted by underlining them.
In Python, any attributes declared outside of a method are class attributes, but they can be considered the same as static attributes until they are overwritten by an instance. Here’s an example:
class Stat:
x = 5 # class or static attribute
def __init__(self, an_y):
self.y = an_y # instance attribute
In this class, we’ve created a class attribute named x
, and a normal attribute named y
. Here’s a main()
method that will help us explore how the static keyword operates:
from Stat import *
class Main:
def main():
some_stat = Stat(7)
another_stat = Stat(8)
print(some_stat.x) # 5
print(some_stat.y) # 7
print(another_stat.x) # 5
print(another_stat.y) # 8
Stat.x = 25 # change class attribute for all instances
print(some_stat.x) # 25
print(some_stat.y) # 7
print(another_stat.x) # 25
print(another_stat.y) # 8
some_stat.x = 10 # overwrites class attribute in instance
print(some_stat.x) # 10 (now an instance attribute)
print(some_stat.y) # 7
print(another_stat.x) # 25 (still class attribute)
print(another_stat.y) # 8
if __name__ == "__main__":
Main.main()
First, we can see that the attribute x
is set to 5 as its default value, so both objects some_stat
and another_stat
contain that same value. Interestingly, since the attribute x
is static, we can access it directly from the class Stat
, without even having to instantiate an object. So, we can update the value in that way to 25, and it will take effect in any objects instantiated from Stat
.
Below that, we can update the value of x
attached to some_stat
to 10, and we’ll see that it now creates an instance attribute for that object that contains 10, overwriting the previous class attribute. The value attached to another_stat
is unchanged.
Python also allows us to create static methods that work in a similar way:
class Stat:
x = 5 # class or static attribute
def __init__(self, an_y):
self.y = an_y # instance attribute
@staticmethod
def sum(a):
return Stat.x + a
We have now added a static method sum()
to our Stat
class. To create a static method, we place the @staticmethod
decorator above the method declaration. We haven’t learned about decorators yet, but they allow us to tell Python some important information about the code below the decorator.
In addition, it is important to remember that a static method cannot access any non-static attributes or methods, since it doesn’t have access to an instantiated object in the self
parameter.
As a tradeoff, we can call a static method without instantiating the class either, as in this example:
from Stat import *
class Main:
@staticmethod
def main():
# other code omitted
Stat.x = 25
moreStat = Stat(7)
print(moreStat.sum(5)) # 30
print(Stat.sum(5)) # 30
if __name__ == "__main__":
Main.main()
This becomes extremely useful in our main()
method. Since we aren’t instantiating our Main
class, we can use the decorator @staticmethod
above the method to clearly mark that it should be considered a static method.
Another major feature of class inheritance is the ability to define a method in a parent class, but not provide any code that implements that function. In effect, we are saying that all objects of that type must include that method, but it is up to the child classes to provide the code. These methods are called abstract methods, and the classes that contain them are abstract classes. Let’s look at how they work!
In the UML diagram above, we see that the describe()
method in the Vehicle
class is printed in italics. That means that the method should be abstract, without any code provided. To do this in Python, we simply inherit from a special class called ABC
, short for “Abstract Base Class,” and then use the @abstractmethod
decorator:
from abc import ABC, abstractmethod
class Vehicle(ABC):
def __init__(self, name):
self.__name = name
self._speed = 1.0
@property
def name(self):
return self.__name
def move(self, distance):
print("Moving");
return distance / self._speed;
@abstractmethod
def describe(self):
pass
Notice that we must first import both the ABC
class and the @abstractmethod
decorator from a library helpfully called ABC
. Then, we can use ABC
as the parent class of our class, and update each method using the @abstractmethod
decorator before the method, similar to how we’ve already used @staticmethod
in an earlier module.
In addition, since we have declared the method describe()
to be abstract, we can either add some code to that method that can be called using super().describe()
from a child class, or we can simply choose to use the pass
keyword to avoid including any code in the method.
Now, any class that inherits from the Vehicle
class must provide an implementation for the describe()
method. If it does not, that class must also be declared to be abstract. So, for example, in the UML diagram above, we see that the MotorVehicle
class does not include an implementation for describe()
, so we’ll also have to make it abstract.
Of course, that means that we’ll have to inherit from both Vehicle
and ABC
. In Python, we can do that by simply including both classes in parentheses after the subclass name, separated by a comma.
Let’s build a quick program following the MVC architecture style to review working with classes, object, inheritance, and polymorphism.
Write a program to store a list of students and teachers at a school. The program should have methods to add a student or a teacher, as well as a method to print the entire list.
The program should conform to the following UML diagram:
Right-click and select “Open image in new tab” to view larger
The purpose of each method will be further described below.
Person
Class__init__()
- constructor that initializes all attributes based on parameterslast_name()
- getter for last_name
attribute—it should be implemented as a propertyget_first_name()
- getter for first_name
attribute—it should be implemented as a propertyget_age()
- getter for age
attribute—it should be implemented as a propertyhappy_birthday()
- method to increase person’s age
attribute by $1$__str__()
- method that overrides the built-in Object
class __str__()
method. It should return a string in the form "first_name last_name: age"
Student
Class__init__()
- constructor that initializes all attributes (including in super class) based on parametersstudent_id()
- getter for student_id
attribute—it should be implemented as a propertygrade_level()
- getter for grade_level
attribute—it should be implemented as a propertyhappy_birthday()
- method to increase student’s age
and grade_level
attribute by $1$__str__()
- method that overrides the built-in Object
class __str__()
method. It should return a string in the form "first_name last_name: age (student_id - grade_level)"
Teacher
Class__init__()
- constructor that initializes all attributes (including in super class) based on parametersclassroom()
- getter for classroom
attribute—it should be implemented as a propertysalary()
- getter for salary
attribute—it should be implemented as a propertyhappy_birthday()
- method to increase teacher’s age
by $1$ and salary
attribute by $1000$__str__()
- method that overrides the built-in Object
class __str__()
method. It should return a string in the form "first_name last_name: age (classroom - $salary)"
View
Classshow_menu()
- a method to show a menu of options to the user. The user should be prompted to input exactly one of the options listed below, which is returned as a String. The wording of the menu is up to you. The method should return whatever was input by the user, without any error checking (that is done in the Controller)
add_student()
- a method to add a new student to the system. The user should input a list of parameters for each attribute as they are listed in the constructor for Student
, separated by spaces. The wording of the prompt is up to you. The method should return whatever was input by the user, without any error checking (that is done in the Controller)
add_teacher()
- a method to add a new teacher to the system. The user should input a list of parameters for each attribute as they are listed in the constructor for Teacher
, separated by spaces. The wording of the prompt is up to you. The method should return whatever was input by the user, without any error checking (that is done in the Controller)list_people()
- a method to list all Person
objects in the persons
list given as a parameter. Each one should be prefixed by an index starting at $0$, incrementing by one for each Person
in the list.
show_error()
- a method to display an error to the user. The parameter error
should be printed to the screen, prefixed by “Error: "Hint: use sys.stdin.readline()
to read an entire line of input anywhere in your code. Don’t forget to import sys
as well!
Controller
Classmain()
- the main method for this program. It should simply instantiate a new instance of the Controller class, and then call the run()
method of that object.__init__()
- the constructor for the Controller object. It initialize the persons
attribute to an empty list, as well as a View
object stored in the view
attribute.run()
- this method consists of a loop that will execute the program until it is terminated. It will call the showMenu()
method of the view to show a menu to the user (see above). Finally, it will parse the string returned by the call to showMenu()
and call additional appropriate methods in the Controller
or View
class to complete the operation. If the user inputs “exit” then it should terminate. Otherwise, the program will repeatedly display the menu to the user until “exit” is chosen. If at any time the user provides input that cannot be properly parsed, the controller should call the showError()
method in the View
class and restart the process (loop back to the beginning) by showing the menu again.add_student()
- this method will receive the string input by the user from the add_student()
method in View
, parse the input, and call the appropriate methods to create a new Student
object and add it to the first empty slot in the persons
list.add_teacher()
- this method will receive the string input by the user from the add_teacher()
method in View
, parse the input, and call the appropriate methods to create a new Teacher
object and add it to the first empty slot in the persons
list.persons()
- these methods are a getter and setter for the persons
attribute. They should be implemented as a property. It is for testing purposes only.A sample execution of the program is shown below.
{Check It!|assessment}(test-3416583454)
{Check It!|assessment}(test-1104470273)
{Check It!|assessment}(test-845982740)
{Check It!|assessment}(test-3746962485)
{Check It!|assessment}(test-3109390640)
{Check It!|assessment}(test-1533527332)
{Check It!|assessment}(test-1355293558)
{Check It!|assessment}(test-3808852464)
{Check It!|assessment}(test-3597936585)
{Check It!|assessment}(test-1788523734)
class Person:
def __init__(self, last_name, first_name, age):
self.__last_name = last_name
self.__first_name = first_name
self.__age = age
@property
def last_name(self):
return self.__last_name
@property
def first_name(self):
return self.__first_name
@property
def age(self):
return self.__age
def happy_birthday(self):
self.__age = self.__age + 1
def __str__(self):
return "{} {}: {}".format(self.first_name, self.last_name, self.age)
from Person import Person
class Student(Person):
def __init__(self, last_name, first_name, age, student_id, grade_level):
super().__init__(last_name, first_name, age)
self.__student_id = student_id
self.__grade_level = grade_level
@property
def student_id(self):
return self.__student_id
@property
def grade_level(self):
return self.__grade_level
def happy_birthday(self):
super().happy_birthday()
self.__grade_level = self.__grade_level + 1
def __str__(self):
return "{} ({} - {})".format(super().__str__(), self.student_id, self.grade_level)
from Person import Person
class Teacher(Person):
def __init__(self, last_name, first_name, age, classroom, salary):
super().__init__(last_name, first_name, age)
self.__classroom = classroom
self.__salary = salary
@property
def classroom(self):
return self.__classroom
@property
def salary(self):
return self.__salary
def __str__(self):
return "{} ({} - ${})".format(super().__str__(), self.classroom, self.salary)
import sys
class View:
def show_menu(self):
print("Please enter one of the following options:")
print(" add student")
print(" add teacher")
print(" list people")
print(" exit")
try:
inp = sys.stdin.readline()
return inp.strip()
except Exception:
return ""
def add_student(self):
print("Please enter the following items for the new student, all on the same line")
print("LastName FirstName Age StudentID GradeLevel")
try:
inp = sys.stdin.readline()
return inp.strip()
except Exception:
return ""
def add_teacher(self):
print("Please enter the following items for the new teacher, all on the same line")
print("LastName FirstName Age Classroom Salary")
try:
inp = sys.stdin.readline()
return inp.strip()
except Exception:
return ""
def list_people(self, persons):
i = 0
for p in persons:
print("{}) {}".format(i, p))
i += 1
def show_error(self, error):
print("Error: {}".format(error))
from Person import Person
from Student import Student
from Teacher import Teacher
from View import View
import sys
class Controller:
@staticmethod
def main(args):
Controller().run()
def __init__(self):
self.__persons = []
self.__view = View()
@property
def persons(self):
return self.__persons
@persons.setter
def persons(self, value):
self.__persons = value
def run(self):
while True:
inp = self.__view.show_menu()
if inp == "add student":
self.add_student(self.__view.add_student())
elif inp == "add teacher":
self.add_teacher(self.__view.add_teacher())
elif inp == "list people":
self.__view.list_people(self.__persons)
elif inp == "exit":
break
else:
self.__view.show_error("Invalid Input!")
def add_student(self, inp):
splits = inp.split(" ")
try:
p = Student(splits[0], splits[1], int(splits[2]), int(splits[3]), int(splits[4]))
self.__persons.append(p)
except Exception:
self.__view.show_error("Unable to parse input!")
def add_teacher(self, inp):
splits = inp.split(" ")
try:
p = Teacher(splits[0], splits[1], int(splits[2]), splits[3], int(splits[4]))
self.__persons.append(p)
except Exception:
self.__view.show_error("Unable to parse input!")
# main guard
if __name__ == "__main__":
Controller.main(sys.argv)
This chapter covered the rest of the programming basics we’ll need to know before starting on the new content of this course. By now we should be pretty familiar with the basic syntax of the language we’ve chosen, as well as the concepts of classes, objects, inheritance, and polymorphism in object-oriented programming. Finally, we’ve explored the Model-View-Controller (MVC) architecture, which will be used extensively in this course.
This page is the main page for Programming by Contract and Introduction to Performance
In this course, we will learn how to develop several different data structures, and then use those data structures in programs that implement several different types of algorithms. However, one of the most difficult parts of programming is clearly explaining what a program should do and how it should perform.
So far, we’ve used UML class diagrams to discuss the structure of a program. It can give us information about the classes, attributes, and methods that our program will contain, as well as the overall relationships between the classes. We can even learn if attributes and methods are private or public, and more.
However, to describe what each method does, we have simply relied on descriptions in plain language up to this point, with no specific format at all. In this module, we’ll introduce the concept of programming by contract to help us provide more specific information about what each method should do and the expectations we can count on based on the inputs and outputs of the method.
Specifically, we’ll learn about the preconditions that are applied to the parameters of a method to make sure they are valid, the postconditions that the method will guarantee if the preconditions are met, and the invariants that a loop or data structure will maintain.
Finally, we can put all of that information together to discuss how to prove that an algorithm correctly performs the task it was meant to, and how to make sure that it works correctly in all possible cases.
First, let’s discuss preconditions. A precondition is an expectation applied to any parameters and existing variables when a method or function is called. Phrased a different way, the preconditions should all be true before the method is called. If all of the preconditions are met, the function can proceed and is expected to function properly. However, if any one of the preconditions are not met, the function may either reach an exception, prompt the user to correct the issue, or produce invalid output, depending on how it is written.
Let’s consider an example method to see how we can define the preconditions applied to that method. In this example, we’re going to write a method triangleArea(side1, side2, side3)
that will calculate the area of a triangle, given the lengths of the sides of the triangle.
So, to determine what the preconditions of that method should be, we must think about what we know about a triangle and what sort of data we expect to receive.
For example, we know that the length of each side should be a number. In addition, those lengths should all be positive, so each one must be strictly greater than $0$.
We can also determine if we expect the length to be whole numbers or floating-point numbers. To make this example simpler, let’s just work with whole numbers.
When looking at preconditions, determining the types and expected range of values of each parameter is a major first step. However, sometimes we must also look at the relationship between the parameters to find additional preconditions that we must consider.
^[File:TriangleInequality.svg. (2015, July 10). Wikimedia Commons, the free media repository. Retrieved 23:22, January 21, 2020 from https://commons.wikimedia.org/w/index.php?title=File:TriangleInequality.svg&oldid=165448754.]
For example, the triangle inequality states that the longest side of a triangle must be strictly shorter than the sum of the other two sides. Otherwise, those sides will not create a triangle. So, another precondition must state that the sides satisfy the triangle inequality.
All together, we’ve found the following preconditions for our method triangleArea(side1, side2, side3)
:
side1
, side2
and side3
each must each be an integer that is strictly greater than $0$side1
, side2
and side3
must satisfy the triangle inequalityWhat if our method is called and provided a set of parameters that do not meet the preconditions described above? As a programmer, there are several actions we can take in our code to deal with the situation.
One of the most common ways to handle precondition failures is to simply throw or raise exceptions from our method as soon as it determines that the preconditions are not met. In this way, we can quickly indicate that the program is unable to perform the requested operation, and leave it up to the code that called that method to either handle the exception or ignore it and allow the program to crash.
This method is best used within the model portions of a program written using the Model-View-Controller or MVC architecture. By doing so, this allows our controller to react to problems quickly, usually by requesting additional input from the user using the view portion of the program.
In simpler programs, it is common for the code to simply handle the precondition failure by asking the user for new input. This is commonly done in programs that are small enough to fit in a single class, instead of being developed using MVC architecture.
Of course, we could choose to simply ignore these precondition failures and allow our code to continue running. IN that case, if the preconditions are not met, then the answer we receive may be completely invalid. On the next page, we’ll discuss how failed preconditions affect whether we can trust our method’s output.
Next, we can discuss postconditions. A postcondition is a statement that is guaranteed to be true after a method is executed, provided all of the preconditions were met. If any one of the preconditions were not met, then we can’t count on the postcondition being true either. This is the most important concept surrounding preconditions and postconditions.
If the preconditions of a method are all true when a method is called, then we may assume the postconditions are true after the method is complete, provided it is written correctly.
On the last page, we discussed the preconditions for a method triangleArea(side1, side2, side3)
that will calculate the area of a triangle given the lengths of its sides. Those preconditions are:
side1
, side2
and side3
each must each be an integer that is strictly greater than $0$side1
, side2
and side3
must satisfy the triangle inequalitySo, once the method completes, what should our postcondition be? In this case, we want to find a statement that would be always true if all of the preconditions are met.
Since the method will be calculating the area of the triangle, the strongest postcondition we can use is the most obvious one:
side1
, side2
and side3
That’s really it!
Of course, there are a few other postconditions that we could consider, especially when we start working with data structures and objects. For example, one of the most powerful postconditions is the statement:
When we call a method that accepts an array or object as a parameter, we know that we can modify the values stored in that array or object because the parameter is handled in a call-by-reference fashion in most languages. So, if we don’t state that this postcondition applies, we can’t guarantee that the method did not change the values in the array or object we provided as a parameter.
So, what if the preconditions are not met? Then what happens?
As we discussed on the previous page, if the preconditions are not met, then we cannot guarantee that the postcondition will be true once the method executes. In fact, it may be decidedly incorrect, depending on how we implement the method.
^[File:Triangle with notations 2 without points.svg. (2018, December 5). Wikimedia Commons, the free media repository. Retrieved 00:03, January 22, 2020 from https://commons.wikimedia.org/w/index.php?title=File:Triangle_with_notations_2_without_points.svg&oldid=330397605.]
For example, the simplest way to find the area of a triangle given the lengths of all three sides is Heron’s formula , which can be written mathematically as:
$$ A = 1/4 \sqrt{(a + b + c)(-a + b + c)(a - b + c)(a + b - c)} $$
Since this is a mathematical formula, it is always possible to get a result from it, even if all of the preconditions are not met. For example, the inputs could be floating-point values instead of integers, or they may not satisfy the triangle inequality. In that case, the function may still produce a result, but it will not represent the actual area of the triangle described, mainly because the parameters provided describe a triangle that cannot exist in the real world. So, we must always be careful not to assume that a method will always provide the correct output unless we provide parameters that make all of its preconditions true.
Another concept related to preconditions and postconditions is the loop invariant. A loop invariant is a statement should be true after each iteration of a loop, provided the loop’s preconditions are all true before the start of the loop. Yes, that’s right—we can think of a loop as a miniature method within a method, with its own set of preconditions and postconditions.
For this example, let’s consider a method maximum(numbers)
that gets an array of numbers as input, and returns the maximum value stored in that list. So, we can start by listing the preconditions of the method:
numbers
is an array containing at least one numerical value, either a floating point or an integerThankfully, that one precondition covers all of our bases. Likewise, we can define our postcondition pretty easily as well:
numbers
arraynumbers
array is not modified by the methodThere we go! Hopefully it is easy to see how the preconditions and postconditions help us build a better definition of what operations should be performed by the method.
Now that we’ve built our method’s preconditions and postconditions, let’s quickly write the method in pseudocode so we can see what it would look like.
function MAXIMUM(NUMBERS)
MAX = NUMBERS[0]
loop I from 1 to length of NUMBERS
if NUMBERS[I] > MAX
MAX = NUMBERS[I]
end if
end loop
return MAX
end function
We’ve seen methods like this several times already, since calculating the maximum value from an array of values is a very common task. This time, however, let’s discuss the preconditions, postconditions and invariants of the loop inside of this method.
First, we can establish an important precondition at the beginning of the loop. In this case, the precondition that best describes the value stored in MAX
is:
MAX
contains the maximum value stored in NUMBERS
up to and including index $0$.In this way, we’ve directly tied our precondition to the values that exist in the method already, and we’ve accurately described the value that MAX
is currently storing.
Don’t worry if this doesn’t make sense right now - you won’t be expected to write your own preconditions and invariants at this point. We’re just introducing the concepts so you’ll understand how they work when you see them later in the projects in this course
Next, we can determine what the loop invariant should be. This is a statement that should be true after each iteration of the loop, provided the precondition is true. Usually we want to try and relate it to the precondition somehow, and how it changes after each iteration of the loop. So, let’s consider the following loop invariant:
MAX
contains the maximum value stored in NUMBERS
up to and including index $I$.Hmm, that’s almost exactly the same as the precondition, isn’t it? That’s what we’re going for here. In this way, we can easily describe what the loop does after each iteration based on how it updates the value in MAX
, keeping the loop invariant true. We call that “maintaining the loop invariant.”
Finally, we can define the postcondition of the loop. This is pretty simple, since it is the same as the loop invariant, but this time we can say that I
is equal to the end of the loop. So, our postcondition is simply:
MAX
contains the maximum value stored in NUMBERS
up to and including the last index.That’s all there is to it! By defining a precondition, invariant, and postcondition for our loop, we can very accurately describe exactly what that loop should do. On the next page, we’ll see how we can put all of those together to show that our method works correctly.
If we have defined the correct preconditions, postconditions, and invariants for our code, we can then use those to prove that our code correctly performs its intended operation.
In this course, we won’t ask you to do any of this yourself, but it is important to understand what is going on in the background and how this process works. We can use the concepts of preconditions and postconditions when grading your code using our Autograder—in fact, that’s really how it works!
To prove correctness of a method, we can generally follow this process:
Let’s do this for our maximum()
method described on the previous page. Here is the pseudocode once again:
function MAXIMUM(NUMBERS)
MAX = NUMBERS[0]
loop I from 1 to length of NUMBERS
if NUMBERS[I] > MAX
MAX = NUMBERS[I]
end if
end loop
return MAX
end function
Here are the associated conditions we established as well:
Method Precondition: NUMBERS
is an array containing at least one numerical value, either a floating point or an integer
Loop Precondition: MAX
contains the maximum value stored in NUMBERS
up to and including index $0$.
Loop Invariant: MAX
contains the maximum value stored in NUMBERS
up to and including index $I$.
Loop Postcondition: MAX
contains the maximum value stored in NUMBERS
up to and including the last index.
Method Postconditions: The method returns the maximum value stored in the NUMBERS
array, and the NUMBERS
array is not modified by the method
Now that we have that information, we can discuss the correctness of the method.
To begin, we assume that the method’s preconditions are true. Therefore, we know that NUMBERS
is an array containing at least one numerical value. Earlier in this chapter, we learned that we can’t assume that the method works if the preconditions are false, so we can always begin our proof by assuming they are true.
Next, we have to use the code as well as the method’s preconditions to establish that the loop’s preconditions are true. In the code, we see that we set the value of MAX
equal to the first element in the NUMBERS
array. So, since there is only that value in the NUMBERS
array up to and including index $0$, it is easy to say that it is indeed the maximum value. So, we’ve shown that our loop’s precondition is true.
After that, we have to show that the loop invariant is true after each iteration of the loop. A proper proof would involve a technique called proof by induction, which is more advanced than we want to cover in this course. However, a simple way to think about it is to say that the value in MAX
contains the largest value in the list that we’ve seen so far. On each loop iteration, we look at one more value. If it is larger than MAX
, then it becomes the new MAX
value. Otherwise, we know that the maximum value we’ve seen so far hasn’t changed. So, we can show that our loop invariant is always true after each loop iteration.
Finally, once we reach the end of the loop, we’ve looked at every element including the last index, and found the MAX
value, so it is easy to see that our loop postcondition is true.
Then, we can quickly use that loop postcondition to understand that the MAX
value is really the largest in the NUMBERS
array, which means that the first part of our method’s postcondition is also true.
But wait! What about the second part, that insists that we did not modify the contents of the numbers array? Thankfully, we can quickly look at our code and determine that we are not assigning a value into the array, nor do we call any functions on the array, so it is easy to show that our code has not modified it in any way.
There we go! This is a quick example for how we can use preconditions, postconditions, and invariants to help understand our code and whether it works correctly.
On the next page, we’ll learn about unit testing and how we can write our own code to verify that our code works correctly.
Once we’ve written a program, how can we verify that it works correctly? There are many ways to do this, but one of the most common is unit testing.
Unit testing a program involves writing code that actually runs the program and verifies that it works correctly. In addition, many unit tests will also check that the program produces appropriate errors when given bad input, or even that it won’t crash when given invalid input.
For example, a simple unit test for the maximum()
method would be:
function MAXIMUMTEST()
ARRAY = new array[5]
ARRAY[0] = 5
ARRAY[1] = 25
ARRAY[2] = 10
ARRAY[3] = 15
ARRAY[4] = 0
RESULT = MAXIMUM(ARRAY)
if RESULT == 25:
print "Test Passed"
end if
end function
This code will simply create an array that we know the maximum value of, and then confirm that our own maximum()
method will find the correct result.
Of course, this is a very simplistic unit test, and it would take several more unit tests to fully confirm that the maximum()
method works completely correctly.
However, it is important to understand how this test relates to the preconditions and postconditions that were established on previous pages. Here, the unit tests creates a variable ARRAY
which is an array of at least one numerical value. Therefore, it has met the preconditions for the maximum()
method, so we can assume that if maximum()
is written correctly, then the postconditions will be true once it is has executed. This is the key assumption behind unit tests.
In this course, you will be asked to build several data structures and implement algorithms that use those data structures. In the assignment descriptions, we may describe these methods using the preconditions and postconditions applied to them. Similarly, we’ll learn about structural invariants of data structures, which help us ensure that your data structures are always valid.
Then, to grade your work, we use an autograder that contains several unit tests. Those unit tests are used to confirm that you code works correctly, and they do so by providing input that either satisfies the preconditions, meaning that the test expects that the postconditions will be true, or by providing invalid input and testing how your code reacts to those situations.
You’ll see these concepts throughout this course, so it is important to be familiar with them now.
The performance of our algorithms is very important, since the difference between good algorithms and bad algorithms on very large data sets can often be measured in terms of days of execution time. Thus, efficiency will be one of the key issues we will look at when designing algorithms.
For example, one simple problem involves finding the largest sum of contiguous elements in an array. So, if we have the array:
[−2, 1, −3, 4, −1, 2, 1, −5, 4]
we could find that the contiguous sequence of:
[4, −1, 2, 1]
sums to $6$, which is the largest sum of any contiguous subsequences of this array.
A simple solution to this problem might involve finding all possible subsequences of the array and adding them, which could take a very long time. In fact, if the array contains $n$ elements, it might take $n^3$ steps to solve it. However, with a little ingenuity, we can actually solve this problem using many fewer steps, even as few as $n$ steps itself.
When we try to solve a problem, it is often helpful to look at multiple solutions to the problem and compare them before choosing our final design. Just the act of trying to find multiple ways to solve the same problem stimulates creativity and promotes mental elasticity and speed of thought. Selecting a solution from several different choices following rigorous and objective criteria enables us to improve fundamental life skills such as simply knowing how to make decisions! In fact, the very attempt at solving a problem in a variety of ways forces us to look our problems from different perspectives, which is the catalyst of all scientific discoveries.
Throughout this section, we will develop two solutions to the problem of finding the maximum max
and minimum min
of a list of N
numbers (defined as list[N]
) as an example. We will develop both solutions and then evaluate their performances in terms of both execution time and memory space.
Let’s start by considering one number from the list at a time.
When we have received just one number, this number is both the maximum and the minimum. In the initial state, you have max
holding the maximum, and min
holding the minimum. The invariant is the following:
max
holds the maximum of all the numbers considered so farmin
holds the minimum of all the numbers considered so farThe algorithm is depicted by the following flowchart and pseudocode:
print "Enter a Number:"
input X
MAX = X
MIN = X
Then, our program will enter a loop to read 10 more numbers from the user. So, we’ll need to perform the following process during each iteration of the loop:
max
with this new number and update the max
value if the new number is greater. In this way, the invariant is preserved.min
with this new number and update the min
value if the new number is smaller. In this way, the invariant is preserved.This part of the program is depicted by the following flowchart and pseudocode:
loop I from 1 to 10
print "Enter a Number:"
input X
if X > MAX
MAX = X
end if
if X < MIN
MIN = X
end if
end loop
After you’ve considered the second number, you end up in the same situation at the beginning: you have a maximum and a minimum value of the numbers input by the user so far. You have found an invariant if you verify that the preconditions before executing an iteration of the loops are the same as the conditions at the end of the loop, known as postconditions:
max
holds the maximum value among all the numbers considered so farmin
holds the minimum value among all the numbers considered so farmax
holds the maximum value among all the numbers considered so farmin
holds the minimum value among all the numbers considered so farmax
holds the maximum value among all the numbers considered so farmin
holds the minimum value among all the numbers considered so farYou can then generalize the solution for the n
th input: when you consider the n
th number, compare it with the values in max
and min
, updating them if necessary. In each step, we can show that the invariant holds.
A full flowchart of this program can be found by clicking the following link:
It is helpful to have this diagram available in a second browser tab for review on the next few pages.
Another solution consists of comparing the numbers in pairs, instead of one at a time.
When we have received just one number, this number is both the maximum and the minimum. In the initial state, you have max
holding the maximum, and min
holding the minimum. The invariant is the following:
max
holds the maximum of all the numbers considered so farmin
holds the minimum of all the numbers considered so farThe algorithm is depicted by the following flowchart and pseudocode:
print "Enter a Number:"
input X
MAX = X
MIN = X
In this program, instead of just considering one number at a time, we’ll ask the user to input two numbers. Then, we can determine which of those two inputs is larger (we’ll call it lastmax
), and compare it to the value in max
. Similarly, we can do the same for the smaller value (called lastmin
) and min
. Would this program be more efficient?
The algorithm is depicted by the following flowchart and pseudocode:
loop I from 1 to 10 step by 2:
output "Enter a Number:"
input X
output "Enter a Number:"
input Y
if X > Y
LASTMAX = X
LASTMIN = Y
else
LASTMAX = Y
LASTMIN = X
end if
if LASTMAX > MAX
MAX = LASTMAX
end if
if LASTMIN < MIN
MIN = LASTMIN
end if
end loop
Once again, we can easily show that the same loop preconditions, postconditions, and invariants work for this loop:
max
holds the maximum value among all the numbers considered so farmin
holds the minimum value among all the numbers considered so farmax
holds the maximum value among all the numbers considered so farmin
holds the minimum value among all the numbers considered so farmax
holds the maximum value among all the numbers considered so farmin
holds the minimum value among all the numbers considered so farA full flowchart of this program can be found by clicking the following link:
It is helpful to have this diagram available in a second browser tab for review on the next few pages.
It is very useful to compare the two solutions to choose one. In general, you can compare the two solutions by considering:
To compare programs in terms of the time we can estimate the running time, or time it takes to complete its work. If the program performs operations on a large set of data, you can also check the run time by using the timer available in your programming language. This type of algorithm profiling is called experimental algorithm evaluation.
You can also estimate the time by counting the number of comparisons and assignments performed by the two programs since comparison and assignments are the fundamental operations that allow you to find the smallest and the largest element.
Let’s do this analysis for both the linear and pairs solution to the problem of finding the minimum and maximum values in a list of numbers. For this analysis, we’ll just look at the code inside the loop, and ignore any code before the loop that initializes variables, since both programs have the same code there.
The linear program performs two comparisons, x > max
and x < min
each time a new number is considered. If the algorithm considers N
numbers, the total number of comparisons is $N * 2$ or $2N$ comparisons.
The pairs program makes three comparisons every time it considers two elements: x > y
, lastmax > max
and lastmin < min
If the algorithm considers N
numbers, the total number of comparison is $N/2 * 3$ or $3/2 N$ comparisons.
If we assume that our list is holding a large number of values, then we can see a major difference in the time required to complete the program. Put another way, we can observe that the difference in efficiency between the two programs increases as N
increases. Assuming N
is $1,000$, the linear program makes $2,000$ comparisons, while the pairs program makes $1,500$ comparisons.
Of course, data sizes in real programs can be much larger. For example, Google currently indexes around 50 billion web pages! So, a list that contains $1,000$ numbers looks pretty small by comparison.
For the assignments, you can count them. For the linear solution, the following situations could occur:
So, in the worst-case, the algorithm performs $N$ assignments. For example, consider a situation where the list is already sorted in increasing order. In that case, we’ll have to update the maximum value each time, resulting in $N$ assignments.
In the pairs solution, for every two numbers considered causes the following assignments to be computed:
lastmax
and lastmin
. These two assignments are done in any case.min
and the max
, depending on the results of the comparisons.Therefore, the number of assignments is $2$ in the best case, $4$ in the worst case, and $3$ in the average case. Hence, with $N$ numbers, the number of assignments is $N/2 * 2$ or $N$ in the best case, $2N$ in the worst case and $3/2N$ in the average case. The number of assignments is greater with the second algorithm. But, could we do better in this last case?
So, how can we determine which is better? Let’s look at a couple of situations.
For example, if we have a list of 1000 numbers, we can find the following number of steps for each program. First, let’s consider the worst case performance, taking the largest values for each.
As we can see, the pairs program requires more steps in total than the linear program. So, it appears that it might be the best choice.
However, not every program will be a worst case. So, let’s look at the best case performance for each one:
Again, we see that the pairs program still requires more steps than the linear program, though both programs run faster in the best case than the worst case.
Finally, we can do the same for the average case performance:
There we go! As we can see, in each case the linear program actually performs better than the pairs program, even though we know that the pairs program will only run the loop half as many times as the linear program. The extra comparisons and assignments make the program take more time!
We can also look at a program based on the amount of space, or memory, that it uses. In this case, we are looking at the number of variables that are needed, and also the size of any lists, arrays, or more advanced data structures used.
The comparison in terms of space leads us to observe that the linear solution uses, in addition to the input value x
for each iteration of the loop, also the max
and min
variables. So, there are just 3 more variables to keep track of.
The pairs solution uses, in addition to the two input values x
and y
for each iteration of the loop, two variables for the lastmax
and lastmin
as well as the global max
and min
variables. So, there are 6 more variables to keep track of in this solution
Therefore, the linear solution is more space-efficient since it requires only three variables.
Let’s consider a bigger example program, just to see the impact of space complexity when analyzing a program. Consider a program that will compute the result of multiplying two numbers from $1$ through $10$. So, there are 10 numbers we need to consider as both operands.
One possible solution would be to pre-compute all possible answers in a 10 by 10 array, which would contain 100 elements. A diagram of this is shown below.
^[Wikipedia contributors. (2020, January 25). Multiplication table. In Wikipedia, The Free Encyclopedia. Retrieved 02:06, January 28, 2020, from https://en.wikipedia.org/w/index.php?title=Multiplication_table&oldid=937470066]
This is a very inefficient program in terms of space complexity, since it requires $N^2$ spaces in memory for $N$ possible operand values.
Of course, we already know that we could write a program that could simply calculate and return the answer based on the input values, so this example is a bit worthless in practice. However, on some small, embedded systems such as a smart watch or nano-machine, we might discover that it is better to use memory than to spend time calculating a result on such a slow processor, so there are times where having higher space complexity helps save time in the long run.
Lastly, when analyzing a program, we must also consider the complexity of the code used to write the program. Code complexity can refer to many things, but in general we use it to describe how many lines of code are included the program, as well as how easy it is to understand what the program does.
One of the most common ways to measure the size of a program is the number of lines of code, or LOC, of the program. This can be a very rough estimate of the size of the program, since, in general, a longer program with more lines of code may be more complex than a shorter program with fewer lines.
So, if we can find a solution to a problem that requires many fewer lines of code than another solution, that is one factor to consider. Of course, the solution with fewer lines of code may be more complex in terms of time or space!
The other important measure of code complexity deals with how easy it is to understand what the program does. Sometimes we can write a program that only takes a few lines of code, but it is so complex that it is difficult to really understand how it works.
For example, consider this line of pseudocode - can you determine what it does?
return (X > 0) and ((X % 10) + (INT((X % 100) / 10))) < 5
It is pretty difficult to understand. Can we rewrite this program to make it easier to understand? Consider the following code:
if X <= 0
return FALSE
else
ONES_PLACE = X % 10
TENS_PLACE = X % 100
SUM_LAST_TWO_DIGITS = TENS_PLACE + ONES_PLACE
if SUM_LAST_TWO_DIGITS < 5
return TRUE
else
return FALSE
end if
end if
This program is pretty easy to follow. First, it will determine if X
is 0 or a negative number, and return FALSE
if so. Then, it will find the last two digits of the number, corresponding to the ones and the tens place, and sum them. Finally, if the sum of the last two digits is less than $5$, it will return TRUE
. Otherwise, it will return FALSE
.
As it turns out, these two programs do the exact same thing! However, from the single line of code in the first program, it is very difficult to decipher exactly what it does. The second program, even though it is much longer, is much easier to understand. In addition, when run on a real computer, both of these programs will take nearly the same amount of time to run.
As we’ve seen, we can describe the complexity of a program in terms of three values: the time it takes to run, the space it requires in memory, and the size and understandability of the code. However, how do we know which of those measures is the most important?
Unfortunately, that is a difficult question to answer. In general, we want to reduce each of these levels of complexity, but sometimes there are trade-offs. For example, a program that runs quickly may require lots of extra memory, or it could even require really complex code.
The world of business uses a simple triad to understand these trade-offs, as shown in the diagram below:
^[File:Project-triangle.svg. (2020, January 12). Wikimedia Commons, the free media repository. Retrieved 02:41, January 28, 2020 from https://commons.wikimedia.org/w/index.php?title=File:Project-triangle.svg&oldid=386979544.]
Along with this diagram comes the saying: “Fast. Good. Cheap. Pick any two.” It helps express the difficulty in finding a perfect program that does not contain at least one level of complexity.
Thankfully, most modern computers have a sufficiently fast processor and ample memory that most programs can run easily, even without worrying about reducing the space and time complexity of the program. So, it may seem that the most important measure to focus on is code complexity: we should strive to write readable and understandable code.
However, as we build larger programs and deal with more input data, time and space complexity quickly become more important. So, it is important to consider each measure independently, and do the best we can to build programs that run quickly, use a minimal amount of memory, and aren’t so complex that they are difficult to understand.
In this module, we covered two major topics that will help us understand the data structures and algorithms we’ll learn in this course. First, we learned about the use of preconditions, postconditions, and loop invariants to help describe the exact specifications of how methods and loops should operate in code. We’ll use these as the basis of unit tests in this course to help prove the correctness of data structures and algorithms we are asked to develop throughout the course.
In addition, we were introduced to the concepts of time complexity, space complexity, and code complexity, three ways to measure the performance and usefulness of a computer program. We also learned that it may be difficult to find a perfect program that has very little time, space, and code complexity, so many times we’ll have to consider a trade-off between programs that operate quickly, use less memory, and are easier to understand.
Which these tools, we’re in a much better place to understand the specifics of the data structures and algorithms we’ll encounter in this module. Next, we’ll do a short project to explore how we can build programs based on a set of preconditions, postconditions, and invariants instead of the plain language descriptions we’ve used up to this point.
This page is the main page for Data Structures and Algorithms
One way to look at a computer program is to think of it as a list of instructions that the computer should follow. However, in another sense, many computer programs are simply ways to manipulate data to achieve a desired result. We’ve already written many programs that do this, from calculating the minimum and maximum values of a list of numbers, to storing and retrieving data about students and teachers in a school.
As we start to consider our programs as simply ways to manipulate data, we may quickly realize that we are performing the same actions over and over again, or even treating data in many similar ways. Over time, these ideas have become the basis for several common data structures that we may use in our programs.
^[File:Binary tree.svg. (2019, September 14). Wikimedia Commons, the free media repository. Retrieved 22:18, February 7, 2020 from https://commons.wikimedia.org/w/index.php?title=File:Binary_tree.svg&oldid=365739199.]
Broadly speaking, a data structure is any part of our program that stores data using a particular format or method. Typically data structures define how the data is arranged, how it is added to the structure, how it can be removed, and how it can be accessed.
Data structures can give us very useful ways to look at how our data is organized. In addition, a data structure may greatly impact how easy, or difficult, it can be to perform certain actions with the data. Finally, data structures also impose performance limitations on our code. Some structures may be better at performing a particular operation than others, so we may have to consider that as well when choosing a data structure for our program.
In this class, we’ll spend the majority of our time learning about these common data structures, as well as algorithmic techniques that work well with each one. By formalizing these structures and techniques, we are able to build a common set of building blocks that every programmer is familiar with, making it much easier to build programs that others can understand and reuse.
First, let’s review some of these common data structures and see how they could be useful in our programs.
First, we can broadly separate the data structures we’re going to learn about into two types, linear and non-linear data structures.
A linear data structure typically stores data in a single dimension, just like an array. By using a linear data structure, we would know that a particular element in the data structure comes before another element or vice-versa, but that’s about it. A great example is seen in the image above. We have a list of numbers, and each element in the list comes before another element, as indicated by the arrows.
Linear data structures can further be divided into two types: arrays, which are typically finite sized; and linked lists, which can be infinitely sized. We’ve already worked with arrays extensively by this point, but linked lists are most likely a new concept. That’s fine! We’ll explore how to build our own later in this course.
Using either arrays or linked lists, we can build the three most commonly used linear data structures: stacks, queues, and sets. However, before we learn about each of those, let’s review a bit more about what the list data structure itself looks like.
The list data structure is the simplest form of a linear data structure. As we can guess from the definition, a list is simply a grouping of data that is presented in a given order. With lists, not only do the elements in the list matter, but the order matters as well. It’s not simply enough to state that elements $8$, $6$ and $7$ are in the list, but generally we also know that $8$ comes before $6$, which comes before $7$.
We’ve already learned about arrays, which are perfect examples of lists in programming. In fact, Python uses the data type list
in the same way most other programming languages use arrays. Other programming languages, such as Java, provide a list
data structure through their standard libraries.
One important way to classify data structures is by the operations they can perform on the data. Since a list is the simplest version of a linear data structure, it has several important operations it can perform:
For example, let’s look at the insert operation. Assume we have the list shown in the following diagram:
Then, we decide we’d like to add the element $4$ at index $3$ in this list. So, we can think of this like trying to place the element in the list as shown below:
Once we insert that element, we then shift all of the other elements back one position, making the list one element larger. The final version is shown below:
Lists are a very powerful data structure, and one of the most commonly used in a variety of programs. While arrays may seem very flexible, their static size and limited operations can sometimes make them more difficult to use than they are worth. Many programmers choose to use the more flexible list data structure instead.
When deciding which data structure to use, lists are best when we might be adding or removing data from anywhere in the list, but we want to maintain the ordering between elements. As we’ll see on the later pages, we can have more specific types of structures for particular ways we intend to add and remove data from our structure, but lists are a great choice if neither of those are a good fit.
The next two data structures we’ll look at are stacks and queues. They are both very similar to lists in most respects, but each one puts a specific limitation on how the data structure operates that make them very useful in certain situations.
^[File:Lifo stack.png. (2017, August 7). Wikimedia Commons, the free media repository. Retrieved 23:14, February 7, 2020 from https://commons.wikimedia.org/w/index.php?title=File:Lifo_stack.png&oldid=254596945.]
A stack is one special version of a list. Specifically, a stack is a Last In, First Out or LIFO data structure.
So, what does that mean? Basically, we can only add elements to the end, or top of the stack. Then, when we want to get an element from the stack, we can only take the one from the top–the one that was most recently added.
A great way to think of a stack is like a stack of plates, such as the one pictured below:
^[File:Tallrik - Ystad-2018.jpg. (2019, December 31). Wikimedia Commons, the free media repository. Retrieved 23:17, February 7, 2020 from https://commons.wikimedia.org/w/index.php?title=File:Tallrik_-_Ystad-2018.jpg&oldid=384552503.]
When we want to add a new plate to the stack, we can just set it on top. Likewise, if we need a plate, we’ll just take the top one off and use it.
A stack supports three major unique operations:
Many stacks also include additional operations such as size and find as well.
^[File:Data Queue.svg. (2014, August 15). Wikimedia Commons, the free media repository. Retrieved 23:21, February 7, 2020 from https://commons.wikimedia.org/w/index.php?title=File:Data_Queue.svg&oldid=131660203.]
A queue is another special version of a list, this time representing a First In, First Out or FIFO data structure.
As seen in the diagram above, new items are added to the back of the queue. But, when we need to take an item from a queue, we’ll take the item that is in the front, which is the one that was added first.
Where have we seen this before? A great example is waiting our turn in line at the train station,
^[File:People waiting a train of Line 13 to come 02.JPG. (2016, November 28). Wikimedia Commons, the free media repository. Retrieved 23:23, February 7, 2020 from https://commons.wikimedia.org/w/index.php?title=File:People_waiting_a_train_of_Line_13_to_come_02.JPG&oldid=223382692.]
In many parts of the world, the term queueing is commonly used to refer to the act of standing in line. So, it makes perfect sense to use that same word to refer to a data structure.
As we can probably guess, we would definitely want to use a stack if we need our data structure to follow the last in, first out, or LIFO ordering. Similarly, we’d use a queue if we need first in, first out or FIFO ordering.
If we can’t be sure that one or the other of those orderings will work for us, then we can’t really use a stack or a queue in our program.
Of course, one of the biggest questions that comes from this is “why not just use lists for everything?” Indeed, lists can be used as both a queue and a stack, simply by consistently inserting and removing elements from either the beginning or the end of the list as needed. So why do we need to have separate data structures for a queue and a stack?
There are two important reasons. First, if we know that we only need to access the most recently added element, or the element added first, it makes sense to have a special data structure for just that usage. In this way, it is clear to anyone else reading our program that we will only be using the data in that specific way. Behind the scenes, of course, we can just use a list to represent a queue or a stack, but in our design documents and in our code, it might be very helpful to know if we should think of it like a stack or a queue.
The other reason has to do with performance. By knowing exactly how we need to use the data, we can design data structures that are specifically created to perform certain operations very quickly and efficiently. A generic list data structure may not be as fast or memory efficient as a structure specifically designed to be used as a stack, for example.
As we learn about each of these data structures throughout this course, we’ll explore how each data structure works in terms of runtime performance and memory efficiency.
Another linear data structure is known as a set. A set is very similar to a list, but with two major differences:
In fact, the term set comes from mathematics. We’ve probably seen sets already in a math class.
Beyond the typical operations to add and remove elements from a set, there are several operations unique to sets:
Again, many of these operations may be familiar from their use in various math classes.
In addition, we can easily think of set operations as boolean logic operators. For example, the set operation union is very similar to the boolean operator or, as seen in the diagram below.
^[File:Venn0111.svg. (2019, November 15). Wikimedia Commons, the free media repository. Retrieved 02:37, February 8, 2020 from https://commons.wikimedia.org/w/index.php?title=File:Venn0111.svg&oldid=375571745.]
As long as an item is contained in one set or the other, it is included in the union of the sets.
Similarly, the same comparison works for the set operation intersection and the boolean and operator.
^[File:Venn0001.svg. (2019, November 15). Wikimedia Commons, the free media repository. Retrieved 02:37, February 8, 2020 from https://commons.wikimedia.org/w/index.php?title=File:Venn0001.svg&oldid=375571733.]
Once again, if an item is contained in the first set and the second set, it is contained in the intersection of those sets.
A set is a great choice when we know that our program should prevent duplicate items from being added to a data structure. Likewise, if we know we’ll be using some of the specific operations that are unique to sets, then a set is an excellent choice.
Of course, if we aren’t sure that our data structure will only store unique items, we won’t be able to use a set.
The last of the linear data structures may seem linear from the outside, but inside it can be quite a bit more complex.
The map data structure is an example of a key-value data structure, also known as a dictionary or associative array. In the simplest case, a map data structure keeps track of a key that uniquely identifies a particular value, and stores that value along with the key in the data structure.
Then, to retrieve that value, the program must simply provide the same key that was used to store it.
In a way, this is very similar to how we use an array, since we provide an array index to store and retrieve items from an array. The only difference is that the key in a map can be any data type! So it is a much more powerful data structure.
In fact, this data structure is one of the key ideas behind modern databases, allowing us to store and retrieve database records based on a unique primary key attached to each row in the database.
A map data structure should support the following operations:
Later in this course, we’ll devote an entire module to learning how to build our own map data structures and explore these operations in more detail.
^[File:Hash table 3 1 1 0 1 0 0 SP.svg. (2019, August 21). Wikimedia Commons, the free media repository. Retrieved 02:46, February 8, 2020 from https://commons.wikimedia.org/w/index.php?title=File:Hash_table_3_1_1_0_1_0_0_SP.svg&oldid=362787583.]
One of the most common ways to implement the map data structure is through the use of a hash table. A hash table uses an array to store the values in the map, and uses a special function called a hash function to convert the given key to a simple number. This number represents the array index for the value. In that way, the same key will always find the value that was given.
But what if we have two keys that produce the same array index? In that case, we’ll have to add some additional logic to our map to handle that situation.
Maps are great data structures when we need to store and retrieve data using a specific key. Just like we would store data in a database or put items in a numbered box to retrieve later, we can use a map as a general purpose storage and retrieval data structure.
Of course, if our data items don’t have unique keys assigned to them, then using a map may not be the best choice of data structure. Likewise, if each key is a sequential integer, we may be able to use an array just as easily.
^[File:6n-graf.svg. (2020, January 12). Wikimedia Commons, the free media repository. Retrieved 02:53, February 8, 2020 from https://commons.wikimedia.org/w/index.php?title=File:6n-graf.svg&oldid=386942400.]
The other type of data structure we can use in our programs is the non-linear data structure.
Broadly speaking, non-linear data structures allow us to store data across multiple dimensions, and there may be multiple paths through the data to get from one item to another. In fact, much of the information stored in the data structure has to do with the paths between elements more than the elements themselves.
Just like linear data structures, there are several different types of non-linear data structures. In this case, each one is a more specialized version of the previous one, hence the hierarchy shown above. On the next few pages, we’ll explore each one just a bit to see what they look like.
^[File:Directed acyclic graph 2.svg. (2016, May 3). Wikimedia Commons, the free media repository. Retrieved 03:05, February 8, 2020 from https://commons.wikimedia.org/w/index.php?title=File:Directed_acyclic_graph_2.svg&oldid=195167720.]
The most general version of a non-linear data structure is the graph, as shown in the diagram above. A graph is a set of nodes that contain data, as well as a set of edges that link two nodes together. Edges themselves may also contain data.
Graphs are great for storing and visualizing not just data, but also the relationships between data. For example, each node in the graph could represent a city on the map, with the edges representing the travel time between the two cities. Or we could use the nodes in a graph to represent the people in a social network, and the edges represent connections or friendships between two people. There are many possibilities!
Graphs are a great choice when we need to store data and relationships between the data, but we aren’t sure exactly what structures or limitations are present in the data. Since a graph is the most general and flexible non-linear data type, it has the most ability to represent data in a wide variety of ways.
^[File:Tree (computer science).svg. (2019, October 20). Wikimedia Commons, the free media repository. Retrieved 03:13, February 8, 2020 from https://commons.wikimedia.org/w/index.php?title=File:Tree_(computer_science).svg&oldid=371240902.]
A tree is a more constrained version of a graph data structure. Specifically, a tree is a graph that can be shown as a hierarchical structure, where each node in the tree is itself the root of a smaller tree. Each node in the tree can have one or more child nodes and exactly one parent node, except for the topmost node or root node, which has no parent nodes.
A tree is very useful for representing data in a hierarchical or sorted format. For example, one common use of a tree data structure is to represent knowledge and decisions that can be made to find particular items. The popular children’s game 20 Questions can be represented as a tree with 20 levels of nodes. Each node represents a particular question that can be asked, and the children of that node represent the possible answers. If the tree only contains yes and no questions, it can still represent up to $2^{20} = 1,408,576$ items!
Another commonly used tree data structure is the trie, which is a special type of tree used to represent textual data. Ever wonder how a computer can store an entire dictionary and quickly spell-check every single word in the language? It actually uses a trie!
Below is a small example of a trie data structure:
^[File:Trie example.svg. (2014, March 2). Wikimedia Commons, the free media repository. Retrieved 03:22, February 8, 2020 from https://commons.wikimedia.org/w/index.php?title=File:Trie_example.svg&oldid=117843653.]
This trie contains the words “to”, “tea”, “ted”, “ten”, “i”, “in”, “inn” and “A” in just a few nodes and edges. Imagine creating a trie that could store the entire English language! While it might be large, we can hopefully see how it would be much more efficient to search and store that data in a trie instead of a linear data structure.
A tree is a great choice for a data structure when there is an inherent hierarchy in our data, such that some nodes or elements are naturally “parents” of other elements. Likewise, if we know that each element may only have one parent but many children, a tree becomes an excellent choice. Trees contain several limitations that graphs do not, but they are also very powerful data structures.
^[File:Max-Heap.svg. (2014, December 28). Wikimedia Commons, the free media repository. Retrieved 03:25, February 8, 2020 from https://commons.wikimedia.org/w/index.php?title=File:Max-Heap.svg&oldid=144372033.]
The last non-linear data structure we’ll talk about is the heap, which is a specialized version of a tree. In a heap, we try to accomplish a few goals:
If we follow those three guidelines, a heap becomes the most efficient data structure for managing a set of data where we always want to get the maximum or minimum value each time we remove an element. These are typically called priority queues, since we remove items based on their priority instead of the order they entered the queue.
Because of this, heaps are very important in creating efficient algorithms that deal with ordered data.
As discussed above, a heap is an excellent data structure for when we need to store elements and then always be able to quickly retrieve either the smallest or largest element in the data structure. Heaps are a very specific version of a tree that specialize in efficiency over everything else, so they are only really good for a few specific uses.
^[File:Euclid flowchart.svg. (2019, January 8). Wikimedia Commons, the free media repository. Retrieved 21:43, February 8, 2020 from https://commons.wikimedia.org/w/index.php?title=File:Euclid_flowchart.svg&oldid=334007111.]
The other major topic covered in this course is the use of algorithms to manipulate the data stored in our data structures.
An algorithm is best defined as a finite list of specific instructions for performing a task. In the real world, we see algorithms all the time. A recipe for cooking your favorite dish, instructions for how to fix a broken car, or a method for solving a complex mathematical equation can all be considered examples of an algorithm. The flowchart above shows Euclid’s Algorithm for finding the greatest common divisor of two numbers.
In this course, however, we’re going to look specifically at the algorithms and algorithmic techniques that are most commonly used with data structures in computer programming.
An algorithmic technique, sometimes referred to as a methodology or paradigm, is a particular way to design an algorithm. While there are a few commonly used algorithms across different data structures, many times each program may need a unique algorithm, or at least an adaptation of an existing algorithm. to perform its work.
To make these numerous algorithms easier to understand, we can loosely categorize them based on the techniques they use to solve the problem. On the next few pages, we’ll introduce some of the more commonly used algorithmic techniques in this course. Throughout this course, we will learn how to apply many of these techniques when designing algorithms that work with various data structures to accomplish a goal.
The first algorithmic technique we’ll use is the brute force technique. This is the algorithmic technique that most of us are most familiar with, even if we don’t realize it.
Simply put, a brute force algorithm will try all possible solutions to the problem, only stopping when it finds one that is the actual solution. A great example of a brute force algorithm in action is plugging in a USB cable. Many times, we will try one way, and if that doesn’t work, flip it over and try the other. Likewise, if we have a large number of keys but are unsure which one fits in a particular lock, we can just try each key until one works. That’s the essence of the brute force approach to algorithmic design.
^[File:Closest pair of points.svg. (2018, October 20). Wikimedia Commons, the free media repository. Retrieved 22:29, February 8, 2020 from https://commons.wikimedia.org/w/index.php?title=File:Closest_pair_of_points.svg&oldid=324759130.]
A great example of a brute force algorithm is finding the closest pair of points in a multidimensional space. This could be as simple as finding the two closest cities on a map, or the two closest stars in a galaxy.
To find the answer, a brute force approach would be to simply calculate the distance between each individual pair of points, and then keep track of the minimum distance found. A pseudocode version of this algorithm would be similar to the following.
MINIMUM = infinity
POINT1 = none
POINT2 = none
loop each POINTA in POINTS
loop each POINTB in POINTS
if POINTA != POINTB
DISTANCE = COMPUTE_DISTANCE(POINTA, POINTB)
if DISTANCE < MINIMUM
MINIMUM = DISTANCE
POINT1 = POINTA
POINT2 = POINTB
end if
end if
end loop
end loop
Looking at this code, if we have $N$ points, it would take $N^2$ steps to solve the problem! That’s not very efficient, event for a small data set. However, the code itself is really simple, and it is guaranteed to find exactly the best answer, provided we have enough time and a powerful enough computer to run the program.
In the project for this module, we’ll implement a few different brute-force algorithms to solve simple problems. This will help us gain more experience with this particular technique.
The next most common algorithmic technique is divide and conquer. A divide and conquer algorithm works just like it sounds. First, it will divide the problem into at least two or more smaller problems, and then it will try to solve each of those problems individually. It might even try to subdivide those smaller problems again and again to finally get to a small enough problem that it is easy to solve.
A great real-world example of using a divide and conquer approach to solving a problem is when we need to look for something that we’ve lost around the house. Instead of trying to search the entire house, we can subdivide the problem into smaller parts by looking in each room separately. Then, within each room, we can even further subdivide the problem by looking at each piece of furniture individually. By reducing the problem’s size and complexity, it becomes easier to search through each individual piece of furniture in the house, either finding our lost object or eliminating that area as the likely location it will be found.
One great example of a divide and conquer algorithm is the binary search algorithm. If we have a list of data that has already been sorted, as seen in the figure above, we can easily find any item in the list using a divide and conquer process.
For example, let’s say we want to find the value $19$ in that list. First, we can look at the item in the middle of the list, which is $23$. Is it our desired number? Unfortunately, it is not. So, we need to figure out how we can use it to divide our input into a smaller problem. Thankfully, we know the list is sorted, so we can use that to our advantage. If our desired number is less than the middle number, we know that it must exist in the first half of the list. Likewise, if it is greater than the middle number, it must be in the second half. In this case, since $19$ is less than $23$, we must only look at the first half of the list.
Now we can just repeat that process, this time using only the first half of the original list. This is the powerful feature of a divide and conquer algorithm. Once we’ve figured out how to divide our data, we can usually follow the same steps again to solve the smaller problems as well.
Once again, we ask ourselves if $12$, the centermost number in the list, is the one we are looking for. Once again, it is not, but we know that $19$ is greater than $12$, so we’ll need to look in the second half of the list.
Finally, we have reduced our problem to the simplest, or base case of the problem. Here, we simply need to determine if the single item in the list is the number we are looking for. In this case, it is! So, we can return that our original list did indeed include the number $19$.
We’ll explore many ways of using divide and conquer algorithms in this course, especially when we learn to sort and search through lists of values.
Another algorithmic technique that we’ll learn about is the greedy technique. In a greedy algorithm, the program tries to build a solution one piece at a time. At each step, it will act “greedy” by choosing the piece that it thinks is the best choice for the solution based on the available information. Instead of trying every possible solution like a brute force algorithm or dividing the problem into smaller parts like the divide and conquer approach, a greedy algorithm will just try to construct the one best answer it can.
^[File:Greedy algorithm 36 cents.svg. (2019, April 27). Wikimedia Commons, the free media repository. Retrieved 23:19, February 8, 2020 from https://commons.wikimedia.org/w/index.php?title=File:Greedy_algorithm_36_cents.svg&oldid=347456702.]
For example, we can use a greedy algorithm to determine the fewest number of coins needed to give change, as shown in the example above. If the customer is owed $36$ cents, and we have coins worth $20$ cents, $10$ cents, $5$ cents and $1$ cent, how many coins are needed to reach $36$ cents?
In a greedy solution, we could choose the coin with the highest value that is less than the change required, give that to the customer, and subtract its value from the remaining change. In this case, it will indeed produce the optimal solution.
In fact, both the United States dollar and the European euro have a system of coins that will always produce the minimum number of coins with a greedy algorithm. So that’s very helpful!
However, does it always work? What if we have a system that has coins worth $30$ cents, $18$ cents $4$ cents, and $1$ cent. Would a greedy algorithm produce the result with the minimum number of coins when making change for $36$ cents?
Let’s try it and see. First, we see that we can use a $30$ cent coin, leaving us with $6$ cents left. Then, we can use a single $4$ cent coin, as well as two $1$ cent coins for a total of $4$ coins: $30 + 4 + 1 + 1 = 36$.
Is that the minimum number of coins?
It turns out that this system includes a coin worth $18$ cents. So, to make $36$ cents, we really only need $2$ coins: $18 + 18 = 36$!
This is the biggest weakness of the greedy approach to algorithm design. A greedy algorithm will find a possible solution, but it is not guaranteed to be the best possible solution. Sometimes it will work just fine, but other times it may produce solutions that are not very good at all. So we always must consider that when creating an algorithm using a greedy technique.
The next algorithmic technique we’ll discuss is recursion. Recursion is closely related to the divide and conquer method we discussed earlier. However, recursion itself is a very complicated term to understand. It usually presents one of the most difficult challenges for a novice programmer to overcome when learning to write more advanced programs. Don’t worry! We’ll spend an entire module on recursion later in this course.
There are many different ways to define recursion. In one sense, recursion is a problem solving technique where the solution to a problem depends on solutions to smaller versions of the problem, very similar to the divide and conquer approach.
However, to most programmers, the term recursion is used to describe a function or method that calls itself inside of its own code. It may seem strange at first, but there are many instances in programming where a method can actually call itself again to help solve a difficult problem. However, writing recursive programs can be tricky at first, since there are many ways to make simple errors using recursion that cause our programs to break.
Mastering recursion takes quite a bit of time and practice, and nearly every programmer has a strong memory of the first time recursion made sense in their minds. So, it is important to make sure we understand it! In fact, it is so notable, that when we search for “recursion” on Google, it helpfully prompts us if we want to search for “recursion” instead, as seen at the top of this page.
A great example of a recursive method is calculating the factorial of a number. We may recall from mathematics that the factorial of a number is the product of each integer from 1 up to and including that number. For example, the factorial of $5$, written as $5!$, is calculated as $5 * 4 * 3 * 2 * 1 = 120$
We can easily write a traditional method to calculate the factorial of a number as follows.
function ITERATIVE_FACTORIAL(N)
RESULT = 1
loop I from 1 to N:
RESULT = RESULT * I
end loop
return RESULT
end function
However, we may also realize that the value of $5!$ is the same as $4! * 5$. If we already know how to find the factorial of $4$, we can just multiply that result by $5$ to find the factorial of $5$. As it turns out, there are many problems in the real world that work just like this, and, in fact, many of the data structures we’ll learn about are built in a similar way.
We can rewrite this iterative function to be a recursive function instead.
function RECURSIVE_FACTORIAL(N)
if N == 1
return 1
else
return N * RECURSIVE_FACTORIAL(N - 1)
end if
end function
As we can see, a recursive function includes two important elements, the base case and a recursive case. We need to include the base case so we can stop calling our recursive function over and over again, and actually reach a solution. This is similar to the termination condition of a for loop or while loop. If we forget to include the base case, our program will recurse infinitely!
The second part, the recursive case, is used to reduce the problem to a smaller version of the same problem. In this case, we reduce $N!$ to $N * (N - 1)!$. Then, we can just call our function again to solve the problem $(N - 1)!$, and multiply the result by $N$ to find the solution to $N!$.
So, if we have a problem that can be reduced to a smaller instance of itself, we may be able to use recursion to solve it!
Beyond the algorithmic techniques we’ve introduced so far, there are a number of techniques that deal specifically with data stored in non-linear data structures based on graphs. Generally speaking, we can group all of these algorithms under the heading graph traversal algorithms.
A graph traversal algorithm constructs an answer to a problem by moving between nodes in a graph using the graph’s edges, thereby traversing the graph. For example, a graph traversal algorithm could be used by a mapping program to construct a route from one city to another on a map, or to determine friends in common on a social networking website.
^[File:Dijkstra Animation.gif. (2018, November 24). Wikimedia Commons, the free media repository. Retrieved 01:45, February 9, 2020 from https://commons.wikimedia.org/w/index.php?title=File:Dijkstra_Animation.gif&oldid=329177321.]
A great example of a graph traversal algorithm is Dijkstra’s Algorithm , which can be used to find the shortest path between two selected nodes in a graph. In the image above, we can see the process of running Dijkstra’s Algorithm on a graph that contains just a few nodes.
^[File:Dijkstras progress animation.gif. (2016, February 11). Wikimedia Commons, the free media repository. Retrieved 01:44, February 9, 2020 from https://commons.wikimedia.org/w/index.php?title=File:Dijkstras_progress_animation.gif&oldid=187363050.]
Of course, we can use the same approach on any open space, as seen in this animation. Starting at the lower left corner, the algorithm slowly works toward the goal node, but it eventually runs into an obstacle. So, it must find a way around the obstacle while still finding the shortest path to the goal.
Algorithms such as Dijkstra’s Algorithm, and a more refined version called the A* Algorithm are used in many different computer programs to help find a path between two points, especially in video games.
In this chapter, we introduced a number of different data structures that we can use in our programs. In addition, we explored several algorithmic techniques we can use to develop algorithms that manipulate these data structures to allow us to solve complex problems in our code.
Throughout the rest of this course, as well as a subsequent course, we’ll explore many of these data structures and techniques in detail. We hope that introducing them all at the same time here will allow us to compare and contrast each one as we learn more about it, while still keeping in mind that there are many different structures and techniques that will be available to us in the future.
This page is the main page for Stacks
A stack is a data structure with two main operations that are simple in concept. One is the push
operation that lets you put data into the data structure and the other is the pop
operation that lets you get data out of the structure.
Why do we call it a stack? Think about a stack of boxes. When you stack boxes, you can do one of two things: put boxes onto the stack and take boxes off of the stack. And here is the key. You can only put boxes on the top of the stack, and you can only take boxes off the top of the stack. That’s how stacks work in programming as well!
A stack is what we call a “Last In First Out”, or LIFO, data structure. That means that when we pop
a piece of data off the stack, we get the last piece of data we put on the stack.
So, where do we see stacks in the real world? A great example is repairing an automobile. It is much easier to put a car back together if we put the pieces back on in the reverse order we took them off. Thus, as we take parts off a car, it is highly recommended that we lay them out in a line. Then, when we are ready to put things back together, we can just start at the last piece we took off and work our way back. This operation is exactly how a stack works.
Another example is a stack of chairs. Often in schools or in places that hold different types of events, chairs are stacked in large piles to make moving the chairs easier and to make their storage more efficient. Once again, however, if we are going to put chairs onto the stack or remove chairs from the stack, we are going to have to do it from the top.
How do we implement stacks in code? One way would be to use something we already understand, an array. Remember that arrays allow us to store multiple items, where each entry in the array has a unique index number. This is a great way to implement stacks. We can store items directly in the array and use a special top
variable to hold the index of the top of the stack.
The following figure shows how we might implement a stack with an array. First, we define our array myStack
to be an array that can hold 10 numbers, with an index of 0 to 9. Then we create a top
variable that keeps track of the index at the top of the array.
Notice that since we have not put any items onto the stack, we initialize top
to be -1
. Although this is not a legal index into the array, we can use it to recognize when the stack is empty, and it makes manipulating items in the array much simpler. When we want to push
an item onto the stack, we follow a simple procedure as shown below. Of course, since our array has a fixed size, we need to make sure that we don’t try to put an item in a full array. Thus, the precondition is that the array cannot be full. Enforcing this precondition is the function of the if
statement at the beginning of the function. If the array is already full, then we’ll throw an exception and let the user handle the situation. Next, we increment the top
variable to point to the next available location to store our data. Then it is just a matter of storing the item into the array at the index stored in top
.
function PUSH(ITEM)
if MYSTACK is full then
throw exception
end if
TOP = TOP + 1
MYSTACK[TOP] = ITEM
end function
If we call the function push(a)
and follow the pseudocode above, we will get an array with a
stored in myStack[0]
and top
will have the value 0
as shown below.
As we push items onto the stack, we continue to increment top
and store the items on the stack. The figure below shows how the stack would look if we performed the following push
operations.
push("b")
push("c")
push("d")
push("e")
Although we are implementing our stack with an array, we often show stacks vertically instead of horizontally as shown below. In this way the semantics of top
makes more sense.
Of course, the next question you might ask is “how do we get items off the stack?”. As discussed above, we have a special operation called pop
to take care of that for us. The pseudocode for the pop
operation is shown below and is similar in structure to the push
operation.
function POP
if TOP == -1 then
throw exception
end if
TOP = TOP - 1
return MYSTACK[TOP + 1]
end function
However, instead of checking to see if the stack is full, we need to check if the stack is empty. Thus, our precondition is that the stack is not empty, which we evaluate by checking if top
is equal to -1
. If it is, we simply throw an exception and let the user handle it. If myStack
is not empty, then we can go ahead and perform the pop
function. We simply decrement the value of top
and return the value stored in myStack[top+1]
.
Now, if we perform three straight pop
operations, we get the following stack.
We have already seen two basic stack operations: push
and pop
. However, there are others that make the stack much easier to use. These basic operations are:
push
: places an item on top of the stack,pop
: removes the item on the top of the stack and returns it,peek
: returns the item on the top of the stack without removing it from the stack,isEmpty
: returns true if there are no items on the stack, andisFull
: returns true if our stack array is full.We will discuss each of these operations. But first, let’s talk about the constructor for the stack
class and what it must do to properly set up a stack
object.
The main responsibility of the constructor is to initialize our attributes in the stack
class. As we discussed above, the attributes include the myStack
array and the top
attribute that keeps track of the top of the stack.
Since we are using an array for our stack, we will need to know how big to make the array in our constructor. There are two options. We could just use a default size for the array. Or, we could allow the user to pass in a positive integer to set the size. In this module we assume the caller must provide a capacity
value, which must be greater than 0
.
function STACKCONSTRUCTOR(CAPACITY)
if CAPACITY not an integer then
throw exception
else if CAPACITY <= 0 then
throw exception
end if
MYSTACK = new array[CAPACITY]
TOP = -1
end function
The first thing we do in the code is to check to make sure that capacity
is actually an integer that is greater than 0
. Essentially, this is our precondition for the method. If our precondition is not met, we throw an exception. (If we are using a typed language such as Java, we can enforce our precondition by requiring that capacity
be of type integer
instead of explicitly checking it in the code.) Once we’ve validated our precondition, we create a new array of size capacity
for the myStack
array and set the attribute top
to -1
.
We have already discussed the push
operation and seen it in operation earlier in this module. In the pseudocode below, we see that we must first check to ensure that the stack is not already full. Again, this is our precondition. You may be picking up on the fact that the first thing we do in our methods is to check our precondition and throw an exception if it is not met. This is good coding practice.
function PUSH(ITEM)
if MYSTACK is full then
throw exception
end if
TOP = TOP + 1
MYSTACK[TOP] = ITEM
end function
Once our precondition is validated, we simply increment top
by 1
and store the item
into the array at index top
. We do not return anything from the push
function. Also, notice that there are no loops in the push
operation and thus the time it takes to execute the operation will always be the same regardless of the size of the myStack
array. We call this constant time performance, which is typically very fast.
Like push
, we have already seen the pop
operation. It simply takes the top item off of the stack and returns it. However, once again we need to validate our precondition before getting into the body of the operation. For the pop
operation, our precondition is that the stack must not already be empty, which is detected when top
equals -1
.
function POP
if TOP == -1 then
throw exception
end if
TOP = TOP - 1
return MYSTACK[TOP + 1]
end function
Once we have validated our precondition, we simply decrement top
by 1
and then return the previous item at the top of the stack (myStack[top + 1]
). Like the push
operation, the pop
operation takes a constant time to execute.
To allow the calling program to detect when the stack is full, we define an isFull
operation. Notice that code external to the stack
class cannot access the value of top
and so it cannot simply check if top + 1 == length of myStack
on its own. In this case, the operation is very simple as we only need to return the Boolean value of top + 1 == length of myStack
as shown below. There is no precondition for isFull
and isFull
operates in constant time.
function ISFULL()
return TOP + 1 == length of MYSTACK
end function
The isEmpty
operation is very similar to the isFull
operation except that we return the Boolean value of the condition top == -1
instead of top + 1 == length of myStack
.
The peek
operation returns the top item on the stack, without removing it from the stack. Like the pop
operation it has the precondition that the stack must not be empty. The pseudocode for the peek
operation is shown below. It is also a constant time operation.
function PEEK()
if ISEMTPY() then
throw exception
end if
return MYSTACK[TOP]
end function
Notice that we replaced the precondition check of top == -1
with a call to isEmpty
, which produces the same result. The real benefit here is the readability of the code and the fact that we only have to code the top == -1
check in the isEmpty
operation. This will make it easier to maintain the code in the future if we change the way we implement the stack.
The doubleCapacity
operation doubles the size of the array holding our stack. So, if we started with an array of size 4
, the doubleCapacity
operation will result in an array of size 8
with the contents of our original array stored in it. Unfortunately, most programming languages (like Java) do not simply let you double the size of the array. A noted exception to this is Python, which does allow you to directly extend an array.
In a traditional programming language, the easiest way to accomplish the doubleCapacity
operation is to complete the following steps:
myStack
array at the new array.The pseudocode for the doubleCapacity
operation is shown below.
function DOUBLECAPACITY()
NEWSTACK = new array[length of MYSTACK * 2]
loop I from 0 to TOP
NEWSTACK[i] = MYSTACK[i]
end for
MYSTACK = NEWSTACK
end function
The doubleCapacity
operation is not a constant time operation. This is due to the fact that copying the contents of the original array into the new array requires us to copy each item in the stack into the new array individually. This requires N steps. Thus, we would say that doubleCapacity
runs in the order of $N$ time.
The halveCapacity
operation is like the doubleCapacity
operation except that we now have a precondition. We must make sure that when we cut the space for storing the stack that we still have enough space to store all the items currently in the stack. For example, if we have 10 items in a stack with a capacity of 16, we can’t successfully perform halveCapacity
. Doing so would only leave us a stack with a capacity of 8 and we would not be able to fit all 10 items in the new stack.
function HALVECAPACITY()
if TOP + 1 > length of MYSTACK / 2 then
throw exception
end if
NEWSTACK = new array[length of MYSTACK / 2]
loop I from 0 to TOP
NEWSTACK[i] = MYSTACK[i]
end for
MYSTACK = NEWSTACK
end function
The toString
operation returns a string that concatenates the strings representing all the items stored in an array. In most programming languages, each object class must implement the toString
operation. For instance, in the stack below where each item is a character, if we called myStack.toString()
, we would expect to be returned the string "K-State"
.
Notice that we must read the stack array from top (right) to bottom (left) to get the proper output string.
In the pseudocode below we first create an empty string and then loop through the stack from top to bottom (0) using the item’s own toString
operation to create the appropriate output string. Notice that there are no preconditions for the operation. This is because if the stack is empty, the for
loop is not executed and we simply return an empty string. However, because of the loop the toString
operation runs in order N time.
function TOSTRING()
OUTPUT = ""
loop I from TOP to 0
OUTPUT = OUTPUT + MYSTACK[I].TOSTRING()
end for
return OUTPUT
end function
The following table shows an example of how to use the above operations to create and manipulate a stack. It assumes the steps are performed sequentially and the result of the operation is shown.
Stacks are useful in many applications. Classic real-world software that uses stacks includes the undo feature in a text editor, or the forward and back features of web browsers. In a text editor, each user action is pushed onto the stack as it is performed. Then, if the user wants to undo an action, the text editor simply pops the stack to get the last action performed, and then undoes the action. The redo command can be implemented as a second stack. In this case, when actions are popped from the stack in order to undo them, they are pushed onto the redo stack.
Another example is a maze exploration application. In this application, we are given a maze, a starting location, and an ending location. Our first goal is to find a path from the start location to the end location. Once we’ve arrived at the end location, our goal becomes returning to the start location in the most direct manner.
We can do this simply with a stack. We will have to search the maze to find the path from the starting location to the ending location. Each time we take a step forward, we push that move onto a stack. If we run into a dead end, we can simply retrace our steps by popping moves off the list and looking for an alternative path. Once we reach the end state, we will have our path stored in the stack. At this point it becomes easy to follow our path backward by popping each move off the top of the stack and performing it. There is no searching involved.
We start with a maze, a startCell
, a goalCell
, and a stack
as shown below. In this case our startCell is 0,0
and our end goal is 1,1
. We will store each location on the stack as we move to that location. We will also keep track of the direction we are headed: up, right, down, or left, which we’ll abbreviate as u
,r
,d
,and l
.
In our first step, we will store our location and direction 0,0,u
on the stack.
For the second step, we will try to move “up”, or to location 1,0
. However, that square in the maze is blocked. So, we change our direction to r
as shown below.
After turning right, we attempt to move in that direction to square 0,1
, which is successful. Thus, we create a new location 0,1,u
and push it on the stack. (Here we always assume we point up when we enter a new square.) The new state of the maze and our stack are shown below.
Next, we try to move to 1,1
, which again is successful. We again push our new location 1,1,u
onto the stack. And, since our current location matches our goalCell
location (ignoring the direction indicator) we recognize that we have reached our goal.
Of course, it’s one thing to find our goal cell, but it’s another thing to get back to our starting position. However, we already know the path back given our wise choice of data structures. Since we stored the path in a stack, we can now simply reverse our path and move back to the start cell. All we need to do is pop the top location off of the stack and move to that location over and over again until the stack is empty. The pseudocode for following the path back home is simple.
loop while !MYSTACK.ISEMPTY()
NEXTLOCATION = MYSTACK.POP()
MOVETO(NEXTLOCATION)
end while
The pseudocode for finding the initial path using the stack is shown below. We assume the enclosing class has already defined a stack called myStack
and the datatype called Cell
, which represents the squares in the maze. The algorithm also uses three helper functions as described below:
getNextCell(maze, topCell)
: computes the next cell based on our current cell’s location and direction;incrementDirection(topCell)
: increments a cell’s direction attribute following the clockwise sequence of up, right, down, left, and then finally done, which means that we’ve tried all directions; andvalid(nextCell)
: determines if a cell is valid. A cell is invalid if it is “blocked”, is outside the boundaries of the maze, or is in the current path (i.e., if it exists in the stack).The parameters of findPath
are a 2-dimensional array called maze
, the startCell
and the endCell
. The algorithm begins by pushing the startCell
onto myStack
. The cell at the top of the stack will always represent our current cell, while the remaining cells in the stack represent the path of cells taken to reach the current cell.
Next, we enter a loop, where we will do the bulk of the work. We peek at the cell on the top of the stack in order to use it in our computations. If the topCell
is equal to our goalCell
, then we are done and return true
indicating that we have found a path to the goal.
If we are not at our goal, we check to see if we have searched all directions from the current cell. If that is the case, then the direction
attribute of the topCell
will have been set to done
. If the direction
attribute of topCell
is equal to done
, then we pop the topCell
of the stack, effectively leaving that cell and returning to the next cell in the stack. This is an algorithmic technique called backtracking.
function FINDPATH(MAZE, STARTCELL, GOALCELL)
MYSTACK.PUSH(STARTCELL);
loop while !MYSTACK.ISEMPTY()
TOPCELL = MYSTACK.PEEK()
if TOPCELL equals GOALCELL
return true
if TOPCELL.GETDIRECTION() = done then
MYSTACK.POP()
else
NEXTCELL = GETNEXTCELL(MAZE, TOPCELL)
INCREMENTDIRECTION(TOPCELL)
if VALID(MAZE, NEXTCELL) then
if MYSTACK.ISFULL() then
MYSTACK.DOUBLECAPACITY();
end if
MYSTACK.PUSH(NEXTCELL)
end if
end if
end while
return false
end function
However, if we have not searched in all directions from topCell
, we will try to explore a new cell (nextCell
) adjacent to the topCell
. Specifically, nextCell
will be the adjacent cell in the direction stored by the direction
attribute. We then increment the direction attribute of the topCell
so if we end up backtracking, we will know which direction to try next.
Before we push the nextCell
onto the stack, we must first check to see if it’s a valid cell by calling the helper function valid
. A cell is valid if it is open to be explored. A cell is invalid if it is “blocked,” is outside the boundaries of the maze, or is in the current path (i.e., if it exists in the stack). To help us determine if a cell is in the stack, we will need to extend our stack operations to include a find
operation that searches the stack while leaving its contents intact. You will get to implement this operation in your project.
If nextCell
is valid, we then check to make sure that the stack is not already full. If it is, we simply call doubleCapacity
and continue on our way. Then we push nextCell
onto myStack
so it will become our next topCell
on the next pass through the loop.
After we have explored all possible paths through the maze, the loop will eventually end, and the operation will return false
indicating no path was found. While this is not the most efficient path finding algorithm, it is a good example of using stacks for backtracking. Also, if we do find a path and return, the path will be saved in the stack. We can then use the previous pseudocode for retracing our steps and going back to the startCell
.
In this module we looked at the stack data structure. Stacks are a “last in first out” data structure that use two main operations, push and pop, to put data onto the stack and to remove data off of the stack. Stacks are useful in many applications including text editor “undo” and web browser “back” functions.
This page is the main page for Recursion
We are now used to using functions in our programs that allow us to decompose complex problems into smaller problems that are easier to solve. Now, we will look at a slight wrinkle in how we use functions. Instead of simply having functions call other functions, we now allow for the fact that a function can actually call itself! When a function calls itself, we call it recursion.
Using recursion often allows us to solve complex problems elegantly—with only a few lines of code. Recursion is an alternative to using loops and, theoretically, any function that can be solved with loops can be solved with recursion and vice versa.
So why would a function want to call itself? When we use recursive functions, we are typically trying to break the problem down into smaller versions of itself. For example, suppose we want to check to see if a word is a palindrome (i.e., it is spelled the same way forwards and backwards). How would we do this recursively? Typically, we would check to see if the first and last characters were the same. If so, we would check the rest of the word between the first and last characters. We would do this over and over until we got down to the 0 or 1 characters in the middle of the word. Let’s look at what this might look like in pseudocode.
function isPalindrome (String S) returns Boolean
if length of S < 2 then
return true
else
return (first character in S == last character in S) and
isPalindrome(substring of S without first and last character)
end if
end function
First, we’ll look at the else
part of the if
statement. Essentially, this statement determines if the first and last characters of S
match, and then calls itself recursively to check the rest of the word S
. Of course, if the first and last characters of S
match and the rest of the string is a palindrome, the function will return true
. However, we can’t keep calling isPalindrome
recursively forever. At some point we have to stop. That is what the if
part of the statement does. We call this our base case. When we get to the point where the length of the string we are checking is 0
or 1
(i.e., < 2
), we know we have reached the middle of the word. Since all strings of length 0 or 1 are, by definition, palindromes, we return true
.
The key idea of recursion is to break the problem into simpler subproblems until you get to the point where the solution to the problem is trivial and can be solved directly; this is the base case. The algorithm design technique is a form of divide-and-conquer called decrease-and-conquer. In decrease-and-conquer, we reduce our problem into smaller versions of the larger problem.
A recursive program is broken into two parts:
The base case is generally the final case we consider in a recursive function and serves to both end the recursive calls and to start the process of returning the final answer to our problem. To avoid endless cycles of recursive calls, it is imperative that we check to ensure that:
Suppose we must write a program that reads in a sequence of keyboard characters and prints them in reverse order. The user ends the sequence by typing an asterisk character *
.
We could solve this problem using an array, but since we do not know how many characters might be entered before the *
, we could not be sure the program would actually work. However, we can use a recursive function since its ability to save the input data is not limited by a predefined array size.
Our solution would look something like this. We’ve also numbered the lines to make the following discussion easier to understand.
function REVERSE() (1)
read CHARACTER (2)
if CHARACTER == `*` then (3)
return (4)
else (5)
REVERSE() (6)
print CHARACTER (7)
return (8)
end if (9)
end function (10)
The function first reads a single character from the keyboard and stores it in CHARACTER
. Then, in line 3 it checks to see if the user typed the *
character. If so, we simply return, knowing that we have reached the end of the input and need to start printing out the characters we’ve read in reverse order. This is the base case for this recursive function.
If the CHARACTER
we read in was not an *
, line 6 will recursively call REVERSE
to continue reading characters. Once the function returns (meaning that we have gotten an *
character and started the return process) the function prints the CHARACTER
in line 7 and then returns itself.
Now let’s look at what happens within the computer when we run REVERSE
. Let’s say the program user wants to enter the three characters from the keyboard: n
, o
, and w
followed by the *
character. The following figure illustrates the basic concept of what is going on in the computer.
The arrows in the figure represent the order of execution of the statements in the computer. Each time we execute the recursive call to REVERSE
in line 6, we create a new instance of the function, which starts its execution back at the beginning of the function (line 2). Then, when the function executes return
, control reverts back to the next statement to be executed (line 7) in the calling instance of the function.
It’s important to understand that each instance of the function has its own set of variables whose values are unique to that instance. When we read n
into the CHARACTER
variable in the first instance of REVERSE
it is not affected by anything that happens in the second instance of REVERSE
. Therefore, reading the o
into CHARACTER
in the second instance of REVERSE
does not affect the value of CHARACTER
in the first instance of REVERSE
.
During the execution of the first instance of REVERSE
, the user enters the character n
so the if
condition is false
and we execute the else
part of the statement, which calls the REVERSE
function. (Note that before we actually start the second instance of REVERSE
, the operating system stores the statement where we will pick up execution once the called function returns.) When the second instance of REVERSE
is started, a new copy of all variables is created as well to ensure we do not overwrite the values from the first instance.
The execution of the second instance of REVERSE
runs exactly like the first instance except that the user enters the character o
instead of n
. Again, the else
part of the if
statement is executed, which calls the REVERSE
function. When the third instance of REVERSE
is executed, the user now inputs w
, which again causes a new instance of REVERSE
to be called.
Finally, in the fourth instance of REVERSE
, the user inputs the *
character, which causes the if
part of the statement to execute, which performs our return
statement. Once the return
from the base case of our recursive function is performed, it starts the process of ending all the instances of the REVERSE
function and creating the solution. When instance 4 of the REVERSE
function returns, execution starts at the write
statement (line 7) of instance 3. Here the character w
is printed, and the function returns to instance 2. The same process is carried out in instance 2, which prints the o
character and returns. Likewise, instance 1 prints its character n
and then returns. The screen should now show the full output of the original call to REVERSE
, which is “won”.
Recursion has allowed us to create a very simple and elegant solution to the problem of reversing an arbitrary number of characters. While you can do this in a non-recursive way using loops, the solution is not that simple. If you don’t believe us, just try it! (Yes, that is a challenge.)
function REVERSE2() (1)
read CHARACTER (2)
if CHARACTER == `*` then (3)
return (4)
else (5)
print CHARACTER (6a)
REVERSE2() (7a)
return (8)
end if (9)
end function (10)
The REVERSE2
function in the previous quiz actually prints the characters entered by the user in the same order in which they are typed. Notice how this small variation in the instruction order significantly changed the outcome of the function. To get a better understanding of why this occurs, we will delve into the order of execution in a little more depth.
From the output of our original REVERSE
function, we could argue that recursive function calls are carried out in a LIFO (last in, first out) order. Conversely, the output of the second version of the function REVERSE2
, would lead us to believe that recursive function calls are carried out in FIFO (first in, first out) order. However, the ordering of the output is really based on how we structure our code within the recursive function itself, not the order of execution of the recursive functions.
To produce a LIFO ordering, we use a method called head recursion, which causes the function to make a recursive call first, then calculates the results once the recursive call returns. To produce a FIFO ordering, we use a method called tail recursion, which is when the function makes all of its necessary calculations before making a recursive call. With the REVERSE
and REVERSE2
functions, this is simply a matter of swapping lines 6 and 7.
While some functions require the use of either head or tail recursion, many times we have the choice of which one to use. The choice is not necessarily just a matter of style, as we shall see next.
Before we finish our discussion of head and tail recursion, we need to make sure we understand how a recursive function actually works in the computer. To do this, we will use a new example. Let’s assume we want to print all numbers from 0 to $N$, where $N$ is provided as a parameter. A recursive solution to this problem is shown below.
function OUTPUT(integer N) (1)
if N == 0 then (2)
print N (3)
else (4)
print "Calling to OUTPUT " N-1 (5)
OUTPUT(N-1) (6)
print "Returning from OUTPUT " N-1 (7)
print N (8)
end if (9)
return (10)
end function (11)
Notice that we have added some extra print
statements (lines 5 and 7) to the function just to help us keep track of when we have called OUTPUT
and when that call has returned. This function is very similar to the REVERSE
function above, we just don’t have to worry about reading a character each time the function runs. Now, if we call OUTPUT
with an initial parameter of 3
, we get the following output. We’ve also marked these lines with letters to make the following discussion simpler.
Calling to OUTPUT 2 (a)
Calling to OUTPUT 1 (b)
Calling to OUTPUT 0 (c)
0 (d)
Returning from OUTPUT 0 (e)
1 (f)
Returning from OUTPUT 1 (g)
2 (h)
Returning from OUTPUT 2 (i)
3 (j)
Lines a, b, and c show how the function makes all the recursive calls before any output or computation is performed. Thus, this is an example of head recursion which produces a LIFO ordering.
Once we get to the call of OUTPUT(0)
, the function prints out 0
(line d) and we start the return process. When we return from the call to OUTPUT(0)
we immediately print out N
, which is 1
and return. We continue this return process from lines g through j and eventually return from the original call to OUTPUT
having completed the task.
Now that we have seen how recursion works in practice, we will pull back the covers and take a quick look at what is going on underneath. To be able to call the same function over and over, we need to be able to store the appropriate data related to each function call to ensure we can treat it as a unique instance of the function. While we do not make copies of the code, we do need to make copies of other data. Specifically, when function A
calls function B
, we must save the following information:
A
to be executed when B
returns (called the return address),B
.We call this information the activation record for function A
. When a call to B
is made, this information is stored in a stack data structure known as the activation stack, and execution begins at the first instruction in function B
. Upon completion of function B
, the following steps are performed.
Next, we will look at how we use the activation stack to implement recursion. For this we will use a simple MAIN
program that calls our simplified OUTPUT
function (where we have removed all the print statements used to track our progress).
function MAIN()
OUTPUT(3) (1)
print ("Done") (2)
end function
function OUTPUT(integer N)
if N == 0 then (1)
print N (2)
else (3)
OUTPUT(N-1) (4)
print N (5)
end if (6)
return (7)
end function
When we run MAIN
, the only record on the activation stack is the record for MAIN
. Since it has not been “called” from another function, it does not contain a return address. It also has no local variables, so the record is basically empty as shown below.
However, when we execute line 1 in MAIN
, we call the function OUTPUT
with a parameter of 3
. This causes the creation of a new function activation record with the return address of line 3 in the calling MAIN
function and a parameter for N
, which is 3
. Again, there are no local variables in OUTPUT
. The stack activation is shown in figure a below.
a | b | c | d |
---|---|---|---|
![]() ![]() |
![]() ![]() |
![]() ![]() |
![]() ![]() |
Following the execution for OUTPUT
, we will eventually make our recursive call to OUTPUT
in line 4, which creates a new activation record on the stack as shown above in b. This time, the return address will be line 5 and the parameter N
is 2
.
Execution of the second instance of OUTPUT
will follow the first instance, eventually resulting in another recursive call to OUTPUT
and a new activation record as shown in c above. Here the return address is again 5
but now the value of parameter N
is 1
. Execution of the third instance of OUTPUT
yields similar results, giving us another activation record on the stack d with the value of parameter N
being 0
.
Finally, the execution of the fourth instance of OUTPUT
will reach our base case of N == 0
. Here we will write 0
in line 2 and then return
. This return will cause us to start execution back in the third instance of OUTPUT
at the line indicated by the return value, or in this case, 5. The stack activation will now look like e in the figure below.
e | f | g | h |
---|---|---|---|
![]() ![]() |
![]() ![]() |
![]() ![]() |
![]() ![]() |
When execution begins in the third instance of OUTPUT
at line 5, we again write the current value of N
, which is 1
, and we then return
. We follow this same process, returning to the second instance of OUTPUT
, then the first instance of OUTPUT
. Once the initial instance of OUTPUT
completes, it returns to line 2 in MAIN
, where the print("Done")
statement is executed and MAIN
ends.
While recursion is a very powerful technique, its expressive power has an associated cost in terms of both time and space. Anytime we call a function, a certain amount of memory space is needed to store information on the activation stack. In addition, the process of calling a function takes extra time since we must store parameter values and the return address, etc. before restarting execution. In the general case, a recursive function will take more time and more memory than a similar function computed using loops.
It is possible to demonstrate that any function with a recursive structure can be transformed into an iterative function that uses loops and vice versa. It is also important to know how to use both mechanisms because there are advantages and disadvantages for both iterative and recursive solutions. While we’ve discussed the fact that loops are typically faster and take less memory than similar recursive solutions, it is also true that recursive solutions are generally more elegant and easier to understand. Recursive functions can also allow us to find solutions to problems that are complex to write using loops.
The most popular example of using recursion is calculating the factorial of a positive integer $N$. The factorial of a positive integer $N$ is just the product of all the integers from $1$ to $N$. For example, the factorial of $5$, written as $5!$, is calculated as $5 * 4 * 3 * 2 * 1 = 120$. The definition of the factorial function itself is recursive. $$ \text{fact}(N) = N * \text{fact}(N - 1) $$ The corresponding pseudocode is shown below.
function FACT(N)
if N == 1
return 1
else
return N * FACT(N-1)
end if
end function
The recursive version of the factorial is slower than the iterative version, especially for high values of $N$. However, the recursive version is simpler to program and more elegant, which typically results in programs that are easier to maintain over their lifetimes.
In the previous examples we saw recursive functions that call themselves one time within the code. This type of recursion is called linear recursion, where head and tail recursion are two specific types of linear recursion.
In this section we will investigate another type of recursion called tree recursion, which occurs when a function calls itself two or more times to solve a single problem. To illustrate tree recursion, we will use a simple recursive function MAX
, which finds the maximum of $N$ elements in an array. To calculate the maximum of $N$ elements we will use the following recursive algorithm.
MAX1
.MAX2
.MAX1
and MAX2
to find the maximum of all elements.Our process recursively decomposes the problem by searching for the maximum in the first $N/2$ elements and the second $N/2$ elements until we reach the base case. In this problem, the base case is when we either have 1 or 2 elements in the array. If we just have 1, we return that value. If we have 2, we return the larger of those two values. An overview of the process is shown below.
The pseudocode for the algorithm is shown below.
function MAX(VALUES, START, END)
print "Called MAX with start = " + START + ", end = " + END
if END – START = 0
return VALUES[START]
else if END – START = 1
if VALUES(START) > VALUES(END)
return VALUES[START]
else
return VALUES[END]
end if
else
MIDDLE = ROUND((END – START) / 2)
MAX1 = MAX(VALUES, START, START + MIDDLE – 1)
MAX2 = MAX(VALUES, START + MIDDLE, END)
if MAX1 > MAX2
return MAX1
else
return MAX2
end if
end if
end function
The following block shows the output from the print
line in the MAX
function above. The initial call to the function is MAX(VALUES, 0, 15)
.
Called MAX with start = 0, end = 7
Called MAX with start = 0, end = 3
Called MAX with start = 0, end = 1
Called MAX with start = 2, end = 3
Called MAX with start = 4, end = 7
Called MAX with start = 4, end = 5
Called MAX with start = 6, end = 7
Called MAX with start = 8, end = 15
Called MAX with start = 8, end = 11
Called MAX with start = 8, end = 9
Called MAX with start = 10, end = 11
Called MAX with start = 12, end = 15
Called MAX with start = 12, end = 13
Called MAX with start = 14, end = 15
As you can see, MAX
decomposes the array each time it is called, resulting in 14 instances of the MAX
function being called. If we had performed head or tail recursion to compare each value in the array, we would have to have called MAX
16 times. While this may not seem like a huge savings, as the value of $N$ grows, so do the savings.
Next, we will look at calculating Fibonacci numbers using a tree recursive algorithm. Fibonacci numbers are given by the following recursive formula. $$ f_n = f_{n-1} + f_{n-2} $$ Notice that Fibonacci numbers are defined recursively, so they should be a perfect application of tree recursion! However, there are cases where recursive functions are too inefficient compared to an iterative version to be of practical use. This typically happens when the recursive solutions to a problem end up solving the same subproblems multiple times. Fibonacci numbers are a great example of this phenomenon.
To complete the definition, we need to specify the base case, which includes two values for the first two Fibonacci numbers: FIB(0) = 0
and FIB(1) = 1
. The first Fibonacci numbers are $0, 1, 1, 2, 3, 5, 8, 13, 21 …$.
Producing the code for finding Fibonacci numbers is very easy from its definition. The extremely simple and elegant solution to computing Fibonacci numbers recursively is shown below.
function FIB(N)
if N == 0
return 0
else if N == 1
return 1
else
return FIB(N-1) + FIB(N-2)
end if
end function
The following pseudocode performs the same calculations for the iterative version.
function FIBIT(N)
FIB1 = 1
FIB2 = 0
for (I = 2 to N)
FIB = FIB1 + FIB2
FIB2 = FIB1
FIB1 = FIB
end loop
end function
While this function is not terribly difficult to understand, there is still quite a bit of mental gymnastics required to see how this implements the computation of Fibonacci numbers and even more to prove that it does so correctly. However, as we will see later, the performance improvements of the iterative solution are worth it.
If we analyze the computation required for the 6th Fibonacci number in both the iterative and recursive algorithms, the truth becomes evident. The recursive algorithm calculates the 5th Fibonacci number by recursively calling FIB(4)
and FIB(3)
. In turn, FIB(4)
calls FIB(3)
and FIB(2)
. Notice that FIB(3)
is actually calculated twice! This is a problem. If we calculate the 36th Fibonacci number, the values of many Fibonacci numbers are calculated repeatedly, over and over.
To clarify our ideas further, we can consider the recursive tree resulting from the trace of the program to calculate the 6th Fibonacci number. Each of the computations highlighted in the diagram will have been computed previously.
If we count the recomputations, we can see how we calculate the 4th Fibonacci number twice, the 3rd Fibonacci number three times, and the 2nd Fibonacci five times. All of this is due to the fact the we do not consider the work done by other recursive calls. Furthermore, the higher our initial number, the worse the situation grows, and at a very rapid pace.
To avoid recomputing the same Fibonacci number multiple times, we can save the results of various calculations and reuse them directly instead of recomputing them. This technique is called memoization, which can be used to optimize some functions that use tree recursion.
To implement memoization, we simply store the values the first time we compute them in an array. The following pseudocode shows an efficient algorithm that uses an array, called FA
, to store and reuse Fibonacci numbers.
function FIBOPT(N)
if N == 0
return 0
else if N == 1
return 1
else if FA[N] == -1
FA[N] = FIBOPT(N-1) + FIBOPT(N-2)
return FA[N]
else
return FA[N]
end if
end function
We assume that each element in FA
has been initialized to -1
. We also assume that N
is greater than 0
and that the length of FA
is larger than the Fibonacci number N
that we are trying to compute. (Of course, we would normally put these assumptions in our precondition; however, since we are focusing on the recursive nature of the function, we will not explicitly show this for now.) The cases where N == 0
and N == 1
are the same as we saw in our previous FIB
function. There is no need to store these values in the array when we can return them directly, since storing them in the array takes additional time. The interesting cases are the last two. First, we check to see if FA[N] == -1
, which would indicate that we have not computed the Fibonacci number for N
yet. If we have not yet computed N
’s Fibonacci number, we recursively call FIBOPT(N-1)
and FIBOPT(N-2)
to compute its value and then store it in the array and return it. If, however, we have already computed the Fibonacci for N
(i.e., if FA[N]
is not equal to -1
), then we simply return the value stored in the array, FA[N]
.
As shown in our original call tree below, using the FIBOPT
function, none of the function calls in red will be made at all. While the function calls in yellow will be made, they will simply return a precomputed value from the FA
array. Notice that for N = 6
, we save 14 of the original 25 function calls required for the FIB
function, or a $56%$ savings. As N
increases, the savings grow even more.
There are some problems where an iterative solution is difficult to implement and is not always immediately intuitive, while a recursive solution is simple, concise and easy to understand. A classic example is the problem of the Tower of Hanoi .
The Tower of Hanoi is a game that lends itself to a recursive solution. Suppose we have three towers on which we can put discs. The three towers are indicated by a letter, A, B, or C.
Now, suppose we have $N$ discs all of different sizes. The discs are stacked on tower A based on their size, smaller discs on top. The problem is to move all the discs from one tower to another by observing the following rules:
To try to solve the problem let’s start by considering a simple case: we want to move two discs from tower A to tower C. As a convenience, suppose we number the discs in ascending order by assigning the number 1 to the larger disc. The solution in this case is simple and consists of the following steps:
The following figure shows how the algorithm works.
It is a little more difficult with three discs, but after a few tries the proper algorithm emerges. With our knowledge of recursion, we can come up with a simple and concise solution. Since we already know how to move two discs from one place to another, we can solve the problem recursively.
In formulating our solution, we assumed that we could move two discs from one tower to another, since we have already solved that part of the problem above. In step 1, we use this solution to move the top two discs from tower A to B. Then, in step 3, we again use that solution to move two discs from tower B to C. This process can now easily be generalized to the case of N discs as described below.
The algorithm is captured in the following pseudocode. Here N
is the total number of discs, ORIGIN
is the tower where the discs are currently located, and DESTINATION
is the tower where they need to be moved. Finally, TEMP
is a temporary tower we can use to help with the move. All the parameters are integers.
function HANOI(N, ORIGIN, DESTINATION, TEMP)
if N >= 0
HANOI(N-1, ORIGIN, TEMP, DESTINATION)
Move disc N from ORIGIN to DESTINATION
HANOI(N-1, TEMP, DESTINATION, ORIGIN)
end if
return
end function
The function moves the $N$ discs from the source tower to the destination tower using a temporary tower. To do this, it calls itself to move the first $N-1$ discs from the source tower to the temporary tower. It then moves the bottom disc from the source tower to the destination tower. The function then moves the $N-1$ discs present in the temporary tower into the destination tower.
The list of movements to solve the three-disc problem is shown below.
Iterative solutions to the Tower of Hanoi problem do exist, but it took many researchers several years to find an efficient solution. The simplicity of finding the recursive solution presented here should convince you that recursion is an approach you should definitely keep in your bag of tricks!
Iteration and recursion have the same expressive power, which means that any problem that has a recursive solution also has an iterative solution and vice versa. There are also standard techniques that allow you to transform a recursive program into an equivalent iterative version. The simplest case is for tail recursion, where the recursive call is the last step in the function. There are two cases of tail recursion to consider when converting to an iterative version.
f(x)
executes is a call to itself, f(y)
with parameter y
, the recursive call can be replaced by an assignment statement, x = y
, and by looping back to the beginning of function f
.The approach above only solves the conversion problem in the case of tail recursion. However, as an example, consider our original FACT
function and its iterative version FACT2
. Notice that in FACT2
we had to add a variable fact
to keep track of the actual computation.
function FACT(N)
if N == 1
return 1
else
return N * FACT(N-1)
end if
end function
function FACT2(N)
fact = 1
while N > 0
fact = fact * N
N = N - 1
end while
return fact
end function
The conversion of non-tail recursive functions typically uses two loops to iterate through the process, effectively replacing recursive calls. The first loop executes statements before the original recursive call, while the second loop executes the statements after the original recursive call. The process also requires that we use a stack to save the parameter and local variable values each time through the loop. Within the first loop, all the statements that precede the recursive call are executed, and then, before the loop terminates, the values of interest are pushed onto the stack. The second loop starts by popping the values saved on the stack and then executing the remaining statements that come after the original recursive call. This is typically much more difficult than the conversion process for tail recursion.
In this module, we explored the use of recursion to write concise solutions for a variety of problems. Recursion allows us to call a function from within itself, using either head recursion, tail recursion or tree recursion to solve smaller instances of the original problem.
Recursion requires a base case, which tells our function when to stop calling itself and start returning values, and a recursive case to handle reducing the problem’s size and calling the function again, sometimes multiple times.
We can use recursion in many different ways, and any problem that can be solved iteratively can also be solved recursively. The power in recursion comes from its simplicity in code—some problems are much easier to solve recursively than iteratively.
Unfortunately, in general a recursive solution requires more computation time and memory than an iterative solution. We can use techniques such as memoization to greatly improve the time it takes for a recursive function to execute, especially in the case of calculating Fibonacci numbers where subproblems are overlapped.
This page is the main page for Searching and Sorting
In this course, we are learning about many different ways we can store data in our programs, using arrays, queues, stacks, lists, maps, and more. We’ve already covered a few of these data structures, and we’ll learn about the others in upcoming modules. Before we get there, we should also look at a couple of the most important operations we can perform on those data structures.
Consider the classic example of a data structure containing information about students in a school. In the simplest case, we could use an array to store objects created from a Student
class, each one representing a single student in the school.
As we’ve seen before, we can easily create those objects and store them in our array. We can even iterate through the array to find the maximum age or minimum GPA of all the students in our school. However, what if we want to just find a single student’s information? To do that, we’ll need to discuss one of the most commonly used data structure operations: searching.
Searching typically involves finding a piece of information, or a value stored in a data structure or calculated within a specific domain. For example, we might want to find out if a specific word is found in an array of character strings. We might also want to find an integer that meets a specific criterion, such as finding an integer that is the sum of the two preceding integers. For this module, we will focus on finding values in data structures.
In general, we can search for
The data structure can be thought of more generally as a container, which can be
For the examples in this module, we’ll generally use a simple finite array as our container. However, it shouldn’t be too difficult to figure out how to expand these examples to work with a larger variety of data structures. In fact, as we introduce more complex data structures in this course, we’ll keep revisiting the concept of searching and see how it applies to the new structure.
In general, containers can be either ordered or unordered. In many cases, we may also use the term sorted to refer to an ordered container, but technically an ordered container just enforces an ordering on values, but they may not be in a sorted order. As long as we understand what the ordering is, we can use that to our advantage, as we’ll see later.
Searches in an unordered container generally require a linear search, where every value in the container must be compared against our search value. On the other hand, search algorithms on ordered containers can take advantage of this ordering to make their searches more efficient. A good example of this is binary search. Let’s begin by looking at the simplest case, linear search.
When searching for a number in an unordered array, our search algorithms are typically designed as functions that take two parameters:
Our search functions then return an index to the number within the array.
In this module, we will develop a couple of examples of searching an array for a specific number.
Finding the first occurrence of a number in an unordered array is a fairly straightforward process. A black box depiction of this function is shown below. There are two inputs–array
and number
–and a single output, the index
of the first occurrence of the number
in array
.
We can also include the search function as a method inside of the container itself. In that case, we don’t have to accept the container as a parameter, since the method will know to refer to the object it is part of.
Of course, when we begin designing an algorithm for our function we should think about two items immediately: the preconditions and the postconditions of the function. For this function, they are fairly simple.
The precondition for find
is that the number
provided as input is compatible with the type of data held by the provided array
. In this case, we have no real stipulations on array
. It does not need to actually have any data in it, nor does it have to be ordered or unordered.
Our postcondition is also straightforward. The function should return the index of number
if it exists in the array. However, if number
is not found in the array, then -1
is returned. Depending on how we wish to implement this function, it could also return another default index or throw an exception if the desired value is not found. However, most searching algorithms follow the convention of returning an invalid index of -1
when the value is not found in the array, so that’s what we’ll use in our examples.
Preconditions:
Postconditions:
To search for a single number in our array, we will use a loop to search each location in the array until we find the number. The general idea is to iterate over all the elements in the array until we either find the number we are searching for or there are no other elements in the array.
function FIND(NUMBER, ARRAY) (1)
loop INDEX from 0 to size of ARRAY - 1 (2)
if ARRAY[INDEX] == NUMBER (3)
return INDEX (4)
end if (5)
end for (6)
return -1 (7)
end function (8)
As we can see in line 1, the function takes both a number
and array
parameter. We then enter a for
loop in line 2 to loop through each location in the array. We keep track of the current location in the array using the index
variable. For each location, we compare number
against the value in the array
at location index
. If we find the number, we simply return the value of index
in line 4. If we do not find the number anywhere in the array, the loop will exit, and the function will return -1
in line 8.
Below is an example of how to execute this algorithm on example data. Step 1 shows the initial state of the variables in the function the first time through the loop. Both array
and number
are passed to the function but we do not modify either of them in the function. The index
variable is the for
loop variable, which is initially set to 0
the first time through the loop. In line 3, the function compares the number in array[index]
against the value of number
. In this step, since index
is 0
, we use array[0]
, which is 8
. Since 8
is not equal to the value of number
, which is 3
, we do nothing in the if
statement and fall to the end for
statement in line 6. Of course, this just sends us back to the for
statement in line 2.
The second time through the for
loop is shown as Step 2 in the figure. We follow the same logic as above and compare array[1]
, or 4, against number
, which is still 3. Since these values are not equal, we skip the rest of the if
statement and move on to Step 3.
In Step 3, index
is incremented to 2
, thus pointing at array[2]
, whose value is 3
. Since this value is equal to the value of number
, we carry out the if
part of the statement. Line 4 returns the value of 2
, which is the first location in array
that holds the value of number
.
Our find
algorithm above will find the first instance of number
in the array
and return the index of that instance. However, we might also be interested in finding the last instance of number
in array
. Looking at our original find
algorithm, it should be easy to find the last value by simply searching the array in reverse order, as shown in the following figure.
We will use the same example as above, except we will start searching backwards from the end of the array. In Step 1, we see that index
is initialized to 7 and we compare array[7]
against number
, which are not the same. Thus, we continue to Step 2, where we decrement index
to 6. Here array[6]
is still not equal to number
, so we continue in the loop. Finally, in Step 3, we decrement index
to 5. Now array[5]
contains the number 3
, which is equal to our number
and we return the current index
value.
Luckily for us, we can change our for
loop index to decrement from the end of the array (size of array - 1
) to the beginning (0
). Thus, by simply changing line 3 in our original function, we can create a new function that searches for the last instance of number
in array
. The new function is shown below.
function REVERSEFIND(NUMBER, ARRAY) (1)
loop INDEX from size of ARRAY – 1 to 0 step -1 (2)
if ARRAY[INDEX] == NUMBER (3)
return INDEX (4)
end if (5)
end for (6)
return -1 (7)
end function (8)
Obviously, the for
loop in line 2 holds the key to searching our array in reverse order. We start at the end of the array by using the index size of array - 1
and then decrement the value of index
(via the step -1
qualifier) each time through the loop until we reach 0. The remainder of the function works exactly like the find
function.
We looked at an iterative version of the find
function above. But what would it take to turn that function into a recursive function? While for this particular function, there is not a lot to be gained from the recursive version, it is still instructive to see how we would do it. We will find recursive functions more useful later on in the module.
In this case, to implement a recursive version of the function, we need to add a third parameter, index
, to tell us where to check in the array. We assume that at the beginning of a search, index
begins at 0. Then, if number
is not in location index
in the array
, index
will be incremented before making another recursive call. Of course, if number
is in location index
, we will return the number of index
. The pseudocode for the findR
function is shown below.
function FINDR (NUMBER, ARRAY, INDEX) (1)
if INDEX >= size of ARRAY then (2)
return -1 (3)
else if ARRAY[INDEX] == NUMBER (4)
return INDEX (5)
else (6)
return FINDR (NUMBER, ARRAY, INDEX + 1) (7)
end if (8)
end function (9)
First, we check to see if index
has moved beyond the bounds of the array, which would indicate that we have searched all the locations in array
for number
. If that is the case, then we return -1
in line 3 indicating that we did not find number
in array
. Next, we check to see if number
is found in array[index]
in line 4. If it is, the we are successful and return the index. However, if we are not finished searching and we have not found number
, then we recursively call findR
and increment index
by 1 to search the next location.
An example of using the findR
function is shown below. The top half of the figure shows the state of the data in the initial call to the findR
function (instance 1). The bottom half of the figure shows the recursive path through the function. The beginning of instance 1 shows the if
statement in line 2. In instance 1, since we have not searched the entire array (line 2) and array[0]
is not equal to number
(line 4), we fall down to the else
part function and execute line 7, the recursive call. Since index
is 0
in instance 1, we call instance 2 of the function with an index
of 1.
In instance 2, the same thing happens as in instance 1 and we fall down to the else
part of the if
statement. Once again, we call a new instance of findR
, this time with index
set at 2. Now, in instance 3, array[index]
is equal to number
in line 4 and so we execute the return index
statement in line 5. The value of index
(2) is returned to instance 2, which, in line 7, simply returns the value of 2 to instance 1. Again, in line 7, instance 1 returns that same value (2) to the original calling function.
Notice that the actual process of searching the array is the same for both the iterative and recursive functions. It is only the implementation of that process that is different between the two.
We may also want to search through a data structure to find an item with a specific property. For example, we could search for the student with the maximum age, or the minimum GPA. For this example, let’s consider the case where we’d like to find the minimum value in an array of integers.
Searching for the minimum number in an unordered array is a different problem than searching for a specific number. First of all, we do not know what number we are searching for. And, since the array is not ordered, we will have to check each and every number in the array.
The input parameters of our new function will be different from the find
function above, since we do not have a number to search for. In this case, we only have an array of numbers as an input parameter. The output parameter, however, is the same. We still want to return the index of the minimum number in the array. In this case, we will return -1
if there is no minimum number, which can only happen if there is no data in the array when we begin.
Preconditions:
Postconditions:
Our preconditions and postconditions are also simple. Our precondition is simply that we have an array whose data can be sorted. This is important, because it means that we can compare two elements in the array and determine which one has a smaller value. Otherwise, we couldn’t determine the minimum value at all!
Our postcondition is that we return the minimum number of the data in the array, or -1
if the array is empty.
The function findMin
is shown below. First, we check to see if the array is empty. If it is, we simply return -1
in line 3. If not, we assume the location 0
contains the minimum number in the array, and set min
equal to 0 in line 5. Then we loop through each location in the array (starting with 1) in line 6 and set min
equal to the minimum of the array data at the current index
and the data at min
. (Note: if the array only has a single number in it, the for loop will not actually execute since index
will be initialized to 1, which is already larger than the size of the array – 1
, which is 0.) Once we complete the loop, we will be guaranteed that we have the index of the minimum number in the array.
function FINDMIN(ARRAY) (1)
if ARRAY is empty then (2)
return -1 (3)
end if (4)
MIN = 0 (5)
loop INDEX from 1 to size of ARRAY - 1 (6)
if ARRAY[INDEX] < ARRAY[MIN] (7)
MIN = INDEX (8)
end if (9)
end for (10)
return MIN (11)
end function (12)
Next, we will walk through the algorithm using our example array in the figure below. Step 1 shows the initial time through the loop. In line 5, min
is set to 0
by default and in line 6, index
is set equal to 1
. Line 7 then computes whether array[1] < array[0]
. In this case, it is and we set min = 1
(which is reflected in the next step where min
has the value 1
).
Step 2 will end up comparing array[2] < array[1]
, since min
is now 1 and index
has been incremented to 2 via the for
loop. In this case, array[2]
is less than array[1]
so we update min
again, this time to 2.
Step 3 follows the same process; however, this time the value in array[3]
is 55, which is greater than the current minimum of 3 in array[2]
. Therefore, min
is not updated. Step 4 finds the minimum value in the array of -3
at index 4 and so updates min
to 4. However, steps 5, 6, and 7 do not find new minimum values. Thus, when the loop exits after Step 6, min
is set to 4 and this value is returned to the calling program in line 11.
We’ve examined many different versions of a linear search algorithm. We can find the first occurrence of a number in an array, the last occurrence of that number, or a value with a particular property, such as the minimum value. Each of these are examples of a linear search, since we look at each element in the container sequentially until we find what we are looking for.
So, what would be the time complexity of this process? To understand that, we must consider what the worst-case input would be. For this discussion, we’ll just look at the find
function, but the results are similar for many other forms of linear search. The pseudocode for find
is included below for reference.
function FIND(NUMBER, ARRAY) (1)
loop INDEX from 0 to size of ARRAY - 1 (2)
if ARRAY[INDEX] == NUMBER (3)
return INDEX (4)
end if (5)
end for (6)
return -1 (7)
end function (8)
How would we determine what the worst-case input for this function would be? In this case, we want to come up with the input that would require the most steps to find the answer, regardless of the size of the container. Obviously, it would take more steps to find a value in a larger container, but that doesn’t really tell us what the worst-case input would be.
Therefore, the time complexity for a linear search algorithm is clearly proportional to the number of items that we need to search through, in this case the size of our array. Whether we use an iterative algorithm or a recursive algorithm, we still need to search the array one item at a time. We’ll refer to the size of the array as $N$.
Here’s the key: when searching for minimum or maximum values, the search will always take exactly $N$ comparisons since we have to check each value. However, if we are searching for a specific value, the actual number of comparisons required may be fewer than $N$.
To build a worst-case input for the find
function, we would search for the situation where the value to find is either the last value in the array, or it is not present at all. For example, consider the array we’ve been using to explore each linear search algorithm so far.
What if we are trying to find the value 55 in this array? In that case, we’ll end up looking at 4 of the 8 elements in the array. This would take $N/2$ steps. Can we think of another input that would be worse?
Consider the case where we try to find 0 instead. Will that be worse? In that case, we’ll need to look at all 8 elements in the array before we find it. That requires $N$ steps!
What if we are asked to find 1 in the array? Since 1 is not in the array, we’ll have to look at every single element before we know that it can’t be found. Once again, that requires $N$ steps.
We could say that in the worst-case, a linear search algorithm requires “on the order of $N$” time to find an answer. Put another way, if we double the size of the array, we would also need to double the expected number of steps required to find an item in the worst case. We sometimes call this linear time, since the number of steps required grows at the same rate as the size of the input.
Our question now becomes, “Is a search that takes on the order of $N$ time really all that bad?”. Actually, it depends. Obviously, if $N$ is a small number (less than 1000 or so) it may not be a big deal, if you only do a single search. However, what if we need to do many searches? Is there something we can do to make the process of searching for elements even easier?
^[File:FileStack retouched.jpg. (2019, January 17). Wikimedia Commons, the free media repository. Retrieved 22:12, March 23, 2020 from https://commons.wikimedia.org/w/index.php?title=File:FileStack_retouched.jpg&oldid=335159723.]
Let’s consider the real world once again for some insights. For example, think of a pile of loose papers on the floor. If we wanted to find a specific paper, how would we do it?
In most cases, we would simply have to perform a linear search, picking up each paper one at a time and seeing if it is the one we need. This is pretty inefficient, especially if the pile of papers is large.
^[File:Istituto agronomico per l’oltremare, int., biblioteca, schedario 05.JPG. (2016, May 1). Wikimedia Commons, the free media repository. Retrieved 22:11, March 23, 2020 from https://commons.wikimedia.org/w/index.php?title=File:Istituto_agronomico_per_l%27oltremare,_int.,_biblioteca,_schedario_05.JPG&oldid=194959053.]
What if we stored the papers in a filing cabinet and organized them somehow? For example, could we sort the papers by title in alphabetical order? Then, when we want to find a particular paper, we can just skip to the section that contains files with the desired first letter and go from there. In fact, we could even do this for the second and third letter, continuing to jump forward in the filing cabinet until we found the paper we need.
This seems like a much more efficient way to go about searching for things. In fact, we do this naturally without even realizing it. Most computers have a way to sort files alphabetically when viewing the file system, and anyone who has a collection of items has probably spent time organizing and alphabetizing the collection to make it easier to find specific items.
Therefore, if we can come up with a way to organize the elements in our array, we may be able to make the process of finding a particular item much more efficient. In the next section, we’ll look at how we can use various sorting algorithms to do just that.
Sorting is the process we use to organize an ordered container in a way that we understand what the ordering of the values represents. Recall that an ordered container just enforces an ordering between values, but that ordering may appear to be random. By sorting an ordered container, we can enforce a specific ordering on the elements in the container, allowing us to more quickly find specific elements as we’ll see later in this chapter.
In most cases, we sort values in either ascending or descending order. Ascending order means that the smallest value will be first, and then each value will get progressively larger until the largest value, which is at the end of the container. Descending order is the opposite—the largest value will be first, and then values will get progressively smaller until the smallest value is last.
We can also define this mathematically. Assume that we have a container called array
and two indexes in that container, a
and b
. If the container is sorted in ascending order, we would say that if a
is less than b
(that is, the element at index a
comes before the element at index b
), then the element at index a
is less than or equal to the element at index b
. More succinctly:
$$ a < b \implies \text{array}[a] \leq \text{array}[b] $$
Likewise, if the container is sorted in descending order, we would know that if a
is less than b
, then the element at index a
would be greater than or equal to the element at index b
. Or:
$$ a < b \implies \text{array}[a] \geq \text{array}[b] $$
These facts will be important later when we discuss the precondition, postconditions, and loop invariants of algorithms in this section.
To sort a collection of data, we can use one of many sorting algorithms to perform that action. While there are many different algorithms out there for sorting, there are a few commonly used algorithms for this process, each one with its own pros, cons, and time complexity. These algorithms are studied extensively by programmers, and nearly every programmer learns how to write and use these algorithms as part of their learning process. In this module, we’ll introduce you to the 4 most commonly used sorting algorithms:
The first sorting algorithm we’ll learn about is selection sort. The basic idea behind selection sort is to search for the minimum value in the whole container, and place it in the first index. Then, repeat the process for the second smallest value and the second index, and so on until the container is sorted.
Wikipedia includes a great animation that shows this process:
^[File:Selection-Sort-Animation.gif. (2016, February 12). Wikimedia Commons, the free media repository. Retrieved 22:22, March 23, 2020 from https://commons.wikimedia.org/w/index.php?title=File:Selection-Sort-Animation.gif&oldid=187411773.]
In this animation, the element highlighted in blue is the element currently being considered. The red element shows the value that is the minimum value considered, and the yellow elements are the sorted portion of the list.
Let’s look at a few steps in this process to see how it works. First, the algorithm will search through the array to find the minimum value. It will start by looking at index 0 as shown in the figure below.
Once it reaches the end of the array, it will find that the smallest value 0 is at index 8.
Then, it will swap the minimum item with the item at index 0 of the array, placing the smallest item first. That item will now be part of the sorted array, so we’ll shade it in since we don’t want to move it again.
Next, it will reset index to 1, and start searching for the next smallest element in the array. Notice that this time it will not look at the element at index 0, which is part of the sorted array. Each time the algorithm resets, it will start looking at the element directly after the sorted portion of the array.
Once again, it will search through the array to find the smallest value, which will be the value 1 at index 6.
Then, it will swap the element at index 1 with the minimum element, this time at index 6. Just like before, we’ll shade in the first element since it is now part of the sorted list, and reset the index to begin at index 2
This process will repeat until the entire array is sorted in ascending order.
To describe our selection sort algorithm, we can start with these basic preconditions and postconditions.
Preconditions:
Postconditions:
We can then represent this algorithm using the following pseudocode.
function SELECTIONSORT(ARRAY) (1)
loop INDEX from 0 to size of ARRAY – 2 (2)
MININDEX = 0 (3)
# find minimum index
loop INDEX2 from INDEX to size of ARRAY – 1 (4)
if ARRAY[INDEX2] < ARRAY[MININDEX] then (5)
MININDEX = INDEX (6)
end if (7)
end loop (8)
# swap elements
TEMP = ARRAY[MININDEX] (9)
ARRAY[MININDEX] = ARRAY[INDEX] (10)
ARRAY[INDEX] = TEMP (11)
end loop (12)
end function (13)
In this code, we begin by looping through every element in the array except the last one, as seen on line 2. We don’t include this one because if the rest of the array is sorted properly, then the last element must be the maximum value.
Lines 3 through 8 are basically the same as what we saw in our findMin
function earlier. It will find the index of the minimum value starting at INDEX
through the end of the array. Notice that we are starting at INDEX
instead of the beginning. As the outer loop moves through the array, the inner loop will consider fewer and fewer elements. This is because the front of the array contains our sorted elements, and we don’t want to change them once they are in place.
Lines 9 through 11 will then swap the elements at INDEX
and MININDEX
, putting the smallest element left in the array at the position pointed to by index.
We can describe the invariant of our outer loop as follows:
index
is sorted in ascending order.The second part of the loop invariant is very important. Without that distinction, we could simply place new values into the array before index
and satisfy the first part of the invariant. It is always important to specify that the array itself still contains the same elements as before.
Let’s look at the time complexity of the selection sort algorithm, just so we can get a feel for how much time this operation takes.
First, we must determine if there is a worst-case input for selection sort. Can we think of any particular input which would require more steps to complete?
In this case, each iteration of selection sort will look at the same number of elements, no matter what they are. So there isn’t a particular input that would be considered worst-case. We can proceed with just the general case.
In each iteration of the algorithm we need to search for the minimum value of the remaining elements in the container. If the container has $N$ elements, we would follow the steps below.
This process continues until we have sorted all of the elements in the array. The number of steps will be:
$$ N + (N – 1) + (N – 2) + … + 2 + 1 $$
While it takes a bit of math to figure out exactly what that means, we can use some intuition to determine an approximate value. For example we could pair up the values like this:
$$ N + [(N – 1) + 1] + [(N – 2) + 2] + … $$
When we do that, we’ll see that we can create around $N / 2$ pairs, each one with the value of $N$. So a rough approximation of this value is $N * (N / 2)$, which is $N^2 / 2$. When analyzing time complexity, we would say that this is “on the order of $N^2$” time. Put another way, if the size of $N$ doubles, we would expect the number of steps to go up by a factor of $4$, since $(2 * N)^2 = 4N$.
Later on, we’ll come back to this and compare the time complexity of each sorting algorithm and searching algorithm to see how they stack up against each other.
Next, let’s look at another sorting algorithm, bubble sort. The basic idea behind bubble sort is to continuously iterate through the array and swap adjacent elements that are out of order. As a side effect of this process, the largest element in the array will be “bubbled” to the end of the array after the first iteration. Subsequent iterations will do the same for each of the next largest elements, until eventually the entire list is sorted.
Wikipedia includes a great animation that shows this process:
^[File:Bubble-sort-example-300px.gif. (2019, June 12). Wikimedia Commons, the free media repository. Retrieved 22:36, March 23, 2020 from https://commons.wikimedia.org/w/index.php?title=File:Bubble-sort-example-300px.gif&oldid=354097364.
]
In this animation, the two red boxes slowly move through the array, comparing adjacent elements. If the elements are not in the correct order (that is, the first element is larger than the second element), then it will swap them. Once it reaches the end, the largest element, 8, will be placed at the end and locked in place.
Let’s walk through a few steps of this process and see how it works. We’ll use the array we used previously for selection sort, just to keep things simple. At first, the array will look like the diagram below.
We’ll begin with the index
variable pointing at index 0. Our algorithm should compare the values at index 0 and index 1 and see if they need to be swapped. We’ll put a bold border around the elements we are currently comparing in the figure below.
Since the element at index 0 is 8, and the element at index 1 is 5, we know that they must be swapped since 8 is greater than 5. We need to swap those two elements in the array, as shown below.
Once those two elements have been swapped, the index variable will be incremented by 1, and we’ll look at the elements at indexes 1 and 2 next.
Since 8 is greater than 2, we’ll swap these two elements before incrementing index to 2 and comparing the next two elements.
Again, we’ll find that 8 is greater than 6, so we’ll swap these two elements and move on to index 3.
Now we are looking at the element at index 3, which is 8, and the element at index 4, which is 9. In this case, 8 is less than 9, so we don’t need to swap anything. We’ll just increment index by 1 and look at the elements at indexes 4 and 5.
As we’ve done before, we’ll find that 9 is greater than 3, so we’ll need to swap those two items. In fact, as we continue to move through the array, we’ll find that 9 is the largest item in the entire array, so we’ll end up swapping it with every element down to the end of the array. At that point, it will be in its final position, so we’ll lock it and restart the process again.
After making a second pass through the array, swapping elements that must be swapped as we find them, we’ll eventually get to the end and find that 8 should be placed at index 8 since it is the next largest value in the array.
We can then continue this process until we have locked each element in place at the end of the array.
To describe our bubble algorithm, we can start with these basic preconditions and postconditions.
Preconditions:
Postconditions:
We can then represent this algorithm using the following pseudocode.
function BUBBLESORT(ARRAY) (1)
# loop through the array multiple times
loop INDEX from 0 to size of ARRAY – 1 (2)
# consider every pair of elements except the sorted ones
loop INDEX2 from 0 to size of ARRAY – 2 – INDEX (3)
if ARRAY[INDEX2] > ARRAY[INDEX2 + 1] then (4)
# swap elements if they are out of order
TEMP = ARRAY[INDEX2] (5)
ARRAY[INDEX2] = ARRAY[INDEX2 + 1] (6)
ARRAY[INDEX2 + 1] = TEMP (7)
end if
end loop
end loop
end function
In this code, we begin by looping through every element in the array, as seen on line 2. Each time we run this outer loop, we’ll lock one additional element in place at the end of the array. Therefore, we need to run it once for each element in the array.
On line 3, we’ll start at the beginning of the array and loop to the place where the sorted portion of the array begins. We know that after each iteration of the outer loop, the value index
will represent the number of locked elements at the end of the array. We can subtract that value from the end of the array to find where we want to stop.
Line 4 is a comparison between two adjacent elements in the array starting at the index index2
. If they are out of order, we use lines 5 through 7 to swap them. That’s really all it takes to do a bubble sort!
Looking at this code, we can describe the invariant of our outer loop as follows:
index
elements in the array are in sorted order, andNotice how this differs from selection sort, since it places the sorted elements at the beginning of the array instead of the end. However, the result is the same, and by the end of the program we can show that each algorithm has fully sorted the array.
Once again, let’s look at the time complexity of the bubble sort algorithm and see how it compares to selection sort.
Bubble sort is a bit trickier to analyze than selection sort, because there are really two parts to the algorithm:
Let’s look at each one individually. First, is there a way to reduce the number of comparisons made by this algorithm just by changing the input? As it turns out, there isn’t anything we can do to change that based on how it is written. The number of comparisons only depends on the size of the array. In fact, the analysis is exactly the same as selection sort, since each iteration of the outer loop does one fewer comparison. Therefore, we can say that bubble sort has time complexity on the order of $N^2$ time when it comes to comparisons.
What about swaps? This is where it gets a bit tricky. What would be the worst-case input for the bubble sort algorithm, which would result in the largest number of swaps made?
Consider a case where the input is sorted in descending order. The largest element will be first, and the smallest element will be last. If we want the result to be sorted in ascending order, we would end up making $N - 1$ swaps to get the first element to the end of the array, then $N - 2$ swaps for the second element, and so on. So, once again we end up with the same series as before:
$$ (N – 1) + (N – 2) + … + 2 + 1. $$
In the worst-case, we’ll also end up doing on the order of $N^2$ swaps, so bubble sort has a time complexity on the order of $N^2$ time when it comes to swaps as well.
It seems that both bubble sort and selection sort are in the same order of time complexity, meaning that each one will take roughly the same amount of time to sort the same array. Does that tell us anything about the process of sorting an array?
Here’s one way to think about it: what if we decided to compare each element in an array to every other element? How many comparisons would that take? We can use our intuition to know that each element in an array of $N$ elements would require $N – 1$ comparisons, so the total number of comparisons would be $N * (N – 1)$, which is very close to $N^2$.
Of course, once we’ve compared each element to every other element, we’d know exactly where to place them in a sorted array. One possible conclusion we could make is that there isn’t any way to sort an array that runs much faster than an algorithm that runs in the order of $N^2$ time.
Thankfully, that conclusion is incorrect! There are several other sorting algorithms we can use that allow us to sort an array much more quickly than $N^2$ time. Let’s take a look at those algorithms and see how they work!
Another commonly used sorting algorithm is merge sort. Merge sort uses a recursive, divide and conquer approach to sorting, which makes it very powerful. It was actually developed to handle sorting data sets that were so large that they couldn’t fit on a single memory device, way back in the early days of computing.
The basic idea of the merge sort algorithm is as follows:
Once again, Wikipedia has a great animation showing this process:
^[File:Merge-sort-example-300px.gif. (2020, February 22). Wikimedia Commons, the free media repository. Retrieved 00:06, March 24, 2020 from https://commons.wikimedia.org/w/index.php?title=File:Merge-sort-example-300px.gif&oldid=397192885.]
Let’s walk through a simple example and see how it works. First, we’ll start with the same initial array as before, shown in the figure below. To help us keep track, we’ll refer to this function call using the array indexes it covers. It will be mergeSort(0, 9)
.
Since this array contains more than 2 elements, we won’t be able to sort it quickly. Instead, we’ll divide it in half, and sort each half using merge sort again. Let’s continue the process with the first half of the array. We’ll use a thick outline to show the current portion of the array we are sorting, but we’ll retain the original array indexes to help keep track of everything.
Now we are in the first recursive call, mergeSort(0, 4)
,which is looking at the first half of the original array. Once again, we have more than 2 elements, so we’ll split it in half and recursively call mergeSort(0, 1)
first.
At this point, we now have an array with just 2 elements. We can use one of our base cases to sort that array by swapping the two elements, if needed. In this case, we should swap them, so we’ll get the result shown below.
Now that the first half of the smaller array has been sorted, our recursive call mergeSort(0, 1)
will return and we’ll look at the second half of the smaller array in the second recursive call, mergeSort(2, 4)
, as highlighted below.
As we’ve seen before, this array has more than 2 elements, so we’ll need to divide it in half and recursively call the function again. First, we’ll call mergeSort(2, 2)
.
In this case, the current array we are considering contains a single element, so it is already sorted. Therefore, the recursive call to mergeSort(2, 2)
will return quickly, and we’ll consider the second part of the smaller array in mergeSort(3, 4)
, highlighted below.
Here, we have 2 elements, and this time they are already sorted. So, we don’t need to do anything, and our recursive call to mergeSort(3, 4)
will return. At this point, we will be back in our call to mergeSort(2, 4)
, and both halves of that array have been sorted. We’re back to looking at the highlighted elements below.
Now we have to merge these two arrays together. Thankfully, since they are sorted, we can follow this simple process:
Let’s take a look at what that process would look like. First, we’ll create a new temporary array to store the result.
Next, we will look at the first element in each of the two sorted halves of the original array. In this case, we’ll compare 2 and 6, which are highlighted below.
Now we should pick the smaller of those two values, which will be 2. That value will be placed in the new temporary array at the very beginning.
Next, we should look at the remaining halves of the array. Since the first half is empty, we can just place the remaining elements from the second half into the temporary array.
Finally, we should replace the portion of the original array that we are looking at in this recursive call with the temporary array. In most cases, we’ll just copy these elements into the correct places in the original array. In the diagram, we’ll just replace them.
There we go! We’ve now completed the recursive call mergeSort(2, 4)
. We can return from that recursive call and go back to mergeSort(0, 4)
.
Since both halves of the array in mergeSort(0, 4)
are sorted, we must do the merge process again. We’ll start with a new temporary array and compare the first element in each half.
At this point, we’ll see that 2 is the smaller of those elements, so we’ll place it in the first slot in the temporary array and consider the next element in the second half.
Next, we’ll compare the values 5 and 6, and see that 5 is smaller. It should be placed in the next available element in our temporary array and we should continue onward.
We’ll repeat this process again, placing the 6 in the temporary array, then the 8, then finally the 9. After completing the merge process, we’ll have the following temporary array.
Finally, we’ll replace the original elements with the now merged elements in the temporary array.
There we go! We’ve now completed the process in the mergeSort(0, 4)
recursive call. Once that returns, we’ll be back in our original call to mergeSort(0, 9)
. In that function, we’ll recursively call the process again on the second half of the array using mergeSort(5, 9)
.
Hopefully by now we understand that it will work just like we intended, so by the time that recursive call returns, we’ll now have the second half of the array sorted as well.
The last step in the original mergeSort(0, 9)
function call is to merge these two halves together. So, once again, we’ll follow that same process as before, creating a new temporary array and moving through the elements in each half, placing the smaller of the two in the new array. Once we are done, we’ll end up with a temporary array that has been populated as shown below.
Finally, we’ll replace the elements in the original array with the ones in the temporary array, resulting in a completely sorted result.
Now that we’ve seen how merge sort works by going through an example, let’s look at the pseudocode of a merge sort function.
function MERGESORT(ARRAY, START, END) (1)
# base case size == 1
if END - START + 1 == 1 then (2)
return (3)
end if (4)
# base case size == 2
if END - START + 1 == 2 then (5)
# check if elements are out of order
if ARRAY[START] > ARRAY[END] then (6)
# swap if so
TEMP = ARRAY[START] (7)
ARRAY[START] = ARRAY[END] (8)
ARRAY[END] = TEMP (9)
end if (10)
return (11)
end if (12)
# find midpoint
HALF = int((START + END) / 2) (13)
# sort first half
MERGESORT(ARRAY, START, HALF) (14)
# sort second half
MERGESORT(ARRAY, HALF + 1, END) (15)
# merge halves
MERGE(ARRAY, START, HALF, END) (16)
end function (17)
This function is a recursive function which has two base cases. The first base case is shown in lines 2 through 4, where the size of the array is exactly 1. In that case, the array is already sorted, so we just return on line 3 without doing anything.
The other base case is shown in lines 5 through 11. In this case, the element contains just two elements. We can use the if statement on line 6 to check if those two elements are in the correct order. If not, we can use lines 7 through 9 to swap them, before returning on line 11.
If neither of the base cases occurs, then we reach the recursive case starting on line 13. First, we’ll need to determine the midpoint of the array , which is just the average of the start
and end
variables. We’ll need to remember to make sure that value is an integer by truncating it if needed.
Then, on lines 14 and 15 we make two recursive calls, each one focusing on a different half of the array. Once each of those calls returns, we can assume that each half of the array is now sorted.
Finally, in line 16 we call a helper function known as merge
to merge the two halves together. The pseudocode for that process is below.
function MERGE(ARRAY, START, HALF, END) (1)
TEMPARRAY = new array[END – START + 1] (2)
INDEX1 = START (3)
INDEX2 = HALF + 1 (4)
NEWINDEX = 0 (5)
loop while INDEX1 <= HALF and INDEX2 <= END (6)
if ARRAY[INDEX1] < ARRAY[INDEX2] then (7)
TEMPARRAY[NEWINDEX] = ARRAY[INDEX1] (8)
INDEX1 = INDEX1 + 1 (9)
else (10)
TEMPARRAY[NEWINDEX] = ARRAY[INDEX2] (11)
INDEX2 = INDEX2 + 1 (12)
end if (13)
NEWINDEX = NEWINDEX + 1 (14)
end loop (15)
loop while INDEX1 <= HALF (16)
TEMPARRAY[NEWINDEX] = ARRAY[INDEX1] (17)
INDEX1 = INDEX1 + 1 (18)
NEWINDEX = NEWINDEX + 1 (19)
end loop (20)
loop while INDEX2 <= END (21)
TEMPARRAY[NEWINDEX] = ARRAY[INDEX2] (22)
INDEX2 = INDEX2 + 1 (23)
NEWINDEX = NEWINDEX + 1 (24)
end loop (25)
loop INDEX from 0 to size of TEMPARRAY – 1 (26)
ARRAY[START + INDEX] = TEMPARRAY[INDEX] (27)
end loop (28)
end function (29)
The merge
function begins by creating some variables. The tempArray
will hold the newly merged array. Index1
refers to the element in the first half that is being considered, while index2
refers to the element in the second half. Finally, newIndex
keeps track of our position in the new array.
The first loop starting on line 6 will continue operating until one half or the other has been completely added to the temporary array. It starts by comparing the first element in each half of the array. Then, depending on which one is smaller, it will place the smaller of the two in the new array and increment the indexes.
Once the first loop has completed, there are two more loops starting on lines 16 and 21. However, only one of those loops will actually execute, since only one half of the array will have any elements left in it to be considered. These loops will simply copy the remaining elements to the end of the temporary array.
Finally, the last loop starting on line 26 will copy the elements from the temporary array back into the source array. At this point, they will be properly merged in sorted order.
Now that we’ve reviewed the pseudocode for the merge sort algorithm, let’s see if we can analyze the time it takes to complete. Analyzing a recursive algorithm requires quite a bit of math and understanding to do it properly, but we can get a pretty close answer using a bit of intuition about what it does.
For starters, let’s consider a diagram that shows all of the different recursive calls made by merge sort, as shown below.
The first thing we should do is consider the worst-case input for merge sort. What would that look like? Put another way, would the values or the ordering of those values change anything about how merge sort operates?
The only real impact that the input would have is on the number of swaps made by merge sort. If we had an input that caused each of the base cases with exactly two elements to swap them, that would be a few more steps than any other input. Consider the highlighted entries below.
If each of those pairs were reversed, we’d end up doing that many swaps. So, how many swaps would that be? As it turns out, a good estimate would be $N / 2$ times. If we have an array with exactly 16 elements, there are at most 8 swaps we could make. With 10 elements, we can make at most 4. So, the number of swaps is on the order of N time complexity.
What about the merge operation? How many steps does that take? This is a bit trickier to answer, but let’s look at each row of the diagram above. Across all of the calls to merge sort on each row, we’ll end up merging all $N$ elements in the original array at least once. Therefore, we know that it would take around $N$ steps for each row in the diagram. We’ll just need to figure out how many rows there are.
A better way to phrase that question might be “how many times can we recursively divide an array of $N$ elements in half?” As it turns out, the answer to that question lies in the use of the logarithm.
The logarithm is the inverse of exponentiation. For example, we could have the exponentiation formula:
$$ \text{base}^{\text{exponent}} = \text{power} $$
The inverse of that would be the logarithm $$ \text{log}_{\text{base}}(\text{power}) = \text{exponent} $$
So, if we know a value and base, we can determine the exponent required to raise that base to the given value.
In this case, we would need to use the logarithm with base $2$, since we are dividing the array in half each time. So, we would say that the number of rows in that diagram, or the number of levels in our tree would be on the order of $\text{log}_2(N)$. In computer science, we typically write $\text{log}_2$ as $\text{lg}$, so we’ll say it is on the order of $\text{lg}(N)$.
To get an idea of how that works, consider the case where the array contains exactly $16$ elements. In that case, the value $\text{lg}(16)$ is $4$, since $2^4 = 16$. If we use the diagram above as a model, we can draw a similar diagram for an array containing $16$ elements and find that it indeed has $4$ levels.
If we double the size of the array, we’ll now have $32$ elements. However, even by doubling the size of the array, the value of $\text{lg}(32)$ is just $5$, so it has only increased by $1$. In fact, each time we double the size of the array, the value of $\text{lg}(N)$ will only go up by $1$.
With that in mind, we can say that the merge operation runs on the order of $N * \text{lg}(N)$ time. That is because there are ${\text{lg}(N)}$ levels in the tree, and each level of the tree performs $N$ operations to merge various parts of the array together. The diagram below gives a good graphical representation of how we can come to that conclusion.
Putting it all together, we have $N/2$ swaps, and $N * \text{lg}(N)$ steps for the merge. Since the value $N * \text{lg}(N)$ is larger than $N$, we would say that total running time of merge sort is on the order of $N * \text{lg}(N)$.
Later on in this chapter we’ll discuss how that compares to the running time of selection sort and bubble sort and how that impacts our programs.
The last sorting algorithm we will review in this module is quicksort. Quicksort is another example of a recursive, divide and conquer sorting algorithm, and at first glance it may look very similar to merge sort. However, quicksort uses a different process for dividing the array, and that can produce some very interesting results.
The basic idea of quicksort is as follows:
pivotValue
. This value could be any random value in the array. In our implementation, we’ll simply use the last value.pivotValue
pivotValue
pivotValue
in between those two parts. We’ll call the index of pivotValue
the pivotIndex
.pivotIndex – 1
pivotIndex + 1
to the endAs with all of the other examples we’ve looked at in this module, Wikipedia provides yet another excellent animation showing this process.
^[File:Sorting quicksort anim.gif. (2019, July 30). Wikimedia Commons, the free media repository. Retrieved 01:14, March 24, 2020 from https://commons.wikimedia.org/w/index.php?title=File:Sorting_quicksort_anim.gif&oldid=359998181.]
Let’s look at an example of the quicksort algorithm in action to see how it works. Unlike the other sorting algorithms we’ve seen, this one may appear to be just randomly swapping elements around at first glance. However, as we move through the example, we should start to see how it achieves a sorted result, usually very quickly!
We can start with our same initial array, shown below.
The first step is to choose a pivot value. As we discussed above, we can choose any random value in the array. However, to make it simple, we’ll just use the last value. We will create two variables, pivotValue
and pivotIndex
, to help us keep track of things. We’ll set pivotValue
to the last value in the array, and pivotIndex
will initially be set to 0. We’ll see why in a few steps.
Now, the algorithm will iterate across each element in the array, comparing it with the value in pivotValue
. If that value is less than or equal to the pivotValue
, we should swap the element at pivotIndex
with the value we are looking at in the array. Let’s see how this would work.
We’d start by looking at the value at index 0 of the array, which is 8. Since that value is greater than the pivotValue
, we do nothing and just look at the next item.
Here, we are considering the value 5, which is at index 1 in the array. In this case, that value is less than or equal to the pivotValue
. So, we want to swap the current element with the element at our pivotIndex
, which is currently 0. Once we do that, we’ll also increment our pivotIndex
by 1. The diagram below shows these changes before they happen.
Once we make those changes, our array should look like the following diagram, and we’ll be ready to examine the value at index 2.
Once again, the value 2 at index 2 of the array is less than or equal to the pivot value. So, we’ll swap them, increment pivotValue
, and move to the next element.
We’ll continue this process, comparing the next element in the array with the pivotValue
, and then swapping that element and the element at the pivotIndex
if needed, incrementing the pivotIndex
after each swap. The diagrams below show the next few steps. First, since 6 is less than or equal to our pivotValue
, we’ll swap it with the pivot index and increment.
However, since 9 is greater than the pivot index, we’ll just leave it as is for now and move to the next element.
3 is less than or equal to the pivot value, so we’ll swap the element at index 3 with the 3 at index 5.
We’ll see that the elements at indexes 6, 7 and 8 are all less than or equal to the pivot value. So, we’ll end up making some swaps until we reach the end of the list.
Finally, we have reached the end of the array, which contains our pivotValue
in the last element. Thankfully, we can just continue our process one more step. Since the pivotValue
is less than or equal to itself, we swap it with the element at the pivotIndex
, and increment that index one last time.
At this point, we have partitioned the initial array into two sections. The first section contains all of the values which are less than or equal to the pivot value, and the second section contains all values greater than the pivot value.
This demonstrates the powerful way that quicksort can quickly partition an array based on a pivot value! With just a single pass through the array, we have created our two halves and done at least some preliminary sorting. The last step is to make two recursive calls to quicksort, one that sorts the items from the beginning of the array through the element right before the pivotValue
. The other will sort the elements starting after the pivotValue
through the end of the array.
Once each of those recursive calls is complete, the entire array will be sorted!
Now that we’ve seen an example of how quicksort works, let’s walk through the pseudocode of a quicksort function. The function itself is very simple, as shown below.
function QUICKSORT(ARRAY, START, END) (1)
# base case size <= 1
if START >= END then (2)
return (3)
end if (4)
PIVOTINDEX = PARTITION(ARRAY, START, END) (5)
QUICKSORT(ARRAY, START, PIVOTINDEX – 1) (6)
QUICKSORT(ARRAY, PIVOTINDEX + 1, END) (7)
end function (8)
This implementation of quicksort uses a simple base case on lines 2 through 4 to check if the array is either empty, or contains one element. It does so by checking if the START
index is greater than or equal to the END
index. If so, it can assume the array is sorted and just return it without any additional changes.
The recursive case is shown on lines 5 – 7. It simply uses a helper function called partition
on line 5 to partition the array based on a pivot value. That function returns the location of the pivot value, which is stored in pivotIndex
. Then, on lines 6 and 7, the quicksort function is called recursively on the two partitions of the array, before and after the pivotIndex
. That’s really all there is to it!
Let’s look at one way we could implement the partition
function, shown below in pseudocode.
function PARTITION(ARRAY, START, END) (1)
PIVOTVALUE = ARRAY[END] (2)
PIVOTINDEX = START (3)
loop INDEX from START to END (4)
if ARRAY[INDEX] <= PIVOTVALUE (5)
TEMP = ARRAY[INDEX] (6)
ARRAY[INDEX] = ARRAY[PIVOTINDEX] (7)
ARRAY[PIVOTINDEX] = TEMP (8)
PIVOTINDEX = PIVOTINDEX + 1 (9)
end if (10)
end loop (11)
return PIVOTINDEX – 1 (12)
This function begins on lines 2 and 3 by setting initial values for the pivotValue
by choosing the last element in the array, and then setting the pivotIndex
to 0. Then, the loop on lines 4 through 11 will look at each element in the array, determine if it is less than or equal to pivotValue
, and swap that element with the element at pivotIndex
if so, incrementing pivotIndex
after each swap.
At the end, the value that was originally at the end of the array will be at location pivotIndex – 1
, so we will return that value back to the quicksort
function so it can split the array into two parts based on that value.
To wrap up our analysis of the quicksort algorithm, let’s take a look at the time complexity of the algorithm. Quicksort is a very difficult algorithm to analyze, especially since the selection of the pivot value is random and can greatly affect the performance of the algorithm. So, we’ll talk about quicksort’s time complexity in terms of two cases, the worst case and the average case. Let’s look at the average case first
What would the average case of quicksort look like? This is a difficult question to answer and requires a bit of intuition and making a few assumptions. The key really lies in how we choose our pivot value.
First, let’s assume that the data in our array is equally distributed. This means that the values are evenly spread between the lowest value and the highest value, with no large clusters of similar values anywhere. While this may not always be the case in the real world, often we can assume that our data is somewhat equally distributed.
Second, we can also assume that our chosen pivot value is close to the average value in the array. If the array is equally distributed and we choose a value at random, we have a $50%$ chance of that value being closer to the average than either the minimum or the maximum value, so this is a pretty safe assumption.
With those two assumptions in hand, we see that something interesting happens. If we choose the average value as our pivot value, quicksort will perfectly partition the array into two equal sized halves! This is a great result, because it means that each recursive call to the function will be working with data that is half the initial array.
If we consider an array that initially contains $15$ elements, and make sure that we always choose the average element as our pivot point, we’d end up with a tree of recursive calls that resembles the diagram below.
In this diagram, we see that each level of the tree looks at around $N$ elements. (It is actually fewer, but not by a significant amount so we can just round up to $N$ each time). We also notice that there are 4 levels to the tree, which is closely approximated by $\text{lg}(N)$. This is the same result we observed when analyzing the merge sort algorithm earlier in this module.
So, in the average case, we’d say that quicksort runs in the order of $N * \text{lg}(N)$ time.
To consider the worst-case situation for quicksort, we must come up with a way to define what the worst-case input would be. It turns out that the selection of our pivot value is the key here.
Consider the situation where the pivot value is chosen to be the maximum value in the array. What would happen in that case?
Looking at the code, we would see that each recursive call would contain one empty partition, and the other partition would be just one less than the size of the original array. So, if our original array only contained 8 elements, our tree recursion diagram would look similar to the following.
This is an entirely different result! In this case, since we are only reducing the size of our array by 1 at each level, it would take $N$ recursive calls to complete. However, at each level, we are looking at one fewer element. Is this better or worse than the average case?
It turns out that it is much worse. As we learned in our analysis of selection sort and bubble sort, the series
$$ N + (N – 1) + (N – 2) + … + 2 + 1 $$
is best approximated by $N^2$. So, we would say that quicksort runs in the order of $N^2$ time in the worst case. This is just as slow as selection sort and bubble sort! Why would we ever call it “quicksort” if it isn’t any faster?
Thankfully, in practice, it is very rare to run into this worst-case performance with quicksort, and in fact most research shows that quicksort is often the fastest of the four sorting algorithms we’ve discussed so far. In the next section, we’ll discuss these performance characteristics a bit more.
This result highlights why it is important to consider both the worst case and average case performance of our algorithms. Many times we’ll write an algorithm that runs well most of the time, but is susceptible to poor performance when given a particular worst-case input.
We introduced four sorting algorithms in this chapter: selection sort, bubble sort, merge sort, and quicksort. In addition, we performed a basic analysis of the time complexity of each algorithm. In this section, we’ll revisit that topic and compare sorting algorithms based on their performance, helping us understand what algorithm to choose based on the situation.
The list below shows the overall result of our time complexity analysis for each algorithm.
We have expressed the amount of time each algorithm takes to complete in terms of the size of the original input $N$. But how does $N^2$ compare to $N * \text{lg}(N)$?
One of the easiest ways to compare two functions is to graph them, just like we’ve learned to do in our math classes. The diagram below shows a graph containing the functions $N$, $N^2$, and $N * \text{lg}(N)$.
First, notice that the scale along the X axis (representing values of $N$) goes from 0 to 10, while the Y axis (representing the function outputs) goes from 0 to 30. This graph has been adjusted a bit to better show the relationship between these functions, but in actuality they have a much steeper slope than is shown here.
As we can see, the value of $N^2$ at any particular place on the X axis is almost always larger than $N * \text{lg}(N)$, while that function’s output is almost always larger than $N$ itself. We can infer from this that functions which run in the order of $N^2$ time will take much longer to complete than functions which run in the order of $N * \text{lg}(N)$ time. Likewise, the functions which run in the order of $N * \text{lg}(N)$ time themselves are much slower than functions which run in linear time, or in the order of $N$ time.
Based on that assessment alone, we might conclude that we should always use merge sort! It is guaranteed to run in $N * \text{lg}(N)$ time, with no troublesome worst-case scenarios to consider, right? Unfortunately, as with many things in the real world, it isn’t that simple.
The choice of which sorting algorithm to use in our programs largely comes down to what we know about the data we have, and how that information can impact the performance of the algorithm. This is true for many other algorithms we will write in this class. Many times there are multiple methods to perform a task, such as sorting, and the choice of which method we use largely depends on what we expect our input data to be.
For example, consider the case where our input data is nearly sorted. In that instance, most of the items are in the correct order, but a few of them, maybe less than $10%$, are slightly out of order. In that case, what if we used a version of bubble sort that was optimized to stop sorting as soon as it makes a pass through the array without swapping any elements? Since only a few elements are out of order, it may only take a few passes with bubble sort to get them back in the correct places. So even though bubble sort runs in $N^2$ time, the actual time may be much quicker.
Likewise, if we know that our data is random and uniformly distributed, we might want to choose quicksort. Even though quicksort has very slow performance in the worst case, if our data is properly random and distributed, research shows that it will have better real-world performance than most other sorting algorithms in that instance.
Finally, what if we know nothing about our input data? In that case, we might want to choose merge sort as the safe bet. It is guaranteed to be no worse than $N * \text{lg}(N)$ time, even if the input is truly bad. While it might not be as fast as quicksort if the input is random, it won’t run the risk of being slow, either.
Now that we’ve learned how to sort the data in our container, let’s go back and revisit the concept of searching once again. Does our approach change when we know the data has been sorted?
Our intuition tells us that it should. Recall that we discussed how much easier it would be to find a particular paper in a sorted filing cabinet rather than just searching through a random pile of papers on the floor. The same concept applies to data in our programs.
The most commonly used searching algorithm when dealing with sorted data is binary search. The idea of the algorithm is to compare the value in the middle of the container with the value we are looking for. In this case, let’s assume the container is sorted in ascending order, so the smaller elements are before the larger ones. If we compare our desired value with the middle value, there are three possible outcomes:
Once an occurrence of the desired value is found, we can also look at the values before it to see if there any more of the desired values in the container. Since it is sorted, they should all be grouped together. If we want our algorithm to return the index of the first occurrence of the desired value, we can simply move toward the front of the array until we find that first occurrence.
Let’s work through a quick example of the binary search algorithm to see how it works in practice. Let’s assume we have the array shown in the diagram below, which is already sorted in ascending order. We wish to find out if the array contains the value 5. So, we’ll store that in our value
variable. We also have variables start
and end
representing the first and last index in the array that we are considering.
First, we must calculate the middle index of the array. To do that, we can use the following formula.
$$ \text{int}((\text{start} + \text{end}) / 2) $$
In this case, we’ll find that the middle index is 5.
Next, we’ll compare our desired value with the element at the middle index, which is 2. Since our desired value 5 is greater than 2, we know that 5 must be present in the second half of the array. We will then update our starting value to be one greater than the middle element and start over. In practice, this could be done either iteratively or recursively. We’ll see both implementations later in this section. The portion of the array we are ignoring has been given a grey background in the diagram below.
Once again, we’ll start by calculating a new middle index. In this case, it will be 8.
The value at index 8 is 7, which is greater than our desired value 5. So we know that 5 should be in the first half of the array from index 6 through 10. We need to update the end variable to be one less than middle and try once again.
We’ll first calculate the middle index, which will be 6. This is because (6 + 7) / 2 is 6.5, but when we convert it to an integer it will be truncated, resulting in just 6.
Since the value at index 6 is 4, which is less than our desired value 5, we know that we should be looking at the portion of the array which comes after our middle element. Once again, we’ll update our start index to be one greater than the middle and start over.
In this case, since both start
and end
are the same, we know that the middle index will also be 7. We can compare the value at index 7 to our desired value. As it turns out, they are a match, so we’ve found our value! We can just return middle
as the index for this value. Of course, if we want to make sure it is the first instance of our desired value, we can quickly check the elements before it until we find one that isn’t our desired value. We won’t worry about that for now, but it is something that can easily be added to our code later.
The binary search algorithm is easily implemented in both an iterative and recursive function. We’ll look at both versions and see how they compare.
The pseudocode for an iterative version of binary search is shown below.
function BINARYSEARCH(ARRAY, VALUE) (1)
START = 0 (2)
END = size of ARRAY - 1 (3)
loop while START <= END (4)
MIDDLE = INT((START + END) / 2) (5)
if ARRAY[MIDDLE] == VALUE then (6)
return MIDDLE (7)
else if ARRAY[MIDDLE] > VALUE then (8)
END = MIDDLE – 1 (9)
else if ARRAY[MIDDLE] < VALUE then (10)
START = MIDDLE + 1 (11)
end if (12)
end loop (13)
return -1 (14)
end function (15)
This function starts by setting the initial values of start
and end
on lines 2 and 3 to the first and last indexes in the array, respectively. Then, the loop starting on line 4 will repeat while the start
index is less than or equal to the end
index. If we reach an instance where start
is greater than end
, then we have searched the entire array and haven’t found our desired value. At that point the loop will end and we will return -1 on line 14.
Inside of the loop, we first calculate the middle
index on line 5. Then on line 6 we check to see if the middle element is our desired value. If so, we should just return the middle
index and stop. It is important to note that this function will return the index to an instance of value
in the array, but it may not be the first instance. If we wanted to find the first instance, we’d add a loop at line 7 to move forward in the array until we were sure we were at the first instance of value
before returning.
If we didn’t find our element, then the if statements on lines 8 and 10 determine which half of the array we should look at. Those statements update either end
or start
as needed, and then the loop repeats.
The recursive implementation of binary search is very similar to the iterative approach. However, this time we also include both start
and end
as parameters, which we update at each recursive call. The pseudocode for a recursive binary search is shown below.
function BINARYSEARCHRECURSE(ARRAY, VALUE, START, END) (1)
# base case
if START > END then (2)
return -1 (3)
end if (4)
MIDDLE = INT((START + END) / 2) (5)
if ARRAY[MIDDLE] == VALUE then (6)
return MIDDLE (7)
else if ARRAY[MIDDLE] > VALUE then (8)
return BINARYSEARCHRECURSE(ARRAY, VALUE, START, MIDDLE – 1) (9)
else if ARRAY[MIDDLE] < VALUE then (10)
return BINARYSEARCHRECURSE(ARRAY, VALUE, MIDDLE + 1, END) (11)
end if (12)
end function (13)
The recursive version moves the loop’s termination condition to the base case, ensuring that it returns -1 if the start
index is greater than the end
index. Otherwise, it performs the same process of calculating the middle
index and checking to see if it contains the desired value
. If not, it uses the recursive calls on lines 9 and 11 to search the first half or second half of the array, whichever is appropriate.
Analyzing the time complexity of binary search is similar to the analysis done with merge sort. In essence, we must determine how many times it must check the middle element of the array.
In the worst case, it will continue to do this until it has determined that the value
is not present in the array at all. Any time that our array doesn’t contain our desired value would be our worst-case input.
In that instance, how many times do we look at the middle element in the array? That is hard to measure. However, it might be easier to measure how many elements are in the array each time and go from there.
Consider the situation where we start with 15 elements in the array. How many times can we divide the array in half before we are down to just a single element? The diagram below shows what this might look like.
As it turns out, this is similar to the analysis we did on merge sort and quick sort. If we divide the array in half each time, we will do this $\text{lg}(N)$ times. The only difference is that we are only looking at a single element, the shaded element, at each level. So the overall time complexity of binary search is on the order of $\text{lg}(N)$. That’s pretty fast!
Let’s go back and look at the performance of our sorting algorithms, now that we know how quickly binary search can find a particular value in an array. Let’s add the function $\text{lg}(N)$ to our graph from earlier, shown below.
As we can see, the function $\text{lg}(N)$ is even smaller than $N$. So performing a binary search is much faster than a linear search, which we already know runs in the order of $N$ time.
However, performing a single linear search is still faster than any of the sorting algorithms we’ve reviewed. So when does it become advantageous to sort our data?
This is a difficult question to answer since it depends on many factors. However, a good rule of thumb is to remember that the larger the data set, or the more times we need to search for a value, the better off we are to sort the data before we search.
In the graph below, the topmost line colored in red shows the approximate running time of $10$ linear search operations, while the bottom line in black shows the running time of performing a merge sort before $10$ binary search operations.
As we can see, it is more efficient to perform a merge sort, which runs in $N * \text{lg}(N)$ time, then perform $10$ binary searches running in $\text{lg}(N)$ time, than it is to perform $10$ linear searches running in $N$ time. The savings become more pronounced as the size of the input gets larger, as indicated by the X axis on the graph.
In fact, this analysis suggests that it may only take as few as 7 searches to see this benefit, even on smaller data sets. So, if we are writing a program that needs to search for a specific value in an array more than about 7 times, it is probably a good idea to sort the array before doing our searches, at least from a performance standpoint.
So far we’ve looked at sorting algorithms that run in $N * \text{lg}(N)$ time. However, what if we try to sort the data as we add it to the array? In a later course, we’ll learn how we can use an advanced data structure known as a heap to create a sorted array in nearly linear time (with some important caveats, of course)!
In this chapter, we learned how to search for values in an array using a linear search method. Then, we explored four different sorting algorithms, and compared them based on their time complexity. Finally, we learned how we can use a sorted array to perform a faster binary search and saw how we can increase our performance by sorting our array before searching in certain situations.
Searching and sorting are two of the most common operations performed in computer programs, and it is very important to have a deep understanding of how they work. Many times the performance of a program can be improved simply by using the correct searching and sorting algorithms to fit the program’s needs, and understanding when you might run into a particularly bad worst-case input.
The project in this module will involve implementing several of these algorithms in the language of your choice. As we learn about more data structures, we’ll revisit these algorithms again to discuss how they can be improved or adapted to take advantage of different structures.
Mergesort iterative without a stack
Quicksort iterative with a stack
Bubble sort recursive
This page is the main page for Queues
A queue (pronounced like the letter “q”) data structure organizes data in a First In, First Out (FIFO) order: the first piece of data put into the queue is the first piece of data available to remove from the queue. A queue functions just like the line you would get into to go into a ballgame, movie, or concert: the person that arrives first is the first to get into the venue.
^[https://commons.wikimedia.org/w/index.php?title=Special:CiteThisPage&page=File%3AReichstag_queue_2.JPG&id=379395710&wpFormIdentifier=titleform]
You might be thinking that this sounds a lot like the stack structure we studied a few modules back, with the exception that the stack was a Last in, First Out (LIFO) structure. If so, you are correct. The real difference between a stack and a queue is how we take data out of the data structure. In a stack, we put data onto the top of the stack and removed it from the top as well. With a queue, we put data onto the end (or rear) of the queue and remove it from the start (or front) of the queue.
The name for queues comes the word in British English used to describe a line of people. Instead of forming lines to wait for some service, British form queues. Thus, when we think of queues, often the first picture to come to mind is a group of people standing in a line. Of course, this is exactly how a computer queue operates as well. The first person in line gets served first. If I get into line before you do, then I will be served before you do.
^[File:BNSF GE Dash-9 C44-9W Kennewick - Wishram WA.jpg. (2019, July 1). Wikimedia Commons, the free media repository. Retrieved 19:30, March 30, 2020 from https://commons.wikimedia.org/w/index.php?title=File:BNSF_GE_Dash-9_C44-9W_Kennewick_-_Wishram_WA.jpg&oldid=356754103.]
Of course, there are other examples of queues besides lines of people. You can think of a train as a long line of railway cars. They are all connected and move together as the train engine pulls them. A line of cars waiting to go through a toll booth or to cross a border is another good example of a queue. The first car in line will be the first car to get through the toll booth. In the picture below, there are actually several lines.
^[File:El Paso Ysleta Port of Entry.jpg. (2018, April 9). Wikimedia Commons, the free media repository. Retrieved 19:30, March 30, 2020 from https://commons.wikimedia.org/w/index.php?title=File:El_Paso_Ysleta_Port_of_Entry.jpg&oldid=296388002.]
How do we implement queues in code? Like we did with stacks, we will use an array, which is an easily understandable way to implement queues. We will store data directly in the array and use special start
and end
variables to keep track of the start of the queue and the end of the queue.
The following figure shows how we might implement a queue with an array. First, we define our array myQueue
to be an array that can hold 10 numbers, with an index of 0 to 9. Then we create a start
variable to keep track of the index at the start of the queue and an end
variable to keep track of the end of the array.
Notice that since we have not put any items into the queue, we initialize start
to be -1
. Although this is not a legal index into the array, we can use it like we did with stacks to recognize when we have not yet put anything into the queue. As we will see, this also makes manipulating items in the array much simpler. However, to make our use of the array more efficient, -1
will not always indicate that the queue is empty. We will allow the queue to wrap around the array from the start
index to the end
index. We’ll see an example of this behavior later.
When we want to enqueue an item into the queue, we follow the simple procedure as shown below. Of course, since our array has a fixed size, we need to make sure that we don’t try to put an item in a full array. Thus, the precondition is that the array cannot be full. Enforcing this precondition is the function of the if
statement at line 1. If the array is already full, then we’ll throw an exception in line 2 and let the caller handle the situation. Next, we store item
at the end
location and then compute the new value of end
in line 4. Line 4 uses the modulo operator %
to return the remainder of the division of $(\text{end} + 1) / \text{length of myQueue}$. In our example, this is helpful when we get to the end of our ten-element array. If end == 9
before enqueue
was called, the function would store item
in myQueue[9]
and then line 4 would cause end
to be $(9 +1) % 10$ or $10 % 10$ which is simply $0$, essentially wrapping the queue around the end of the array and continuing it at the beginning of the array.
function ENQUEUE (item)
if ISFULL() then (1)
raise exception (2)
end if
MYQUEUE[END] = ITEM (3)
END = (END + 1) % length of MYQUEUE (4)
if START == -1 (5)
START = 0 (6)
end if
end function
Given our initial configuration above, if we performed an enqueue(7)
function call, the result would look like the following.
Notice that the value 7 was stored at myQueue[0]
in line 3, end
was updated to 1
in line 4, and start
was set to 0
in line 7. Now, let’s assume we continue to perform enqueue
operations until myQueue
is almost filled as shown below.
If at this point, we enqueue another number, say -35
, the modulo operator in line 4 would help us wrap the end of the list around the array and back to the beginning as expected. The result of this function call is shown below.
Now we have a problem! The array is full of numbers and if we try to enqueue another number, the enqueue
function will raise an exception in line 2. However, this example also gives us insight into what the isFull
condition should be. Notice that both start
, and end
are pointing at the same array index. You may want to think about this a little, but you should be able to convince yourself that whenever start == end
we will be in a situation like the one above where the array is full, and we cannot safely enqueue another number.
To rectify our situation, we need to have a function to take things out of the queue. We call this function dequeue
, which returns the item at the beginning of the queue (pointed at by start
) and updates the value of start
to point to the next location in the queue. The pseudocode for the dequeue
is shown below.
function DEQUEUE ()
if ISEMPTY() then (1)
raise exception (2)
end if
ITEM = MYQUEUE[START] (3)
START = (START + 1) % length of MYQUEUE (4)
if START == END (5)
START = -1 (6)
END = 0 (7)
end if
return ITEM (8)
end function
Line 1 checks if the queue is empty and raises an exception in line 2 if it is. Otherwise, we copy the item at the start
location into item
at line 3 and then increment the value of start
by 1
, using the modulo operator to wrap to the beginning of the array if needed in line 4. However, if we dequeue
the last item in the queue, we will actually run into the same situation that we ran into in the enqueue
when we filled the array, start == end
. Since we need to differentiate between being full or empty (it’s kind of important!), we reset the start
and end
values back to their initial state when we dequeue the last item in the queue. That is, we set start = -1
and end = 0
. This way, we will always be able to tell the difference between the queue being empty or full. Finally, we return the item to the calling function in line 8.
We have already seen the pseudocode for the two key operations for queues: enqueue
and dequeue
. However, there are several others that make the queue data structure much easier to use:
We will discuss each of these operations. But first, let’s talk about the constructor for the queue
class and what it must do to properly set up a queue
object.
The main responsibility of the constructor is to initialize all the attributes in the queue
class. As we discussed above, the attributes include the myQueue
array and the start
and end
variables that hold indexes into myQueue
.
Since we are using an array for our queue, we will need to know how big to make the array in our constructor. There are two options. We could just use a default size for the array. Or, we could allow the user to pass in a positive integer to set the size. In this module we assume the caller must provide a capacity
value, which must be greater than 0
.
function QUEUE (CAPACITY)
if CAPACITY is not an integer then (1)
throw exception (2)
else if CAPACITY <= 0 then (3)
throw exception (4)
end if
MYQUEUE = new array of size capacity (5)
START = -1 (6)
END = 0 (7)
end function
The first thing we do in the code is to check to make sure that capacity
is actually an integer that is greater than 0
. Essentially, this is our precondition for the method. If our precondition is not met, we throw an exception. (If we are using a typed language such as Java, we can enforce our precondition by requiring that capacity
be of type integer
instead of explicitly checking it in the code.) Once we’ve validated our precondition, we create a new array of size capacity
for the myQueue
array and set the attribute start
to -1
and end
to 0
.
We have already discussed the enqueue
operation and seen it in operation above. In the pseudocode below, we see that we must first check to ensure that the queue is not already full. Again, this is our precondition.
function ENQUEUE (item)
if ISFULL() then (1)
raise exception (2)
end if
MYQUEUE[END] = ITEM (3)
END = (END + 1) % length of MYQUEUE (4)
if START == -1 (5)
START = 0 (6)
end if
end function
Once our precondition is validated, we simply increment and store the item
into the array at index end
. Then we increment end
, using the modulo operator to wrap end
to point to the beginning of the array if warranted. Next, we check for the condition of an empty queue. If start = 1
, then we know the queue is empty, so we set start = 0
. The enqueue
function does not return a value.
Because there are no loops in the enqueue function, the function operates in constant time regardless of the size of the array or the number of items in it.
Like enqueue
, we have already seen the dequeue
operation. It simply takes the first item from the start
of the queue and returns it. However, before we can do that, we need to validate our precondition. For the dequeue
operation, our precondition is that the queue must not already be empty, which is detected by the isEmpty
function in line 1.
function DEQUEUE ()
if ISEMPTY() then (1)
raise exception (2)
end if
ITEM = MYQUEUE[START] (3)
START = (START + 1) % length of MYQUEUE (4)
if START == END (5)
START = -1 (6)
END = 0 (7)
end if
return ITEM (8)
end function
Once we have validated our precondition, we simply copy the item from the myQueue[start]
in line 3 and increment start
. Again, we use the modulo operator in line 4 to wrap start
back to 0
if it is needed. Next, we check to see if the myQueue
is empty, and, if it is, reset the values of start
and end
back to their initial values. Finally, we return the item to the calling function in line 8. Like the enqueue
operation, the dequeue
function operates in constant time.
The peek
operation returns the item at the start of the queue, without removing it from the array. Like the dequeue
operation it has the precondition that the queue must not be empty. The pseudocode for the peek
operation is shown below. It is also a constant time operation.
function PEEK() (1)
if ISEMPTY() then (2)
raise exception (3)
else
return MYQUEUE[START] (4)
end if
end function
To allow the calling program to detect when the queue is full, we define an isFull
operation. Notice that code external to the queue
class cannot access the value of start
or end
so it cannot simply check if start == end
on its own. In this case, the operation is very simple as we only need to return the Boolean value of start == end
as shown below. There is no precondition for isFull
and isFull
operates in constant time.
function ISFULL()
return START == END (1)
end function
The isEmpty
operation is very similar to the isFull
operation except that we return the Boolean value of the condition start == -1
instead of start == end
.
function ISEMPTY()
return START == -1 (1)
end function
This size
method returns the number of items in the queue. However, it is not as straightforward as it might sound. Actually, there are several cases that we must consider, based on the fact that both start
and end
can “wrap around” the end of the array:
start == -1
—the queue is empty, and size = 0
,start == end
—the queue is full, and size
equals the capacity of the array,start < end
—size = end – start
, andstart > end
—size = capacity of array - start + end + 1
.Thus, in our function, we simply need to check four conditions.
function SIZE()
if START = -1 (1)
return 0 (2)
else if START == END (3)
return capacity of MYQUEUE (4)
else if START < END (5)
return END – START (6)
else
return capacity of MYQUEUE – START + END (7)
end if
end function
Notice that the conditions that are checked in lines 3 and 5 ensure that start
must be greater than end
. Therefore, we can simply use an else
statement to capture the last case in line 7. Once again, this is a constant time function.
The doubleCapacity
operation doubles the size of the array holding our queue. If we started with an array of size 4
, the doubleCapacity
operation will result in an array of size 8
with the contents of our original array stored in it. Unfortunately, most programming languages (like Java) do not simply let you double the size of the array. A noted exception to this is Python, which does allow you to directly extend an array.
In a traditional programming language, the easiest way to accomplish the doubleCapacity
operation is to complete the following steps:
start
and end
variables to point at the correct elements, thenmyQueue
array at the new array.The pseudocode for the doubleCapacity
operation is shown below.
function DOUBLECAPACITY()
NEWQUEUE = new array of MYQUEUE capacity * 2 (1)
LENGTH = SIZE() (2)
for I = 0 to LENGTH - 1 (3)
NEWQUEUE[I] = DEQUEUE() (4)
end for
START = 0 (5)
END = LENGTH (6)
MYQUEUE = NEWQUEUE (7)
end function
In the function, we create the new array in line 1 and then save the total number of items in the array for use later in line 2. Next, we use a for
loop in lines 3 and 4 to copy the contents from myQueue
into newQueue
. Since the contents of myQueue
are not necessarily stored neatly in the array (i.e., from $0$ to $n$), it is easier for us to use the existing size
and dequeue
functions to get access to each item in the queue in order. Once we have copied the items from myQueue
to newQueue
, we simply need to set the start
and end
variables in line 5 and 6, and then set myQueue = newQueue
in line 7 to complete the process.
The doubleCapacity
operation is not a constant time operation since copying the contents of the original array into the new array requires us to copy each item via a loop. This requires $N$ steps. Thus, we would say that doubleCapacity
runs in “order $N$” time.
The halveCapacity
operation is similar to the doubleCapacity
operation except that we now have a precondition. We must make sure that when we reduce the space for storing the queue that we still have enough space to store all the items currently in the queue. For example, if we have 10 items in a queue with a capacity of 16, we can’t successfully perform halveCapacity
. Doing so would only leave us a queue with a capacity of 8 and we would not be able to fit all 10 items in the new queue.
The pseudocode for the halveCapacity
function is shown below, with the precondition being checked in line 2. Once we create newQueue
to be half the capacity of myQueue
in line 4, the remainder of the function is exactly the same as the doubleCapacity
function, since lines 5-10 are just concerned with copying the items from myQueue
to newQueue
and setting the associated variables.
function HALVECAPACITY()
if SIZE() > MYQUEUE capacity / 2 then (2)
throw exception (3)
end if
NEWQUEUE = new array of MYQUEUE capacity / 2 (4)
LENGTH = SIZE() (5)
for I = 0 to LENGTH - 1 (6)
NEWQUEUE[I] = DEQUEUE() (7)
end for
START = 0 (8)
END = LENGTH % length of NEWQUEUE (9)
MYQUEUE = NEWQUEUE (10)
end function
Like the doubleCapacity
operation, halveCapacity
is not a constant time operation since copying the contents of the original array requires us to loop $N$ times. So, halveCapacity
runs in “order $N$” time.
The toString
function returns a string that concatenates the strings representing all the items stored in an array. In most programming languages, each object class must implement the toString
operation. For instance, in the queue below where each item is a character, if we called myQueue.toString()
, we would expect to be returned the string "Wildcats"
.
Notice that we must read the queue array from start
to end
to get the proper output string.
In the pseudocode below we first create an empty string called output
in line 1. Then, we create a loop in line 2 that counts using i
the number of items in the queue from 0
to the size
of the queue. However, we can’t use this counter i
to directly index into the array, since start
and end
may be almost anywhere in the array. Thus, we use i
to compute next
in line 3, which we will use as our index into the array. Our index i
should begin with start
and finish with the end
value, which can also be computed as start + size() – 1
modulo the capacity of myQueue
. We then use the index next
in line 4 to select the appropriate element of myQueue
to append to our output string. Once the loop ends, we simply return our output
string.
function TOSTRING()
OUTPUT = "" (1)
for I = 0 to SIZE() - 1 (2)
NEXT = (START + I) % MYQUEUE capacity (3)
OUTPUT = OUTPUT + MYQUEUE[next].TOSTRING() (4)
end for (5)
return OUTPUT (6)
end function
The toString
function includes a loop that, at most, looks at each element in myQueue
; therefore, toString
executes in order $N$ time.
The following table shows an example of how to use the above operations to create and manipulate a queue. It assumes the steps are performed sequentially and the result of the operation is shown.
Queues are useful in many applications. Classic real-world software which uses queues includes the scheduling of tasks, sharing of resources, and processing of messages in the proper order. A common need is to schedule tasks to be executed based on their priority. This type of scheduling can be done in a computer or on an assembly line floor, but the basic concept is the same.
Let’s assume that we are putting windshields onto new cars in a production line. In addition, there are some cars that we want to rush through production faster than others. There are actually three different priorities:
Ideally, as cars come to the windshield station, we would be able to put their windshields in and send them to the next station before we received the next car. However, this is rarely the case. Since putting on windshields often requires special attention, cars tend to line up to get their windows installed. This is when their priority comes into account. Instead of using a simple queue to line cars up first-come, first-served in FIFO order, we would like to jump high priority cars to the head of the line.
While we could build a sophisticated queueing mechanism that would automatically insert cars in the appropriate order based on priority and then arrival time, we could also use a queue to handle each set of car priorities. A figure illustrating this situation is shown below. As cars come in, they are put in one of three queues: high priority, medium priority, or low priority. When the windshield station completes a car it then takes the next car from the highest priority queue.
The interesting part of the problem is the controller at the windshield station that determines which car will be selected to be worked on next. The controller will need to have the following interface:
function receiveCar(car, priority)
// receives a car from the previous station and places into a specific queue
function bool isEmpty()
// returns true if there are no more cars in the queue
function getCar() returns car
// retrieves the next car based on priority
Using this simple interface, we will define a class to act as the windshield station controller. It will receive cars from the previous station and get cars for the windshield station.
We start by defining the internal attributes and constructor for the Controller
class as follows, using the Queue functions defined earlier in this module. We first declare three separate queues, one each for high
, medium
, and low
priority cars. Next, we create the constructor for the Controller
class. The constructor simply initializes our three queues with varying capacities based on the expected usage of each of the queues. Notice, that the high
priority queue has the smallest capacity while the low
priority queue has the largest capacity.
class Controller
declare HIGH as a Queue
declare MEDIUM as a Queue
declare LOW as a Queue
function Controller()
HIGH = new Queue(4)
MEDIUM = new Queue(6)
LOW = new Queue(8)
end function
Next, we need to define the interface function as defined above. We start with the receiveCar
function. There are three cases based on the priority
of the car
. If we look at the first case for priority == high
, we check to see if the high
queue is full before calling the enqueue
function to place the car
into the high
queue. If the queue is full, we raise an exception. We follow the exact same logic for the medium
and low
priority cars as well. Finally, there is a final else
that captures the case where the user did not specific either high
, medium
, or low
priority. In this case, an exception is raised.
function receiveCar(CAR, PRIORITY)
if PRIORITY == high
if HIGH.isFull()
raise exception
else
HIGH.enqueue(CAR)
end if
else PRIORITY == medium
if MEDIUM.isFull()
raise exception
else
MEDIUM.enqueue(CAR)
end if
else PRIORITY == low
if LOW.isFull()
raise exception
else
LOW.enqueue(CAR)
end if
else
raise exception
end if
end function
Now we will define the isEmpty
function. While we do not include an isFull
function due to the ambiguity of what that would mean and how it might be useful, the isEmpty
function will be useful for the windshield station to check before they request another call via the getCar
function.
As you can see below, the isEmpty
function simply returns the logical AND of each of the individual queue’s isEmpty
status. Thus, the function will return true if, and only if, each of the high
, medium
, and low
queues are empty.
function isEmpty()
return HIGH.isEmpty() and MEDIUM.isEmpty() and LOW.isEmpty()
end function
Finally, we are able to define the getCar
function. It is similar in structure to the receiveCar
function in that it checks each queue individually. In the case of getCar
, the key to the priority mechanism we are developing is in the order we check the queues. In this case, we check them in the expected order from high to low. If the high
queue is not empty, we get the car from that queue and return it to the calling function. If the high
queue is empty, then we check the medium
queue. Likewise, if the medium
queue is empty, we check the low queue
. Finally, if all of the queues are empty, we raise an exception.
function getCar()
if not HIGH.isEmpty()
return HIGH.dequeue()
else not MEDIUM.isEmpty()
return MEDIUM.dequeue()
else not LOW.isEmpty()
return LOW.dequeue()
else
raise exception
end if
end function
The following example shows how the Controller
class would work, given specific calls to receiveCar
and getCar
.
In this module we looked at the queue data structure. Queues are a “first in first out” data structure that use two main operations, enqueue and dequeue, to put data into the queue and to remove data from the queue. Queues are useful in many applications including the scheduling of tasks, sharing of resources, and processing of messages in the proper order.
This page is the main page for Lists
A list is a data structure that holds a sequence of data, such as the shopping list shown below. Each list has a head item and a tail item, with all other items placed linearly between the head and the tail. As we pick up items in the store, we will remove them, or cross them off the list. Likewise, if we get a text from our housemate to get some cookies, we can add them to the list as well.
^[Source: https://www.agclassroom.org/teacher/matrix/lessonplan.cfm?lpid=367]
Lists are actually very general structures that we can use for a variety of purposes. One common example is the history section of a web browser. The web browser actually creates a list of past web pages we have visited, and each time we visit a new web page it is added to the list. That way, when we check our history, we can see all the web pages we have visited recently in the order we visited them. The list also allows us to scroll through the list and select one to revisit or select another one to remove from the history altogether.
Of course, we have already seen several instances of lists so far in programming, including arrays, stacks, and queues. However, lists are much more flexible than the arrays, stacks, and queues we have studied so far. Lists allow us to add or remove items from the head, tail, or anywhere in between. We will see how we can actually implement stacks and queues using lists later in this module.
Most of us see and use lists every day. We have a list for shopping as we saw above, but we may also have a “to do” list, a list of homework assignments, or a list of movies we want to watch. Some of us are list-makers and some are not, but we all know a list when we see it.
^[Source: https://wiki.videolan.org/index.php?title=File:Basic_playlist_default.png&oldid=59730]
However, there are other lists in the real world that we might not even think of as a list. For instance, a playlist on our favorite music app is an example of a list. A music app lets us move forward or backward in a list or choose a song randomly from a list. We can even reorder our list whenever we want.
All the examples we’ve seen for stacks and queues can be thought of as lists as well. Stacks of chairs or moving boxes, railroad trains, and cars going through a tollbooth are all examples of special types of lists.
To this point, we have been using arrays as our underlying data structures for implementing linear data structures such as stacks and queues. Given that with stacks and queues we only put items into the array and remove from either the start or end of the data structure, we have been able to make arrays work. However, there are some drawbacks to using arrays for stacks and queues as well as for more general data structures.
While drawbacks 1 and 2 above can be overcome (albeit rather awkwardly) when using arrays for stacks and queues, drawback 3 becomes a real problem when trying to use more general list structures. If we insert an item into the middle of an array, we must move several other items “down” the array to make room.
If for example, if we want to insert the number 5 into the sorted array shown below, we have to carry out several steps:
i
,i
to the end of the list down one place location in the array,i
, andIn our example, step 1 will loop through each item of the array until we find the first number in the array greater than 5. As shown below, the number 7 is found in index 3.
Next, we will use another loop to move each item from index i
to the end of the array down by one index number as shown below.
Finally, we will insert our new number, 5, into the array at index 3 and increment tail to 8.
In this operation, if we have $N$ items, we either compare or move all of them, which would require $N$ operations. Of course, this operation runs in order $N$ time.
The same problem occurs when we remove an item from the array. In this case we must perform the following steps:
Instead of using arrays to try to hold our lists, a more flexible approach is to build our own list data structure that relies on a set of objects that are all linked together through references to each other. In the figure below we have created a list of numbers that are linked to each other. Each object contains both the number as well as a reference to the next number in the list. Using this structure, we can search through each item in the list by starting sequentially from the beginning and performing a linear search much like we did with arrays. However, instead of explicitly keeping track of the end of the list, we use the convention that the reference in the last item of the list is set to 0
, which we call null
. If a reference is set to null
we interpret this to mean that there is no next item in the list. This “linked list” structure also makes inserting items into the middle of the list easier. All we need to do is find the location in the list where we want to insert the item and then adjust the references to include the new item into the list.
The following figure shows a slightly more complex version of a linked list, called a “doubly linked list”. Instead of just having each item in the list reference the next item, it references the previous item in the list as well. The main advantage of doubly linked lists is that we can easily traverse the list in either the forward or backward direction. Doubly linked lists are useful in applications to implement undo and redo functions, and in web browser histories where we want the ability to go forward and backward in the history.
We will investigate each of these approaches in more detail below and will reimplement both our stack and queue operations using linked lists instead of arrays.
To solve the disadvantages of arrays, we need a data structure that allows us to insert and remove items in an ordered collection in constant time, independently from the number of items in the data structure.
The solution lies in creating our own specialized data structure where each node contains the data of interest as well as a reference, or pointer to the next node in the list. Of course, we would also need to have a pointer to the first node in the list, which we call the head
.
The figure below shows how we can construct a linked list data structure. The head
entity shown in the figure is a variable that contains a pointer to the first node in the list, in this case the node containing -2
. Each node in the list is an object that has two main parts: the data that it holds, and a pointer to the next item in the list.
The class representation of a singly linked list Node
is shown below. As discussed above, we have two attributes: data
, which holds the data of the node, and next
, which is a reference or pointer to the next node. We also use a constructor and a standard toString
operation to appropriately create a string representation for the data stored in the node.
A list is represented by a special variable head
that contains a pointer to the first item in the list. If the head
is null
(equal to 0
), then we have an empty list, which is a list with no items in it.
However, if we have items in the list, head
will point to a node as shown in the figure below. This node has some data (in this case -2
) and its own pointer that points to the next node in the list. As we can see in our example, head
points to a sequence of five nodes that makes up our list. The node with the data 67 in it is the last item in the list since its pointer is null
. We often refer to this condition as having a null
pointer.
While we will not show them explicitly in this module, each pointer is actually an address in memory. If we have a pointer to node X
in our node, that means that we actually store the address of X
in memory in our node.
To capture the necessary details for a singly linked list, we put everything into a class. The singly linked list class has two attributes:
list
—the pointer to the first node in the list, andsize
—an integer to keep track of the number of items in the list.Class SingleLinkedList
Node head
Integer size = 0
While we would normally create getter and setter methods for each attribute in the class, to simplify and clarify our pseudocode below we use “dot notation” to refer directly to the attributes in the node. The following table illustrates our usage in this module.
Use | Meaning |
---|---|
node |
![]() ![]() |
node.next |
![]() ![]() |
node.next.next |
![]() ![]() |
head |
![]() ![]() |
head.next |
![]() ![]() |
Given the structure of our linked list, we can easily insert a new node at any location in the list. However, for our purposes we are generally interested in inserting new nodes at the beginning of the list, at some specific location in the list, or in the appropriate order if the list is sorted.
Inserting a node at the beginning of a list is straightforward. We just have to be careful about the order we use when swapping pointers. In the prepend
code below, line 1 creates the new node to be stored in the list. Next, line 2 assigns the pointer in the new node to point to the pointer held by the head
. If there was an item already in the list, head
will point to the previous first item in the list. If the list was empty, head
will have been null
and thus the node.next
will become null
as well. Line 3 assigns head
to point to the new node and line 4 increments our size variable.
function prepend(data)
node = new Node(data) (1)
node.next = head (2)
head = node (3)
size = size + 1 (4)
end function
We show this process in the following example. The figure below shows the initial state as we enter the prepend
operation. Our list has three items in it, an “a”, “W”, and “Q” and we want to add the new node “M” in front of item “a”.
The figure below shows the effect of the first step of the operation. This step creates a new node for “M” and changes next
to point at the same node as the pointer held by head
, which is the address of the first item in the list, “a”.
The result of performing line 3 in the operation is shown below. In line 3 we simply change head
to point to our new node, instead of node “a”. Notice now that the new node has been fully inserted into the list.
And, if we redraw our diagram a bit, we get a nice neat list!
Since there are no loops in the prepend
operation, prepend
runs in constant time.
Inserting a node at a given index in the linked list is a little more difficult than inserting a node at the beginning of the list. First, we have to find the proper location to insert the new node before we can actually insert it. However, since we are given an index number, we simply need to follow the linked list to the appropriate index and then perform the insertion.
We do have a precondition to meet before we proceed, however. We need to make sure that the index provided to the operation is not less than 0 and that it is not greater than the size of the list, which is checked in line 2. If the precondition is not satisfied, we raise an exception in line 3.
function insertAt(data, index) (1)
if index < 0 or index > size (2)
raise exception (3)
elseif index == 0 (4)
prepend(data) (5)
else
curr = head.next (6)
prev = head (7)
node = new Node(data) (8)
for i = 1 to index – 1 (9)
prev = curr (10)
curr = curr.next (11)
end for (12)
prev.next = node (13)
node.next = curr (14)
size = size + 1 (15)
end if
end function
Lines 4 and 5 check to see if the index
is 0, which means that we want to insert it as the first item in the list. Since this is the same as the prepend
operation we’ve already defined, we simply call that operation. While this may not seem like a big deal, it is actually more efficient and helps us to simplify the code in the rest of the operation.
The operation uses curr
to keep track of which node in the list we are currently looking at, thus we initialize curr
to point at the first node in the list in line 6. To allow us to swap pointers once we find the appropriate place in the list, we keep track of the node previous to curr
as well by using the variable pre
. This variable is initialized to head
in line 7, and line 8 creates the new node we will insert into our list. After line 8, our list
, node
, and previous
pointers would look like the following (assuming the index passed in was 2
).
At this point we start our walk through the list using the for
loop in lines 9 - 12. Specifically, with an index
of 2 we will actually go through the loop exactly one time, from 1 to 1
. Each time through the loop, lines 10 and 11 will cause curr
and prev
to point at the next nodes in the list. At the end of one time through our loop, our example will be as shown below.
Now, the only thing left to do is update the next
pointer of node “3” to point at node
(line 13), and node.next
to point at curr
node (line 14), while line 15 increments the size attribute. The updated list is shown below.
The insertAt
operation, while being quite flexible and useful, can potentially loop through each node in the list. Therefore, it runs in order $N$ time.
When we want to insert an item into an ordered list, we need to find the right place in the list to actually insert the new node. Essentially, we need to search the list to find two adjacent nodes where the first node’s data is less than or equal to data
and the second node’s data is greater than data
. This process requires a linear search of the list.
function insertOrdered(data)
curr = head (1)
index = 0 (2)
while curr != NULL AND curr.data < data (3)
index = index + 1 (4)
curr = curr.next (5)
end while
insertAt(data, index) (6)
end function
Notice that we do not have a precondition since we will search the list for the appropriate place to insert the new node, even if the list is currently empty. In line 1, we create a curr
variable to point to the current node we are checking in the list, while in line 2 we initialize an index
variable to keep track of the index of curr
in the list.
Next, lines 3 – 5 implement a loop that searches through the list to find a node where the data in that node is greater than or equal to the data we are trying to put into the list. We also check to see if we are at the end of the list. Inside the loop, we increment index
and point curr
to the next node in the list.
Once we find the appropriate place in the list, we simply call the insertAt
operation to perform the actual insertion. Using the insertAt
operation provides a nice, easy to understand operation. However, we do suffer a little in efficiency since both operations loop through the list to the location where we want to insert the new data node. However, since the insertAt
call is not embedded within the loop, our insertOrdered
operation still runs in order $N$ time.
Since the previous example inserts the number 2 into the list (which falls between -1 and 3), the results of running the insertOrdered
operation will be the same output as the result of the insertAt
operation as shown above.
The process of removing a node from a linked list is fairly straightforward. First, we find the node we want to remove from the list and then change the next
pointer from the previous node in the list to the next node in the list. This effectively bypasses the node we want to remove. For instance, if we want to remove node “3” from the following list,
we simply change the next
pointer in the “-2” node to point to node “18” instead of node “3”. Since no other nodes are pointing at node “3” it is effectively removed from our list as shown below. We then return the data in that node to the requesting function. Eventually, the garbage collector will come along and realize that nothing is referencing node “3” and put it back into available memory.
Many programming languages, including Java and Python, automatically manage memory for us. So, as we create or delete objects in memory, a special subroutine called the garbage collector will find and remove any objects that we are no longer using. This will help free up memory so we can use it again.
Other languages, such as C, require us to do that manually. So, whenever we stop using objects, we would have to also remember to free the memory used by that object. Thankfully, we don’t have to worry about that in this course!
Removing an item at the beginning of a list is extremely simple. After checking our precondition in line 1, which ensures that the list is not empty, we create a temporary copy of the data in the first node in line 3 so we can return it later in line 6. However, the actual removal of the first node simply requires us to point head
to the second node in the list (line 4), which is found at head.next
. This effectively skips over the first node in the list. Finally, we decrement our size
variable in line 5 to keep it consistent with the number of nodes now in the list. Since there are no loops, removeFirst
runs in constant time.
function removeFirst() returns data
if size == 0 (1)
raise exception (2)
end if
temp = head.data (3)
head = head.next (4)
size = size – 1 (5)
return temp (6)
end function
Removing a node at a specific index in the list is more difficult than simply removing the first node in the list since we have to walk through the list to find the node we want to remove before we can actually remove it. In addition, while walking through the list, we must keep track of the current node as well as the previous node, since removing a node requires us to change the previous node in the list.
In our removeAt
operation below, we first check our precondition in line 1 to ensure that the index
provided is a valid index in the list. If it is, we check to see if index
is 0 in line 3 and call the removeFirst
operation in line 4 if it is.
function removeAt(index) returns data
if index < 0 OR index > size – 1 (1)
raise exception (2)
else if index == 0 (3)
return removeFirst() (4)
else
curr = head.next (5)
prev = head (6)
for i = 1 to index - 1 (7)
prev = curr (8)
curr = curr.next (9)
end for
prev.next = curr.next (10)
size = size – 1 (11)
return curr.data (12)
end if
end function
Before we start our walk through the list using the for
loop in lines 7 - 9, we declare two variables in lines 5 and 6:
curr
points to the current node in our walk, andprev
points to the node before curr
in the list.Lines 7 – 9 are the for
loop that we use to walk through the list to find the node at index
. We simply update the values of prev
and curr
each time through the loop to point to the next node in the list.
Once we complete the for
loop, curr
is pointing at the node we want to remove and prev
points at the previous node. Thus, we simply set prev.next = curr.next
to bypass the curr
node, decrement our size attribute by 1 to retain consistency, and return the data associated with the curr
node.
Like the insertAt
operation, the removeAt
operation uses a loop and thus runs in order $N$ time.
If we want to remove all occurrences of a specific node from the list, we take the data we want to remove from the list and then search all nodes in the list, removing any whose data matches the data from the input node. We will return the number of nodes removed from the list.
function removeData(data)
curr = head (1)
index = 0 (2)
while (curr != null) (3)
if (curr.data == data) (4)
removeAt(index) (5)
end if
index = index + 1 (6)
curr = curr.next (7)
end while
end function
To simplify this operation, we will call the removeAt
operation to actually remove the node from the list, leaving this operation to simply find the nodes whose data match the input data
. We will use two variables in this operation:
curr
will point to the current node we are checking, andindex
will keep track of the index of the curr
node so we can use the removeAt
operation.The main part of the operation is a while
loop (lines 3 – 7) that walks through the list, node by node. For each node in the list, we check if its data matches the input data
in line 5, and then call removeAt
to remove it from the list if it does. Then, each time through the loop, we increment index
in line 7 and then point curr
to the next node in the list in line 8. When our loop exits, we have removed all the nodes whose data matched the input data
.
Since we walk through the entire list, the removeData
operation runs in order $N$ time.
The list isEmpty
operation is rather straightforward. We simply need to return the truth of whether head.next
has a null pointer. Obviously, isEmpty
runs in constant time.
function isEmpty() returns boolean
return head == NULL (1)
end function
The peek
operation is designed to return the data from the last node inserted into the list, which is the node pointed at by head
. This is easy to do; however, we must ensure that we check to see if the list is empty in line 1 before we return the head.data
in line 3. Due to its simple structure, the run time of the peek
operation is constant.
function peek() returns data
if isEmpty() (1)
raise exception (2)
end if
return head.data (3)
end function
The peekEnd
operation is designed to return the first node inserted into the list, which is now the last node in the list. Like the peek
operation, we must ensure the list is not empty in line 1 before actually searching for the end of the list. Lines 3 – 5 walk through the list using a while
statement until curr.next
is null, signifying that curr
is pointing at the last node in the queue. Finally, line 6 simply returns the data
in the last node. Since peekEnd
must walk through the entire list to find the last node, it runs in order $N$ time.
function peekEnd() returns data
if isEmpty() (1)
raise exception (2)
end if
curr = head (3)
while curr.next != null (4)
curr = curr.next (5)
end while
return curr.data (6)
end function
If we implement a stack using a singly linked list, we can simplify many things about the implementation. First of all, we can totally remove the isFull
, doubleCapacity
, and halveCapacity
operations since we can grow and shrink our list-based stack as needed. The rest of the operations can be implemented directly with list operations. The front of the list will be the top of the stack since the operations to insert and remove items from the front of list are very efficient.
To implement our stack, we assume we have declared a linked list object named list
.
As expected, the push
operation is almost trivial. We simply call the list prepend
operation to insert the data into the front of the list.
function push(data)
list.prepend(data)
end function
Like push
, the pop
operation is also easily implemented using the removeFirst
operation of our linked list. As long as the list is not empty, we simply return the data from the first item when we remove it from the list.
function pop() returns data
if list.isEmpty() then
throw exception
end if
return list.removeFirst().data
end function
The isEmpty
operation is even easier. It is implemented by simply returning the results of the list isEmpty
operation.
function isEmpty() return boolean
return list.isEmpty()
end function
The stack peek
operation is also straightforward. To implement the peek
operation we simply return the results from the list peek
operation, which returns the data
from the first node in the list.
function peek() returns data
return list.peek()
end function
As we can see, each of the major operations for a stack is implemented easily using list operations that run in constant time. This makes list-based stacks extremely efficient data structures to use.
With singly linked lists, each node in the list had a pointer to the next node in the list. This structure allowed us to grow and shrink the list as needed and gave us the ability to insert and delete nodes at the front, middle, or end of the list. However, we often had to use two pointers when manipulating the list to allow us to access the previous node in the list as well as the current node. One way to solve this problem and make our list even more flexible is to allow a node to point at both the previous node in the list as well as the next node in the list. We call this a doubly linked list.
The concept of a doubly linked list is shown below. Here, each node in the list has a link to the next node and a link to the previous node. If there is no previous or next node, we set the pointers to null.
A doubly linked list node is the same as a singly linked list node with the addition of the previous
attribute that points to the previous node in the list as shown below.
The class representation of a doubly linked list Node
is shown below. As discussed above, we have three attributes:
data
, which holds the data of the node,next
, which is a pointer to the next node, andprevious
, which is a pointer to the previous node.We also use a constructor and the standard toString
operation to create a string for the data stored in the node.
As with our singly linked list, we start off a doubly linked list with a pointer to the first node in the list, which we call head
. However, if we also store the pointer to the last node in the list, we can simplify some of our insertion and removal operations as well as reduce the time complexity of operations that insert, remove, or peek at the last node in the list.
The figure below shows a doubly linked list with five nodes. The variable head
points to the first node in the list, while the variable tail
points to the last node in the list. Each node in the list now has two pointers, next
and previous
, which point to the appropriate node in the list. Notice that the first node’s previous
pointer is null, while the last node’s next
pointer is also null.
Like we did for our singly linked list, we capture the necessary details for our doubly linked list in a class. The doubly linked list class has four attributes:
head
—the pointer to the first node in the list,tail
—the pointer to the last node in the list,current
—the pointer to the current node used by the iterator, andsize
—an integer to keep track of the number of items in the list.Class DoubleLinkedList
Node head
Node tail
Node current
Integer size = 0
Insertion in doubly linked lists is similar to what we saw in the singly linked list with two exceptions:
previous
and next
pointers in all affected nodes.tail
pointer to make the insertion of data at the end of the list very efficient.Inserting at the beginning of a doubly linked list is almost as straightforward as in a singly linked list. We just need to make sure that we update the previous
pointer in each affected node. After creating the new node
in line 1, we check to see if the list is empty in line 2. If it is empty, then we only have to worry about updating the head
and tail
pointers to both point at node
in lines 3 and 4. If the list is not empty, we have the situation shown below.
To insert a node at the beginning of the list, we set head.previous
(the previous
pointer in the first node in the list) to point to the new node in line 5.
Next, we set the next
pointer in the new node to point to where head
is currently pointing in line 6, which is the first node in the list.
Finally, we update head
to point to the new node
and then increment the size in line 8.
With a little bit of reformatting, we can see that we’ve successfully inserted our new node in the list.
The pseudocode for this operation is given below.
function prepend(data)
node = new Node(data) (1)
if size == 0 (2)
head = node (3)
tail = node (4)
else
head.previous = node (5)
node.next = head (6)
head = node (7)
end
size = size + 1 (8)
end function
Since there are no loops in the prepend
code, the code runs in constant time.
Inserting a new node at some arbitrary index in a doubly linked list is similar to the same operation in a singly linked list with a couple of changes.
append
operation (defined below) to insert the node at the end of the list.previous
and next
pointers in all affected nodes.Lines 1 and 2 in the code check to ensure that the index is a valid number, then we check to see if we are inserting at the beginning or end of the list in lines 2 and 4. If we are, we simply call the appropriate method, either prepend
or append
.
If none of those conditions exist, then we start the process of walking through the list to find the node at index
. To do this, we need to create the new node we want to insert and then create a temporary pointer curr
that we will use to point to the current node on our walk.
Lines 10 and 11 form the loop that walks through the list until we get to the desired index. When the loop ends, we will want to insert the new node
between curr
and curr.next
. Thus, we set the appropriate values for the new node’s next
and previous
pointers in line 12 and 13. Then, we set the previous
pointer in node.next
to point back to node
in line 14 and then set curr.next
to point at the new node. Finally, we increment size
by 1.
function insertAt(data, index)
if index < 0 OR index > size (1)
raise exception (2)
else if index == 0 (3)
prepend(data) (4)
else if index == size (5)
append(data) (6)
else (7)
node = new node(data) (8)
curr = head (9)
for i = 1 to index -1 (10)
curr = curr.next (11)
end for
node.next = curr.next (12)
node.previous = curr (13)
node.next.previous = node (14)
curr.next = node (15)
size = size + 1 (16)
end if
end function
Although prepend
and append
run in constant time, the general case will cause us to walk through the list using a for
loop. Therefore, the insertAt
operation runs in order $N$ time.
Since we have added the tail
pointer to the doubly linked list class, we can make adding a node at the end of the list run in constant time instead of order $N$ time. In fact, if you look at the code below for the append
operation, it is exactly the same as the constant time prepend
operation except we have replaced the head
pointer with the tail
pointer in lines 5 – 7.
function append(data)
node = new node(data) (1)
if size == 0 (2)
tail = node (3)
head = node (4)
else
tail.next = node (5)
node.previous = tail (6)
tail = node (7)
end if
size = size + 1 (8)
end function
The process of removing a node from a doubly linked list is really no more difficult than from a singly linked list. The only difference is that instead of changing just one pointer, we now also need to modify the previous
pointer in the node following the node we want to remove. For instance, if we want to remove node “3” from the following list,
we simply modify the next
pointer in node “-2” to point to node “23”. Then, we modify the previous
pointer in node “23” to point to node “-2”. We then return the data in that node to the requesting function.
The remove
operation removes the first node in the list. First, we check to ensure that there is at least one node in the list in line 1 and raise an exception if there is not. Now, the process is simple. We simply create a temporary pointer temp
that points to the node we are going to delete in line 3 and then point head
to head.next
, which is the second node in the list. Then, in line 5, we check to see if the list is empty (head == null
) and set tail
to null
if it is (it was pointing at the node we just removed). If the list is not empty, we do not need to worry about updating tail
; however, we do need to set the previous
pointer of the first node in the list to null
(it was also pointing at the node we just removed). Finally, we decrement size
in line 8 and then return the data in the node we just removed in line 9. Obviously, the operation runs in constant time since there are no loops.
function remove() returns data
if size == 0 (1)
raise exception (2)
end if
temp = head (3)
head = head.next (4)
if head == null (5)
tail = null (6)
else
head.previous = null (7)
end if
size = size – 1 (8)
return temp.data (9)
end function
Removing a node at a specific index is very similar to the way we did it in singly linked lists. First, if we have an invalid index
number, we raise an exception in line 2. Otherwise we check for the special cases of removing the first or last node in the list and calling the appropriate operations in lines 3 – 6.
If we have no special conditions, we create a temporary pointer curr
and then walk through our list in lines 7 – 9. Once we reach the node we want to remove, we simply update the next node’s previous
pointer (line 10) and the previous node’s next
pointer (line 11) and we have effectively removed the node from the list. We then decrement size in line 12 and return the data
from the removed node in line 13.
Since the operation relies on a loop to walk through the list, the operation runs in order $N$ time.
function removeAt(index) returns data
if index < 0 OR index > size – 1 (1)
raise exception (2)
else if (index == 0) (3)
return remove() (4)
else if index == size – 1 (5)
return removeLast() (6)
else
curr = head.next; (7)
for i = 1 to index -1 (8)
curr = curr.next (9)
end for
curr.next.previous = curr.previous (10)
curr.previous.next = curr.next (11)
size = size – 1 (12)
return curr.data (13)
end if
end function
Since we have added the tail
pointer to the doubly linked list class, we can make removing a node at the end of the list run in constant time instead of running in order $N$ time. In fact, if you look at the code below for the removeLast
operation, it is almost exactly the same as the constant time removeFirst
operation. The only difference is that we have replaced the head
pointer with the tail
pointer and head.next
with tail.previous
in lines 3 – 7.
function removeLast() returns data
if size == 0 (1)
raise exception (2)