Introduction
Setting the Stage
Setting the Stage
Before we delve too deeply into how to reason about Object-Orientation and how to utilize it in your programming efforts, it would be useful to understand why object-orientation came to exist. This initial chapter seeks to explore the origins behind object-oriented programming.
Some key terms to learn in this chapter are:
By this point, you should be familiar enough with the history of computers to be aware of the evolution from the massive room-filling vacuum tube implementations of ENIAC, UNIVAC, and other first-generation computers to transistor-based mainframes like the PDP-1, and the eventual introduction of the microcomputer (desktop computers that are the basis of the modern PC) in the late 1970’s. Along with a declining size, each generation of these machines also cost less:
Machine | Release Year | Cost at Release | Adjusted for Inflation |
---|---|---|---|
ENIAC | 1945 | $400,000 | $5,288,143 |
UNIVAC | 1951 | $159,000 | $1,576,527 |
PDP-1 | 1963 | $120,000 | $1,010,968 |
Commodore PET | 1977 | $795 | $5,282 |
Apple II (4K RAM model) | 1977 | $1,298 | $8,624 |
IBM PC | 1981 | $1,565 | $4,438 |
Commodore 64 | 1982 | $595 | $1,589 |
This increase in affordability was also coupled with an increase in computational power. Consider the ENIAC, which computed at 100,000 cycles per second. In contrast, the relatively inexpensive Commodore 64 ran at 1,000,000 cycles per second, while the more pricy IBM PC ran at 4,770,000 cycles per second.
Not surprisingly, governments, corporations, schools, and even individuals purchased computers in larger and larger quantities, and the demand for software to run on these platforms and meet these customers’ needs likewise grew. Moreover, the sophistication expected from this software also grew. Edsger Dijkstra described it in these terms:
The major cause of the software crisis is that the machines have become several orders of magnitude more powerful! To put it quite bluntly: as long as there were no machines, programming was no problem at all; when we had a few weak computers, programming became a mild problem, and now we have gigantic computers, programming has become an equally gigantic problem.Edsger Dijkstra, The Humble Programmer (EWD340), Communications of the ACM
Coupled with this rising demand for programs was a demand for skilled software developers, as reflected in the following table of graduation rates in programming-centric degrees (the dashed line represents the growth of all bachelor degrees, not just computer-related ones):
Unfortunately, this graduation rate often lagged far behind the demand for skilled graduates, and was marked by several periods of intense growth (the period from 1965 to 1985, 1995-2003, and the current surge beginning around 2010). During these surges, it was not uncommon to see students hired directly into the industry after only a course or two of learning programming (coding boot camps are a modern equivalent of this trend).
All of these trends contributed to what we now call the Software Crisis.
At the 1968 NATO Software Engineering Conference held in Garmisch, Germany, the term “Software Crisis” was coined to describe the current state of the software development industry, where common problems included:
The software development industry sought to counter these problems through a variety of efforts:
This course will seek to instill many of these ideas and approaches into your programming practice through adopting them in our everyday work. It is important to understand that unless these practices are used, the same problems that defined the software crisis continue to occur!
In fact, some software engineering experts suggest the software crisis isn’t over, pointing to recent failures like the Denver Airport Baggage System in 1995, the Ariane 5 Rocket Explosion in 1996, the German Toll Collect system cancelled in 2003, the rocky healthcare.gov launch in 2013, and the massive vulnerabilities known as the Meltdown and Spectre exploits discovered in 2018.
One of the strategies that computer scientists employed to counter the software crisis was the development of new programming languages. These new languages would often 1) adopt new techniques intended to make errors harder to make while programming, and 2) remove problematic features that had existed in earlier languages.
Let’s take a look at a working (and in current use) program built using Fortran, one of the most popular programming languages at the onset of the software crisis. This software is the Environmental Policy Integrated Climate (EPIC) Model, created by researchers at Texas A&M:
Environmental Policy Integrated Climate (EPIC) model is a cropping systems model that was developed to estimate soil productivity as affected by erosion as part of the Soil and Water Resources Conservation Act analysis for 1980, which revealed a significant need for improving technology for evaluating the impacts of soil erosion on soil productivity. EPIC simulates approximately eighty crops with one crop growth model using unique parameter values for each crop. It predicts effects of management decisions on soil, water, nutrient and pesticide movements, and their combined impact on soil loss, water quality, and crop yields for areas with homogeneous soils and management. EPIC Homepage
You can download the raw source code here (click “EPIC v.1102” under “Source Code”). Open and unzip the source code, and open a file at random using your favorite code editor. See if you can determine what it does, and how it fits into the overall application.
Try this with a few other files. What do you think of the organization? Would you be comfortable adding a new feature to this program?
You probably found the Fortran code in the example difficult to wrap your mind around - and that’s not surprising, as more recent languages have moved away from many of the practices employed in Fortran. Additionally, our computing environment has dramatically changed since this time.
One clear example is symbol names for variables and procedures (functions) - notice that in the Fortran code they are typically short and cryptic: RT
, HU
, IEVI
, HUSE
, and NFALL
, for example. You’ve been told since your first class that variable and function names should express clearly what the variable represents or a function does. Would rainFall
, dailyHeatUnits
, cropLeafAreaIndexDevelopment
, CalculateWaterAndNutrientUse()
, CalculateConversionOfStandingDeadCropResidueToFlatResidue()
be easier to decipher? (Hint: the documentation contains some of the variable notations in a list starting on page 70, and some in-code documentation of global variables occurs in MAIN_1102.f90.).
Believe it or not, there was an actual reason for short names in these early programs. A six character name would fit into a 36-bit register, allowing for fast dictionary lookups - accordingly, early version of FORTRAN enforced a limit of six characters for variable names. However, it is easy to replace a symbol name with an automatically generated symbol during compilation, allowing for both fast lookup and human readability at a cost of some extra computation during compilation. This step is built into the compilation process of most current programming languages, allowing for arbitrary-length symbol names with no runtime performance penalty.
In addition to these less drastic changes, some evolutionary language changes had sweeping effects, changing the way we approach and think about how programs should be written and executed. These “big ideas” of how programming languages should work are often called paradigms. In the early days of computing, we had two common ones: imperative and functional.
At its core, imperative programming simply means the idea of writing a program as a sequence of commands, i.e. this Python script uses a sequence of commands to write to a file:
f = open("example.txt")
f.write("Hello from a file!")
f.close()
An imperative program would start executing the first line of code, and then continue executing line-by-line until the end of the file or a command to stop execution was reached. In addition to moving one line through the program code, imperative programs could jump to a specific spot in the code and continue execution from there, using a GOTO
statement. We’ll revisit that aspect shorty.
In contrast, functional programming consisted primarily of functions. One function was designated as the ‘main’ function that would start the execution of the program. It would then call one or more functions, which would in turn call more functions. Thus, the entire program consisted of function definitions. Consider this Python program:
def concatenateList(str, list):
if(len(list) == 0):
return str
elif(len(list) == 1):
head = list.pop(0)
return concatenateList(str + head, list)
else:
head = list.pop(0)
return concatenateList(str + head + ", ", list)
def printToFile(filename, body):
f = open(filename)
f.write(body)
def printListToFile(filename, list):
body = concatenateList("", list)
printToFile(filename, body)
def main():
printListToFile("list.txt", ["Dog", "Cat", "Mouse"])
main()
You probably see elements of your favorite higher-order programming language in both of these descriptions. That’s not surprising as modern languages often draw from multiple programming paradigms (after all, both the above examples were written in Python). This, too, is part of language evolution - language developers borrow good ideas as they find them.
But as languages continued to evolve and language creators sought ways to make programming easier, more reliable, and more secure to address the software crisis, new ideas emerged that were large enough to be considered new paradigms. Two of the most impactful of these new paradigms these are structured programming and object orientation. We’ll talk about each next.
Another common change to programming languages was the removal of the GOTO
statement, which allowed the program execution to jump to an arbitrary point in the code (much like a choose-your-own adventure book will direct you to jump to a page). The GOTO came to be considered too primitive, and too easy for a programmer to misuse 1.
While the GOTO
statement is absent from most modern programming languages the actual functionality remains, abstracted into control-flow structures like conditionals, loops, and switch statements. This is the basis of structured programming, a paradigm adopted by all modern higher-order programming languages.
Each of these control-flow structures can be represented by careful use of GOTO
statements (and, in fact the resulting assembly code from compiling these languages does just that). The benefit of using structured programming is it promotes “reliability, correctness, and organizational clarity” by clearly defining the circumstances and effects of code jumps 2.
You probably aren’t very familiar with GOTO statements because the structured programming paradigm has become so dominant. Before we move on, let’s see how some familiar structured programming patterns were originally implemented using GOTOs:
In C#, you are probably used to writing if statements with a true branch:
int x = 4;
if(x < 5)
{
x = x * 2;
}
Console.WriteLine("The value is:" + x);
With GOTOs, it would look something like:
int x = 4;
if(x < 5) goto TrueBranch;
AfterElse:
Console.WriteLine("The value is:" + x);
Environment.Exit(0);
TrueBranch:
x = x * 2;
goto AfterElse
Similarly, a C# if statement with an else branch:
int x = 4;
if(x < 5)
{
x = x * 2;
}
else
{
x = 7;
}
Console.WriteLine("The value is:" + x);
And using GOTOs:
int x = 4;
if(x < 5) goto TrueBranch;
goto FalseBranch;
AfterElse:
Console.WriteLine("The value is:" + x);
Environment.Exit(0);
TrueBranch:
x = x * 2;
goto AfterElse;
FalseBranch:
x = 7;
goto AfterElse;
Note that with the goto, we must tell the program to stop running explicitly with Environment.Exit(0)
or it will continue on to execute the labeled code (we could also place the TrueBranch and FalseBranch before the main program, and use a goto to jump to the main program).
Loops were also originally constructed entirely from GOTOs, so the familiar while loop:
int times = 5;
while(times > 0)
{
Console.WriteLine("Counting Down: " + times);
times = times - 1;
}
Can be written:
int times = 5;
Test:
if(times > 0) goto Loop;
Environment.Exit(0);
Loop:
Console.WriteLine("Counting Down: " + times);
times = times - 1;
goto Test;
The do while
and for
loops are implemented similarly. As you can probably imagine, as more control flow is added to a program, using GOTOs and corresponding labels to jump to becomes very hard to follow.
Interestingly, the C# language does have a goto statement (Java does not). Likely this is because C# was designed to compile to intermediate language like Visual Basic, which is an evolution of BASIC which was old enough to have a goto.
Accordingly, the above examples with the goto
statements are valid C# code. You can even compile and run them. However, you should avoid using goto
statements in your code.
Dijkstra, Edgar (1968). “Go To Statement Considered Harmful” ↩︎
Wirth, Nicklaus (1974). “On the Composition of Well-Structured Programs” ↩︎
The object-orientation paradigm was similarly developed to make programming large projects easier and less error-prone.
The term “Object Orientation” was coined by Alan Kay while he was a graduate student in the late 60’s. Alan Kay, Dan Ingalls, Adele Goldberg, and others created the first object-oriented language, Smalltalk, which became a very influential language from which many ideas were borrowed. To Alan, the essential core of object-orientation was three properties a language could possess: 1
Let’s break down each of these ideas, and see how they helped address some of the problems we’ve identified in this chapter.
Encapsulation refers to breaking programs into smaller units that are easier to read and reason about. In an object-oriented language these units are classes and objects, and the data contained in these units is protected from being changed by code outside the unit through information hiding.
Message Passing allows us to send well-defined messages between objects. This gives us a well-defined and controlled method for accessing and potentially changing the data contained in an encapsulated unit. In an object oriented language, calling a method on an object is a form of message passing, as are events.
Dynamic Binding means we can have more than one possible way to handle messages and the appropriate one can be determined at run-time. This is the basis for polymorphism, an important idea in many object-oriented languages.
Remember these terms and pay attention to how they are implemented in the languages you are learning. They can help you understand the ideas that inspired the features of these languages.
We’ll take a deeper look at each of these in the next few chapters. But before we do, you might want to see how language popularity has fared since the onset of the software crisis, and how new languages have appeared and grown in popularity in this animated chart from Data is Beautiful:
Interestingly, the four top languages in 2019 (Python, JavaScript, Java, and C#) all adopt the object-oriented paradigm - though the exact details of how they implement it vary dramatically.
Eric Elliot, “The Forgotten History of Object-Oriented Programming,” Medium, Oct. 31, 2018. ↩︎
In this chapter, we’ve discussed the environment in which object-orientation emerged. Early computers were limited in their computational power, and languages and programming techniques had to work around these limitations. Similarly, these computers were very expensive, so their purchasers were very concerned about getting the largest possible return on their investment. In the words of Niklaus Wirth:
Tricks were necessary at this time, simply because machines were built with limitations imposed by a technology in its early development stage, and because even problems that would be termed "simple" nowadays could not be handled in a straightforward way. It was the programmers' very task to push computers to their limits by whatever means available.
As computers became more powerful and less expensive, the demand for programs (and therefore programmers) grew faster than universities could train new programmers. Unskilled programmers, unwieldy programming languages, and programming approaches developed to address the problems of older technology led to what became known as the “software crisis” where many projects failed or floundered.
This led to the development of new programming techniques, languages, and paradigms to make the process of programming easier and less error-prone. Among the many new programming paradigms was structured programming paradigm, which introduced control-flow structures into programming languages to help programmers reason about the order of program execution in a clear and consistent manner.
Also developed during this time was the object-oriented paradigm, which brings together four big ideas: encapsulation & information hiding, message passing, and dynamic binding. We will be studying this paradigm, its ideas, and implementation in the C# language throughout this course.