Chapter 16

Data-Driven Websites

From desktop GUIs to the World Wide Web!

Subsections of Data-Driven Websites

Introduction

Up to this point, we’ve mainly focused on developing an application that can be executed locally on a computer. To use an application like this, users would have to download it and possibly install it on their system. Likewise, as developers, we’ll have to create a release that they can install, and we may have to make sure that the release is compatible with various different operating systems and computer architectures.

For decades, this was really the state of the art of computer programming. However, starting in the 2000s, things began to drastically change with the rise of Web 2.0 and interactive website. Soon, a whole new type of application, the web application, became commonplace.

Today, outside of video games and a few specialized applications, many computer users primarily interact with web applications instead of applications installed locally on their computer. Some great examples are the various social media sites such as Facebook and YouTube, productivity tools such as Microsoft Office 365 or Google Drive, and even communication platforms such as Slack and Discord.

To make things even more complicated, many of those web applications include versions that you can install and run locally on your computer or smartphone, but in many cases they are simply a lightweight wrapper around the web application. In that way, it appears to be running as a local application, but it is really just a version of the same web application that is stored locally.

In this chapter, we’re going to pivot our focus to building a web application. To do that, we’ll have to introduce many new concepts to lay the foundation for working in the web, so there will be lots of new content and ideas in this chapter.

Some key terms that we’ll cover:

  • Hypertext Markup Language (HTML)
  • HTML Tags
  • Cascading Style Sheets (CSS)
  • CSS Selectors
  • JavaScript (JS)
  • Hypertext Transfer Protocol (HTTP)
  • Static Web Servers, such as Apache, IIS, and nginx
  • Dynamic Web Pages
  • Templates
  • Template Rendering
  • Web Frameworks, such as Spring (Java) and Flask (Python)
  • Web Requests
  • Web Responses
  • Routing
  • Uniform Resource Locator (URL) and Uniform Resource Identifier (URI)

At the end of this chapter, you’ll be able to generate your own data driven web pages using a web framework and its built-in templating engine!

HTML

Content Note

Much of the content in this page was adapted from Nathan Bean’s CIS 400 course at K-State, with the author’s permission. That content is licensed under a Creative Commons BY-NC-SA license.

YouTube Video

Video Materials

The World Wide Web was the brainchild of Sir Tim Berners-Lee. It was conceived as a way to share information across the Internet; in Sir Berners-Lee’s own words describing the idea as he first conceived it:

This project is experimental and of course comes without any warranty whatsoever. However, it could start a revolution in information access.

Clearly that revolution has come to pass. The web has become part of our daily lives.

There were three key technologies that Sir Tim Berners-Lee proposed and developed. These remain the foundations upon which the web runs even today. Two are client-side, and determine how web pages are interpreted by browsers. These are:

  • Hypertext Markup Language
  • Cascading Style Sheets

HTML

Hypertext Markup Language (HTML), is one of the three core technologies of the world wide web, along with Cascading Style Sheets (CSS) and Javascript (JS). Each of these technologies has a specific role to play in delivering a website. HTML defines the structure and contents of the web page. It is a markup language, similar to XML (indeed, HTML is based on the SGML, or Standardized General Markup Language, standard, which XML is also based on).

HTML Elements

The structure of HTML consists of various tags. For example, a button in HTML looks like this:

<button onclick="doSomething">
    Do Something
</button>

HTML elements have and opening and closing tag, and can have additional HTML content nested inside these tags. HTML tags can also be self-closing, as is the case with the line break tag:

<br />

Let’s explore the parts of an HTML element in more detail.

HTML Element Structure HTML Element Structure1

The Start Tag

The start tag is enclosed in angle brackets (< and >). The angle brackets differentiate the text inside them as being HTML elements, rather than text. This guides the browser to interpret them correctly.

Angle Brackets in HTML

Because angle brackets are interpreted as defining HTML tags, you cannot use those characters to represent greater than and less than signs. Instead, HTML defines escape character sequences to represent these and other special characters. Greater than is &gt;, less than is &lt;. A full list can be found on mdn.

The Tag Name

Immediately after the < is the tag name. In HTML, tag names like button should be expressed in lowercase letters. This is a convention (as most browsers will happily accept any mixture of uppercase and lowercase letters), but is very important when using popular modern web technologies like Razor and React, as these use Camel case tag names to differentiate between HTML and components they inject into the web page.

The Attributes

After the tag name come optional attributes, which are key-value pairs expressed as key="value". Attributes should be separated from each other and the tag name by whitespace characters (any whitespace will do, but traditionally spaces are used). Different elements have different attributes available - and you can read up on what these are by visiting the MDN article about the specific element.

However, several attributes bear special mention:

  • The id attribute is used to assign a unique id to an element, i.e. <button id="that-one-button">. The element can thereafter be referenced by that id in both CSS and JavaScript code. An element ID must be unique in an HTML page, or unexpected behavior may result!

  • The class attribute is also used to assign an identifier used by CSS and JavaScript. However, classes don’t need to be unique; many elements can have the same class. Further, each element can be assigned multiple classes, as a space-delimited string, i.e. <button class="large warning"> assigns both the classes “large” and “warning” to the button.

Also, some web technologies (like Angular) introduce new attributes specific to their framework, taking advantage of the fact that a browser will ignore any attributes it does not recognize.

The Tag Content

The content nested inside the tag can be plain text, or another HTML element (or collection of elements). HTML elements can have multiple child elements. Indentation should be used to keep your code legible by indenting any nested content, i.e.:

<div>
    <h1>A Title</h1>
    <p>This is a paragraph of text that is nested inside the div</p>
    <p>And this is another paragraph of text</p>
</div>

The End Tag

The end tag is also enclosed in angle brackets (< and >). Immediately after the < is a forward slash /, and then the tag name. You do not include attributes in a end tag.

If the element has no content, the end tag can be combined with the start tag in a self-closing tag, i.e. the <input> tag is typically written as self-closing:

<input id="first-name" type="text" placeholder="Your first name" />

Text in HTML

Text in HTML works a bit differently than you might expect. Most notably, all white space is converted into a single space. Thus, the lines:

<blockquote>
    If you can keep your head when all about you   
        Are losing theirs and blaming it on you,   
    If you can trust yourself when all men doubt you,
        But make allowance for their doubting too;   
    If you can wait and not be tired by waiting,
        Or being lied about, don’t deal in lies,
    Or being hated, don’t give way to hating,
        And yet don’t look too good, nor talk too wise:
    <i>-Rudyard Kipling, exerpt from "If"</i>
</blockquote>

Would be rendered:

If you can keep your head when all about you Are losing theirs and blaming it on you, If you can trust yourself when all men doubt you, But make allowance for their doubting too; If you can wait and not be tired by waiting, Or being lied about, don’t deal in lies, Or being hated, don’t give way to hating, And yet don’t look too good, nor talk too wise: -Rudyard Kipling, exerpt from "If"

If, for some reason you need to maintain formatting of the included text, you can use the <pre> element (which indicates the text is preformatted):

<blockquote>
    <pre>
If you can keep your head when all about you   
    Are losing theirs and blaming it on you,   
If you can trust yourself when all men doubt you,
    But make allowance for their doubting too;   
If you can wait and not be tired by waiting,
    Or being lied about, don’t deal in lies,
Or being hated, don’t give way to hating,
    And yet don’t look too good, nor talk too wise:
    </pre>
    <i>-Rudyard Kipling, exerpt from "If"</i>
</blockquote>

Which would be rendered:

If you can keep your head when all about you   
    Are losing theirs and blaming it on you,   
If you can trust yourself when all men doubt you,
    But make allowance for their doubting too;   
If you can wait and not be tired by waiting,
    Or being lied about, don’t deal in lies,
Or being hated, don’t give way to hating,
    And yet don’t look too good, nor talk too wise:
    
-Rudyard Kipling, exerpt from "If"

Note that the <pre> preserves all formatting, so it is necessary not to indent its contents.

Alternatively, you can denote line breaks with <br/>, and non-breaking spaces with &nbsp;:

<blockquote>        
    If you can keep your head when all about you<br/>
    &nbsp;&nbsp;&nbsp;&nbsp;Are losing theirs and blaming it on you,<br/>   
    If you can trust yourself when all men doubt you,<br/>
    &nbsp;&nbsp;&nbsp;&nbsp;But make allowance for their doubting too;<br/>
    If you can wait and not be tired by waiting,<br/>
    &nbsp;&nbsp;&nbsp;&nbsp;Or being lied about, don’t deal in lies,<br/>
    Or being hated, don’t give way to hating,<br/>
    &nbsp;&nbsp;&nbsp;&nbsp;And yet don’t look too good, nor talk too wise:<br/>    
    <i>-Rudyard Kipling, exerpt from "If"</i>
</blockquote>

Which renders:

If you can keep your head when all about you
    Are losing theirs and blaming it on you,
If you can trust yourself when all men doubt you,
    But make allowance for their doubting too;
If you can wait and not be tired by waiting,
    Or being lied about, don’t deal in lies,
Or being hated, don’t give way to hating,
    And yet don’t look too good, nor talk too wise:

-Rudyard Kipling, exerpt from "If"

Additionally, as a program you may want to use the the <code> element in conjunction with the <pre> element to display preformatted code snippets in your pages. There are even some JavaScript libraries available to automatically add syntax colors to your code.

HTML Comments

HTML comments are identical to XML comments (as both inherited from SGML). Comments start with the sequence <!-- and end with the sequence -->, i.e.:

<!-- This is an example of a HTML comment -->

Basic Page Structure

HTML5 (the current HTML standard) pages have an expected structure that you should follow. This is:

<!DOCTYPE html>
<html>
    <head>
        <title><!-- The title of your page goes here --></title>
        <!-- other metadata about your page goes here -->
    </head>
    <body>
        <!-- The contents of your page go here -->
    </body>
</html>

HTML Elements

Rather than include an exhaustive list of HTML elements, I will direct you to the list provided by MDN. However, it is useful to recognize that elements can serve different purposes:

There are more tags than this, but these are the most commonly employed, and the ones you should be familiar with.

Learning More

The MDN HTML Docs are recommended reading for learning more about HTML.

Subsections of HTML

CSS

Content Note

Much of the content in this page was adapted from Nathan Bean’s CIS 400 course at K-State, with the author’s permission. That content is licensed under a Creative Commons BY-NC-SA license.

Cascading Style Sheets (CSS) is the second core web technology of the web. It defines the appearance of web pages by applying stylistic rules to matching HTML elements. CSS is normally declared in a file with the .css extension, separate from the HTML files it is modifying, though it can also be declared within the page using the <style> element, or directly on an element using the style attribute.

CSS Rules

A CSS rule consists of a selector and a definition block, i.e.:

h1
{
    color: red;
    font-weight: bold;
}

CSS Selectors

A CSS selector determines which elements the associated definition block apply to. In the above example, the h1 selector indicates that the style definition supplied applies to all <h1> elements. The selectors can be:

  • By element type, indicated by the name of the element. I.e. the selector p applies to all <p> elements.
  • By the element id, indicated by the id prefixed with a #. I.e. the selector #foo applies to the element <span id="foo">.
  • By the element class, indicated by the class prefixed with a .. I.e. the selector .bar applies to the elements <div class="bar">, <span class="bar none">, and <p class="alert bar warning">.

CSS selectors can also be combined in a number of ways, and psuedo-selectors can be applied under certain circumstances, like the :hover psudo-selector which applies only when the mouse cursor is over the element.

You can read more on MDN’s CSS Selectors Page.

CSS Definition Block

A CSS definition block is bracketed by curly braces and contains a series of key-value pairs in the format key=value;. Each key is a property that defines how an HTML Element should be displayed, and the value needs to be a valid value for that property.

Measurements can be expressed in a number of units, from pixels (px), points (pt), the font size of the parent (em), the font size of the root element (rem), a percentage of the available space (%), or a percentage of the viewport width (vw) or height (vh). See MDN’s CSS values and units for more details.

Other values are specific to the property. For example, the cursor property has possible values help, wait, crosshair, not-allowed, zoom-in, and grab. You should use the MDN documentation for a reference.

Styling Text

One common use for CSS is to change properties about how the text in an element is rendered. This can include changing attributes of the font (font-style, font-weight, font-size, font-family), the color, and the text (text-align, line-break, word-wrap, text-indent, text-justify). These are just a sampling of some of the most commonly used properties.

Styling Elements

A second common use for CSS is to change properties of the element itself. This can include setting dimensions (width, height), adding margins, borders, and padding.

These values provide additional space around the content of the element, following the CSS Box Model:

CSS Box Model CSS Box Model1

Providing Layout

The third common use for CSS is to change how elements are laid out on the page. By default HTML elements follow the flow model, where each element appears on the page after the one before it. Some elements are block level elements, which stretch across the entire page (so the next element appears below it), and others are inline and are only as wide as they need to be to hold their contents, so the next element can appear to the right, if there is room.

The float property can make an element float to the left or right of its container, allowing the rest of the page to flow around it.

Or you can swap out the layout model entirely by changing the display property to flex or grid. For learning about these two display models, the CSS-Tricks A Complete Guide to Flexbox and A Complete Guide to Grid are recommended reading. These can provide quite powerful layout tools to the developer.

Learning More

This is just the tip of the iceberg of what is possible with CSS. Using CSS media queries can change the rules applied to elements based on the size of the device it is viewed on, allowing for responsive design. CSS Animation can allow properties to change over time, making stunning visual animations easy to implement. And CSS can also carry out calculations and store values, leading some computer scientists to argue that it is a Turing Complete language.

The MDN Cascading Stylesheets Docs and CSS Tricks are recommended reading to learn more about CSS and its uses.

JavaScript

Content Note

Much of the content in this page was adapted from Nathan Bean’s CIS 400 course at K-State, with the author’s permission. That content is licensed under a Creative Commons BY-NC-SA license.

JavaScript (or ECMAScript, which is the standard JavaScript is derived from), was originally developed for Netscape Navigator by Brendan Eich. The original version was completed in just 10 days. The name “JavaScript” was a marketing move by Netscape as they had just secured the rights to use Java Applets in their browser, and wanted to tie the two languages together. Similarly, they pushed for a Java-like syntax, which Brendan accommodated. However, he also incorporated functional behaviors based on the Scheme language, and drew upon Self’s implementation of object-orientation. The result is a language that may look familiar to you, but often works in unexpected ways.

JavaScript is a Dynamically Typed Language

Unlike the statically-typed Java language, JavaScript has dynamic types like Python. This means that we always declare variables using the var keyword, i.e.:

var i = 0;
var story = "Jack and Jill went up a hill...";
var pi = 3.14;

The type of the variable is inferred when it is set, and the type can change with a new assignment, i.e.:

var i = 0; // i is an integer
i = "The sky is blue"; // now i is a string
i = true; // now i is a boolean

This would cause an error in C#, but is perfectly legal in JavaScript. Because JavaScript is dynamically typed, it is impossible to determine type errors until the program is run.

In addition to var, variables can be declared with the const keyword (for constants that cannot be re-assigned), or the let keyword (discussed below).

JavaScript Types

While the type of a variable is inferred, JavaScript still supports types. You can determine the type of a variable with the typeof() function. The available types in JavaScript are:

  • integers (declared as numbers without a decimal point)
  • floats (declared as numbers with a decimal point)
  • booleans (the constants true or false)
  • strings (declared using double quotes ("I'm a string"), single quotes 'Me too!', or backticks `I'm a template string ${2 + 3}`) which indicate a template string and can execute and concatenate embedded JavaScript expressions.
  • lists (declared using square brackets, i.e. ["I am", 2, "listy", 4, "u"]), which are a generic catch-all data structure, which can be treated as an array, list, queue, or stack.
  • objects (declared using curly braces or constructed with the new keyword, discussed later)

In JavaScript, there are two keywords that represent a null value, undefined and null. These have a different meaning: undefined refers to values that have not yet been initialized, while null must be explicitly set by the programmer (and thus intentionally meaning nothing).

JavaScript is a Functional Language

As suggested in the description, JavaScript is a functional language incorporating many ideas from Scheme. In JavaScript we declare functions using the function keyword, i.e.:

function add(a, b) {
  return a + b;
}

We can also declare an anonymous function (one without a name):

function (a, b) {
  return a + b;
}

or with the lambda syntax:

(a,b) => {
  return a + b;
}

In JavaScript, functions are first-class objects, which means they can be stored as variables, i.e.:

var add = function(a,b) {
  return a + b;
}

Added to arrays:

var math = [
  add,
  (a,b) => {return a - b;},
  function(a,b) { a * b; },
]

Or passed as function arguments.

JavaScript has Function Scope

Variable scope in JavaScript is bound to functions. Blocks like the body of an if or for loop do not declare a new scope. Thus, this code:

for(var i = 0; i < 3; i++;)
{
  console.log("Counting i=" + i);
}
console.log("Final value of i is: " + i);

Will print:

Counting i=0
Counting i=1
Counting i=2
Final value of i is: 3

Because the i variable is not scoped to the block of the for loop, but rather, the function that contains it.

The keyword let was introduced in ECMAScript version 6 as an alternative for var that enforces block scope. Using let in the example above would result in a reference error being thrown, as i is not defined outside of the for loop block.

JavaScript is Event-Driven

JavaScript was written to run within the browser, and was therefore event-driven from the start. It uses the event loop and queue pattern we saw in C#. For example, we can set an event to occur in the future with setTimeout():

setTimeout(function(){console.log("Hello, future!")}, 2000);

This will cause “Hello, Future!” to be printed 2 seconds (2000 milliseconds) in the future (notice too that we can pass a function to a function).

JavaScript is Object-Oriented

As suggested above, JavaScript is object-oriented, but in a manner more similar to Self than to C#. For example, we can declare objects literally:

var student = {
  first: "Mark",
  last: "Delaney"
}

Or we can write a constructor, which in JavaScript is simply a function we capitalize by convention:

function Student(first, last){
  this.first = first;
  this.last = last;
}

And invoke with the new keyword:

var js = new Student("Jack", "Sprat");

Objects constructed from classes have a prototype, which can be used to attach methods:

Student.prototype.greet = function(){
  console.log(`Hello, my name is ${this.first} ${this.last}`);
}

Thus, js.greet() would print Hello, my name is Jack Sprat;

ECMAScript 6 introduced a more familiar form of class definition:

class Student{
  constructor(first, last) {
    this.first = first;
    this.last = last;
    this.greet = this.greet.bind(this);
  }
  greet(){
    console.log(`Hello, my name is ${this.first} ${this.last}`);
  }
}

However, because JavaScript uses function scope, the this in the method greet would not refer to the student constructed in the constructor, but the greet() method itself. The constructor line this.greet = this.greet.bind(this); fixes that issue by binding the greet() method to the this of the constructor.

The Document Object Model

The Document Object Model (DOM) is a tree-like structure that the browser constructs from parsed HTML to determine size, placement, and appearance of the elements on-screen. In this, it is much like the elements tree we used with Windows Presentation Foundation (which was most likely inspired by the DOM). The DOM is also accessible to JavaScript - in fact, one of the most important uses of JavaScript is to manipulate the DOM.

You can learn more about the DOM from MDN’s Document Object Model documentation entry.

HTTP

Content Note

Much of the content in this page was adapted from Nathan Bean’s CIS 400 course at K-State, with the author’s permission. That content is licensed under a Creative Commons BY-NC-SA license.

YouTube Video

Video Materials

At the heart of the world wide web is the Hypertext Transfer Protocol (HTTP). This is a protocol defining how HTTP servers (which host web pages) interact with HTTP clients (which display web pages).

It starts with a request initiated from the web browser (the client). This request is sent over the Internet using the TCP protocol to a web server. Once the web server receives the request, it must decide the appropriate response - ideally sending the requested resource back to the browser to be displayed. The following diagram displays this typical request-response pattern.

HTTP’s request-response pattern HTTP’s request-response pattern

This HTTP request-response pattern is at the core of how all web applications communicate. Even those that use websockets begin with an HTTP request.

The HTTP Request

A HTTP Request is just text that follows a specific format and sent from a client to a server. It consists of one or more lines terminated by a CRLF (a carriage return and a line feed character, typically written \r\n in most programming languages).

  1. A request-line describing the request
  2. Additional optional lines containing HTTP headers. These specify details of the request or describe the body of the request
  3. A blank line, which indicates the end of the request headers
  4. An optional body, containing any data belonging of the request, like a file upload or form submission. The exact nature of the body is described by the headers.

The HTTP Response

Similar to an HTTP Request, an HTTP response consists of one or more lines of text, terminated by a CRLF (sequential carriage return and line feed characters):

  1. A status-line indicating the HTTP protocol, the status code, and a textual status
  2. Optional lines containing the Response Headers. These specify the details of the response or describe the response body
  3. A blank line, indicating the end of the response metadata
  4. An optional response body. This will typically be the text of an HTML file, or binary data for an image or other file type, or a block of bytes for streaming data.

Making a Request

With our new understanding of HTTP requests and responses as consisting of streams of text that match a well-defined format, we can try manually making our own requests, using a Linux command line tool netcat.

In Codio, we can open a terminal window and type the following command:

nc google.com 80

The nc portion is the netcat executable - we’re asking Linux to run netcat for us, and providing two command-line arguments, google.com and 80, which are the webserver we want to talk to and the port we want to connect to (port 80 is the default port for HTTP requests).

Now that a connection is established, we can stream our request to Google’s server:

GET / HTTP/1.1

The GET indicates we are making a GET request, i.e. requesting a resource from the server. The / indicates the resource on the server we are requesting (at this point, just the top-level page). Finally, the HTTP/1.1 indicates the version of HTTP we are using.

Note that you need to press the return key twice after the GET line, once to end the line, and the second time to end the HTTP request. Pressing the return key in the terminal enters the CRLF character sequence (Carriage Return & Line Feed) the HTTP protocol uses to separate lines

Once the second return is pressed, a whole bunch of text will appear in the terminal. This is the HTTP Response from Google’s server. We’ll take a look at that next.

Reading the Response

Scroll up to the top of the request, and you should see something like:

HTTP/1.1 200 OK
Date: Wed, 16 Jan 2019 15:39:33 GMT
Expires: -1
Cache-Control: private, max-age=0
Content-Type: text/html; charset=ISO-8859-1
P3P: CP="This is not a P3P policy! See g.co/p3phelp for more info."
Server: gws
X-XSS-Protection: 1; mode=block
X-Frame-Options: SAMEORIGIN
Set-Cookie: 1P_JAR=2019-01-16-15; expires=Fri, 15-Feb-2019 15:39:33 GMT; path=/; domain=.google.com
Set-Cookie: NID=154=XyALfeRzT9rj_55NNa006-Mmszh7T4rIp9Pgr4AVk4zZuQMZIDAj2hWYoYkKU6Etbmjkft5YPW8Fens07MvfxRSw1D9mKZckUiQ--RZJWZyurfJUyRtoJyTfSOMSaniZTtffEBNK7hY2M23GAMyFIRpyQYQtMpCv2D6xHqpKjb4; expires=Thu, 18-Jul-2019 15:39:33 GMT; path=/; domain=.google.com; HttpOnly
Accept-Ranges: none
Vary: Accept-Encoding

<!doctype html>...

The first line indicates that the server responded using the HTTP 1.1 protocol, the status of the response is a 200 code, which corresponds to the human meaning “OK”. In other words, the request worked. The remaining lines are headers describing aspects of the request - the Date, for example, indicates when the request was made, and the path indicates what was requested. Most important of these headers, though is the Content-Type header, which indicates what the body of the response consists of. The content type text/html means the body consists of text, which is formatted as HTML – in other words, a webpage.

Everything after the blank line is the body of the response - in this case, the page content as HTML text. If you scroll far enough through it, you should be able to locate all of the HTML elements in Google’s search page.

That’s really all there is with a HTTP request and response. They’re just streams of data. A webserver just receives a request, processes it, and sends a response.

You can learn a bit more about HTTP and see a similar example in the HTTP lecture from the CIS 527 - Enterprise Systems Administration course.

Subsections of HTTP

Static Web Servers

Content Note

Much of the content in this page was adapted from Nathan Bean’s CIS 400 course at K-State, with the author’s permission. That content is licensed under a Creative Commons BY-NC-SA license.

Now that we’ve learned about all of the core technologies used to create and deliver webpages, let’s take a deeper look at the software that runs on the servers that are responsible for receiving HTTP requests and responding to them. We typically call these programs web servers.

The earliest web servers simply served files held in a directory, and in fact many web servers today are still capable of doing exactly that. For example, K-State Computer Science provides a basic web server that can be used to host a personal web page for any faculty, staff, or students in the department. According to the instructions, all you have to do is place the files in a special folder named public_html on the central cslinux.cs.ksu.edu server, and then they can be accessed at the address http://people.cs.ksu.edu/~[eid]/ where [eid] is your K-State eID.

Apache is one of the oldest and most popular open-source web servers in the world. Microsoft introduced their own web server, Internet Information Services (IIS) around the same time. Unlike Apache, which can be installed on most operating systems, IIS only runs on the Windows Server operating system. More recently, the nginx server has become very popular due to its focus on high performance.

As the web grew in popularity, there was tremendous demand to supplement these static pages with pages created on the fly in response to requests - allowing pages to be customized for a particular user, or displaying the most up-to-date information from a database. In other words, dynamic pages. We’ll take a look at these next.

Dynamic Web Pages

Content Note

Much of the content in this page was adapted from Nathan Bean’s CIS 400 course at K-State, with the author’s permission. That content is licensed under a Creative Commons BY-NC-SA license.

Modern websites are more often full-fledged applications than collections of static files. These applications remain built upon the foundations of the core web technologies of HTML, CSS, and JavaScript. In fact, the client-side application is typically built of exactly these three kinds of files! So how can we create a dynamic web application?

One of the earliest approaches was to write a program to dynamically create the HTML file that was being served. Consider this method:

public String GeneratePage() {
    StringBuilder sb = new StringBuilder();
    sb.append("<!DOCTYPE html>\n");
    sb.append("<html>\n");
    sb.append("<head>\n");
    sb.append("<title>My Dynamic Page</title>\n");
    sb.append("</head>\n");
    sb.append("<body>\n");
    sb.append("<h1>Hello, world!</h1>\n");
    sb.append("<p>Time on the server is ");
    SimpleDateFormat formatter= new SimpleDateFormat("yyyy-MM-dd 'at' HH:mm:ss z");
    Date date = new Date(System.currentTimeMillis());
    sb.append(formatter.format(date) + "\n");
    sb.append("</p>\n");
    sb.append("</body>\n");
    sb.append("</html>\n");
    return sb.toString();
}
def generate_page(self) -> str:
    sb: List[str] = list()
    sb.append("<!DOCTYPE html>")
    sb.append("<html>")
    sb.append("<head>")
    sb.append("<title>My Dynamic Page</title>")
    sb.append("</head>")
    sb.append("<body>")
    sb.append("<h1>Hello, world!</h1>")
    sb.append("<p>Time on the server is ")
    now = datetime.now()
    sb.append(now.strftime("%d/%m/%Y %H:%M:%S"))
    sb.append("</p>")
    sb.append("</body>")
    sb.append("</html>")
    return "\n".join(sb)

It generates the HTML of a page showing the current date and time. Remember too that HTTP responses are simply text, so we can generate a response as a string as well:

public String generateResponse() {
    String page = generatePage();
    StringBuilder sb = new StringBuilder();
    sb.append("HTTP/1.1 200\n");
    sb.append("Content-Type: text/html; charset=utf-8\n");
    sb.append("ContentLength:" + page.length() + "\n");
    sb.append("\n");
    sb.append(page);
    return sb.toString();
}
def generate_response(self) -> str:
    page: str = generate_page()
    sb: List[str] = list()
    sb.append("HTTP/1.1 200");
    sb.append("Content-Type: text/html; charset=utf-8");
    sb.append("ContentLength:" + page.length());
    sb.append("");
    sb.append(page);
    return "\n".join(sb)

The resulting string could then be streamed back to the requesting web browser. This is the basic technique used in all server-side web frameworks: they dynamically assemble the response to a request by assembling strings into an HTML page. Where they differ is what language they use to do so, and how much of the process they’ve abstracted.

For example, this approach was adopted by Microsoft and implemented as Active Server Pages (ASP). By placing files with the .asp extension among those served by an IIS server, C# or Visual Basic code written on that page would be executed, and the resulting string would be served as a file. This would happen on each request - so a request for http://somesite.com/somepage.asp would execute the code in the somepage.asp file, and the resulting text would be served.

You might have looked at the above examples and shuddered. After all, who wants to assemble text like that? And when you assemble HTML using raw string concatenation, you don’t have the benefit of syntax highlighting, code completion, or any of the other modern development tools we’ve grown to rely on. Thankfully, most web development frameworks provide some abstraction around this process, and by and large have adopted some form of template syntax to make the process of writing a page easier.

Template Rendering

Content Note

Much of the content in this page was adapted from Nathan Bean’s CIS 400 course at K-State, with the author’s permission. That content is licensed under a Creative Commons BY-NC-SA license.

It was not long before new technologies sprang up to replace the ad-hoc string concatenation approach to creating dynamic pages. These template approaches allow you to write a page using primarily HTML, but embed snippets of another language to execute and concatenate into the final page. This is very similar to the formatted strings we’ve used in Java and Python, i.e.:

String output = String.format("%s, %d", "Computer", 410)
output: str = "{}, {}".format("Computer, 410)

The example above concatenates the string "Computer" and the number 410 with a comma between them. While the template strings above use either format specifiers like %s or curly braces {} to call out the script snippets, most HTML template libraries initially used some variation of angle brackets + additional characters. As browsers interpret anything within angle brackets (<>) as HTML tags, these would not be rendered if the template was accidentally served as HTML without executing and concatenating scripts. Two early examples are:

  • <?php echo "This is a PHP example" ?>
  • <% Response.Write("This is a classic ASP example) %>

And abbreviated versions:

  • <?= "This is the short form for PHP" ?>
  • <%= "This is the short form for classic ASP" %>

Template rendering proved such a popular and powerful tool that rendering libraries were written for most programming languages, and could be used for more than just HTML files - really any kind of text file can be rendered with a template. Thus, you can find template rendering libraries for JavaScript, Python, Ruby, and pretty much any language you care to (and they aren’t that hard to write either).

Classic PHP, Classic ASP, and ASP.NET web pages all use a single-page model, where the client (the browser) requests a specific file, and as that file is interpreted, the dynamic page is generated. This approach worked well in the early days of the world-wide-web, where web sites were essentially a collection of pages. However, as the web grew increasingly interactive, many web sites grew into full-fledged web applications, or full-blown programs that didn’t lend themselves to a page-based structure. This new need resulted in new technologies to fill the void - web frameworks. We’ll talk about these next.

Web Frameworks

Content Note

Much of the content in this page was adapted from Nathan Bean’s CIS 400 course at K-State, with the author’s permission. That content is licensed under a Creative Commons BY-NC-SA license.

YouTube Video

Video Materials

As web sites became web applications, developers began looking to use ideas and techniques drawn from traditional software development. These included architectural patterns like Model-View-Controller (MVC) and Pipeline that simply were not possible with the server page model. The result was the development of a host of web frameworks across multiple programming languages, including:

  • Ruby on Rails, which uses the Ruby programming language and adopts a MVC architecture
  • Laravel, which uses the PHP programming language and adopts a MVC architecture
  • Django, which uses the Python programming language and adopts a MVC architecture
  • Express, which uses the Node implementation of the JavaScript programming language and adopts the Pipeline architecture
  • Revel, which uses the Go programming language and adopts a Pipeline architecture
  • Cowboy, which uses the erlang programming language and adopts a Pipeline architecture
  • Phoenix, which uses the elixir programming language, and adopts a Pipeline architecture

Spring and Flask

In this course, we’re going to explore a lightweight web framework that was built for our chosen language:

Both of these frameworks are very powerful, but most importantly, they are extremely flexible and allow us to structure our web application in a way that makes sense for our needs.

On the next pages, we’ll dive a bit deeper into how these web frameworks handle web requests and generate appropriate responses for them.

Subsections of Web Frameworks

Request & Response

Earlier in this chapter, we discussed how HTTP is a request-response protocol, as shown in this diagram:

HTTP’s request-response pattern HTTP’s request-response pattern

We also discussed how we could write a simple dynamic program to generate a response by concatenating strings together. It was definitely not efficient, but it demonstrated that it is possible to dynamically generate a response to a web request.

Web frameworks simplify this process greatly by handling most of the work for us. As developers, all we really need to do is collect the data that should be contained in the response, and create the template used to generate the web page.

Requests & Responses

Let’s look at the diagram below, which shows the process that a MVC-based web framework, such as Spring, might follow:

Model-View-Controller Model-View-Controller1

First, the application will receive an incoming web request from a client, which will include a path and possibly some additional data. The framework will examine that request, and determine which part of the application should respond to it. This is a process known as routing, which we’ll cover on the next page.

At that point, the framework will delegate the request to a piece of code, usually called a controller, that can respond to it. In most web applications, the controllers are the main portion of the code written by the developer that isn’t part of the framework itself. So, in the controller, we can look at the request as well as the data that comes along with it, and we can collect the data needed to respond to it.

For example, the request might include information about the user that sent the request, as well as a search term used in a search box. So, our code might collect information about that user and the search term, and use it to populate a model that contains all of the data that is requested.

Once we have completed that task, we can return the model back to the framework, as well as specify a particular template that should be used to create the response. So, the next thing the framework will do is find the requested template and render it, substituting data from the model into the template based on the special markers included in the template. In most cases, the template is the other major part of the web application that is written by the developer.

Finally, the rendered template is placed into an HTTP response, and the response is sent back to the client.

Routing

Of course, one major question that we still need to resolve is “how does the web framework look at a web request and determine what code to execute?” To do that, most web frameworks introduce the concept of routing.

In a web framework, a route is usually a mapping from a path to a particular function in a controller.

For example, a simple web framework might match the path /, representing the top level page on the server, to a function called getIndex() in one of the controllers in the web application itself.

So, when an incoming HTTP GET request asks for the page at path /, the web framework will call the code in our getIndex() function, which will usually return a model and a template to render. Then, the framework will render that template using the data in the model, and send that as a response back to the user’s client web browser.

Complex Routes

Routes in our web application can be much more complex than mapping simple paths to functions. For example, the route could specify one function when the path is requested using an HTTP GET request, and an entirely different function when the path is requested using an HTTP POST request, which includes some additional data.

A great example is logging in to a website. If the user sends an HTTP GET request to the /login path, it could call a function named getLoginPage() to render a login page that asks the user for a username and password.

When the user enters that information on the page and clicks the “submit” button, it will send an HTTP POST request to the same /login path, along with the username and password that the user entered. In that case, the web framework can be configured to send that request to a different function, postLogin() that will determine if the username and password match an existing user account. If so, the user will be logged in and the website will send an appropriate response. If not, it can even direct the web framework to render the same login template as before, including an extra message to let the user know that the information submitted was invalid.

Data in Routes

Finally, routes can also include placeholders for data, similar to wildcards. This is most commonly used in RESTful routing, short for “Representation State Transfer,” which we’ll cover in a later chapter.

For example, a web application might be configured with a route that matches the path /title/<id>, where <id> is a placeholder for some data that is provided as part of the path. So, if the user requests the item at path /title/123, the web framework will know that the user is requesting information about the title with the ID of 123.

In fact, if you look closely at many websites today, you’ll see this pattern all over! A great example is IMDb (the “Internet Movie Database”), where the url https://www.imdb.com/title/tt0076759/ takes you to this page about the original Star Wars movie. In that URL, we see the RESTful route /title/tt0076759, where tt0076759 is the identifier for Star Wars.

We can even explore this by changing the identifier a bit and seeing where that takes us. If we increment the identifier by 1, we get https://www.imdb.com/title/tt0076760/, which takes us to the page about the movie Starship Invasions, released in the same year as Star Wars. In fact, by trying several similar identifiers, we can quickly guess that some of the data on IMDb from movies was loaded alphabetically by year of release!

Leaking Data via Routes

While RESTful routes using sequential identifiers such as this one are really useful, they can also cause issues. One common cause of this is attaching sequential identifiers to user uploaded data. In this way, any user who uses the platform can upload a piece of data, and then use the identifier attached to that piece of data to guess the identifier of data from other users. This is referred to as an Insecure Direct Object Reference or IDOR. If the website doesn’t properly limit access to this data, it could result in private data being publicly available.

This was most recently in the news when it was revealed that the data from the Parler social network was downloaded by exploiting this bug, among others. Wired does a good job describing how it happened.

Template Inheritance

One thing you may have also noticed is that many web applications use the same layout across many different pages. Since each page in a web framework requires a different template, it could be very difficult to make sure that each of those pages includes the same information, and updating them would be a major hassle if there were several hundred or thousands of pages in the application.

Thankfully, most web frameworks also include the ability for templates to be composed of other templates. In that way, we can create a hierarchical structure of templates, and even create smaller templates that we can reuse over and over again in our code.

Layout Template

One of the most common ways to accomplish this is through the use of a top-level layout template, which defines the overall layout of the pages used by the web application. This could include specific CSS and JavaScript files, metadata, and even a common header, navigation bar, and footer for each page.

For example, here is a short layout template for the Jinja2 template engine used with Flask in Python, which would be stored in the file layout.html:

<!doctype html>
<html lang="en">
  <head>
    <meta charset="utf-8">
    <title>{% block title %}{% endblock %} - Web Application</title>
  </head>
    
  <body>
      
    <header>
      <nav>
        <a href="/">Homepage</a>
      </nav>
    </header>

    <main>
        {% block content %}{% endblock %}
    </main>

    <footer>
      <div>
        <span>&copy; 2021 Web Application</span>
      </div>
    </footer>

  </body>
</html>

This layout includes a header with a title and some metadata. In addition, the body includes both a header and a footer with some information that should be included on every page in the application. Between those, we see a main section.

In Jinja2, the sections surrounded by curly braces and percent signs {% %} define blocks that can be replaced by other content. So, when we use this layout template, we can replace the title and content block with information specific to that page.

Using a Template

To use this layout template, we can just specify it as part of another template. For example, here is a template for a home page, titled index.html:

{% extends "layout.html" %}

{% block title %}Home Page{% endblock %}

{% block content %}

<p>Hello World!</p>

{% endblock %}

Pretty simple, isn’t it? This template basically defines the content to be placed in the title and content blocks, and at the top it specifies that it will use the template in layout.html as it’s layout template. So, when the template is rendered, we receive the following HTML:

<!doctype html>
<html lang="en">
  <head>
    <meta charset="utf-8">
    <title>Home Page - Web Application</title>
  </head>
    
  <body>
      
    <header>
      <nav>
        <a href="/">Homepage</a>
      </nav>
    </header>

    <main>
        <p>Hello World!</p>
    </main>

    <footer>
      <div>
        <span>&copy; 2021 Web Application</span>
      </div>
    </footer>

  </body>
</html>

This use of template inheritance can be done in most template engines used in web frameworks.

In addition, we can place other templates inside of our page template. We’ll see how to do that in the example project for this chapter.

Summary

In this chapter, we covered the background content for working with web applications. We learned about HTML, CSS and JavaScript, the three core technologies used on the World Wide Web today. We also learned about HTTP, the protocol used to request a website from a web server and then receive a response from that server.

We then explored static web pages, which made up the majority of the World Wide Web in the early days. However, as the web became more commonplace, the need for dynamic web pages increased. Initially, that process was very rudimentary, but eventually many web frameworks were created to simplify that process.

A web framework follows the same request-response model used by HTML. However, it uses the path of the web request, along with any additional data included in the request, to determine what page to render. This is a process called routing.

Finally, we saw how many template engines today support template inheritance, allowing us to define a hierarchical set of templates that make each page in our web application include the same basic information and structure.

With this information in hand, we can start building a web application as part of our semester project.

Review Quiz

Check your understanding of the new content introduced in this chapter below - this quiz is not graded and you can retake it as many times as you want.

Quizdown quiz omitted from print view.