Chapter 2

Hyper-Text Transfer Protocol

Etiquette for Web Servers and Clients

Subsections of Hyper-Text Transfer Protocol

Introduction

At the heart of the world wide web is the Hyper-Text Transfer Protocol (HTTP). This is a protocol defining how HTTP servers (which host web pages) interact with HTTP clients (which display web pages).

It starts with a request initiated from the web browser (the client). This request is sent over the Internet using the TCP protocol to a web server. Once the web server receives the request, it must decide the appropriate response - ideally sending the requested resource back to the browser to be displayed. The following diagram displays this typical request-response pattern.

HTTP’s request-response pattern HTTP’s request-response pattern

This HTTP request-response pattern is at the core of how all web applications communicate. Even those that use websockets begin with an HTTP request.

Info

The HTTP standard, along with many other web technologies, is maintained by the World-Wide-Web Consortium (abbreviated W3C), stakeholders who create and maintain web standards. The full description of the Hyper-Text Transfer Protocol can be found here w3c’s protocols page.

Browser Requests

Before we get too deep in the details of what a request is, and how it works, let’s explore the primary kind of request you’re already used to making - requests originating from a browser. Every time you use a browser to browse the Internet, you are creating a series of HTTP (or HTTPS) requests that travel across the networks between you and a web server, which responds to your requests.

To help illustrate how these requests are made, we’ll once again turn to our developer tools. Open the example page this link. On that tab, open your developer tools with CTRL + SHIFT + i or by right-clicking the page and selecting “Inspect” from the context menu. Then choose the “Network” tab:

Selecting the network tab in the developer tools Selecting the network tab in the developer tools

The network tab displays details about each request the browser makes. Initially it will probably be empty, as it does not log requests while not open. Try refreshing the page - you should see it populate with information:

The populated network tab The populated network tab

The first entry is the page itself - the HTML file. But then you should see entries for site.css, brazil.gif, fiber-4814456_960_720.jpg, jquery-3.5.1.slim.min.js, and site.js. Each of these entries represents an additional resource the browser fetched from the web in order to display the page.

Take, for example, the two images brazil.gif and fiber-4814456_960_720.jpg. These correspond to <img> tags in the HTML file:

<img alt="Paper scene from the film Brazil" src="brazil.gif"/>
<img alt="Fiber optic cables" src="https://cdn.pixabay.com/photo/2020/02/03/00/12/fiber-4814456_960_720.jpg"/>
        

The important takeaway here is that the image is requested separately from the HTML file. As the browser reads the page and encounters the <img> tag, it makes an additional request for the resource supplied in its src attribute. When that second request finishes, the downloaded image is added to the web page.

Notice too that while one image was on our webserver, the other is retrieved from Pixabay.com’s server.

Similarly, we have two JavaScript files:

<script src="https://code.jquery.com/jquery-3.5.1.slim.min.js" integrity="sha256-4+XzXVhsDmqanXGHaHvgh1gMQKX40OUvDEBTu8JcmNs=" crossorigin="anonymous"></script>
<script src="site.js"></script>

As with the images, one is hosted on our website, site.js, and one is hosted on another server, jquery.com.

Both the Pixabay image and the jQuery library are hosted by a Content Delivery Network- a network of proxy servers that are distributed geographically in such a way that a request for a resource they hold can be processed from a nearby server. Remember that the theoretical maximum speed for internet transmissions is the speed of light (for fiber optics) or electrons in copper wiring. Communication is further slowed at each network switch encountered. Serving files from a nearby server can prove very efficient at speeding up page loads because of the shorter distance and smaller number of switches involved.

A second benefit of using a CDN to request the JQuery library is that if the browser has previously downloaded the library when visiting another site it will have cached it. Using the cached version instead of making a new request is much faster. Your app will benefit by faster page loads that use less bandwidth.

Notice too that the jQuery <script> element also uses the integrity attribute to allow the browser to determine if the library downloaded was tampered with by comparing cryptographic tokens. This is an application of Subresource Integrity, a feature that helps protect your users. As JavaScript can transform the DOM, there are incentives for malicious agents to supplant real libraries with fakes that abuse this power. As a web developer you should be aware of this, and use all the tools at your disposal to keep your users safe.

You can use the network tab to help debug issues with resources. Click on one of the requested resources, and it will open up details about the request:

Request details in the Network Tab Request details in the Network Tab

Notice that it reports the status code along with details about the request and response, and provides a preview of the requested resource. We’ll cover what these all are over the remainder of this chapter. As you learn about each topic, you may want to revisit the tab with the example to see how these details correspond to what you are learning.

Request Format

So now that we’ve seen HTTP Requests in action, let’s examine what they are. A HTTP Request is just a stream of text that follows a specific format and sent from a client to a server.

Http Request as a stream of text Http Request as a stream of text

It consists of one or more lines terminated by a CRLF (a carriage return and a line feed character, typically written \r\n in most programming languages).

  1. A request-line describing the request
  2. Additional optional lines containing HTTP headers. These specify details of the request or describe the body of the request
  3. A blank line, which indicates the end of the request headers
  4. An optional body, containing any data belonging of the request, like a file upload or form submission. The exact nature of the body is described by the headers.

The Request-Line

The request-line follows the format

Request-Line = Method SP Request-URI SP HTTP-Version CRLF

The Method refers to the HTTP Request Method (often GET or POST).

SP refers to the space character.

The Request-URI is a Universal Request Indicator, and is typically a URL or can be the asterisk character (*), which refers to the server instead of a specific resource.

HTTP-Version refers to the version of the HTTP protocol that will be used for the request. Currently three versions of the protocol exist: HTTP/1.0, HTTP/1.1, and HTTP/2.0. Most websites currently use HTTP/1.1 (HTTP/2.0 introduces many changes to make HTTP more efficient, at the cost of human readability. Currently it is primarily used by high-traffic websites).

Finally, the CRLF indicates a carriage return followed by a line feed.

For example, if we were requesting the about.html page of a server, the request-line string would be:

GET /about.html HTTP/1.1\r\n

The Headers

Header lines consist of key-value pairs, typically in the form

Header = Key: Value CRLF

Headers provide details about the request, for example, if we wanted to specify we can handle the about.html page data compressed with the gzip algorithm, we would add the header:

Accept-Encoding: compress, gzip\r\n

The server would then know it could send us a zipped version of the file, resulting in less data being sent from the server to the client.

If our request includes a body (often form data or a file upload), we need to specify what that upload data is with a Content-Type header and its size with a Content-Length header, i.e.:

Content-Length: 26012 Content-Type: image/gif

A Blank Line

The header section is followed by a blank line (a CRLF with no characters before it). This helps separate the request metadata from the request body.

The Request Body

The body of the request can be text (as is the case for most forms) or binary data (as in an image upload). This is why the Content-Type header is so important for requests with a body; it lets the server know how to process the data. Similarly, the Content-Length header lets us know how many bytes to expect the body to consist of.

It is also acceptable to have no body - this is commonly the case with a GET request. If there is no body, then there are also no required headers. A simple get request can therefore consist of only the request-line and blank line, i.e.:

GET /about.html HTTP/1.1\r\n\r\n

Info

The HTTP/1.1 request definition can be found in W3C RFC 2616 Section 5

Request Methods

The first line of the HTTP request includes the request method, which indicates what kind of action the request is making of the web server (these methods are also known as HTTP Verbs). The two most common are GET and POST, as these are supported by most browsers.

Commonly Used HTTP Methods

The following requests are those most commonly used in web development. As noted before GET and POST requests are the most commonly used by web browsers, while GET, PUT, PATCH, and DELETE are used by RESTful APIs. Finally, HEAD can be used to optimize web communications by minimizing unnecessary data transfers.

GET

A GET request seeks to retrieve a specific resource from the web server - often an HTML document or binary file. GET requests typically have no body and are simply used to retrieve data. If the request is successful, the response will typically provide the requested resource in its body.

A HEAD request is similar to a GET request, except the response is not expected to provide a body. This can be used to verify the type of content of the resource, the size of the resource, or other metadata provided in the response header, without downloading the full data of the resource.

POST

The POST request submits an entity to the resource, i.e. uploading a file or form data. It typically will have a body, which is the upload or form.

PUT

The PUT request is similar to a POST request, in that it submits an entity as its body. It has a more strict semantic meaning though; a PUT request is intended to replace the specified resource in its entirety.

PATCH

The PATCH request is also similar to POST and PUT requests - it submits an entity as its body. However, its semantic meaning is to only apply partial modifications to the specified entity.

DELETE

As you might expect, the DELETE method is used to delete the specified resource from the server.

Info

Additional methods include CONNECT, which establishes a tunnel to a server; OPTIONS, which identifies communications options with a resource, and TRACE, which performs a message loop-back test to the target resource. HTTP Methods are defined in W3C’s RFC2616.

URIs and URLs

Before a web request can be made, the browser needs to know where the resource requested can be found. This is the role that a Universal Resource Locator (a URL) plays. A URL is a specific kind of Universal Resource Indicator (URI) that specifies how a specific resource can be retrieved.

Info

URLs and URIs The terms URL and URI are often used interchangeably in practice. However, a URL is a specific subset of URIs that indicate how to retrieve a resource over a network; while a URI identifies a unique resource, it does not necessarily indicate how to retrieve it. For example, a book’s ISBN can be represented as a URI in the form urn:isbn:0130224189. But this URI cannot be put into a browser’s Location to retrieve the associated book.

A URI consists of several parts, according to the definition (elements in brackets are optional):

URI = scheme:[//[userinfo@]host[:port]]path[?query][#fragment]

Let’s break this down into individual parts:

scheme: The scheme refers to the resource is identified and (potentially) accessed. For web development, the primary schemes we deal with are http (hyper-text transfer protocol), https (secure hyper-text transfer protocol), and file (indicating a file opened from the local computer).

userinfo: The userinfo is used to identify a specific user. It consists of a username optionally followed by a colon (:) and password. We will discuss its use in the section on HTTP authentication, but note that this approach is rarely used today, and carries potential security risks.

host: The host consists of either a fully quantified domain name (i.e. google.com, cs.ksu.edu, or gdc.ksu.edu) or an ip address (i.e. 172.217.1.142 or [2607:f8b0:4004:803::200e]). IPv4 addresses must be separated by periods, and IPv6 addresses must be closed in brackets. Additionally, web developers will often use the loopback host, localhost, which refers to the local machine rather than a location on the network.

port: The port refers to the port number on the host machine. If it is not specified (which is typical), the default port for the scheme is assumed: port 80 for HTTP, and port 443 for HTTPS.

path: The path refers to the path to the desired resource on the server. It consists of segments separated by forward slashes (/).

query: The query consists of optional collection of key-value pairs (expressed as key:value), separated by ampersands (&), and preceded by a question mark (?). The query string is used to supply modifiers to the requested resource (for example, applying a filter or searching for a term).

fragment: The fragment is an optional string proceeded by a hashtag (#). It identifies a portion of the resource to retrieve. It is most often used to auto-scroll to a section of an HTML document, and also for navigation in some single-page web applications.

Thus, the URL https://google.com indicates we want to use the secure HTTP scheme to access the server at google.com using its port 443. This should retrieve Google’s main page.

Similarly, the url https://google.com/search?q=HTML will open a Google search result page for the term HTML (Google uses the key q to identify search terms).

Request Headers

Request headers take the form of key-value pairs, separated by colons : and terminated with a CRLF (a carriage return and line feed character). For example:

Accept-Encoding: gzip

Indicates that the browser knows how to accept content compressed in the Gzip format.

Note that request headers are a subset of message headers that apply specifically to requests. There are also message headers that apply only to HTTP responses, and some that apply to both.

As HTTP is intended as an extensible protocol, there are a lot of potential headers. IANA maintains the official list of message headers as well as a list of proposed message headers. You can also find a categorized list in the MDN Documentation

While there are many possible request headers, some of the more commonly used are:

Accept Specifies the types a server can send back, its value is a MIME type.

Accept-Charset Specifies the character set a browser understands.

Accept-Encoding Informs the server about encoding algorithms the client can process (most typically compression types)

Accept-Language Hints to the server what language content should be sent in.

Authorization Supplies credentials to authenticate the user to the server. Will be covered in the authentication chapter.

Content-Length The length of the request body sent, in octets

Content-Type The MIME type of the request body

Content-Encoding The encoding method of the request body

Cookie Sends a site cookie - see the section on cookies later

User-Agent A string identifying the agent making the request (typically a browser name and version)

Request Body

After the request headers and an extra CRLF (carriage return and line feed) is the request body.

For GET and DELETE requests, there is no body. For POST, PUT, and PATCH, however, this section should contain the data being sent to the server. If there is a body, the headers should include Content-Type and Content-Length. The Content-Length is always provided as a count of octets (a set of eight bits). Thus, binary data is sent as an octet stream. Text data is typically sent in UTF-8 encoding.

Two body formats bear special mention: application/x-www-form-urlencoded and multipart/form-data. These encodings are commonly used for submitting HTML forms, and will be covered in more detail in the Form Data chapter.

Response Format

Similar to an HTTP Request, an HTTP response is typically a stream of text and possibly data:

HTTP Response as a stream of text and data HTTP Response as a stream of text and data

It consists of one or more lines of text, terminated by a CRLF (sequential carriage return and line feed characters):

  1. A status-line indicating the HTTP protocol, the status code, and a textual status
  2. Optional lines containing the Response Headers. These specify the details of the response or describe the response body
  3. A blank line, indicating the end of the response metadata
  4. An optional response body. This will typically be the text of an HTML file, or binary data for an image or other file type, or a block of bytes for streaming data.

The Status-Line

The status-line follows the format

Status-Line = HTTP-Version SP Status-Code SP Reason-Phrase CRLF

The HTTP-Version indicates the version of the HTTP protocol that is being used (HTTP/1.0, HTTP/1.1, or HTTP/2.0).

SP refers to a space character.

The Status-Code is a three-digit numeric representation of the response status. Common codes include 200 (OK), 404 (Not Found), and 500 (Server Error).

The Reason-Phrase is a plain-text explanation of the status code.

Response Headers

Just like HTTP Requests, a HTTP response can contain headers describing the response. If the response has a body, a Content-Type and Content-Length header would be expected.

A Blank Line

The header section is followed by a blank line (a CRLF with no characters before it). This helps separate the response metadata from the response body.

Response Body

The response body contains the data of the response. it might be text (as is typically the case with HTML, CSS, or JavaScript), or a binary file (an image, video, or executable). Additionally, it might only be a sequence of bytes, as is the case for streaming media.

Info

The full HTTP/1.1 response definition can be found in W3C RFC 2616 Section 6.

Status Codes

The status-line consists of a numeric code, followed by a space, and then a human-readable status message that goes with the code. The codes themselves are 3-digit numbers, with the first number indicating a general category the response status falls into. Essentially, the status code indicates that the request is being fulfilled, or the reason it cannot be.

1XX Status Codes

Codes falling in the 100’s provide some kind of information, often in response to a HEAD or upgrade request. See the MDN Documentation for a full list.

2XX Status Codes

Codes in the 200’s indicate success in some form. These include:

200 OK A status of 200 indicates the request was successful. This is by far the most common response.

201 Created Issued in response to a successful POST request, indicates the resource POSTed to the server has been created.

202 Accepted Indicates the request was received but not yet acted upon. This is used for batch-style processes. An example you may be familiar with is submitting a DARS report request - the DARS server, upon receiving one, adds it to a list of reports to process and then sends a 202 response indicating it was added to the list, and should be available at some future point.

There are additional 200 status codes. See the MDN Documentation for a full list.

3XX Status Codes

Codes in the 300’s indicate redirects. These should be used in conjunction with a Location response header to notify the user-agent where to redirect. The three most common are:

301 Moved Permanently Indicates the requested resource is now permanently available at a different URI. The new URI should be provided in the response, and the user-agent may want to update bookmarks and caches.

302 Found Also redirects the user to a different URI, but this redirect should be considered temporary and the original URI used for further requests.

304 Not Modified Indicates the requested resource has not changed, and therefore the user-agent can use its cached version. By sending a 304, the server does not need to send a potentially large resource and consume unnecessary bandwidth.

There are additional 300 status codes. See the MDN Documentation for a full list.

4XX Status Codes

Codes in the 400’s indicate client errors. These center around badly formatted requests and authentication status.

400 Bad Request is a request that is poorly formatted and cannot be understood.

401 Unauthorized means the user has not been authenticated, and needs to log in.

403 Forbidden means the user does not have permissions to access the requested resource.

404 Not Found means the requested resource is not found on the server.

There are many additional 400 status codes. See the MDN Documentation for a full list.

5XX Status Codes

Status codes in the 500’s indicate server errors.

500 Server Error is a generic code for “something went wrong in the server.”

501 Not Implemented indicates the server does not know how to handle the request method.

503 Service Unavailable indicates the server is not able to handle the request at the moment due to being down, overloaded, or some other temporary condition.

There are additional 500 status codes. See the MDN Documentation for a full list.

Response Headers

Response headers take the form of key-value pairs, separated by colons : and terminated with a CRLF (a carriage return and line feed character), just like Request Headers (both are types of Message Headers). For example, this header:

Expires: Wed, 12 Jun 2019 08:00:00 CST

indicates to the browser that this content will expire June 12, 2019 at 8AM Central Standard Time. The browser can use this value when populating its cache, allowing it to use the cached version until the expiration time, reducing the need to make requests.

Note that response headers are a subset of message headers that apply specifically to requests. As we’ve seen there are also message headers that apply only to HTTP requests, and some that apply to both.

As HTTP is intended as an extensible protocol, there are a lot of potential headers. IANA maintains the official list of message headers as well as a list of proposed message headers. You can also find a categorized list in the MDN Documentation

While there are many possible response headers, some of the more commonly used are:

Allow Lists the HTTP Methods that can be used with the server

Content-Length The length of the response body sent, in octets

Content-Type The MIME type of the response body

Content-Encoding The encoding method of the response body

Location Used in conjunction with redirects (a 301 or 302 status code) to indicate where the user-agent should be redirected to.

Server Contains information about the server handling the request.

Set-Cookie Sets a cookie for this server on the client. The client will send back the cookie on subsequent requests using the Cookie header.

We’ll make use of these headers as we start writing web servers.

Response Body

After the response headers and an extra CRLF (carriage return and line feed) is the response body.

The body is typically text (for HTML, CSS, JavaScript, and other text files) or binary data (for images, video, and other file types).

Setting the Content-Type and Content-Length headers lets the web client know what kind of data, and how much of it, should be expected. If these headers are not supplied in the response, the browser may treat the body as a blob of binary data, and only offer to save it.

HTTP 2.0

You may have noticed that the earlier parts of this chapter focused on HTTP version 1.1, while mentioning version 2.0. You might wonder why we didn’t instead look at version 2.0 - and it’s a valid question.

In short, HTTP 2.0 was created to make the request-response pattern of the web more efficient. One method it uses to do so is switching from text-based representations of requests and responses to binary-based representations. As you probably remember from working with File I/O, binary files are much smaller than the equivalent text file. The same is true of HTTP requests and responses. But the structure of HTTP 2.0 Requests and Responses are identical to HTTP 1.1 - they are simply binary (an hence, harder to read).

But this is not the only improvement in HTTP 2.0. Consider the request-response pattern we discussed earlier in the chapter:

HTTP 2.0’s request-response pattern HTTP 2.0’s request-response pattern

To display a webpage, the browser must first request the page’s HTML data. As it processes the returned HTML, it will likely encounter HTML <img>, <link>, and <src> tags that refer to other resources on the server. To get these resources, it will need to make additional requests. For a modern webpage, this can add up quickly! Consider a page with 20 images, 3 CSS files, and 2 JavaScript files. That’s 25 separate requests!

One of the big improvements between HTTP/1.0 and HTTP/1.1 was that HTTP/1.1 does not close its connection to the server immediately - it leaves a channel open for a few seconds. This allows it to request these additional files without needing to re-open the connection between the browser and the server.

HTTP/2.0 takes this thought a step farther, by trying to anticipate the browser’s additional requests. In the HTTP/2.0 protocol, when a browser requests an HTML page, the server can push the additional, linked files as part of the response. Thus, the entire page content can be retrieved with a single request instead of multiple, separate requests. This minimizes network traffic, allowing the server to handle more requests, and speeds up the process of rendering pages on a client’s machine.

Statelessness and Scaling

One final point to make about Hyper-Text Transfer Protocol. It is a stateless protocol. What this means is that the server does not need to keep track of previous requests - each request is treated as though it was the first time a request has been made.

This approach is important for several reasons. One, if a server must keep track of previous requests, the amount of memory required would grow quickly, especially for popular web sites. We could very quickly grow past the memory capacity of the server hardware, causing the webserver to crash.

Scaling strategies Scaling strategies

A second important reason is how we scale web applications to handle more visitors. We can do vertical scaling - increasing the power of the server hardware - or horizontal scaling - adding additional servers to handle incoming requests. Not surprisingly, the second option is the most cost-effective. But for it to work, requests need to be handed quickly to the first available server - there is no room for making sure a subsequent request is routed to the same server.

Thus, HTTP was designed as a stateless protocol, in that each new HTTP request is treated as being completely independent from all previous requests from the same client. This means that when using horizontal scaling, if the first request from client BOB is processed by server NANCY, and the second request from BOB is processed by server MARGE, there is no need for MARGE to know how NANCY responded. This stateless property was critical to making the world-wide-web even possible with the earliest internet technologies.

Of course, you probably have experience with websites that do seem to keep track of state - i.e. online stores, Canvas, etc. One of the challenges of the modern web was building state on top of a stateless protocol; we’ll discuss the strategies used to do so in later chapters.

Summary

In this chapter, we explored the Hyper-Text Transfer Protocol (HTTP), the technology used to exchange data across the web. We saw how requests and responses are simply well-structured streams of text or data exchanged across a packet-switched network. We examined the structure of these requests, and saw how they all use a similar pattern of a request or response line, a series of headers carrying metadata about the request or response, and an optional body.

We also discussed HTTP request methods like GET and POST, and HTTP response codes like a 404 and what these terms mean. We explored how HTTP has evolved from HTTP 1.0 to 1.1 to 2.0. And we discussed how HTTP is a stateless protocol, and how that statelessness is key to making websites that can serve thousands if not millions of users.

We’ll continue to reference the content of this chapter as we move forward, as HTTP is one of the core web technologies along with HTML, CSS, and JavaScript.