Much of the content in this page was adapted from Nathan Bean’s CIS 400 course at K-State, with the author’s permission. That content is licensed under a Creative Commons BY-NC-SA license.
The World Wide Web was the brainchild of Sir Tim Berners-Lee. It was conceived as a way to share information across the Internet; in Sir Berners-Lee’s own words describing the idea as he first conceived it:
This project is experimental and of course comes without any warranty whatsoever. However, it could start a revolution in information access.
Clearly that revolution has come to pass. The web has become part of our daily lives.
There were three key technologies that Sir Tim Berners-Lee proposed and developed. These remain the foundations upon which the web runs even today. Two are client-side, and determine how web pages are interpreted by browsers. These are:
- Hypertext Markup Language
- Cascading Style Sheets
HTML
Hypertext Markup Language (HTML), is one of the three core technologies of the world wide web, along with Cascading Style Sheets (CSS) and Javascript (JS). Each of these technologies has a specific role to play in delivering a website. HTML defines the structure and contents of the web page. It is a markup language, similar to XML (indeed, HTML is based on the SGML, or Standardized General Markup Language, standard, which XML is also based on).
HTML Elements
The structure of HTML consists of various tags. For example, a button in HTML looks like this:
<button onclick="doSomething">
Do Something
</button>
HTML elements have and opening and closing tag, and can have additional HTML content nested inside these tags. HTML tags can also be self-closing, as is the case with the line break tag:
<br />
Let’s explore the parts of an HTML element in more detail.
The Start Tag
The start tag is enclosed in angle brackets (<
and >
). The angle brackets differentiate the text inside them as being HTML elements, rather than text. This guides the browser to interpret them correctly.
Because angle brackets are interpreted as defining HTML tags, you cannot use those characters to represent greater than and less than signs. Instead, HTML defines escape character sequences to represent these and other special characters. Greater than is >
, less than is <
. A full list can be found on mdn.
The Tag Name
Immediately after the <
is the tag name. In HTML, tag names like button
should be expressed in lowercase letters. This is a convention (as most browsers will happily accept any mixture of uppercase and lowercase letters), but is very important when using popular modern web technologies like Razor and React, as these use Camel case tag names to differentiate between HTML and components they inject into the web page.
The Attributes
After the tag name come optional attributes, which are key-value pairs expressed as key="value"
. Attributes should be separated from each other and the tag name by whitespace characters (any whitespace will do, but traditionally spaces are used). Different elements have different attributes available - and you can read up on what these are by visiting the MDN article about the specific element.
However, several attributes bear special mention:
-
The
id
attribute is used to assign a unique id to an element, i.e.<button id="that-one-button">
. The element can thereafter be referenced by that id in both CSS and JavaScript code. An element ID must be unique in an HTML page, or unexpected behavior may result! -
The
class
attribute is also used to assign an identifier used by CSS and JavaScript. However, classes don’t need to be unique; many elements can have the same class. Further, each element can be assigned multiple classes, as a space-delimited string, i.e.<button class="large warning">
assigns both the classes “large” and “warning” to the button.
Also, some web technologies (like Angular) introduce new attributes specific to their framework, taking advantage of the fact that a browser will ignore any attributes it does not recognize.
The Tag Content
The content nested inside the tag can be plain text, or another HTML element (or collection of elements). HTML elements can have multiple child elements. Indentation should be used to keep your code legible by indenting any nested content, i.e.:
<div>
<h1>A Title</h1>
<p>This is a paragraph of text that is nested inside the div</p>
<p>And this is another paragraph of text</p>
</div>
The End Tag
The end tag is also enclosed in angle brackets (<
and >
). Immediately after the <
is a forward slash /
, and then the tag name. You do not include attributes in a end tag.
If the element has no content, the end tag can be combined with the start tag in a self-closing tag, i.e. the <input> tag is typically written as self-closing:
<input id="first-name" type="text" placeholder="Your first name" />
Text in HTML
Text in HTML works a bit differently than you might expect. Most notably, all white space is converted into a single space. Thus, the lines:
<blockquote>
If you can keep your head when all about you
Are losing theirs and blaming it on you,
If you can trust yourself when all men doubt you,
But make allowance for their doubting too;
If you can wait and not be tired by waiting,
Or being lied about, don’t deal in lies,
Or being hated, don’t give way to hating,
And yet don’t look too good, nor talk too wise:
<i>-Rudyard Kipling, exerpt from "If"</i>
</blockquote>
Would be rendered:
If you can keep your head when all about you Are losing theirs and blaming it on you, If you can trust yourself when all men doubt you, But make allowance for their doubting too; If you can wait and not be tired by waiting, Or being lied about, don’t deal in lies, Or being hated, don’t give way to hating, And yet don’t look too good, nor talk too wise: -Rudyard Kipling, exerpt from "If"
If, for some reason you need to maintain formatting of the included text, you can use the <pre> element (which indicates the text is preformatted):
<blockquote>
<pre>
If you can keep your head when all about you
Are losing theirs and blaming it on you,
If you can trust yourself when all men doubt you,
But make allowance for their doubting too;
If you can wait and not be tired by waiting,
Or being lied about, don’t deal in lies,
Or being hated, don’t give way to hating,
And yet don’t look too good, nor talk too wise:
</pre>
<i>-Rudyard Kipling, exerpt from "If"</i>
</blockquote>
Which would be rendered:
If you can keep your head when all about you Are losing theirs and blaming it on you, If you can trust yourself when all men doubt you, But make allowance for their doubting too; If you can wait and not be tired by waiting, Or being lied about, don’t deal in lies, Or being hated, don’t give way to hating, And yet don’t look too good, nor talk too wise:
-Rudyard Kipling, exerpt from "If"
Note that the <pre> preserves all formatting, so it is necessary not to indent its contents.
Alternatively, you can denote line breaks with <br/>
, and non-breaking spaces with
:
<blockquote>
If you can keep your head when all about you<br/>
Are losing theirs and blaming it on you,<br/>
If you can trust yourself when all men doubt you,<br/>
But make allowance for their doubting too;<br/>
If you can wait and not be tired by waiting,<br/>
Or being lied about, don’t deal in lies,<br/>
Or being hated, don’t give way to hating,<br/>
And yet don’t look too good, nor talk too wise:<br/>
<i>-Rudyard Kipling, exerpt from "If"</i>
</blockquote>
Which renders:
If you can keep your head when all about you
Are losing theirs and blaming it on you,
If you can trust yourself when all men doubt you,
But make allowance for their doubting too;
If you can wait and not be tired by waiting,
Or being lied about, don’t deal in lies,
Or being hated, don’t give way to hating,
And yet don’t look too good, nor talk too wise:
-Rudyard Kipling, exerpt from "If"
Additionally, as a program you may want to use the the <code> element in conjunction with the <pre> element to display preformatted code snippets in your pages. There are even some JavaScript libraries available to automatically add syntax colors to your code.
HTML Comments
HTML comments are identical to XML comments (as both inherited from SGML). Comments start with the sequence <!--
and end with the sequence -->
, i.e.:
<!-- This is an example of a HTML comment -->
Basic Page Structure
HTML5 (the current HTML standard) pages have an expected structure that you should follow. This is:
<!DOCTYPE html>
<html>
<head>
<title><!-- The title of your page goes here --></title>
<!-- other metadata about your page goes here -->
</head>
<body>
<!-- The contents of your page go here -->
</body>
</html>
HTML Elements
Rather than include an exhaustive list of HTML elements, I will direct you to the list provided by MDN. However, it is useful to recognize that elements can serve different purposes:
- Some organize the page into sections like the header and footer - MDN calls these the Content Section elements
- Some define the meaning, structure or style of text - MDN calls these the Inline text semantics elements
- Some present images, audio, video, or other embedded multimeda content - MDN calls these the Image and multimedia elements and Embedded content elements
- Tables are composed of Table content elements
- User input is collected with Forms
There are more tags than this, but these are the most commonly employed, and the ones you should be familiar with.
Learning More
The MDN HTML Docs are recommended reading for learning more about HTML.