Last week, we posted about the value of understanding HTML, the language of the web. While there are many tools that make website creation possible without requiring you to know any “code,” we feel that having a working knowledge of how the code behind a web page works is tremendously valuable.
Today, we’re going to take a basic look at HTML. We’ll expose the simple but powerful structure, and discuss how various elements of the language work. After reading through this article, you will be able to decipher much of the content in existing web pages, and you will be able to create your own!
What is HTML? Is it programming code? Is it complicated?
HTML is an acronym for “Hypertext Markup Language,” which in plain English means that it is a system for defining the appearance and behavior of sections of text. Just as you might write a text document in a word processor, and then highlight a section of the text and make it bold or italic, HTML serves as a flexible way to “mark up” portions of a document.
It’s not complicated, although it often appears to be. Websites that have complex formatting usually contain many HTML “tags,” making it initially difficult to read through a document. But once we understand the basics, it becomes much easier to figure out what’s going on.
HTML is not a programming language. There’s no computer logic to figure out, no conditions and loops and variables, and everything else that comes with programming languages. These kinds of things can be used to supplement HTML, but they are not part of the language. HTML simply divides content up for formatting purposes.
The structure of HTML
In HTML, we define areas of content by using HTML “tags.” The tag concept is surprisingly simple, but powerful. Every tag definition is enclosed within chevron characters (< and >), and is simply a predefined name or abbreviation (<tagname>). When we reach the end of a content section, we indicate this by using the same tag definition again, but with a forward slash added (</tagname>). Let’s look at a paragraph as an example.
In a word processor, you would probably start a new paragraph by pressing the Return (or Enter) key. You’d type out the content of your paragraph, and then you would press Return again when the paragraph is completed. For the web, we explicitly define the start and end of a paragraph with tags instead. The HTML tag to start a paragraph is simply the letter P. So, for example, I could write
<p>This is a new paragraph</p>
The <p> defines the start of a paragraph, and the </p> defines its end. In a web browser, that code would be interpreted as follows:
It’s a simple example, but it’s one that you’ll see everywhere you go on the internet, including this page! HTML starts to look more confusing as additional tags are added, but the basic thing to remember is that there are usually pairs of tags—one that opens a section of text, and one that closes it.
One of the powerful properties HMTL is that we can define hierarchical levels of markup instruction. For example, consider the following:
The tag definition for bold text is “strong.” Sometimes, b (for bold) is used, also. So, to create the above paragraph using HTML, we simply add tags in the appropriate areas of the text, like so:
<p>This is a sample paragraph that also has <strong>bold text</strong> defined within it</p>
There is no limit to the number of tags within tags that you can have. There are, however, practical concerns. You would not, for example, nest a paragraph within a paragraph. You would end the current paragraph first, and start a new one.
A web page, with most of the typical HTML elements
Once we understand that a web page is simply text with tags inserted to modify its appearance, we can consider a much more complex example, and it should start to be a lot easier to read. Below is the code for a sample web page:
There are many tags used in the example above. We start off by defining the start of an HTML document, and then its header. The header contains information that’s not actually part of the page (like the “title,” which is the text that will appear up in your web browser’s title bar at the top of the window). We close the head section (</head>), but do not close the HTML tag yet.
Still working within the “HTML” section, we create a “body” section, which will contain everything that is visible on the page. This is where you see familiar tags like “p” and “strong,” and some additional tags like “em” (for emphasis, or italics). There is also an unordered list (“ul”) and its list items (“li”). Once all of the content has been divided into appropriate content sections, we end the body section and, finally, end the html section. Here’s what the code looks like in a web browser:
An HTML tag can have more than just <tagname>. Sometimes, additional data is needed as well. For example, to make a hyperlink to another web page, we use the anchor tag: <a>text for hyperlink</a>. But, this does not provide enough information for the browser to know where to jump to. We also need to specify the hyperlink referral information.
Any time that we need to provide additional attributes, we just include them inside the chevrons (“< >”) with the tag. In this case, we’ll specify a web address like so: <a href=”http://www.wpi.edu”>Link to the WPI home page </a>. The text inside the <a></a> tags becomes a hyperlink, which redirects to our home page.
HTML attributes often confuse people who are learning HTML because they add more complexity, and are not always easy to remember. Thankfully, you don’t have to remember them all; you only need to know how to use them. W3C, the organization that developed HTML, provides a list of HTML elements on their website: http://www.w3.org/TR/REC-html40/index/elements.html (opens in new window).
You can click on any of the elements listed there to see a list of attributes associated with that particular element, and decide which attributes (if any) you need to use.
Images and video
Images are an interesting exception to the <tag>Content</tag> approach we’ve discussed so far. Since images are not textual in nature, there is no content to put between the opening and ending tags. Because of this, the HTML “img” element does not use a traditional tag pair to define the start and end of the content. Instead, we simply use one set of chevrons and appropriate attributes: <img src=”http://media.wpi.edu/Academics/ATC/TTL_Logo-Smll_size.png” />
Because of the evolving nature of multimedia formats, adding video content to the web can be a bit complicated. We generally recommend using “embed code” from whatever site you are uploading your video to. Youtube, for example, provides an “object” element with all of the attributes already filled out for you. You can simply paste this code snippet into the body section of your web page.
Try it out!
Getting started with HTML is easy. You can open up any plain text editor like Notepad and paste any of the code above into it. Save the document, but change the file type from .txt to .html, and then open the file in a web browser. To edit any HTML file you have already created, right-click on it and choose “Open With.” You can select Notepad from a list of available programs. (On Mac, use the Open With contextual menu)
If you want to expand on your HTML experience, try using an HTML editing program like PageBreeze (Free–http://www.pagebreeze.com/) or Adobe Dreamweaver (30-day trial–http://www.adobe.com/cfusion/tdrc/index.cfm?product=dreamweaver). These tools allow you to input and format text as you would in a word processor, and then see the code that’s generated as a result. It is a great way to learn new tips and tricks for building web pages!
Next time, we’ll look at CSS, the language for complex formatting and styling of web elements. Until then, have fun experimenting.