This guide contains the following sections:
HTML documents use tags to indicate formatting or structural
information. A tag is simply a left
angle bracket ( <
) followed by a directive
and zero or more parameters followed by a right angle bracket (
>
). The remainder of this document explains the various
HTML directives.
<title>Simple example of an HTML document.</title> <h1>A simple example.</h1> This is a simple HTML document. This is the first paragraph. <p> This is the second paragraph. This is a word in <i>italics</i>. This is a word in <b>bold</b>. Here is an inlined GIF image: <img src="myimage.gif">. <p> This is the third paragraph. Here is a hypertext link from the word <a href="subdir/myfile.html">foo</a> to a document called "subdir/myfile.html". <p> <h2>A second-level header.</h2> Here is a section of text that should show up in a fixed-width font (as if it were a computer listing or a verse of poetry): <p> <pre> The cat in the hat fell to the ground and went splat. </pre> This is a bulleted list with two items: <p> <ul> <li> First item goes here. <li> Second item goes here. </ul> This is the end of my example document. <p> <address>John Bigbooty</address>Note that any HTML document from anywhere on the net that you access with Mosaic can be easily used as an example; just use the Document Source option in Mosaic's File menu to call up a window that will show you the HTML for the current document being viewed.
The title generally goes on the first line of the document. Here is an example title:
<title>This is my document's title.</title>Notice that the directive for the title tag is, appropriately enough,
title
. Note also the fact that there are both
starting and ending title tags, and that the ending tag
looks just like the starting tag except a slash ( /
)
precedes the directive. (This is also a good time to note that HTML
is not case sensitive: both <title>
and
<TITLE>
mean the same thing.) Headers are displayed within the document, generally using larger and/or bolder fonts than normal document text. There are six levels of headers (numbered 1 through 6), with 1 being the largest. (Usually only levels 1 through 3 are used with any frequency.)
Here is an example level 1 header:
<h1>This is a level 1 header.</h1>Here is an example level 2 header:
<h2>This is a level 2 header.</h2>Most documents use the same five or six words both for the title and for the initial (level 1) header; for example, the first two lines of the HTML source for this document are:
<title>A Beginner's Guide to HTML</title> <h1>A Beginner's Guide to HTML</h1>
<p>
. Here is an example paragraph, complete with terminating paragraph tag:
This is my first sentence. This is my second sentence. This is my third sentence. This is the end of the paragraph. <p>
<
), right
angle bracket ( >
), and ampersand ( &
). Why is this? The angle brackets are used to specify HTML tags (as shown above), while ampersand is used as the escape mechanism for these and other characters:
<
is the escape sequence for
<
>
is the escape sequence for
>
&
is the escape sequence for
&
Note also that there are additional escape sequences that are possible; notably, there are a whole set of such sequences to support 8-bit character sets (namely, ISO 8859-1); for example:
ö
is the escape sequence for
a lowercase o
with an umlaut: ö
ñ
is the escape sequence for
a lowercase n
with an tilde: ñ
È
is the escape sequence for
an uppercase E
with a grave mark: È
fixed-width
styles. Correspondingly,
you should know about the following three directives:
<i>text</i>
puts text in italics
(the result of the example would be text).
<b>text</b>
puts text in bold
(the result of the example would be text).
<code>text</code>
puts text in
a fixed-width font (the result of the example would be
text
).
Here's how that image was inlined into the document text above:
<img align=top src="elvis-small.gif">Note in particular the
align=top
parameter -- this
directs the document viewer to align adjacent text with the top of the
image (rather than the bottom, as is the default). So if you just say
<img src="elvis-small.gif">
, you'll get this
effect: This default behavior is especially
suited for using an image at the beginning of a paragraph (see the
next paragraph as an example).
Multiple instances of the img
tag
can be scattered through the document, but note that each such image
takes time to process and thus slows down the initial display of the
document. (Using a particular image multiple times in a document
causes no performance hit compared to using the image only once,
though.)
(Note that the img
tag is an HTML extension that is
currently only understood by NCSA Mosaic and not by most other World
Wide Web browsers.)
a
,
which stands for anchor (which is a common term for one end of
a hypertext link). An anchor is commonly used to point to somewhere from the current document. Here's how that works:
<a
href="document.html"
, and follow that with
the closing angle bracket: >
.
</a>
<a href="subdir/document.html">some text</a>.......which causes "some text" to be the hyperlink to the document named "subdir/document.html".
Note that inlined images (explained above) can serve as the contents of anchors. For example, the following picture of Elvis is a hyperlink to the NCSA Mosaic documentation: -- so when you click on Elvis, you get the Mosaic docs. The HTML for that was:
<a href="http://machine.name/subdir/file.html"> <img src="elvis-small.html"></a>
Here's an example. In document A, I have a traditional hyperlink, but the hypertext reference (href) gives not only the filename ("document-b.html") but also the name of a named anchor in the referenced document ("foobar"), with those two things separated by a hash mark ("#"):
This is my <a href="document-b.html#foobar">link</a>.Meanwhile, in document B, I have a lot of other text, and then the following:
Here's <a name="foobar">some random text</a>.Therefore, the link in document A points directly at the words "some random text" in document B, and following the link from document A will not only jump the reader to document B but will position document B in the window such that "some random text" is immediately visible no matter where in document B it's located. (In Mosaic, the window will be scrolled far enough down so "some random text" will be on the top line of the viewable region of the window, if possible.)
An offshoot of this technique is that you can have hyperlink cross-references within a single document: to point to a named anchor with name "blargh" in the current document, just give "#blargh" as the href for the hyperlink (omitting a filename):
I'm pointing to the named anchor "blargh" in this document with this <a href="#blargh">link</a>.
<ul>
tag.
<li>
tag. (There is no closing
tag for list items.)
</ul>
tag.
<ul> <li> First item goes here. <li> Second item goes here. </ul>For a numbered list, do the same thing except use the
ol
directive rather than the ul
directive. For example:
<ol> <li> First item goes here. <li> Second item goes here. </ol>Lists can be arbitrarily nested: any list item can itself contain lists. Also note that no paragraph separator (or anything else) is necessary at the end of a list item; the subsequent
<li>
tag (or list end tag) serves that role. (One
can also have a number of paragraphs, each themselves containing
nested lists, in a single list item, and so on.) An example nested list follows:
<ul> <li> This item includes a nested list. <ul> <li> First item of nested list. <li> Second item of nested list. </ul> <li> Second item goes here. <ul> <li> Only item of second nested list. </ul> </ul>This is displayed as:
Here's an example description list:
<dl> <dt> This is the first "title". <dd> This is the first "description", followed by a lot of completely meaningless text intended to make sure that at least one line wrap will occur for a reasonable window width, and if you don't have a window width wide enough to cause at least a single line wrap, you should narrow your window at this point, otherwise this example is pretty much pointless and here I sit getting carpal tunnel syndrome typing in all this verbage all for nothing. <dt> This is the second "title". <dd> This is the second "description". </dl>......which comes out looking like this:
pre
tag ("pre" stands for preformatted). For
example, the following HTML:
<pre> column 1 column 2 column 3 -------- -------- -------- 133.0 115.0 332.5 + 556.0 + 332.6 + 229.3 = 689.0 = 447.6 = 561.8 </pre>.......will result in exactly this:
column 1 column 2 column 3 -------- -------- -------- 133.0 115.0 332.5 + 556.0 + 332.6 + 229.3 = 689.0 = 447.6 = 561.8No surprises there. (You should be aware that you can also embed hypertext references inside
pre
sections without losing
the formatted effects, which is good. This capability is used, for
example, in the manual page interfaces provided through Mosaic.)
In general, you should try to avoid using pre
whenever
possible under the principle that the final results will be much less
flexible, and attractive, than full HTML. (Most people seem to think
that preformatted, fixed-width text -- an artifact of the typewriter
and primitive computer era -- looks pretty baroque compared to
formatted text.)
<h1>This is <a name="foo">invalid HTML.</h1></a>Since many HTML parsers aren't very good at handling invalid HTML, it is always good to avoid doing bad things like overlapping constructs.
img
tag points at an image that does not
exist or cannot be otherwise obtained from whatever server
is supposed to be serving it, the NCSA logo will be substituted
in place. For example, doing
<img href="doesNotExist.gif">
(where
"doesNotExist.gif" does not exist) causes the following to
be displayed: If this happens to you, first make sure that the referenced image does in fact exist, then make sure the remote server (if any) can actually serve it, then make sure the image file is uncorrupted (and that your server is not corrupting it -- the NCSA httpd doesn't corrupt images, but certain other common http servers do).
The in-development HTML RFC is here.
A description of SGML, the Standard Generalized Markup Language on which HTLM is based, is here.
A simple overview of Universal Resource Locators (the extended
filename references used in hypertext links and in the
src
part of an img
tag) is here; this overview is still incomplete and
will improve in the future.
The URL specification itself is here.
A style guide for online hypertext document structures can be found here.
Copyright 1993 Marc Andreessen (marca@ncsa.uiuc.edu)