Controlling a web browser

The following technologies must be studied in order to properly control a web browser.

SGML

SGML defines the rules for creating the markup language for a document. A markup language consists of texts and tags. The tags are used to define the structure of a document. Text is anything else. An element in the document is enclosed by a start tag and an end tag.

Documents encoded using SGML can be modelled as a tree. The root of the tree is the outer most tag. Tags have to nest inside other tags (i.e., improperly nested tags are considered illegal). The contents between a start tag and an end tag is consider the children of the element. Attributes can be associated with the start tag. An attribute is a name/value pair.

The SGML specification describes the syntax of tags, attributes, and character encodings. A document type definition, DTD, provides a definition of the allowed tags and organization of the the tags. It is a basically a context free grammar.

SGML provided the rules for the creation of the HTML application.

XML

XML is a simplified version of SGML. It also provides a framework for the creation of documents and data representation.

The XML specification is described in under 50 pages. One of the goals for XML is to allow for simple software implementations.

The XML standard is used to:

SGML is more focused on documents.

HTML and XHTML

HTML started as a language to document information and to link the documents for researchers at CERN. Originally it was mostly concerned with the structure and not the presentation of the information.

During the browser wars many presentation tags were added.

XHTML is a return to an only structural representation. CSS provides the presentation information. Separating the structure from the presentation allows the information in the document to be presented in multiple ways.

XHTML syntax restrictions - a summary

Despite that XHMTL inherits all elements and attributes of HTML, XHTML is more rigorous in the following ways:

In this class, try to stick to these rules as you can.

first

A document type definition in a HTML document is not required, however it is essential in controlling the behaviour of different web browsers.

first.html
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
    "http://www.w3.org/TR/html4/strict.dtd">
<html>
    <head>
        <title>Hello World</title>
    </head>
    <body>
        <p>
            Hello World
        </p>
	<ul>
	    <li> one </li>
	    <li> two </li>
	</ul>
    </body>
</html>

The DOCTYPE declaration specifies which document type definition the document will follow. The contents of the title tag specifies the name of the document, the browser normally displays this name in the title bar.

first dom

HTML 4.01 DTD

HTML 4.01 defines the following DTDs:

first with loose

first-loose.html
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
   "http://www.w3.org/TR/html4/loose.dtd">
<html>
    <head>
        <title>Hello World</title>
    </head>
    <body>
        <p>
            Hello World
        </p>
	<ul>
	    <li> one </li>
	    <li> two </li>
	</ul>
    </body>
</html>

first with quirks

first-quirks.html
<html>
    <head>
        <title>Hello World</title>
    </head>
    <body>
        <p>
            Hello World
        </p>
	<ul>
	    <li> one </li>
	    <li> two </li>
	</ul>
    </body>
</html>

XHTML DTD

XHTML defines the following DTDs: