The Web Standards

The W3C website, http://www.w3.org, has a huge number of standards in varying stages of creation. Not all of these standards concern us, and not all of the ones that concern us can be found at this website. However, the vast majority of standards that do concern us can be found there.

We're going to take a brief look now at the technologies and standards that have an impact on JavaScript and give a little background information about each. Some of the technologies may be unfamiliar, but we need to be aware of their existence at the very least.

HTML

The HTML standard is maintained by W3C. This standard might seem fairly straightforward, given that each version should have introduced just a few new tags, but in reality the life of the standards body was vastly complicated by the browser wars. The versions 1.0 and 2.0 of HTML were simple, small documents, but when W3C came to debate HTML version 3.0, they found that much of the new functionality they were discussing had already been superceded by new additions, such as the <applet> and <style> tags, to the version 3.0 browser's appletstyle. Version 3.0 was discarded, and a new version 3.2 became the standard.

However, a lot of the features that went into HTML 3.2 had been introduced at the behest of the browser manufacturers and ran contrary to the spirit of HTML, which was intended solely to define structure. The new features, stemming right back to the <font> tag, just confused the issue and added unnecessary presentational features to HTML. These features really became redundant with the introduction of style sheets. So suddenly, in the version 3 browsers, there were three distinct ways to define the style of an item of text. Which was the correct way? And if all three ways were used, which style did the text ultimately assume? The version 4.0 of the HTML standard was left with the job of unmuddling this chaotic mess and marked up a lot of tags for deprecation (removal) in the next version of the standards. It was the largest version of the standard and included features that linked it to style sheets and the Document Object Model, and also added facilities for the visually impaired and other unfairly neglected minority interest areas. The current version of the HTML standard is 4.01.

ECMAScript

JavaScript itself followed a trajectory similar to that of HTML. It was first used in Netscape Navigator and then added to Internet Explorer at a later date. The Internet Explorer version of JavaScript was christened Jscript and wasn't far removed from the version of JavaScript found in Netscape Navigator. However, once again there were differences between the implementations of the two and so a lot of care had to be taken when writing script for both browsers.

Oddly enough, it was left to the European Computer Manufacturers Association (ECMA) to propose a standard specification for JavaScript. This didn't appear until a few versions of JavaScript had already been released. Unlike HTML, which had been developed from the start with the W3C consortium, JavaScript was a proprietary creation. This is the reason that it is governed by a different standards body. Microsoft and Netscape both agreed to use ECMA as the standards vehicle/debating forum, because of its reputation for fast-tracking standards and perhaps also because of its perceived neutrality. The name ECMAScript was chosen so as not to be biased to either vendor's creation and also because Java of JavaScript was a trademark of Sun licensed to Netscape. The standard, named ECMA-262, laid down a specification that was roughly equivalent to the JavaScript 1.1 specification.

Having said that, the ECMAScript standard covers only core JavaScript features, such as the primitive data types of numbers, strings, and Booleans, native objects like the Date, Array, and Math objects, and the procedural statements like for and while loops, and if and else conditionals. It makes no reference to client-side objects and collections, such as window, document, forms, links, and images. So, while the standard helps to make core programming tasks compatible when both JavaScript and Jscript comply with it, it is of no use in making the scripting of client-side objects compatible between the main browsers. Some incompatibilities remain.

All current implementations of JavaScript are expected to conform to the current ECMAScript standard, which is ECMAScript edition 3, published in December 1999. As of January 2004, ECMAScript edition 4 is under development.

While in the version 3 browsers there were quite a few irregularities between the Microsoft and Netscape dialects of JavaScript, they're now similar enough to be considered the same language. The Opera browser also supports and offers the same kind of support for the standard. This is a good example of how standards have provided a uniform language across browser implementations, although a feature war similar to the one that took place with HTML still applies to a lesser degree for JavaScript.

XML

Extensible Markup Language, or XML, is a standard for creating markup languages (such as HTML). XML itself has been designed to look as much like HTML as possible, but that's where the similarities end.

HTML is actually an application of the meta-language SGML, which is also a standard for generating markup languages. SGML has been used to create many markup languages, but HTML is the only one that enjoys universal familiarity and popularity. XML, on the other hand, is a direct subset of SGML. SGML is generally considered to be too complex for people to be able to accurately represent it on a computer, so XML is a simplified subset of SGML. XML is much easier to read than SGML as well.

XML's main purpose is for the creating customized markup languages that are very similar in look and structure to HTML. One main use of XML is in the representation of data. Whereas a normal database can store information, databases don't allow individual stored items to contain information about their structure. XML can use the tag structure of markup languages to represent any kind of data, from mathematical and chemical notations to the entire works of Shakespeare, where information contained in the structure of the data might otherwise be lost. For instance, an XML document could be used to record that Mark Anthony doesn't appear until Scene II Act I of Shakespeare's play Julius Caesar, while a relational database would struggle to do this without a lot of extra fields, as the following example shows:

<play>
   <act1>
      <scene1>
         ...
      </scene1>
      <scene2>
         <mark_anthony>
            Caeser, my lord?
         </mark_anthony>
      </scene2>
      <scene3>
         ...
      </scene3>
   </act1>
   <act2>
      ...
   </act2>
   <act3>
      ...
   </act3>
   <act4>
      ...
   </act4>
   <act5>
      ...
   </act5>
</play>

XML is also completely cross-platform, because it contains just text. This means that an application on Windows can package up the data in this format, and a completely different application on Unix should be able to unravel it and read the data.

XML is more complex than HTML. Whereas a browser will take HTML code, interpret the relevant details, and display the corresponding web page without any intervention, interpreting XML requires several extra steps.

Because we're creating the markup language ourselves, we need to first create a set of rules through which the language will be run. This can be done in one of two ways, either by an XML schema or by a Document Type Definition (DTD). Both of these are used to draw up rules, such as which tags we can use in our markup language, which attributes these tags take, and what kind of data these attributes are expecting.

Secondly, once we've written our XML document in our new language, it must be checked against both the syntax rules laid down for XML documents and the rules in the schema or the DTD to see if the code conforms. We'll be taking an in-depth look at XML in the next chapter.

XHTML

XHTML 1.0 is where the XML and HTML standards meet. XHTML is just a respecification of the HTML 4.01 standard as an XML application. The advantages of this allow XHTML to get around some of the problems caused by a browser's particular interpretation of HTML, and more importantly to provide a specification that allows the Web to be used by clients other than browsers, such as those provided on handheld computers, mobile phones, or any software device that might be connected to the Internet (perhaps even your refrigerator).

It also offers a common method for specifying our own tags, instead of just adding them randomly. We can specify new tags via a common method using an XML DTD and an XML namespace. (This is a way of identifying one set of tags uniquely from any other set of tags.) This is particularly useful for the new markup languages, such as Wireless Markup Language (WML), which are geared toward mobile technology and require a different set of tags to be able to display on the reduced interfaces.

Having said that, anyone familiar with HTML should be able to look at an XHTML page and understand what's going on. There are differences, but not ones that add new tags or attributes.

The following list points out the main differences between XHTML and HTML:

XHTML requires an XML declaration at the top of the file: <?xml version='1.0'?>.
We also have to provide a DTD declaration at the top of the file referencing the version of the DTD standard we are using.
We have to include a reference to the XML namespace within the HTML tag.
We need to supply all XHTML tag names in lowercase, because XML is case-sensitive.
The <head> and <body> elements must always be included in an XHTML document.
Tags must always be closed and nested correctly. When only one tag is required, such as with line breaks, the tag is closed with a /, for example <br/>.
Attribute values must always be denoted by quotation marks.

This set of rules makes it possible to keep a strict hierarchical structure to the tags, which in turn makes it possible for the Document Object Model to work correctly. This is the route that HTML is currently taking, and all future HTML standards will be XHTML standards. This also provides a way of standardizing markup languages across all device types, so that the next version of WML (the markup language of mobile devices), will also be compliant with the XHTML standard. We should now be creating our HTML documents according to the rules specified previously. If we do so, we will find the job of writing JavaScript that manipulates the page via the DOM and works in the way it was intended much, much easier.

It's now time for us to consider the Document Object Model itself.