3.1. The Difference Between HTML and XHTML
From its very beginning, Hypertext Markup Language is what has made the World Wide Web possible. It both conveys the thoughts of the person who created the page and defines nearly every aspect of what we see on each and every web page visited. Like English, French, Spanish, Japanese, Russian, or any other language in use today, it is a living language, evolving and growing.
Early on, this growth was fast and sudden, with "features" often doing an end-run around the World Wide Web Consortium. Add to that the fact that many of the designers of web pages play fast and loose in an effort to have more content than the next guy. So what if some corners were cut? It was all about content, and content was king.
Enter XHTML, considered by some as an effort to reign in the Wild West approach to web development by making HTML a dialect of XML. XHTML came in three flavors: transitional, strict, and frameset, with each flavor offering either different capabilities or different degrees of conformance to the XML standard.
3.1.1. Not Well Formed
Probably the biggest single difference between HTML and XHTML is that XHTML must be well formed. "Not a big deal," you say. Well, it could be. The part of the document that isn't well formed doesn't have to be glaring, like a foot being attached to the forehead. Because an XHTML document is essentially XML, simply following the HTML practices that we've followed for years is enough to get us into trouble. Consider the following two HTML input statements:
<input type="text" name="bad" id="bad" value="Not well-formed"> <input type="text" name="alsobad" id="alsobad" value="Not well-formed" disabled>
Both statements are perfectly acceptable HTML, but as XHTML, they don't make the grade because neither is well formed. The problem with the first statement is that the tag isn't closedperfectly acceptable in HTML, but verboten in XHTML. Fortunately, correcting it is a simple matter; just close the tag in the manner of self-closing tags or treat it as a container tag. The problem with the second statement might be a little harder to spot. I'll give you a hint: attributes. Yes, in XML, attributes must always have values, so give it one. disabled="disabled" might look goofy, but it works.
3.1.2. Well Formed
At first glance, it might appear that all that is required to convert HTML into XHTML is to slap a DTD before the HTML tag, close some tags, and clean up some attributes. Voilà, instant XHTML! Well, maybe, sometimes, occasionally, except on Tuesdays or at night during a full moon. You see, unfortunately, there is still a potential source of problems.
3.1.3. A Well-Formed Example
Thankfully, my despair didn't last very long. It wasn't like there was a death in the family, or Stargate SG-1 had been cancelled, or anything important like that. It was merely a technical speed bump (or white tail deer, to those of you in Pennsylvania) on the road of life. I wasn't worried because I knew a trick that would make anything well formed.
If you're unfamiliar with CDATA, it is the XML equivalent of saying "Pay no attention to that man behind the curtain." Basically, anything that is within the CDATA won't be parsed as XML, which is quite convenient for this case. There is, however, one problem with using CDATA; certain web browsers have issues with it, so it is necessary to hide it from the browser in the manner shown in Listing 3-1.
Listing 3-1. Hiding CDATA