6.3. Handling Verboten CharactersOccasionally when dealing with XML documents, you will encounter certain characters that will cause a document to be not well formed. For example, imagine an element that contains a JavaScript function, such as the one shown in Listing 6-6. Examined from a JavaScript perspective, the function looks like it works, but when examined from an XML point of view, there is one big glaring error. Here is a hint: Look at the for loop. Listing 6-6. A Script Element That Is Not Well Formed
XML interprets the less-than (<) operator as the beginning of a new element, and from an XML viewpoint, the new tag is not well formed. Fortunately, you can use one of two methods to get around this issue: entities or CDATA sections. Each of these methods is suited to a different purpose, so let's examine each to determine which better suits our problem. 6.3.1. EntitiesEntities. A part of me just likes to say the word entities. It's just a fun word to say, especially to a manager who is unfamiliar with XML. Just imagine someone's reaction when being told that the XML contains entities. Talk about your flashbacks to late-night horror movies! Of course, there is always the alternative: being fitted for a jacket with wraparound sleeves. Either way, you've gotten the manager's attention. XML has five predefined entities whose purpose it to avoid well-formedness issues when encountering select common characters. Table 6-1 defines these five entities, and later topics cover how to define additional entities.
The JavaScript in Listing 6-6 can be made well formed by replacing the character < by its corresponding entity <. Unfortunately, although the use of entities would correct the issue from an XML point of view, from a JavaScript perspective, there is a world of difference between < and <. To make both XML and JavaScript happy, it is necessary to use a CDATA section. 6.3.2. CDATA SectionsA CDATA section is the XML equivalent of "Pay no attention to that man behind the curtain," from The Wizard of Oz. However, there is no pesky little girl with a little dog to mess things up. Because of this, XML totally ignores whatever is within a CDATA section's tags, <![CDATA[ and ]]>, as shown in Listing 6-7. Listing 6-7. A Well-Formed Script Element
|