Java Tutorial

XML Namespaces

This is the last topic we need a little insight into before we get back into Java programming. Even though they are very simple, XML namespaces can be very confusing. The confusion arises because it is so easy to make assumptions about what they imply when you first meet them. Let's look briefly at why we have XML namespaces in the first place and then see what an XML namespace actually is.

We saw earlier that an XML document can only have one DOCTYPE declaration. This can identify an external DTD by a URI or include explicit markup declarations, or it may do both. What happens if we want to combine two or more XML documents that each have their own DTD into a single document? The short answer is we can't – not easily anyway. Since the DTD for each document will have been defined without regard for the other, element name collisions are a real possibility. It may be impossible to differentiate between different elements that share a common name and in this case major revisions of the documents' contents as well as a new DTD will be necessary to deal with this. It won't be easy.

XML namespaces are intended to help deal with this problem. They enable names used in markup to be qualified so that you can make duplicate names used in different markup unique by putting them in separate namespaces. An XML namespace is just a collection of element and attribute names that is identified by a URI. Each name in an XML namespace is qualified by the URI that identifies the namespace. Thus different XML namespaces may contain common names without causing confusion since each name is notionally qualified by the unique URI for the namespace that contains it.

I say 'notionally qualified' because you don't actually qualify names using the URI directly. You use another name called a namespace prefix whose value is the URI for the namespace. For example, I could have a namespace that is identified by the URI http://www.wrox.com/Toys and a namespace prefix, toys, that contains a declaration for the name rubber_duck. I could have a second namespace with the URI http://www.wrox.com/BathAccessories and the namespace prefix BathAccessories that also defines the name rubber_duck.

The rubber_duck name from the first namespace is referred to as Toys:rubber_duck and that from the second namespace is BathAccessories:rubber_duck so there is no possibility of confusing them. The colon is used in the qualified name to separate the namespace prefix from the local name, which is why we said earlier you should avoid the use of colons in ordinary XML names.

Let's come back to the confusing aspects of namespaces for a moment. There is a temptation to imagine that the URI that identifies an XML namespace also identifies a document somewhere that specifies the names that are in the namespace. This is not required by the namespace specification. The URI is just a unique identifier for the namespace and a unique qualifier for a set of names. It does not necessarily have any other purpose, or even have to refer to a real document. It only needs to be unique. The definition of how names within a given namespace relate to one another and the rules for markup that uses them is an entirely separate question. This may be provided by a DTD or some other mechanism such as an XML Schema.

Namespace Declarations

A namespace is associated with a particular element in a document, which of course can be, but does not have to be, the root element. A typical namespace declaration in an XML document looks like this:

<sketcher:sketch xmlns:sketcher="http://www.wrox.com/dtds/sketches">

A namespace declaration uses a special reserved attribute name, xmlns, within an element and in this instance the namespace applies to the <sketch> element. The name sketcher that is separated from xmlns by a colon is the namespace prefix and it has the value http://www.wrox.com/dtds/sketches. You can use the namespace prefix to qualify names within the namespace, and since this maps to the URI, the URI is effectively the qualifier for the name. The URL that I've given here is hypothetical – it doesn't actually exist, but it could. The sole purpose of the URI identifying the namespace is to ensure that names within the namespace are unique so it doesn't matter whether it exists or not. You can add as many namespace declarations within an element as you want and each namespace declared in an element is available within that element and its content.

With the namespace declared with the sketcher prefix, we can use the <circle> element that is defined in the sketcher namespace like this:

<sketcher:sketch xmlns:sketcher="http://www.wrox.com/dtds/sketches">
  <sketcher:circle radius="15" angle="0">
    <sketcher:color R="150" G="250" B="100"/>
    <sketcher:position x="30" y="50"/>
  </sketcher:circle>
</sketcher:sketch>

Each reference to the element name is qualified by the namespace prefix sketcher. A reference in the same document to a <circle> element that is defined within another namespace can be qualified by the prefix specified in the declaration for that namespace. By qualifying each element name by its namespace prefix, we avoid any possibility of ambiguity.

A namespace has scope – a region of an XML document over which the namespace declaration is visible. The scope of a namespace is the content of the element within which it is declared, plus all direct or indirect child elements. The namespace declaration above applies to the <sketch> element and all the elements within it. If you declare a namespace in the root element for a document, its scope is the entire document.

You can declare a namespace without specifying a prefix. This namespace then becomes the default namespace in effect for this element and its content, and unqualified element names are assumed to belong to this namespace. Here is an example:

<sketch xmlns="http://www.wrox.com/dtds/sketches">

There is no namespace prefix specified so the colon following xmlns is omitted. This namespace becomes the default, so we can use element and attribute names from this namespace without qualification and they are all implicitly within the default namespace. For instance:

<sketch xmlns="http://www.wrox.com/dtds/sketches">
  <circle radius="15" angle="0">
    <color R="150" G="250" B="100"/>
    <position x="30" y="50"/>
  </circle>
</sketch>

This markup is a lot less cluttered than the earlier version that used qualified names, which makes it much easier to read. It is therefore advantageous to declare the namespace that you use most extensively in a document as the default.

You can declare several namespaces within a single element. Here's an example of a default namespace in use with another namespace:

<sketch xmlns="http://www.wrox.com/dtds/sketches"
        xmlns:print="http://www.wrox.com/dtds/printed">
  <circle radius="15" angle="0">
    <color R="150" G="250" B="100"/>
    <position x="30" y="50"/>
  </circle>
  <print:circle print:lineweight="3" print:linestyle="dashed"/>
</sketch>

Here the namespace with the prefix print contains names for elements relating to hardcopy presentation of sketch elements. The <circle> element in the print namespace is qualified by the namespace prefix so it is distinguished from the element with the same name in the default namespace.

XML Namespaces and DTDs

For a document to be valid you must still have a DTD and the document must be consistent with it. The way in which a DTD is defined has no specific provision for namespaces. The DTD for a document that uses namespaces must therefore define the elements and attributes using qualified names and must also make provision for the xmlns attribute with or without its prefix in the markup declaration for any element in which it can appear. Because the markup declarations in a DTD have no specific provision for accommodating namespaces, a DTD is a less than ideal vehicle for defining the rules for markup when namespaces are used. The XML Schema specification provides a much better solution, and overcomes a number of other problems associated with DTDs.

However, the XML Schema specification has been finalized and approved relatively recently – in the first half of 2001. Consequently the present JAXP implementation distributed as part of SDK 1.4 has no specific provision for schemas so we won't go into them here.