Internationalization and Multiple Languages

Since registries need the ability to support users from all over the world, UDDI has added multiple language support in many of the entry types. The U in UDDI stands for "universal," which means the registry needs to support multiregional and international organizations and the services they offer. UDDI supports the ability to classify entities using multiple languages and/or multiple scripts of the same language. Additionally, UDDI can allow additional language-specific sort orders and provides for consistent search results in a language-independent manner.

Flute Bank also does business in Canada and Mexico and therefore needs to support English, French, and Spanish. The bank would like its name to appear in the native tongue of of service users and will provide translations in the registry. Let us look at the new businessEntity for Flute Bank:

<businessEntity
  businessKey="CBC2A349-A4C1-22D4-AE87-BC8713D2BD9"
  authorizedName="MattieLee"
  operator= ... >
  <name>Flute Bank</name>
  <name xml:lang="sp">banco de la flauta</name>
  <name xml:lang="fr">banque de cannelure</name>
  <name xml:lang="il">serie della scanalatura</name>
  <name xml:lang="pr">banco da flauta</name>
  <name xml:lang="gr">Flötebank</name>
...
</businessEntity>

UDDI registries support internationalization features through API sets and allow for multilingual business entity descriptions. UDDI registries support the following internationalization features:

Multilingual names and descriptions
Multiple names in the same language
Internationalized address format
Language-dependent collation

In the previous listing, the registration showed English, Spanish, and French, based on the Latin character set, which is the same as ASCII. Since registrations may include character sets from other languages, such as Chinese, Arabic, Hindi, and others, the xml:lang attribute can be used to specify additional language-specific information. Flute Bank can represent its name in Chinese for Chinese-speaking customers as well as in Russian for Russian-speaking customers, while also displaying its English name, using the xml:lang attribute.

<businessEntity>
...
  <name xml:lang="zh"></name>
  <name xml:lang="ru"></name>
  <name xml:lang="en">Flute Bank</name>
  <name xml:lang="en">FB</name>
...
</businessEntity>

UDDI allows multiple name elements to be published. In the above listing, we have two English representations of Flute Bank-one its full name, and other its acronym. In this scenario, the first name element for each language is treated as the primary name, which would be used for all searching and sorting operations.

Each supported language within the registry is based on the Unicode 3.0 specification and ISO 10646, which support the majority of languages in use. Each language has its own unique behavior when it comes to sort-order collation, depending on whether the language's script is alphabetic, syllabic, or ideographic.

Languages that share the same alphabetic script, such as English, Spanish, and French, have different collation weights, depending on the other languages with which they're used. For languages that have both upper- and lowercase letters (bicameral), sorting depends on whether sorting is specified as case-sensitive or case-insensitive. For ideographic languages, such as Chinese and others that have large character range, collation may depend on whether stroke order or phonetic collation is specified.

The ability to support time zones is an important aspect of human communication. UDDI registries allow businesses to publish contact information, such as telephone and/or fax numbers. It is also important to have the ability to attach hours of availability. Businesses can indicate the time zone for each contact by specifying it as part of the contact's address.

UDDI also supports differing formats for postal addresses. Many parts of the world specify their postal addresses differently and may use different elements, such as lot numbers, building identification, floor numbers, subdivisions, and so on. In UDDI, the address is supported by an address element that is part of the businessEntity data structure. The address element contains a list of addressline elements.

Addresses are specified using the ubr-uddi-org:postalAddress tModel. This addresses the common subelements of an address, such as cities, states, and so on. The address element also specifies a tModelKey attribute as well as the keyName/keyValue pair for each addressLine element. Let us look at the address fragment for Flute Bank's businessEntity registration:

<address useType="Data Center" tModelKey="uddi:ubr.uddi.org:postalAddress">
  <addressLine keyName="Street" keyValue="60">Diana Drive</addressLine>
  <addressLine keyName="House number" keyValue="70">25</addressLine>
  <addressLine keyName="City" keyValue="40">Bloomfield</addressLine>
  ...
  <addressLine keyName="Country" keyValue="20">Trinidad</addressLine>
</address>

Addresses themselves may have different language representations as well. This can be accomplished by using differing addressLine elements, depending on the language. Using the keyName/keyValue pair with the codes specified in the ubr-uddi-org:postAddress tModel, you can determine proper address formatting programmatically. Let us look at one more example for Flute Bank and how this may be represented:

<address useType="International Office" xml:lang="en"
tModelKey="uddi:ubr.uddi.org:postalAddress">
  <addressLine keyName="House Number" keyValue="70">No. 9</addressLine>
  <addressLine keyName="Street" keyValue="60">Aberdeen Road</addressLine>
  <addressLine keyName="District" keyValue="50">Newlands Village</addressLine>
  <addressLine keyName="City" keyValue="40">Biche</addressLine>
</address>
<address useType="International Office" xml:lang="ja"
tModelKey="uddi:ubr.uddi.org:postalAddress">
  <addressLine keyName="City" keyValue="40"></addressLine>
  <addressLine keyName="Street" keyValue="60"></addressLine>
  <addressLine keyName="District" keyValue="50"></addressLine>
  <addressLine keyName="House Number" keyValue="70"></addressLine>
</address>

As you can see, UDDI can support global usage of services in a straighta-forward manner. It also provides functionality that allows you to perform lan guage-dependent collation, based on the results returned from find operations. Please see the latest UDDI specification for this support.