SOAP Processing Model

Earlier, we mentioned how a node may send, receive, or do both with a SOAP message. The mechanics of how a node handles these SOAP messages are called the processing model. In addition to the sending and receiving nodes, a SOAP intermediary may sit between the message originator and the ultimate destination (Figure 4.6), playing both a receiver and a sender role, adding to SOAP's horizontal extensibility. Intermediaries receive a SOAP message from the originator or another SOAP intermediary and pass the message on to the intended destination or another intermediary.

Figure 4.6: SOAP nodes and intermediaries

SOAP messages can target specific headers at individual nodes, to ensure en route processing as the message goes hop by hop. The SOAP node for which a header element is intended may be the ultimate recipient of the message or an intermediary. Each header entry can identify which node must act on that header by specifying an actor attribute, like this:

<env:Header>
  <x:someheader someattribute="somevalue"
             soap:actor="http://www.flutebank.com/accountcheck ">
</env:Header>

The model of the SOAP actor attribute is intended to be quite simple and is based on two assumptions:

All SOAP nodes are identified by a unique URI.
This URI can target one or more header entries at the node.

The URI is only a logical identification criterion. The actor http://www.flutebank.com/accountcheck in the preceding example does not need to point to a physical URL but only identifies the actor responsible for validating an account number and balance. SOAP defines two special cases for the actor attribute:

A special SOAP actor URI of http://schemas.xmlsoap.org/soap/actor/next indicates that the header block is to be processed by the node that has received the message.
If no actor is specified, the node assumes it is the ultimate message recipient.

These two cases are specified because these actors are so commonly used, and it is useful to specify these details independently of where the message is going or how it gets there. For example, an intermediary may route a message dynamically between a set of "next" actors or when a node redirects the message to another node. Either way, there is no need to change the message.

The act of inserting a header entry into a message is like establishing a contract between the party responsible for the message and the party receiving it. An intermediary that inserts a header entry can be thought of as acting on behalf of the initial sender, because the receiver views the message as coming from the intermediary, not the sender. It also means that an intermediary cannot take a block out if the block is not intended for it. It is worth nothing that nothing prevents it from looking at blocks not intended for it.

With the combination of the actor and mustUnderstand attributes, it is possible to route a message along a path with several different processing nodes. This model of distributing message processing across nodes makes for very scalable architecture, with each node providing value-added features to the request.

As an example, one value-added feature is caching responses to frequent or unchanging requests. If special caching servers are identified as actors with a particular URI, upon receiving the SOAP message with the appropriate header, a cache server can determine if the data within the message body is to be cached on the server. The header block shown below identifies the intended actor with the URI http://www.flute.com/cache. Any node acting in that capacity may choose to interpret and act upon the block. In this example, the header indicates whether the data in the body is cacheable.

<env:Header>
    <c:cache xmlns:m="http://www.flute.com/BillPay/"
          soap:actor="http://www.flute.com/cache">
       <m:cachable>1</m:cacheable>
    </c:cache>
</env:Header>

In general, a SOAP node follows the following steps in processing a message:

Verify that the message is a SOAP message.
Identify and process header blocks targeted at the node. The node must understand each mandatory header block (header with mustAttribute set to "1"). If it cannot process a mandatory block, it must generate the SOAP fault of type mustUnderstand.
If the node is an intermediary, forward the request.
If the node is the ultimate recipient, process the SOAP body.

In SOAP 1.2, SOAP nodes take on different roles in processing a message and can act in more than one role. Each header block can be targeted for nodes in a particular role by tagging the header block with the name of the role. This is a refinement of the SOAP 1.1 notion of a SOAP actor and provides for a better processing model. The 1.2 version defines the following special SOAP roles:

Header blocks with the role http://www.w3.org/2002/06/soap-envelope/role/next are intended for each SOAP intermediary and the ultimate SOAP receiver.
Blocks with the role http://www.w3.org/2002/06/soap-envelope/role/none are forwarded without any processing, along with the message, to the ultimate SOAP receiver.
Blocks with role http://www.w3.org/2002/06/soap-envelope/role/ultimateReceiver are meant for the ultimate SOAP receiver, which must act upon the block.

Though SOAP defines this processing model using actors and headers, it does not define any message routing protocol. The ebXML specifications define the Messaging Service, and IBM-Microsoft have defined the WS-Routing protocol. These are covered in Chapters 7 and 17, respectively.

SOAP Bindings

As mentioned earlier, a SOAP message essentially provides the capability of sending a one-way message. Any real-world application will require more sophisticated message exchange patterns. For example, request-response, notification, one-way with acknowledgment, asynchronous communication, and reliable messaging are richer patterns needed for enterprise applications.

The binding mechanism, which defines how SOAP messages are processed between SOAP nodes, is one way to extend SOAP functionality SOAP relies on the underlying protocol to provide much of this functionality. In some cases (e.g., reliable messaging and secure messaging), additional specifications are needed to implement it. In either case, a SOAP message must be layered on top of, or bound to, an underlying protocol, which the various message-exchange patterns rely on for the added functionality. The SOAP 1.1 specification provides bindings for HTTP and defines how a request-response messaging model is to work over HTTP. Using SOAP over HTTPS also may satisfy some, but not all, of your security needs.

Although not advisable, it is entirely possible to create a custom implementation of a new binding-say, SOAP over raw TCP sockets. In case you are considering a new custom binding, you must recognize the disadvantages of such an approach, the biggest of which is that those services will not be able to interoperate with external Web services. Nor will you be able to take advantage of popular SOAP processing engines, such as Apache AXIS or Java WSDP. If you define your own SOAP headers to provide the context information for the message, you will also need to build the engine to process that information. Of course, building your own security and transactional functionality will be an enormous challenge as well.

HTTP Binding

The SOAP specification defines how the SOAP message is bound to the HTTP POST request mechanism. Figure 4.5 showed how the SOAP message is wrapped in such a request.

A formal definition by the specification of a binding mechanism to HTTP ensures that a SOAP message enjoys the features built previously to handle HTTP messages. For HTTP nodes capable of handling SOAP messages, the binding mechanism provides rules for uniformly processing a SOAP message wrapped in an HTTP request. For example, nodes can take advantage of HTTP response codes to determine the outcome of a request such as the HTTP 500 response when a fault occurs.

The HTTP SOAPAction header indicates the type of action or the intent of the SOAP message. In Listing 4.1, no value was specified, because in a typical RPC style invocation, the intent of the message can be conveyed using other means (such as the location URL /flute/billPay), and this can be used by the processing SOAP node to identify the service to which the request is to be dispatched.

In data-oriented or document-style messaging, the meaning of the data (or the action to be taken) can be embedded in the data itself. To interpret the data, the processing node must parse the data content and extract the intent. Besides being slow, this approach violates a good design principle, wherein data and its metadata are kept separate. In such a scenario, the SOAPAction field may be used to convey the intent of the request. Because this field is part of the lower layer protocol, it can be parsed relatively quickly. This design also satisfies the guideline to keep the metadata (intent of the request) separate from the data sent within the SOAP body. The SOAPAction field may also be used by firewalls to filter SOAP messages by quickly (relatively speaking) parsing incoming traffic.

By using the HTTP response, a SOAP-based Web service can implement a request-response message exchange pattern. Although not specified by the SOAP specifications, it is conceivable that in the future, other HTTP mechanisms like HTTP PUT may be used to implement a one-way with acknowledgment message-exchange pattern.

The following additional rules apply to this binding:

The HTTP response code of 2XX indicates that the processing was successful.
In case of application-defined errors, the SOAP Fault element should be used to indicate the application error.
If a request cannot be processed, the SOAP node must return an HTTP 500 internal server error code.

Several free tools can help intercept and dump a SOAP message. Apache provides the TcpMon Java utility, and Pocketsoap.com provides the TCP-trace windows executable. These utilities act as proxies, by listening to a configurable port and redirecting traffic to another configurable port. To dump the SOAP message, have these utilities listen to a port (say, 8080), have the SOAP server listen to another (9090), and configure the utility to redirect all traffic from port 8080 to 9090. These tools provide a graphical user interface and automatically show all traffic in the GUI.

SMTP-POP Binding

We saw how single SOAP messages that represent a complete unit of work can be combined into request-response-style communication by binding to HTTP. By using email protocols such as SMTP and POP, applications can take the advantage of the asynchronous store and forward messaging capabilities of the mail systems to provide a one-way transport for SOAP. This allows SOAP to be used in a number of scenarios where a protocol such as HTTP may not be suitable.

While the SOAP 1.1 specifications do not directly address these bindings, it is easy to see how the SOAP message and its attachment may be included in an email message. Listing 4.5 shows a SOAP message containing an XML document (a purchase order) as an attachment in an email message.

Listing 4.5: An email containing a SOAP message, with an XML document as an attachment

Return-Path: <flutebank@localhost>
Received: from 127.0.0.1 ([127.0.0.1])
        by BYTECODE (JAMES SMTP Server 2.0a3-cvs) with SMTP ID 155 for
                                                                   <javaws@localhost>;
          Wed, 11 Sep 2002 15:03:18 -0400
Message-ID:  6123606.1031770997849.JavaMail.Administrator@BYTECODE
Date: Wed, 11 Sep 2002 15:03:16 -0400 (EDT)
From: flutebank@localhost
To: officemin@localhost
Subject: PurchaseOrder
Mime-Version: 1.0
Content-Type: multipart/mixed; boundary="----=_Part_0_4944979.1031770996528"
X-Mozilla-Status: 8001
X-Mozilla-Status2: 00000000
X-UIDL: Mail1031770998410-7
------=_Part_0_4944979.1031770996528
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
<?xml version="1.0" encoding="UTF-8"?>
<soap-env:Envelope xmlns:soap-
          env="http://schemas.xmlsoap.org/soap/envelope/">
    <soap-env:Header/>
 <soap-env:Body>
   <po:PurchaseOrder xmlns:po="http://www.flutebank.com/schema">
     <senderid>myuserid@Mon Aug 19 23:55:28 EDT 2002</senderid>
    </po:PurchaseOrder>
  </soap-env:Body>
</soap-env:Envelope>
------=_Part_0_4944979.1031770996528
Content-Type: application/octet-stream; name=purchaseorder.xml
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment; filename=purchaseorder.xml

<?xml version="1.0" encoding="UTF-8"?>
<purchaseorder xmlns="http://www.flutebank.com/schema"
        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xsi:schemaLocation="http://www.flutebank.com/schema purchaseorder.xsd">
  <identifier>87 6784365876JHITRYUE</identifier>
  <date>29 October 2002</date>
  <billingaddress>
        <name>John Malkovich</name>
        <street>256 Eight Bit Lane</street>
        <city>Burlington</city>
        <state>MA</state>
        <zip>01803</zip>
 </billingaddress>

<!--other XML from purchase order here, not shown-->

</purchaseorder>

------=_Part_0_4944979.1031770996528--

In Chapter 11, we will look at how SOAP can be combined with the JavaMail API to build asynchronous messaging systems.