15.2 Protocols and Standards
Various protocols are mentioned throughout this chapter, as well as in Chapter 16 and Chapter 17. While going into detail about the various protocols is beyond the scope of this book and also not necessary for an understanding of how web services work, some understanding is useful.
A protocol is a set of rules that describe the transmission and receipt of data between two or more computing devices. For example, TCP/IP (Transmission Control Protocol/Internet Protocol) governs the low-level transport of packets of data on the Internet.
Layered on top of TCP/IP is HTTP (the HyperText Transfer Protocol), which is used to enable servers and browsers on the Web to communicate. It is primarily used to establish connections between servers and browsers and to transmit HTML to the client browser.
The client sends an HTTP request to the server, which then processes the request. The server typically returns HTML pages to be rendered by the client browser, although in the case of web services, the server may instead return a SOAP message containing the returned data of the web service method call.
HTTP requests pass name/value pairs from the requesting browser to a server. The request can be either of two types: HTTP-GET, or HTTP-POST.
In GET requests, the name/value pairs are appended directly to the URL. The data is uuencoded (which guarantees that only legal ASCII characters are passed over the wire), then appended to the URL, separated from the URL by a question mark.
For example, consider the following URL:
The question mark indicates that this is an HTTP-GET request, the name of the variable passed to the GetName method is StockSymbol, and the value is msft.
GET requests are suitable when all the data that needs to be passed can be handled by name/value pairs, there are few fields to pass, and the length of the fields is relatively short. GET requests are also suitable when security is not an issue. This last point arises because the URL is sent over the wire and is included in server logs as plain text. As such, they can be easily captured by a network sniffer or an unscrupulous person.
The .NET Framework provides a class, HttpGetClientProtocol (shown in Figure 15-4), for using the HTTP-GET protocol in your clients.
POST requests are suitable for large numbers of fields or when lengthy parameters need to be passed. Also, if security is an issue, a POST request is safer than a GET request, since the HTTP request can be encrypted.
As with GET requests, with POST requests only name/value pairs can be passed. This precludes passing complex data types (such as classes, structs, or datasets).
The .NET Framework provides a class, HttpPostClientProtocol (see Figure 15-4), for using the HTTP-POST protocol in your clients.
XML (eXtensible Markup Language) is an open standard promulgated by the World Wide Web Consortium (W3C) as a means of describing data (for more information visit www.w3c.org). At the time of this writing, the current version of the XML protocol is Version 1.0.
XML is similar to HTML. In fact, both XML and HTML are derived from SGML (Standard Generalized Markup Language). Like HTML documents, XML documents are plain text documents containing tags. However, while HTML uses predefined tags that specify how the HTML document will display in a browser, XML allows tags to be defined by the document developer, so that virtually any data can be conveyed.
XML documents are text files that are human-readable. However, they are typically not meant to actually be read by humans, except developers doing programming and debugging. Since tags are used to define every field in an XML document, the files are generally much larger than the same data in a proprietary binary database file. However, that is rarely an issue, since it is computer programs, not people, reading the document, and the difference in transmission time over the Internet is usually negligible at today's speeds.
One significant difference between HTML and XML is that while most HTML readers (i.e., web browsers) are tolerant of coding errors, XML readers generally are not. XML must be well-formed. (For a complete discussion of well-formed XML markup, see Chapter 4.) For example, while browsers generally do not care if tags are upper- or lowercase, in XML they must be lowercase or an error will be generated.
SOAP (Simple Object Access Protocol) is an XML grammar that's tailored for exchanging web service data. In a .NET web service, you'll usually send SOAP messages over HTTP. SOAP is a simple, lightweight protocol for the exchange of information over the Internet. Like XML, the SOAP standard is promulgated by the W3C.
SOAP uses XML syntax to format its content. It is, by design, as simple as possible and provides a minimum of functionality. Therefore, it is very modular and flexible. Since SOAP messages consist of XML, which is plain text, they can easily pass through firewalls, unlike many proprietary, binary formats. At the time of this writing, the latest SOAP version is 1.2. The SOAP protocol was originally developed by Compaq, HP, IBM, Lotus, Microsoft, and others.
SOAP is not limited to name/value pairs as HTTP-GET and HTTP-POST are. Instead, SOAP can also be used to send more complex objects, including datasets, classes, and other objects.
One drawback to using SOAP to pass requests back and forth to web services is that SOAP messages tend to be very verbose, because of the nature of XML. Therefore, if bandwidth or transmission performance is an issue, you may be better off using either HTTP-GET or HTTP-POST.
The .NET Framework provides a class, SoapHttpClientProtocol (see Figure 15-4), for using the SOAP protocol in your clients.
15.2.6 .NET Support for Protocols
The .NET Framework provides a number of classes for interacting with the HTTP protocol. Figure 15-4 shows a hierarchy of classes for the SOAP, HTTP-GET and HTTP-POST client protocols, all deriving from WebClientProtocol and HttpWebClientProtocol.