JavaScript Editor Source code editor     Website development 

Main Page

Previous Section Next Section

A Closer Look at URLs

URLs are so common now that they appear with little or no explanation on TV commercials and bubble gum wrappers. But the home page URLs you hear in the media are only a small subset of the many options available with this versatile form. The URL is defined in RFC 1738.

Not all URLs refer to HTTP. In fact, the URL form was devised as a universal method for several different Internet protocols. The protocol portion of the URL is referred to as the scheme. The scheme identifies a protocol and therefore tells the computer how to interpret the rest of the URL. The general format for a URL is described in RFC 1738 as


Table 17.1 shows some of the scheme options defined in RFC 1738. Other schemes are also possible. In fact, some new schemes have been added in later RFCs.

Table 17.1. URL Schemes




File Transfer Protocol


Hypertext Transfer Protocol


The Gopher protocol


Electronic mail


Usenet news


Usenet news with NNTP access


Interactive session (see Hour 15)


Wide area information servers


Host-specific filenames

As the <protocol-specific-part> term in later the general form of the URL demonstrates, the structure of the URL may differ, depending on the URL's scheme. The computer first reads the scheme, and the scheme tells the computer how to interpret the rest of the URL. As this hour focuses on HTTP, this section will focus primarily on the HTTP form of the URL. But it is worth noting that you'll also encounter other schemes as you browse the Web. The ftp scheme is another common variant. Most modern Web browsers are capable of recognizing alternative schemes such as ftp and responding to the URL accordingly.

The general form for later an HTTP URL is


<host> is the DNS name of the server (for example,, and <path> is the path to the HTML document or other resource. The other options are less common and are less familiar to the average user. Those options include

  • <port>— The port number of the daemon or service to which the browser is connecting. (See Hour 6, "The Transport Layer," for more on port numbers.) The port number reserved for HTTP servers is TCP port 80. If the port number is omitted, port 80 is assumed.

  • <parameters>— Optional parameters supplied by the client. The user almost never has to enter parameters in order to access a Web site. However, parameters are sometimes passed to the server through scripts.

  • <search>— Lets the client send a query string to the user. The user almost never enters a query into a URL by hand. Watch the URL box of your Web browser when you enter a search through one of the Internet search engines. You may see a query string transmitted to the search server through the URL.

By the Way

Complex URLs containing ports, parameters, and queries are sometimes used to reconfigure the Web server itself. The Web server must possess the necessary extensions and scripts to process the configuration request.

If a connection has already been established, it is not necessary to use the entire URL to identify a resource. HTTP and RFC 1738 permit the use of a relative URL. The relative URL gives the URL as referenced from the current page or from a default <BASE> location defined in the document. For example, if you are already on the home page specified with the URL, the relative URL to the file


is techniques/repair/fix.html.

The relative URL might seem like a confusing way to save a few bits and keystrokes, but it offers benefits in building and deploying Web sites. As shown in Figure 17.3, if the Webmaster uses relative URLs for the internal links within a Web site, the complete directory structure for the site can be copied to a different server without disrupting the integrity of the links.

Figure 17.3. Relative URLs make a Web site portable.


    Previous Section Next Section

    JavaScript Editor Source code editor     Website development