The SSL Protocol
SSL stands for Secure Sockets Layer and TLS stands for Transport Layer Security. These are two families of protocols that were originally designed to provide security for HTTP transactions, but they also can be used for a variety of other Internet protocols such as IMAP and NNTP. HTTP running over SSL is referred to as secure HTTP.
Netscape released SSL version 2 in 1994 and SSL version 3 in 1995. TLS is an IETF standard designed to standardize SSL as an Internet protocol, but it is just a modification of SSL version 3 with a small number of added features and minor cleanups. The TLS acronym is the result of arguments between Microsoft and Netscape over the naming of the protocol because each company proposed its own name. However, the name has not stuck and most people refer to these protocols simply as SSL. Unless otherwise specified, the rest of this chapter refers to SSL/TLS as SSL.
You specify that you want to connect to a server using SSL by replacing http with https in the protocol component of a URI. The default port for HTTP over SSL is 443.
The following sections explain how SSL addresses the confidentiality, integrity, and authentication requirements outlined above. You will also learn a bit about the underlying mathematical and cryptographic principles at the core of SSL.
Addressing the Need for Confidentiality
The SSL protocol protects data by encrypting it. Encryption is the process of converting a message, the plaintext, into a new encrypted message, the ciphertext. Although the plaintext is readable by everyone, the ciphertext is completely unintelligible to anyone who might intercept it. Decryption is the reverse process, which transforms the ciphertext back into the original plaintext.
Usually, the encryption and decryption process involves an additional piece of information: a key. If both sender and receiver share the same key, the process is referred to as symmetric cryptography. If sender and receiver have different, complementary keys, the process is called asymmetric or public key cryptography.
If the key used to both encrypt and decrypt the message is the same, the process is known as symmetric cryptography. DES, Triple-Des, RC4, and RC2 are algorithms used for symmetric key cryptography. Many of these algorithms can have different key sizes, measured in bits. In general, given an algorithm, the greater the number of bits in the key, the more secure the algorithm is and the slower it will run because of the increased computational needs of performing the algorithm.
Symmetric cryptography is relatively fast compared to public key cryptography, which is explained in the next section. Symmetric cryptography has two main drawbacks, however. One is that keys must be changed periodically to avoid providing an eavesdropper with access to large amounts of material encrypted with the same key. The other issue is the key distribution problem: How do you get the keys to each one of the parties, and in a safe manner? This was one of the original limiting factors of symmetric cryptography; the problem was solved by periodically having people traveling around with suitcases full of keys. Then, along came public key cryptography.
Public Key Cryptography
Public key cryptography takes a different approach than its symmetric predecessor. Instead of both parties sharing the same key, a pair of keys exists: one public and the other private. The public key can be widely distributed, whereas the owner keeps the private key secret. These two keys are complementarya message encrypted with one of the keys can be decrypted only by the other key.
Using this method, anyone wanting to transmit a secure message to you can encrypt the message using your public key, assured that only the owner of the private keyyoucan decrypt it. Even if an eavesdropper has access to the public key, he cannot decrypt the communication meant for you. In fact, you want the public key to be as widely available as possible, so more people can send encrypted messages to you. Public key cryptography can also be used to provide message integrity and authentication. People with public keys will place these keys on public key servers or simply send the keys to others with whom they want to have secure email exchanges. Using the appropriate software tools, such as PGP or GnuPG, the sender will encrypt the outgoing message based on the recipient's public key.
The assertion that only the owner of the private key can decrypt a message meant for them means that with the current knowledge of cryptography and availability of computing power, brute force alone will not break the encryption in a reasonable timeframe, however, if the underlying algorithm or its implementation is flawed, such attacks are possible.
By the Way
Public key cryptography is similar to giving away many identical padlocks and retaining the master key. Anybody who wants to send you a message privately can do so by putting it in a safe and locking it with one of those padlocks (public keys) before sending it to you. Only you have the appropriate key (private key) to open that padlock (decrypt the message).
Addressing the Need for Integrity
Data integrity is preserved by performing a special calculation on the contents of the message and storing the result with the message itself. When the message arrives at its destination, the recipient then performs the same calculation and compares the results. If the contents of the message changed, the results of the calculation will be differentand you'll know someone else has tampered with it.
Digest algorithms perform just that process, creating message digests. A message digest is a method of creating a fixed-length representation of an arbitrary message that uniquely identifies itlike a fingerprint. A good message digest algorithm should be irreversible and collision resistant, at least for practical purposes. Irreversible means that the original message cannot be obtained from the digest and collision resistant means that no two different messages should have the same digest. Examples of digest algorithms are MD5 and SHA.
Message digests alone, however, do not guarantee the integrity of the messagean attacker could change the text and the message digest. Message authentication codes, or MACs, are similar to message digests, but incorporate a shared secret key in the process. The result of the algorithm depends both on the message and the key used. Because the attacker has no access to the key, he cannot modify both the message and the digest. HMAC is an example of a message authentication code algorithm.
The SSL protocol uses MAC codes to avoid replay attacks and to assure integrity of the transmitted information.
Addressing the Need for Authentication
SSL uses certificates to authenticate the parties in a communication. Public key cryptography can be used to digitally sign messages. In fact, just by encrypting a message with your secret key, the receiver can guarantee it came from you. Other digital signature algorithms involve first calculating a digest of the message, and then signing the digest.
You can tell that the person who created that public and private key pair is the one sending the message, but how do you tie that key to a person or organization that you can trust in the real world? It's plausible that an attacker could impersonate a sender's identity and distribute a different public key, claiming it is the legitimate one.
Trust can be achieved by using digital certificates. Digital certificates are electronic documents that contain a public key and information about its owner (name, address, and so on). To be useful, the certificate must be signed by a trusted third party (certification authority, or CA) who certifies that the information is correct. There are many different kinds of CAs, as described later in the chapter. Some of them are commercial entities, providing certification services to companies conducting business over the Internet. Companies providing internal certification services create other CAs.
The CA guarantees that the information in the certificate is correct, and that the key belongs to that individual or organization. Certificates have a period of validity and can expire or be revoked. Certificates can be chained so that the certification process can be delegated. For example, a trusted entity can certify companies, which in turn can take care of certifying its own employees.
If this whole process is to be effective and trusted, the certificate authority must require appropriate proof of identity from individuals and organizations before it issues a certificate.
SSL and Certificates
You can check a real-life certificate by connecting to a secure server with your browser. If the connection has been successful, a little padlock icon or another visual clue will be added to the status bar of your browser. Depending on your browser, you should be able to click the representative icon in order to view information on the SSL connection and the remote server certificate. Open the https://www.zend.com/ URL in your browser and analyze the certificate, following the steps outlined in the preceding paragraph. You can see how the issuer of the certificate is Thawte CA. The page downloaded seamlessly because Thawte is a trusted CA that has its own certificates bundled inside Web browsers.
Figure 28.1. SSL Certificate in use at www.zend.com.
You can see that both issuer and subject are provided as distinguished names (DN), a structured way of providing a unique identifier for every element on the network. In the case of the Thawte certificate, the DN is C=IL, S=Mehoz Tel Aviv, L=Ramat Gan, O=Zend Technologies Ltd., CN=www.zend.com.
C stands for country, S for state, L for locality, O for organization, and CN for common name. In the case of a Web site certificate, the common name identifies the fully qualified domain name of the Web site. This is the server name part of the URL; in this case, www.zend.com. If this does not match what you typed in the top bar, the browser will issue an error.
SSL Protocol Summary
The process to establish an SSL connection is the following: