No recent development in email technology has had more impact than the rise of spam. Spam is the nickname for the mass-mail email messages that have begun to clutter the mailboxes of millions of Internet users. The messages advertise bank loans, dietary aids, charity scams, and various products and services on the theme of ephemeral gratification. Technologically, spam is just email—that's why it works. The email servers routing a message don't know whether the message was generated by an odious automation scheme or by a loved one of the recipient.
Luckily, the receiver has some options for identifying and eliminating spam. Some of the techniques used for fighting spam are based on TCP/IP principles, and are therefore relevant to this book. However, as you will see, the creators of spam are good at finding their way around the antidotes, so no solution lasts forever. Newer techniques focus primarily on analyzing the text of the email message.
When the spam industry started, recipients began to realize that much of the spam came from a few specific email addresses. Spam foes have accumulated large databases of addresses that are thought to be associated with spam. (These databases are known as blacklists. See the Open Relay Database at http://www.ordb.org for an example of a blacklist of open relay SMTP servers that are, or are likely to be, used by the spam industry.) Firewalls, mail servers, or client programs can scan incoming messages for evidence of a blacklisted address.
Spammers often change IP addresses and domain names to avoid the blacklist. A blacklist is considered a good first line of defense, but it is not adequate for completely controlling spam. Spammers, in fact, sometimes use the mail servers of other unsuspecting companies to forward spam messages. As you learned earlier in this hour, an SMTP mail server simply waits for a message from a client and forwards it. The idea, of course, is that only the owners of the mail server will use it to forward messages, but a mail server that isn't properly locked down can be used by anybody, including spammers at another location (see Figure 18.7). Sometimes legitimate companies and totally innocent individuals find themselves on email blacklists because spammers are using their server as a relay.
Figure 18.7. Spammers can sometimes use someone else's unsecured and unsuspecting email server to send their messages.
Spam foes have struck back against this tactic with their own remedy. By placing the mail server inside the corporate firewall and blocking incoming SMTP requests at the firewall (see Figure 18.8), an organization protects itself from becoming a spam relay. As shown in Figure 18.8, mail clients from inside the firewall can use the mail server to forward messages, but clients located outside cannot reach the mail server. This technique is useful for controlling spam. However, it does create some limitations. A user from the home network who is traveling with a laptop or checking mail from some outside location might find it impossible to send messages without reconfiguring the email client to point to a different SMTP server.
Figure 18.8. Placing the SMTP server behind a firewall and blocking incoming SMTP requests protects the server from misuse by spammers.
Other techniques for fighting spam rely on analysis of the message content. Certain terms and phrases occur more frequently in spam headers and messages. Some spam filters impound messages based on rules. For instance, a filter might impound curse words or other terms associated with gratuitous anatomical descriptions. More sophisticated methods use probabilistic techniques to analyze the word use within the message and assign a score that indicates the likelihood that a message is spam.
These sorts of methods are effective at catching spam, but they sometimes create false positives, in which legitimate messages are impounded for exhibiting a spamlike profile. The best techniques provide a means for "training" the filter, by showing it any mistaken positives so that it can recalculate the probabilities and not make the same mistake twice.