Stopping Spam at the Gateway

Share it on Twitter  
Share it on Facebook  
Share it on Google+
Share it on Linked in  
I hate spam. You hate spam. We all hate spam. But, none of us hate spam as much as ISP and business network administrators hate spam. Alexis Rosen, president and co-owner of Public Access Networks, which runs Panix, one of the oldest ISPs concedes that spam's "not as bad as Adolph Hitler," but "it is morally evil."

Well, that's clear enough. Why such strong feelings? Rosen explains spam "chews up a lot of bandwidth and disk space." And the non-stop disk I/O sucks down system resources and significantly stresses the mail server. And why is this so annoying? Because it directly interferes with their ability to perform as an ISP and that, in turn, is slapping down the bottom line. This isn't just Panix's problem. All ISPs and corporate networks face it.

So what can you do about it?

Stopping Spam at the Gateway.

There are four basic ways you can try to block spam at the gateway. These are: blacklists, whitelists, rules-based filtering and Bayesian. None of them are perfect. None of them will ever be perfect. All of them working together will never be perfect.

The fundamental problem with anti-spam protection as David Ferris, president of leading e-mail researcher, Ferris Research, says is that "the ideal goal is: 100% effectiveness, with 0% false positives. An impossible ideal." Still, "most people will find high false positive rates, of the order of one in 1,000, quite acceptable." Unfortunately, the very, very best anti-spam programs when set to stop the most possible spam average one false positive in a hundred.

Still, both for the sake of end-users, not to mention the workload on your mail servers and network bandwidth, a network engineer must do the best they can.


The idea is simple. Determine the domain names or IP addresses of known spammers and their ISPs, and then block them. Typically, you subscribe to a blacklist listing and then use it at your gateway to refuse any mail traffic (SMTP or POP) from the spammers. Unfortunately, blacklists can also block perfectly fine users who happen to be at the same ISP, or just in the same IP address range, as a known spammer.

Worse still, blacklists are as subject to human error as any such listing and many users or their e-mail systems are unfairly tarred with a blacklist. Adding insult to injury, getting off some blacklists can be almost impossible for ISPs or individual owners.

SpamCop, for example, is infamous for being overaggressive in blocking possible spam sites. Another problem is that, when a spammer can change his e-mail address faster than you can change your underpants, the overall effectiveness of blacklisting drops enormously. For example, Giga reported in "MAPS Realtime Blackhole List Under Fire" that even well-respected the Mail Abuse Prevention System/Realtime Blackhole List RBL (MAPS/RBL), snags only 25% of spam, and can block 34% of good mail.

That said, careful use of blacklists can still be helpful from keeping spam from ever getting past your network perimeter. The Spamhaus Project, for example, has a reputation of accurate and up-to-date spammer lists and the Open Relay Database remains useful for identifying unsecured mail servers that can easily be used for spamming.


Whitelists sound like a good idea. Users simply refuse to get mail from anyone unless they've approved the specific message or the sender. This works in two ways. In the first kind, users simply block any message from someone who's not on their approved list. The other kind, software automatically replies with a verification message to emails sent from unknown addresses. These messages usually require the sender to send a message back showing that's a real person on the other end of the Internet

So it is that we have two kinds of whitelists but they have two problems in common. They're cumbersome and they don't always work. For example, if a user likes getting mail from or an e-mail list, he must set up rules to allow this. If a friend moves to a different e-mail address list, the user must update his whitelist. If someone in HR, not his friend at the company, sends him a job-offer, he may never see it.

The list goes on and on. Whitelists only sound like a good idea; they're much of a pain for most users to be worth considering. Worse still, from an ISP's viewpoint, they're very cumbersome since they can generate tons of mail asking spammers for response messages, which is likely to only cause more spam.

Page 2: Rule-based filters

Rule-based filters

The idea of rules-based filtering is very simple. They use many spam identification tests on the mail headers and body text to identify spam. In this method, the software looks for terms like "SEX!" or "Hair Growth" and then deletes them at the mail server.

This approach's problem is that it's always a step behind the spammers. For instance, we, know that a message with the subject of "F R E E V I A G A R A" is spam. But, a ruled based program might miss it. The rules-based approach is a good one, but keeping the rules accurate and up to the minute is a never-ending job. Another problem is that the better a rules-based program gets, the slower it will run.

Make no mistake about it; trainable rule-based filters are an excellent technique. But they're condemned to always be a step behind. And they come with a built-in, eternally growing performance hit.

Bayesian filters

At first glance Bayesian looks a lot like rules filtering, but instead of starting with preset rules, Bayesian, with a user or administrator's help, learns to tell the difference between spam and good mail. This is expressed in terms of a probability and so after a few hundred messages, a good Bayesian filter will automatically recognize that the odds are seriously against any message with a subject of 'sex' with the HTML coding for bright red is almost certainly spam.

Bayesian, because it's simple to program and highly accurate — success rates of 98% are not uncommon — has become the hottest anti-spam technology.

At the Gateway

There are more than a dozen commercial anti-spam programs. These include: Brightmail, Cloudmark Authority, CipherTrust IronMail, Trend Micro and Tumbleweed. All these companies use several, if not all, of the anti-spam methods to try to build the perfect anti-spam program.

They're all trying but no one is close to perfection yet. You really must obtain evaluation copies and test them with your users and network before you'll be able to make an informed choice.

Many ISPs and companies build their own solutions. Of these, most are built on the foundation of the procmail Unix mail processing utility and SpamAssassin, a powerful Unix-based, open source mail filtering program.

SpamAssassin isn't just for Unix and Linux shops though. There are many versions available including Network Associate's McAfee System Protection SpamKiller for Microsoft Exchange Small Business for Exchange 2000. There are also a variety of other commercial and open source programs based on SpamAssassin that will work in concert with almost any mail server.

None of these anti-spam programs, however, are that fast. Most network administrators find that these programs require their own servers for effective mail throughput. Others administrators use outsourced anti-spam services such as those provided by Postini and MessageLabs.

If you do elect to use your own in-house server, it needs fast connections to both your Internet gateway and the e-mail server. I'd recommend Fast Ethernet at a minimum and, if you have more than 500 user mailboxes, I think gigabit Ethernet for inter-server connections should be seriously considered.

The machines themselves should have ample memory, at least 512MB of RAM, and fast 120GB+ hard drives. System speed, while important, isn't as critical as memory and disk space. That's because when you boil spam-protection down to its basics, it comes down to lots and lots of string comparisons. Such procedures always tend to be processor light but memory intensive. Finally, these machines should have no other jobs except spam-bashing.

If possible, as Ferris recommends, end-users should have direct access to spam messages. You may be sure a given message is spam. The program may be certain that it's spam, but only the user can tell if it really is spam. If the user has to go through a help desk to get at the message, he's not going to be a happy user. Some server programs, like ActiveState's PureMessage, already enable users to get directly at their 'spam' mail.

Does this sound like building server anti-spam protection will either be a lot of trouble or expensive if you outsource it? You're right; it will be one or the other.

Is it worth it? You tell me? Are your users sick of spam? Are you tired of having large chunks of your Internet bandwidth taken up by spam? Are you tired of watching your mail servers' hard drives glow from constant use? If you answer yes to three or more of those questions, it's time to add anti-spam services to your network.


Loading Comments...