Now I Just Need A Virus Ninja

The War on Spam is getting to the point where you need better automated tools; having to manually adjust procmail filters for every new trick was quickly becoming annoying.

So, I finally decided to give SpamAssassin a whirl. It’s always been highly recommended but I’d been a bit hesitant to use it since it looked like it might be overkill. It’s the kind of thing you install to handle spam across entire corporate networks (we use it at work here too), so I was expecting something sendmail-like in its difficulty to configure and admin. It turned out to be pretty painless, though — do the ‘make install’, set up the .forward and .procmailrc files (the samples included work just fine), and ta-da, you’re done and your e-mail is now being spam-filtered.

The important question is, of course: does it work? The answer is yes…and no. So far it has caught a good number of spam messages and not accidentally flagged any valid e-mail as false positives. There are however still a few types of messages making it through the filter:

1) Viruses. Unfortunately Swen and its ilk are still circulating around the net far too much, and with little text in the message to parse and none of the spammer’s tricks being used, it’s hard for SpamAssassin to catch these. Technically it’s not really SA’s job to catch viruses; I’ll have to find another package and use it to do additional virus filtering instead.

2) Short, generic spam. These are those messages with vague subjects like “hi”, “lose it”, “can u spend few mins?” etc. and worded like a friendly greeting. Since most of the usual spammer’s tricks are missing, SA can’t judge these very well.

However, there is hope for the latter case: SA uses ‘Bayesian filtering’ to attempt to learn what spam looks like. If I keep feeding messages like those to the database, it should eventually start to be able to tell the difference between them and valid e-mail and start filtering them automatically. In theory, anyway. Only time will tell how well it will work, as it has to build up a history first.

In the meantime I still have to handle those viruses and some spam messages manually. It’s still better than no filtering at all though, and using the SA tools to teach it about the spam it misses isn’t too much of a hassle.

Leave a Reply

Your email address will not be published. Required fields are marked *