[NMIA Logo]
New Mexico Internet Access, Inc.
 
Home
|
NMIA's services
|
Manage Your Account
|
Support
|
Message of the Day
|
Check email via the web

The problem with spam filtering.

NMIA would like to point out that we are utterly opposed to spam, from our subscribers, from anyone.

We are trying to reduce it by every means we know. We have settled on Spam Assassin (SA) as a primary scoring tool (note it does not filter, it only scores) for incoming email messages. The score can be used to filter (i.e., remove, redirect, classify, etc) the message as it is being received. This can be done on our systems, so you don't have to download it, or on your computer using any modern email handler (Outlook, Eudora, etc).

The nature of the problems.

1st: No spammer uses a traceable IP address. Almost all spam sent today is relayed through innocent mail servers. 99% of these are ordinary PC's running Windows XP, which has a mis-configured XIMS server, or which has been compromised by a virus which installs an email server trojan, usually without alerting the owner. These machines are apparently operating without firewall protection, using always on connections (i.e., DSL, ComCast, etc).

We have made a careful analysis of 200 recent spam messages, and found that all came from always on connections such as DSL or cable. All appear to be non-commercial connections, in the US, France, Germany, Austria, Switzerland, Italy, and Taiwan. All are simple mail relays, and each of the 200 messages was relayed from a unique address. That last makes IP filtering essentially impossible.

We do use the Black Hole lists of known and reported spammer's IP addresses (RBL, MAPS, etc), but these are becoming less and less effective as spammers become more sophisticated.

2nd: As we're sure you must have seen, much of the spam that leaks through SA with a low score, has a "Subject:" line beginning with several random letters. Their randomness makes it almost impossible to find an algorithm to match such a string. They are never words, but to prove that a bunch of letters isn't a word, is very compute intensive. And, any typo in a legitimate email might be falsely rejected.

The message body also often contains a nonsensical list of words from a dictionary. This is an attempt to poison any self-adjusting (Bayesian) filter trying to learn the patterns in spam mail.

The irregular learning ability of Bayesian filters and the disruptive effects of spammers' poisoning have caused occasional high false positives, which are very serious because ham mail can be lost. For this reason, we have discontinued the Bayesian filter part of SA, for the time being.

3rd: Recently, some (we believe, misguided) entrepreneurs have tried to start a new service related to avoiding spam. The archetype is Habeas.com who have created a copyrighted haiku which, for a fee, one may insert in the header of an email that one wishes to be delivered in spite of any level of spam filtering. This depends on the cooperation of the programmers of the spam filter software used. The programmer inserts a rule that bypasses the filter on detecting the haiku in the header, delivering immediately. (I'm not making this up)

The vendor of the haiku expects to prosecute anyone using their haiku to send spam, or who hasn't paid for its use. They have actually sued one of two such who were identified. The problem is that, out of the 40,000 spam messages in one recent sample, we have been unable to identify any of the actual senders, since they all used one of the millions of open mail relays mentioned above.

Incidentally, the provider of the scoring software we use (i.e., Spam Assassin), did not inform us of the addition of this special rule, though it is mentioned, sub rosa, on their web site. We were alerted when we noticed that our staff had received several hundred spam messages with the haiku in place. That rule has now been removed.

Finally, the above is not a complete description of the problem which is evolving hour by hour. We are periodically overwhelmed by the introduction of a new spammer weapon, and this causes variations in spam volume as our subscribers sometimes report. We do appreciate reports of any egregious increase or offense. In general, samples are not needed. In fact, sending such samples often results in their being deleted by our spam filters.

Just so you have something to compare, on an average day, each NMIA staff member will receive about 2 thousand email messages. Perhaps 5% to 10% of those are ham (i.e., desired email, mostly from you), the remainder are spam. SA will typically find about 85% to 90% of the spam with a score of 5 or more, usually with less than 1% being ham falsely identified as spam. In the score range of 1 to 4, there is often an even a mix of spam and ham, and as the spammers develop another subterfuge, a spam can show up even with a score of 0 or less. Each of us deals with over a hundred messages from our subscribers which we must differentiate from one or two hundred spam messages. So rest assured we feel the pain of spam.

We will be pleased to discuss any aspect of managing your individual email flow, including a completely open server, and white and black lists.

Just send email with your questions to help@nmia.com or call 247-0888.