Mail filter requirements

It’s time to update the spam filter at The Internet Company again.

I’m getting a lot of feedback from users of both my system and another I administer that they need several different things in a spam filter.

My users need:

  • The ability to retrieve a filtered message. Even if it’s rejected, in most cases, being able to fetch it from a quarantine is necessary. Some things can be hard-rejects, like virus-infected mails and things from very obvious spam sources, but the grey area needs to be very wide.
  • Some degree of control over what techniques are used: degree of quarantining, whether blacklists are used, and whether they reject or merely quarantine mail
  • Whitelisting, both by individual user and by domain.
  • Blacklisting, both by individual user and by domain, including whether to quarantine or reject.
  • Ability to retrain a learning filter while still using a POP3 mail client. This means a ‘signature’ with saved fulltext of the message like DSPAM or CRM114’s mailreaver do, so mail can be forwarded back altered by mail clients with no interest in preserving formatting like Microsoft Outlook, or so that there can be a web interface to retrain.

The overall themes here are ‘user control’ and ‘ability to retrieve a missed message’. Spam filters can be highly accurate in practice, with well-trained users who understand how the filters work, but most aren’t accurate enough or careful enough while training to be able to reject mail based on a learning filter alone. Business users could lose a thousand dollars or more on certain emails from previously unknown senders, so the ability to review and recover from the filter’s decisions is very important.