Mail filter actions

August 15th, 2010

Most mail filters get something major wrong. Most use an ordered list of actions, but limited to narrow scopes, in the order that they occur in SMTP: first check the sender, then the receivers, then check the content.

Mail filter plugins should be run first in order of what phase of processing they need to be in, but evaluated in order of finality of their decision. Check RBLs that outright block hosts first, then ones that are used to decide to quarantine. Then check for viruses, things that will get a message outright rejected or quarantined, then check spam filters.

Execute in parallel, in fact. Many checks involve waiting on networks, disks and other resources, so there’s no reason not to set several actions off at once and wait for completion.

There are several sets of actions that happen: responses to the SMTP client that’s sending us the message, and internal processing of the message, logs, notices to receivers about exceptional events. Once a message is accepted at SMTP time, we no longer have the option to bounce it: if it disappears into the aether, it had better really be junk, because nobody will know what happened to it. Each stream of actions is independent: rules will continue to be evaluated until all specified actions have been satisfied. (smtp, receiver, message, system)

The actions one might want: tempfail, accept, reject, notify, drop, log, record, add-header, add-footer, filter-message, redirect, quarantine, and continue.

The redirect and quarantine actions merely change the destination of the message, and don’t stop processing.

I figure group them numerically, with the highest priority overriding any lower priorities. Let groups be ORed together. Stop when you have a definite answer.

There are two kinds of actions: on`` actions react to the conditions of the group -- if a whitelist matches or not, if a spamfilter returns 'spam', 'not spam' or 'unsure'. ``on .. when actions are triggered when the condition of the when clause matches as well, forming a primitive boolean AND while still respecting an idea of priorities.

defaults { on error tempfail all; on success continue all; on any log all; }

group virus { checkcontent clamd; on match reject all, log system, log receiver; }

group user-whitelist { check whitelist; on match accept all; on match when virus match notify receiver; }

group { checkrbl b.barracudacentral.com; checkrbl b.spamcop.org; on match reject all, log system; }

group { checkcontent lmtp:///tmp/spamd.sock; checkcontent blacklistedwords; on spam accept smtp, quarantine message; }

finally { on any accept all; } `

A message comes in from 127.0.0.2: RBLs come up saying to block it. Because no higher rule will accept it, it gets rejected before DATA. The connection attempt is logged to the user, but no message is accepted at all.

A virus-bearing message comes in from 1.2.3.4, from a white-listed sender: RBLs don’t reject it, not being a listed IP. The SMTP connection gets as far as DATA, and the virus scanner is fired off, and returns a ‘virus’ response. The message is rejected on the SMTP side, a notice is sent to the receiver with the details. The whitelist is lower priority than virus scanner, so the message is still rejected. However, since there is also an action aimed at the receiver, that event fires and a notice is sent to the receiver of the message. At this point, evaluation stops since there are no more actions that could happen.

Thoughts and suggestions are welcome.