Beating'em up some more: spamdb and greytrapping

Behind the scenes, rarely mentioned and barely documented are two of spamd's helpers, the spamdb database tool and the spamlogd whitelist updater, which both perform essential functions for the greylisting feature. Of the two spamlogd works quietly in the background, while spamdb has been developed to offer some interesting features.

NoteRestart spamd to enable greylisting
 

If you followed all steps in the tutorial exactly up to this point, spamlogd has been started automatically already. However, if your initial spamd configuration did not include greylisting, spamlogd may not have been started, and you may experience strange symptoms, such as your greylists and whitelist not getting updated properly.

Under normal circumstances, you should not have to start spamlogd by hand. Restarting spamd after you have enabled greylisting ensures spamlogd is loaded and available too.

spamdb is the administrator's main interface to managing the black, grey and white lists via the contents of the /var/db/spamdb database.

Early versions of spamdb simply offered options to add whitelist entries to the database or update existing ones (spamdb -a nn.mm.nn.mm ) and to delete whitelist entries (spamdb -d nn.mm.nn.mm) to compensate for shortcomings in either the blacklists used or the effects of the greylisting algorithms.

By the time the development cycle for OpenBSD 3.8 started during the first half of 2005, spamd users and developers had accumulated significant amounts of data and experience on spammer behaviour and spammer reactions to countermeasures.

We already know that spam senders rarely use a fully compliant SMTP implementation to send their messages. That's why greylisting works. Also, as we noted earlier, not only do spammers send large numbers of messages, they rarely check that the addresses they feed to their hijacked machines are actually deliverable. Combine these facts, and you see that if a greylisted machine tries to send a message to an invalid address in your domain, there is a significant probability that the message is a spam, or for that matter, malware.

Enter greytrapping

Consequently, spamd had to learn greytrapping. Greytrapping as implemented in spamd puts offenders in a temporary blacklist, dubbed spamd-greytrap, for 24 hours. Twenty-four hours is short enough to not cause serious disruption of legitimate traffic, since real SMTP implementations will keep trying to deliver for a few days at least. Experience from large scale implementations of the technique shows that it rarely if ever produces false positives[1] . Machines which continue spamming after 24 hours will make it back to the tarpit soon enough.

Your own traplist

To set up your own traplist, you use spamdb's -T option. In my case, the strange address I mentioned earlier[2] was a natural candidate for inclusion:

$ spamdb -T -a wkitp98zpu.fsf@datadok.no

Sure enough, the spammers thought this was just as usable as almost two years ago:

Nov  6 09:50:25 delilah spamd[23576]: 210.214.12.57: connected (1/0)
Nov  6 09:50:32 delilah spamd[23576]: 210.214.12.57: connected (2/0)
Nov  6 09:50:40 delilah spamd[23576]: (GREY) 210.214.12.57: 
<gilbert@keyholes.net> -> <wkitp98zpu.fsf@datadok.no>
Nov  6 09:50:40 delilah spamd[23576]: 210.214.12.57: disconnected 
after 15 seconds.
Nov  6 09:50:42 delilah spamd[23576]: 210.214.12.57: connected (2/0)
Nov  6 09:50:45 delilah spamd[23576]: (GREY) 210.214.12.57: 
<bounce-3C7E40A4B3@branch15.summer-bargainz.com> -> 
<adm@dataped.no>
Nov  6 09:50:45 delilah spamd[23576]: 210.214.12.57: disconnected 
after 13 seconds.
Nov  6 09:50:50 delilah spamd[23576]: 210.214.12.57: connected (2/0)
Nov  6 09:51:00 delilah spamd[23576]: (GREY) 210.214.12.57: 
<gilbert@keyholes.net> -> <wkitp98zpu.fsf@datadok.no>
Nov  6 09:51:00 delilah spamd[23576]: 210.214.12.57: disconnected 
after 18 seconds.
Nov  6 09:51:02 delilah spamd[23576]: 210.214.12.57: connected (2/0)
Nov  6 09:51:02 delilah spamd[23576]: 210.214.12.57: disconnected 
after 12 seconds.
Nov  6 09:51:02 delilah spamd[23576]: 210.214.12.57: connected (2/0)
Nov  6 09:51:18 delilah spamd[23576]: (GREY) 210.214.12.57: 
<gilbert@keyholes.net> -> <wkitp98zpu.fsf@datadok.no>
Nov  6 09:51:18 delilah spamd[23576]: 210.214.12.57: disconnected 
after 16 seconds.
Nov  6 09:51:18 delilah spamd[23576]: (GREY) 210.214.12.57: 
<bounce-3C7E40A4B3@branch15.summer-bargainz.com> -> 
<adm@dataped.no>
Nov  6 09:51:18 delilah spamd[23576]: 210.214.12.57: disconnected 
after 16 seconds.
Nov  6 09:51:20 delilah spamd[23576]: 210.214.12.57: connected (1/1), 
lists: spamd-greytrap
Nov  6 09:51:23 delilah spamd[23576]: 210.214.12.57: connected (2/2), 
lists: spamd-greytrap
Nov  6 09:55:33 delilah spamd[23576]: (BLACK) 210.214.12.57: 
<gilbert@keyholes.net> -> <wkitp98zpu.fsf@datadok.no>
Nov  6 09:55:34 delilah spamd[23576]: (BLACK) 210.214.12.57: 
<bounce-3C7E40A4B3@branch15.summer-bargainz.com> -> 
<adm@dataped.no>

This log fragment shows how the spammer's machine is greylisted at first contact, and then clumsily tries to deliver messages to my greytrap address, only to end up in the spamd-greytrap blacklist after a few minutes. By now we all know what it will be doing for the next twenty-odd hours.

Deleting, handling trapped entries

spamdb offers a few more options you should be aware of. The -T option combined with -d lets you delete traplist mail address entries, while the -t (lowercase) option combined with -a or -d lets you add or delete trapped IP address entries from the database.

Exporting your list of currently trapped addresses can be as simple as putting together a simple one-liner with spamdb, grep and a little imagination.

The downside: some people really do not get it

We have already learned that the main reason why greylisting works is that any standards compliant mail setup is required to retry delivery after some “reasonable” amount of time. However as Murphy will be all too happy to tell you, life is not always that simple.

For one thing, the first email message sent from any site which has not contacted you for as long as the greylister keeps its data around will be delayed for some random amount of time which depends mainly on the sender's retry interval. There are some circumstances where avoiding even a minimal delay is desirable. If you for example have some infrequent customers who always demand your immediate and urgent attention to their business when they do contact you, an initial delivery delay of what could be several hours may not be optimal.

In addition, you are bound to encounter misconfigured mail servers which either do not retry at all or retry too quickly, perhaps stopping delivery retries after a few attempts. As luck would have it, in your case one of these is likely to be at an important customer's site, run by an incompetent who will not listen to reason or possibly a site owned and operated by your boss' boyfriend.

Finally, there are some sites which are large enough to have several outgoing SMTP servers, and not play well with greylisting since they are not guaranteed to retry delivery of any given message from the same IP address as the last delivery attempt for that message. Even though those sites can sincerely claim to comply with the retry requirements, since the RFCs do no state that the new delivery attempts have to come from the same IP address, it's fairly obvious that this is one of the few remaining downsides of greylisting.

If you need to compensate for such things in your setup, it is fairly easy to do. One useful approach is to define a table for a local whitelist, to be fed from a file in case of reboots:

table <localwhite> file "/etc/mail/whitelist.txt"

To make sure SMTP traffic from the addresses in the table is not fed to spamd, you add a no rdr rule at the top of your redirection block:

no rdr proto tcp from <localwhite> to $mailservers port smtp

Once you have these changes added to your rule set, you enter the addresses you need to protect from redirection into the whitelist.txt file, then reload your rule set using pfctl -f. You can then use all the expected table tricks on the <localwhite> table, including replacing its content after editing the whitelist.txt file. See the Chapter called Tables make your life easier or man pfctl for a few pointers.

NoteEven better: copy rules from the man page and use nospamd
 

Recent-ish versions of the spamd(8) man page have exactly the rules you need, including the nospamd table:

table <spamd-white> persist
table <nospamd> persist file "/etc/mail/nospamd"
pass in on egress proto tcp to any port smtp divert-to 127.0.0.1 port spamd
pass in on egress proto tcp from <nospamd> to any port smtp
pass in log on egress proto tcp from <spamd-white> to any port smtp
pass out log on egress proto tcp to any port smtp
  

Please stick to exactly those rules unless you have a very good reason to introduce local variations. If you need some initial content for your nospamd file, you can fetch mine (which is based on sediments of extracted SPFrecords from various domains over the years) from here or use Aaron Poffenbergers excellent spf_fetch which intends to automate fetching of SPF records for a list of domains.

Notes

[1]

One prime example was Bob Beck's "ghosts of usenet postings past" based traplist, which rarely contained less than 20,000+ entries. The number of hosts varies widely and has been as high as roughly 670,000. At the time of writing (mid November 2010), the list typically contained around 55,000 entries. While still officially in testing, the list was made publicly available on January 30th, 2006. The list, which to my knowledge never did produce any false positives and was available from http://www.openbsd.org/spamd/traplist.gz for your spamd.conf, was however retired from service in May 2016 and is no longer available.

[2]

That address is completely bogus. It is probably based on a GNUS message-ID, which in turn was probably lifted from a news spool or some unfortunate malware victim's mailbox.