This is a graph of my spam-count:

A graph of aquarion's spam

This is how I created my spam-count. It’s a combination of spamassassin, procmail, shell-script and PHP, and therefore full of Stuff Wot No Man Was To Wot Of. Or something.

First, I run spamassassin, which just generally rocks. All my email comes into one mailbox, aq which is at gkhs dot net. That includes every mailing list, every aquarionics dot com address, all my suespammers mail, and my hotmail account (via the really quite nice program, gotmail. It then gets fed though procmail which uses various script-fu to deposit mailing-lists into newsgroups on the local news server (I prefer reading discussion via usenet), and everything else gets sent though Spamassassin like this:

:0fw:/home/aquarion/logs/sa.lock

| /usr/local/bin/spamassassin

:0:

  • ^X-Spam-Status: YES

$MAILDIR/spam/date +%Y-%m-%d

meaning that everything that SA thinks is spam gets forwarded to a mailbox within my spam folder with the name as the date. Most solutions I saw for this kind of statistics generation put all the mail into one box and then grab the date from it. For the way I’m doing it, that’s a waste of processing, plus it ignores Rule One: Spammers Lie. The date on the spam usually has no relation to the date you got it.

So, we now have a box called – for example – 2003-06-09 containing today’s Spam (On the second day of every month, a cron-job wraps all the last-month’s spam into a tarball and dumps it somewhere to rot). Every morning at 1:12am, the following runs:

#!/bin/sh
DATE=date +%Y-%m-%d
SPAMTODAY=from -f ~/Mail/spam/$DATE | wc -l

echo $DATE, $SPAMTODAY >> ~/logs/spam.log

(from is a program that displays the from: header of every mail in a mailbox. In this case, it generates one line per mail which is what we want). Giving us a file like this:

-<del>snip</del>-
2003-06-02, 328
2003-06-03, 134
2003-06-04, 130
2003-06-05, 152
2003-06-06, 125
2003-06-07, 123
2003-06-08, 267
-<del>snip</del>-
.

Which we can do whatever we want with. In this case, a PHP script (or you can view that nicely formatted) which generates a graph.

And thats how I know I get about 100 to 300 pieces of spam every day.

Best yet, that’s the only way I know I get 100 -> 300 pieces of spam a day 🙂