aqWiki Imported From Epistula PHP

Wiki wiki wild, wiki wiki wild wild web.

About a month ago, I concieved a Project.

Basically, it was when ESR shifted the focus of the Jargon File by changing the “J Random Hacker” entry to more reflect his own beliefs. My plan to combat this was – and is – to file the whole Jargon File into a Wiki and let the world at it. Think an entry is biased? Change it.

All I needed to do was to get it out of the Jargon File format and into something that I could import into a wiki. Then I discovered something really, really fundermental.

All wikis suck.

In fact, the particular way all these Wiki’s sucked was two fold. The first was the most important: No existing wiki that I could install on this server could import data from an external source. All the ones that backed onto plain text files I couldn’t – for various reasons – install. All the ones that backed onto a mysql database had data structures six feet deep that I couldn’t hack my way around.

The second reason all wikis suck is the really, really horribly nasty text formatting that has become standard. ‘’‘’this is italic’‘’’ ‘’‘’‘’‘’this is bold’‘’‘’‘’’ is a little too baroque, verbose, and nasty for my liking.

This weekend, my project was to play around with PEAR (The PHP equivlient of CPAN, crossed with apt-get. It rocks), for which I needed a project. Plus, Dean Allen has just released Textile 2 beta, the best text-formatting library for PHP (and now perl) bar none. Aha, I thought. This will solve several problems.

So, this afternoon about 16:00 I started coding my own Wikilike, and now at 1am, I’ve finished the first cut. It uses textile for formatting, it does Wikilike things, and it backs on to the Aquarionics User System (currently only used for Forever, so if you had an account on Forever at about 8pm this evening – when I copied the database locally – you have an account on the Wiki. I haven’t gotten around to writing an account creation system for the wiki locally yet, so you’ll have to be anonymous if you don’t).

The current Wiki is up on my local server, it inherantly supports mulitple wiki’s per server, but I’ve still got to put in the really cool bits, like the XML-RPC interface, the ability for admins to lock pages, and stuff.

And the name? Well, it was done quickly (A Qwiki), it’s mine (Aq Wiki) and it’s slightly sick (Aqw Iki)

But it’s there, it’s working, and since I’m working in the morning, I’d better head to bed…

epistula Imported From Epistula PHP

Gzip Compression with PHP

Mark re-recommended compressing pages today, which is always a useful idea. Sneaky (The server Aqcom wallows on) has mod_gzip installed, which automagically compresses all static pages before sending them. Great, but not so cool for the dynamic PHP pages that the rest of the site works on.

My solution to this was to kill two birds with one stone. The first is that since Aquarionics caches every page the first time it’s loaded the cache directory can get quite large – esspecially when something like Google comes in and looks at (and therefore generates) all the pages on the site. First thing was then to modify my cache generation system as follows:

The cache-save system – which writes the contents of the output buffer to a file – just used @gzopen@ and @gzwrite@ instead of the standard PHP file-write code, and the cache-load decompressed it.

Next, testing for compression and sending if possible. First was simple, PHP gives you an array $HTTP_SERVER_VARS containing all the stuff the client sends, so I just needed to check the HTTP_ACCEPT_ENCODING variable for the string “gzip”:

$compress = true;
} else {
$compress = false;

and then the complicated bit, sending the right version. If the compression was on, all I needed to do was send the right headers and pass through the contents of the cache. If not, I would decompress the cache and pass that. Note that while there is a function to directly output the contents of a gzip file (gzpassthru) I tend to avoid it because I want to send a content-length header too, and I can’t get that if I only know the length after the output. gzpassthru returns the number of bytes out, but header("Content-Length: ".gzpassthru($cachename)) was abandoned for readiblity, though it’s prefectly valid. Anyway, the code:

header ("ETag: "$etag"");
if ($etagval==$etag || $HTTP_IF_MODIFIED_SINCE == $gmt_mtime) {
header("HTTP/1.1 304 Not Modified");
} elseif ($compress){
header("X-Compression: gzip");
header("Content-Encoding: gzip");
header("Content-Length: ".filesize($_EP['cachedir']."/".$cachename));
readfile($_EP['cachedir']."/".$cachename) or die("Couldn't open cache");
} else {
header("X-Compression: None");
$zd = gzopen ($_EP['cachedir']."/".$cachename, "r") or die("Couldn't open cache");
$contents = gzread ($zd, 1000000);
header("Content-Length: ".strlen($contents));
echo $contents;
gzclose ($zd);

(I’ve left the etag code in for context), so now all pages should be gzipped any time they are accessed after the first (First time it generates the content, echos it, then compresses it and writes it to the cache) The full code for all this is at the top and bottom of Epistula.php

Imported From Epistula linux PHP

Everyone loves Spam Statistics

This is a graph of my spam-count:

A graph of aquarion's spam

This is how I created my spam-count. It’s a combination of spamassassin, procmail, shell-script and PHP, and therefore full of Stuff Wot No Man Was To Wot Of. Or something.

First, I run spamassassin, which just generally rocks. All my email comes into one mailbox, aq which is at gkhs dot net. That includes every mailing list, every aquarionics dot com address, all my suespammers mail, and my hotmail account (via the really quite nice program, gotmail. It then gets fed though procmail which uses various script-fu to deposit mailing-lists into newsgroups on the local news server (I prefer reading discussion via usenet), and everything else gets sent though Spamassassin like this:


| /usr/local/bin/spamassassin


  • ^X-Spam-Status: YES

$MAILDIR/spam/`date +%Y-%m-%d`

meaning that everything that SA thinks is spam gets forwarded to a mailbox within my spam folder with the name as the date. Most solutions I saw for this kind of statistics generation put all the mail into one box and then grab the date from it. For the way I’m doing it, that’s a waste of processing, plus it ignores Rule One: Spammers Lie. The date on the spam usually has no relation to the date you got it.

So, we now have a box called – for example – 2003-06-09 containing today’s Spam (On the second day of every month, a cron-job wraps all the last-month’s spam into a tarball and dumps it somewhere to rot). Every morning at 1:12am, the following runs:

DATE=`date +%Y-%m-%d`
SPAMTODAY=`from -f ~/Mail/spam/$DATE | wc -l`

echo $DATE, $SPAMTODAY >> ~/logs/spam.log

(from is a program that displays the from: header of every mail in a mailbox. In this case, it generates one line per mail which is what we want). Giving us a file like this:

2003-06-02, 328
2003-06-03, 134
2003-06-04, 130
2003-06-05, 152
2003-06-06, 125
2003-06-07, 123
2003-06-08, 267

Which we can do whatever we want with. In this case, a PHP script (or you can view that nicely formatted) which generates a graph.

And thats how I know I get about 100 to 300 pieces of spam every day.

Best yet, that’s the only way I know I get 100 -> 300 pieces of spam a day 🙂

Imported From Epistula PHP Projects


New software release for all y’all, and it’s not even a weblog-related thing. Be afraid.

It says on my About page that I “write things to put things into databases and take them out again.” and MusicDB is no exception. It’s entire purpose in life is to put things (In this case, references to MP3 files) into a MySQL database, and then take them out again according to the criteria you specify via the command-line or web-based client.

It’s in Perl, and SQL, and PHP. It runs my somewhat excessive MP3 playlists, and it’s reached 1.0 and been thrown into the universe with nothing but a GPL to it’ name. Go have fun.