Aquarionics

Logging

Aquarionics' logging system was designed to work against mod_log_sql, a module that, er, logs to an SQL database. This worked until we upgraded to Apache 2, which log_sql didn't support until recently. Since part of the logging system is the bit of AqCom that shows who linked here recently, I'd rather not convert it to run off plain text files (though I may be converting it to use Sqlite at some point), so I created a perl script that feeds the log into the database in log_sql's format. It looks like this:

#!/usr/bin/perl

use DBD::mysql;

#Database options:
$dbUser = "user";
$dbPass = "password";
$dbName = "epistula";

$database = DBI->connect("dbi:mysql:$dbName:localhost:1114", $dbUser, $dbPass);

#204.95.98.252 - - [24/Dec/2003:15:23:38 +0000] "GET /archive/writing/2003/08/ 
	19 HTTP/1.0" 200 11873 "-" "msnbot/0.11 (+http://search.msn.com/msnbot.htm)"

while (<>) {
  my ($client, $identuser, $authuser, $date, $method,
      $url, $protocol, $status, $bytes, $referer,$agent) =

/^(S+) (S+) (S+) [(.*?)] "(S+) (.*?) (S+)" (S+) (S+) "(.*?)" "(.*?)"$/;
  # ...
        #$database->quote($thisdir);
        $q = "insert into apachelogs (remote_host, remote_user, request_time, 
			request_method, request_uri, request_protocol, status, bytes_sent, referer, agent)
        values
        (".$database->quote($client).", ".$database->quote($authuser).", '".$date."', "
			.$database->quote($method).", ".$database->quote($url).", "
			.$database->quote($protocol).", ".$database->quote($status).", "
			.$database->quote($bytes).", ".$database->quote($referer).", "
			.$database->quote($agent).")";

        #print $database->quote($url)."n";
        my $sth = $database->prepare($q);
        $sth->execute();

}

...and is run using this crontab line:

@reboot tail -f /var/log/apache2/www.aquarionics.com | $EPBIN/apache2db.pl &

Now, the important thing to remember is that this gets pretty big pretty quickly, since it logs every line. It's vitally important that you don't under any circumstances, forget that you commented out this crontab line:

@daily echo "delete from apachelogs where time_stamp < `date +%Y%m%d --date '1 month ago'`" | mysql epistula

Because otherwise you'll discover that your daily database dumps start to hit 16Mb each... BZ compressed... 380Mb uncompressed... oh, lets say four months and twelve days later.

For example.

(I ran the above query, or one like it, just before I started this entry. It's just stopped:

mysql> delete from apachelogs where time_stamp < 20040825;
Query OK, 913830 rows affected (21 min 44.87 sec)

Reformatting for the girlymen who don't have 2000px wide displays and are reading the RSS feed. See? This is why I want to only do partial content, because that way when I do something like this it only fucks up in IE


Nicholas 'Aquarion' Avenell is a web developer in London, you can find out more about him or how to get in touch.

There are more Articles, Projects, Journal Entries, Photographs and things that defy description here, too.

If you're looking for something specific, there are Calendar & Category -based lists of everything.

And if you want to follow stuff that appears here, try a Syndication Feed, or the generic Feed of everything.


Aquarion's last Twitter was: [updating]
Twitter last updated


More Journal:

[RSS Icon]
[ESF Icon]
[CDF Icon]

That which is relevant:


Explain Ads
© 2000 to 2008 inclusive Nicholas Avenell
All comments are the property of their creators, published with permission
(Unless otherwise indicated, the opinions and sentiments expressed on this site are those of the author and not of any organisation of which he is an affiliate, including his employer. Caveat Lector, E&OE. sigh)
0.251 seconds, 11 queries, 2.61Mb on Mon, 25 Aug 2008 16:43:03 +0000
Generated by Epistula Version 2.0.3