<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" 
  xmlns:dc="http://purl.org/dc/elements/1.1/"
  xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
  xmlns:admin="http://webns.net/mvcb/"
  xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
  xmlns:content="http://purl.org/rss/1.0/modules/content/"
  xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
  xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/">
<channel>
<title>Aquarionics - Category - Perl</title>
<link>http://www.aquarionics.com/category/Perl</link>
<description></description>
<dc:language>en-gb</dc:language>
<dc:creator>Aquarion (nicholas@aquarionics.com)</dc:creator>
<dc:rights>Copyright 2008 Aquarion</dc:rights>
<dc:date>2008-10-01T19:25:52+00:00</dc:date>
<admin:generatorAgent rdf:resource="http://www.aquarionics.com/epistula/?v=2.0.3" />
<admin:errorReportsTo rdf:resource="mailto:nicholas@aquarionics.com"/>
<sy:updatePeriod>daily</sy:updatePeriod>
<sy:updateFrequency>8</sy:updateFrequency>
<sy:updateBase>2000-01-01T12:00+00:00</sy:updateBase>
<item>
	<title>Logging</title>
	<link>http://www.aquarionics.com/journal/2004/09/25/Logging</link>
	<comments>http://www.aquarionics.com/journal/2004/09/25/Logging</comments>
	<description>In which Aquarion fixes the logging system. Beware, contains lines of a dangerous width.</description>
	<guid isPermaLink="true">http://www.aquarionics.com/journal/2004/09/25/Logging</guid>
	<content:encoded><![CDATA[<p>Aquarionics' logging system was designed to work against <a href="http://www.outoforder.cc/projects/apache/mod_log_sql/">mod_log_sql</a>, a module that, er, logs to an SQL database. This worked until we upgraded to Apache 2, which log_sql didn't support until recently. Since part of the logging system is the bit of AqCom that shows who linked here recently, I'd rather not convert it to run off plain text files (though I may be converting it to use <A HREF="http://www.sqlite.org/">Sqlite</A> at some point), so I created a perl script that feeds the log into the database in log_sql's format. It looks like this:</p>

<pre><code lang="perl">#!/usr/bin/perl

use DBD::mysql;

#Database options:
$dbUser = "user";
$dbPass = "password";
$dbName = "epistula";

$database = DBI->connect("dbi:mysql:$dbName:localhost:1114", $dbUser, $dbPass);

#204.95.98.252 - - [24/Dec/2003:15:23:38 +0000] "GET /archive/writing/2003/08/ 
	19 HTTP/1.0" 200 11873 "-" "msnbot/0.11 (+http://search.msn.com/msnbot.htm)"

while (&lt;&gt;) {
  my ($client, $identuser, $authuser, $date, $method,
      $url, $protocol, $status, $bytes, $referer,$agent) =

/^(S+) (S+) (S+) [(.*?)] "(S+) (.*?) (S+)" (S+) (S+) "(.*?)" "(.*?)"$/;
  # ...
        #$database->quote($thisdir);
        $q = "insert into apachelogs (remote_host, remote_user, request_time, 
			request_method, request_uri, request_protocol, status, bytes_sent, referer, agent)
        values
        (".$database->quote($client).", ".$database->quote($authuser).", '".$date."', "
			.$database->quote($method).", ".$database->quote($url).", "
			.$database->quote($protocol).", ".$database->quote($status).", "
			.$database->quote($bytes).", ".$database->quote($referer).", "
			.$database->quote($agent).")";

        #print $database->quote($url)."n";
        my $sth = $database->prepare($q);
        $sth->execute();

}</code></pre>

<p>...and is run using this crontab line:</p>

<pre><code lang="shell">@reboot tail -f /var/log/apache2/www.aquarionics.com | $EPBIN/apache2db.pl &amp;</code></pre>

<p>Now, the important thing to remember is that this gets pretty big pretty quickly, since it logs every line. It's vitally important that you don't under any circumstances, forget that you commented out <em>this</em> crontab line:</p>

<pre><code lang="shell">@daily echo "delete from apachelogs where time_stamp &lt; `date +%Y%m%d --date '1 month ago'`" | mysql epistula</code></pre>

<p>Because otherwise you'll discover that your daily database dumps start to hit 16Mb each... BZ compressed... 380Mb uncompressed... oh, lets say four months and twelve days later.</p>

<p>For example.</p>

<p>(I ran the above query, or one like it, just before I started this entry. It's just stopped:

<pre>mysql&gt; delete from apachelogs where time_stamp &lt; 20040825;
Query OK, 913830 rows affected (21 min 44.87 sec)</pre>

<p><ins datetime="2004-09-25T16:49:00+0000">Reformatting for the girlymen who don't have 2000px wide displays and are reading the RSS feed. See? This is why I want to only do partial content, because that way when I do something like this it only fucks up in IE</ins></p>]]></content:encoded>
	<dc:date>2004-09-25T12:32:13+00:00</dc:date>
	<dc:subject>aqcom</dc:subject>
	<dc:subject>Perl</dc:subject>
	<dc:subject>programming</dc:subject>
	<slash:comments>0</slash:comments>
	<slash:section>journal</slash:section>
	<trackback:ping>http://www.aquarionics.com/trackback/journal/1515</trackback:ping>
</item>
<item>
	<title>Geek at Christmas</title>
	<link>http://www.aquarionics.com/journal/2003/12/24/Geek_at_Christmas</link>
	<comments>http://www.aquarionics.com/journal/2003/12/24/Geek_at_Christmas</comments>
	<description>So, as is traditional I spend my christmas holidays playing with epistula. Now I have referer tracking turned working again.

	The problem with referer tracking is extracting the data from log files. When the server had mod_log_sql it was easy (I have an entire log stats suite built for mod_log_sql), but since log_sql doesn&amp;#8217;t support Apache 2 yet (A patch to make it do so was released...</description>
	<guid isPermaLink="true">http://www.aquarionics.com/journal/2003/12/24/Geek_at_Christmas</guid>
	<content:encoded><![CDATA[<p>So, as is <a href="http://www.aquarionics.com/journal/2001/12/21/Klide_1.6">traditional</a> I spend my christmas holidays playing with epistula. Now I have referer tracking turned working again.</p>

	<p>The problem with referer tracking is extracting the data from log files. When the server had <a href="http://www.grubbybaby.com/mod_log_sql/">mod_log_sql</a> it was easy (I have an entire log stats suite built for mod_log_sql), but since log_sql doesn&#8217;t support Apache 2 yet (A patch to make it do so was released yesterday. It remains untested) I had to brush off my extremely limited perl skillz to create this, a perl program to send apache logs to mysql:</p>

<code>
	<p>#!/usr/bin/perl<br>
use DBD::mysql;</p>
	<p>#Database options:<br>
$dbUser = "username";<br>
$dbPass = "password";<br>
$dbName = "database";</p>

	<p>$database = DBI-&gt;connect(<br>
"dbi:mysql:$dbName:localhost:1114",<br>
$dbUser, $dbPass<br>
);</p>

	<p>while (&lt;&gt;) {</p>
  my ($client, $identuser, $authuser, $date, $method,
      $url, $protocol, $status, $bytes, $referer,$agent) =<br>
/^(S+) (S+) (S+) [(.*?)] "(S+) (.*?) (S+)" (S+) (S+) "(.*?)" "(.*?)"$/;

        $q = "insert into apachelogs <br>
(remote_host, remote_user, request_time, request_method, <br>
request_uri, request_protocol, status, bytes_sent, referer, agent)
        values
        (".$database-&gt;quote($client).", ".$database-&gt;quote($authuser).", '"<br>
.$date."', ".$database-&gt;quote($method).", ".$database-&gt;quote($url)<br>
.", ".$database-&gt;quote($protocol).", ".$database-&gt;quote($status)<br>
.", ".$database-&gt;quote($bytes).", ".$database-&gt;quote($referer)<br>
.", ".$database-&gt;quote($agent).")";

        my $sth = $database-&gt;prepare($q);
        $sth-&gt;execute();

	<p>}<br>
</code></p>]]></content:encoded>
	<dc:date>2003-12-24T17:55:57+00:00</dc:date>
	<dc:subject>aqcom</dc:subject>
	<dc:subject>Perl</dc:subject>
	<dc:subject>Christmas</dc:subject>
	<slash:comments>1</slash:comments>
	<slash:section>journal</slash:section>
	<trackback:ping>http://www.aquarionics.com/trackback/journal/1267</trackback:ping>
</item>
</channel>
</rss>