The Importance of Analyzing your Logs
Posted by kevin Tue, 07 Jun 2005 04:19:17 GMT
You might have noticed, but I run a big poetry website called allpoetry.com. For the first few years I ran nightly analysis on the logfiles, then hourly as the site got bigger, but it kept slowing things down.
And I never looked at it. So I disabled it, and I just clean up the log by hand every few months once it reaches a few gigabytes.
The important thing with statistics is figuring out what questions you want to be answering. To me that’s one of the big problems with research sometimes – rummaging through answers to questions we don’t have, and might not ever.
Anyways, I’m playing around with me-driven logfile analysis of the last 100,000 rows (just using ‘tail’ whenever I want to look at the stats. Not quite as useful as historical stats, but faster.
One of my big problems also was that since it’s a dynamic site, many pages are unique. I really want a tool that will ‘stem’ my pages, so /poem/14441?reply=yes will just be /poem. It’s more useful to see that there were 20,000 hits on poems, rather than 20 hits on poem number 14441.
I’m sure others have confronted this problem before; if you have any ideas, let me know!
Technorati Tags: logs
