Why read stats?Before making your own web stats, it is best to first understand 'why' you want to read the statistics. The most common reason is for advertizing purposes and to measure the populartiy of one website versus another. These might be important things to measure, but web log files are not an accurate measure. A very good explanation of this can be found at http://www.cranfield.ac.uk/docs/stats/ .
Usefulness of web statsWeb stats are most useful for pinpointing key pages and directories within a website and measure the comparative number of hits and amount of data transfered as opposed to parallel pages and directories. In this way, a web author can try rearranging links on a webpage in a more 'directed' fashion.
How to collect web statsIf all you want is hit information and how much traffic you have to your website, you can check these statistics at http://password.io.com. If you want to look at more details, please read on.
There are a couple of different ways one can go about getting their individual webstats (normal users on io.com that is...the virtual domain customers already have stats produced for them).
Our main web server log files are split each night at midnight and can be found in the directory /home/www/logs/80/YYYY/MM/DD/ where YYYY, MM, and DD are the numerical values for year, month, and date. Our example user will be called 'derf'.
Derf can look at the 1 megabyte report that's made each night and see where his directory stands in the main directory listing, or he can get his own original log file for each day by zgrepping the main log file (from the Unix shell):
zgrep -h -e '(~|%7E)derf' /home/www/logs/80/1998/02/03/combined_log* > ~/access_logIf he wanted a log file for several days at once, he can use a similar solution:
zgrep -h -e '(~|%7E)derf' /home/www/logs/80/1998/02/??/combined_log* > ~/access_logwhich would give him his logs for the entire month of February.
zgrep -h -e '(~|%7E)derf' /home/www/logs/80/1998/??/??/combined_log* > ~/access_logwhich would give him logs for the entire year of 1998.
Each one of the log files is between 80 and 120 megabytes (each day), so the more log files you are zgrepping, the longer it will take. Also, the more popular your web pages are, the bigger your log file will be.
to do with that log file? Well, there are a few hundred tools available on
the web for analyzing them. The one IO uses is called Analog. It is rated as
the fastest one out there and it's available for almost any platform (mac,
windows, *nix, etc.). It's also free! The official home page for Analog is