Navigation Map

Help Desk Home

Homebuilder

Network Operations Center

Uploading
References
Trouble Shooting
Special Characters

Counters
Forms
Password Control
RealAudio
Imagemaps
Mime Types
Web Stats

CGI Basics
Custom CGI
File Permissions
SSI's
Java!
MySQL
PHP(3)
Custom Errors
Virtual Domain FAQ
Secure Server Info

IO Web Helpdesk - Web Stats


Why read stats?

Before making your own web stats, it is best to first understand 'why' you want to read the statistics. The most common reason is for advertizing purposes and to measure the populartiy of one website versus another. These might be important things to measure, but web log files are not an accurate measure. A very good explanation of this can be found at http://www.cranfield.ac.uk/docs/stats/ .

Usefulness of web stats

Web stats are most useful for pinpointing key pages and directories within a website and measure the comparative number of hits and amount of data transfered as opposed to parallel pages and directories. In this way, a web author can try rearranging links on a webpage in a more 'directed' fashion.

How to collect web stats

If all you want is hit information and how much traffic you have to your website, you can check these statistics at http://password.io.com. If you want to look at more details, please read on.

There are a couple of different ways one can go about getting their individual webstats (normal users on io.com that is...the virtual domain customers already have stats produced for them).

Our main web server log files are split each night at midnight and can be found in the directory /home/www/logs/80/YYYY/MM/DD/ where YYYY, MM, and DD are the numerical values for year, month, and date. Our example user will be called 'derf'.

Derf can look at the 1 megabyte report that's made each night and see where his directory stands in the main directory listing, or he can get his own original log file for each day by zgrepping the main log file (from the Unix shell):

zgrep
-h -e '(~|%7E)derf' /home/www/logs/80/1998/02/03/combined_log* > ~/access_log
If he wanted a log file for several days at once, he can use a similar solution:
zgrep
-h -e '(~|%7E)derf' /home/www/logs/80/1998/02/??/combined_log* > ~/access_log
which would give him his logs for the entire month of February.

OR

zgrep
-h -e '(~|%7E)derf' /home/www/logs/80/1998/??/??/combined_log* > ~/access_log
which would give him logs for the entire year of 1998.

Each one of the log files is between 80 and 120 megabytes (each day), so the more log files you are zgrepping, the longer it will take. Also, the more popular your web pages are, the bigger your log file will be.

What to do with that log file? Well, there are a few hundred tools available on the web for analyzing them. The one IO uses is called Analog. It is rated as the fastest one out there and it's available for almost any platform (mac, windows, *nix, etc.). It's also free! The official home page for Analog is http://www.statslab.cam.ac.uk/~sret1/analog/ .



Last revised December 6, 1999