The Lowdown on Log Analysis
Reprinted from The Triangle Technical Journal
So you’ve invested in a corporate web site, now what? Well, many companies treat their web sites like the OSHA posters that they are required to display…everyone has one so there’s ours. But that is a bad move. Web sites at the corporate level can be significant investments and should be treated as such. To get in the right mindset, organizations should think of their web sites in the same manner as physical expansions – web sites are essentially virtual expansions. And therefore, their performance should be evaluated to determine effectiveness, impact, and return on investment. “How?”…You ask. “Server Log Analysis.”…I reply.
Just like footprints in the sand, each of us leaves cybertracks when we surf the Web. With every click of the mouse and site visited, our virtual “footprints” reveal information that can be very useful to an organization. If you’re interested in knowing: “How many hits is my web site getting?” “How many visitors are going to my site?” “What pages are people looking at?” Where are my users coming from?” “Are any of my links broken?” “What browsers/platforms are being used?” “How do I go about measuring the usage traffic on my site?” and “What information can I get?”, then server log analysis is for you.
Server log files are basically records of web server activity. They provide details about file requests to a web server and the server response to those requests. Each line of the main file log (aka the access log) describes the source of a request, the file requested, the date and time of the request, the content type and length of the transferred file, and other data such as errors and the identity of referring pages.
As you can see, this can be an overwhelming amount of information (some useful and some not so useful), so it will be important to select tools that capture and summarize data in a format that is meaningful to your organization. Though data collected in log files can vary from one server to another, the ideal log file for web analytics contains the following:
- Who is visiting your site. You want unique visitor identification so that you know whether a visitor is new or returning.
- The path visitors take through your pages. With insight of each page a visitor viewed and the order of those pages visited, you can identify trends in how visitors navigate through your site. You also want to know what element (link, icon) a visitor clicked on to navigate between the pages of your site.
- How much time visitors spend on each page. A pattern of an extended period of time being spent on a single page may indicate that the page is either very interesting or very confusing.
- Where visitors are leaving your site. The exit page for a visitor’s session can be evaluated to determine whether it was a logical place to end the visit, or whether the visitor bailed out due to usability issues or a mere lack of interest.
- The success of users’ experiences at your site. Completed purchases, successful downloads, and viewed information are concrete indicators of tasks accomplished.
Basically, the goal of data collection is to have enough information to reconstruct the entire session of the user’s visit to your web site. Armed with such information, an organization is poised to leverage that insight to not only optimize its web site for increased customer loyalty and business benefits, but also to assist with the following:
- Planning Web infrastructure capacity to handle future growth;
- Understanding qualities of new and repeat visitors;
- Target marketing offers and campaigns to specific categories of visitors;
- Determining a budget that is appropriate for future web site improvements and online advertising campaigns;
- Identifying potential partnerships based on generated referral traffic and realized profits;
- Modifying investments in, altering the navigation to, or editing the content in web pages based on user patterns.
Now, the bad news: log files are not fun to look at – in fact, they’re ugly and overwhelming. But there is good news – these beasts can be tamed. There are some excellent software solutions out there that can summarize and present raw log data in a format that is useful for you – and even better news…some are free! These tools have many similarities, but can range in features and price. So before evaluating the options available, it is important to determine the type of data you wish to see. Log analysis programs can display everything from a list of IP addresses connected to the Web server to a pie chart detailing which files were accessed most often. Also, if your server produces any unusual file types or proprietary file formats, you’ll need to check and see if your desired software package can understand the flavor of log file you want to analyze.
Almost every log file analyzer runs locally on your computer, but some software companies have moved to an ASP model – where the log file report generators are hosted with the vendor. These options are “lightweight” yet powerful, and they can be accessed from any computer with Web access – regardless of whether or not it’s directly connected to your server. Though their functionality is limited and security issues are always a concern, a hosted solution is often less costly than purchasing a large software application.
If you are finding yourself ready to dive into web analytics, remember to work closely with your technical staff who have expertise in the software and configuration choices used in producing the log file. You may also want to seek advice or engage in your project with someone who has actually performed server log analysis. As in any dynamic technical field, spending a few hours with people who have pioneered the technique can radically reduce the low-productivity weeks you’d spend toiling on your own – and accelerate the useful results that are to come from your log file analysis. But to give you a head start, some of the more commonly deployed log analysis tools include:
- Webalizer – [Open Source, Free of Charge]: It is one of the most popular log analysis tools. Webalizer is a C-based analysis tool that has to be run from the command line. The graphics are not optimal, but the reports are sufficient for providing a quick glimpse of a few important data points; namely “What pages are being accessed?” and “How many hits are we getting?”
- Analog – [Open Source, Free of Charge]: It is a C-based analysis tool that can be run from a web page or the command line. Analog attempts to present everything, but it is an example of how to include too much information for normal human consumption. By default, everything is presented on the same Web page. A navigation bar at the top (Analog’s saving grace) allows users to click on a specific report, which drills down to another section of the page. Analog’s more interesting reports include listings of: how many hits come from each country, search engine queries that brought users to the site, and which browsers and operating systems visitors used.
- Summary – [Commercial, 30-day Trial Version Available]: It is a commercial log analysis tool that includes all possible information and lists options in a text web page for users to click on. With over sixty reports, you can find out more about your visitors, connection bandwidth, referrers, and all aspects of your website, than with any other tool. It offers many unique reports including: search words used to find your site in major search engines, periods when the server was down, most commonly used entry points into your site, visitors connection (modem) speed. Although the GUI can be a bit cumbersome, the cost of Summary is not prohibitive and the reports are decent.
- WebTrends – [Commercial]: Preface: WebTrends is not for companies with skinny wallets. WebTrends has been around for more than a decade and plays nicely with IIS. It is the industry leader for log file reporting applications on the enterprise and small-to-medium business level. The advantages of the tool are its ease-of-use and ability to generate just about any report you could possibly want. The disadvantages of WebTrends is that it only runs on Windows and its price. But the company’s claims of usability appear founded, and it has even included a way to access all of the information available from Web server logs.
- AWStats – [Open Source, Free of Charge]: It is a Perl-based log analysis tool that can be run from a web page or the command line. AWStats is by far the best looking of all of the Free Web log analysis tools. Its graphics are superb, and its information is effectively presented. At a glance, users can view all available reports and navigate seamlessly between them.
Having your own server log analysis statistics has distinct benefits. First and foremost, statistics about your web site are available on demand and in a format that is useful to you. Secondly, you can change the data you are gathering as needed. And finally, log analysis statistics provide a more complete record of how your web site is being used – also known as…invaluable insight. As we’ve all heard, knowledge is power – and by improving your understanding of your users through server log analysis, you will have a direct effect on your business success and customer loyalty.
Business Processes
- Progressive Development: Stop Postponing Joy!
- Progressive Development: The Revolution of Evolution, Presented May 26, 2006 at Webstock, Wellington NZ
- Is Your Not-For-Profit Web Focused?
- The Lowdown on Log Analysis
- So You Think You Want A CMS? Read On To Be Sure...
- The Smart Organization's Guide To Implementing Change
- Google's AdSense: Turning Clicks to Cash
- Your Company's Go-To Source: The Intranet Inside
- The Power of the Portal
- Association Management Software: Customer Relationship Management and Business Intelligence for the Not-for-Profit
- Jefferson Meets the Jetsons: The Web of Politics
- The Business of Blogs
- Psst.I see knowledge communities.Pass it on
- Eeny, Meany, Miney, Moe... Bringing Logic to CRM Selection
- We Blog: Publishing Online with Weblogs, by Meg Hourihan, Matt Haughey, and Paul Bosch, edited by Steven Champeon (buy it)
- Growing Online Community, Presented October 8, 2000 at WEB2000 in Washington, DC
- Who Knows? Employees come and go, but how do you get their knowledge to stick around?, Entrepreneur Magazine, featuring CEO Heather Hesketh
- Plug and Play Journalist
