rbechtel
 
How To Interpret Website Traffic Data

What can we learn from the following sample of raw data generated by a website’s traffic?

199.64.0.252--[11/Nov/2014:09:54:47 -0500] "GET /commercialinvestmentblog.html HTTP/1.1" 200 26828 "https://www.google.com/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.111 Safari/537.36" 199.64.0.252--[11/Nov/2014:09:54:47 -0500] "GET /background.jpg HTTP/1.1" 200 5145 "http://www.gsn.html" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.111 Safari/537.36" 199.64.0.252--[11/Nov/2014:09:54:47 -0500] "GET /2.jpg HTTP/1.1" 200 53918 "http://www.gsn.com/aboutus.html" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.111 Safari/537.36" 199.64.0.252--[11/Nov/2014:09:54:47 -0500] "GET /favicon.ico HTTP/1.1" 404 209 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.111 Safari/537.36"199.64.0.252--[11/Nov/2014:09:54:47 -0500] "GET /2.jpg HTTP/1.1" 200 53918 "http://www.gsn.com/contactus.html" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.111 Safari/537.36" 199.64.0.252- - [11/Nov/2014:09:55:34 -0500] "GET /bestfirms2015.jpg HTTP/1.1" 304 - "http://gsn.com/" "Mozilla/5.0 (Windows NT 6.1; Trident/7.0; rv:11.0) "


Answer: At 9:55 a.m. on Nov. 20, 2014, an employee of Honeywell Inc. in Morristown, New Jersey, zip code 07960, website: http://honeywell.com/Pages/Home.aspx), used Goggle search  to access a blog page titled “Commercial Investment.” The visitor then navigated to the website’s About Us page, and after that to its Contact Us page.

Below are instructions on how to access and interpret website traffic data most hosting companies provide upon request.  Because their success is our success, we offer this guide to our customers so that they can enjoy the competitive advantages this intelligence delivers. While metrics such as search engine rankings and activity reports speak to a website’s general performance, a website’s traffic data covers each and every visit to your website and removes most the guesswork as to who the visitors are and the pages they are visiting.   

Step 1

Contact the company that hosts your website and request traffic data. Most reputable companies can and will supply this online, compiling free weekly or monthly reports. Companies such as XO Communications, whose data we are using in this article's examples, enable customers to activate weekly or monthly reports in the customer portal.

Step 2

If the data is one uninterrupted mass, you will need to separate it into blocks. To do this:

  • Copy and paste all the data into a Word document. This will make it more manageable.

  • Identify and separate individual blocks of data. Each block is preceded by either an IP Address or a Host Name.

IP Address

This consists of four groups of numbers separated by periods. It is assigned to each computer participating in a computer network.  

Example:  12.144.20.254 

Sometimes the IP Address is hyphenated, e.g., 12-144-20-254.

Host Name

Host name identifies a group of computers assigned to an IP address. Host names can take various forms. Examples:

  • tx-node1.gmacm.com

  • 94.sub-70-211-73.myvzw.com

  • spxysfo1.bankofamerica.com 

  • sf208-121-64-3.sfgov.org

  • powellrogersandspeaks.com

Below is an example of separated blocks:

199.64.0.252--[11/Nov/2014:09:54:47 -0500] "GET /commercialinvestmentblog.html HTTP/1.1" 200 26828 "https://www.google.com/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.111 Safari/537.36"

199.64.0.252--[11/Nov/2014:09:54:47 -0500] "GET /background.jpg HTTP/1.1" 200 5145 "http://www.gsn.html" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.111 Safari/537.36"

199.64.0.252--[11/Nov/2014:09:54:47 -0500] "GET /2.jpg HTTP/1.1" 200 53918 "http://www.gsn.com/contactus.html" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.111 Safari/537.36"

66.249.64.108 - - [17/Nov/2014:14:10:09 -0500] "GET /biographywilson.html HTTP/1.1" 200 4635 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

66.249.64.112 - - [17/Nov/2014:14:17:52 -0500] "GET /biographyjones.html HTTP/1.1" 404 213 "-" "Googlebot/2.1 (+http://www.google.com/bot.html)"

tx-node1.gmacm.com - - [24/Nov/2014:13:00:42 -0500] "GET /profileellis.html HTTP/1.1" 200 56216 "https://www.google.com/" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.65 Safari/537.36"

tx-node1.gmacm.com - - [24/Nov/2014:13:00:42 -0500] "GET /profilelinks.jpg HTTP/1.1" 200 29463 "http://www.ellislawgrp.com/profileellis.html" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.65 Safari/537.36"

tx-node1.gmacm.com - - [24/Nov/2014:13:00:43 -0500] "GET /elloissuper.jpg HTTP/1.1" 200 35289 "http://www.ellislawgrp.com/profileellis.html" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.65 Safari/537.36"

tx-node1.gmacm.com - - [24/Nov/2014:13:00:42 -0500] "GET /largephotos/ellis.jpg HTTP/1.1" 200 74159 "http://www.ellislawgrp.com/profileellis.html" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.65 Safari/537.36"

Step 3

Ignore data blocks that represent a webbot visit. There are two types of visitors to websites—people and webbots. Webbots crawl page text and metatags on behalf of search engines.. Any data block that includes “bot” is a webbot, e.g., the webbot blocks below sent by Google:

66.249.64.112 - - [17/Nov/2014:14:17:52 -0500] "GET /biographyjones.html HTTP/1.1" 404 213 "-" "Googlebot/2.1 (+http://www.google.com/bot.html)"

66.249.64.108 - - [17/Nov/2014:14:10:09 -0500] "GET /biographywilson.html HTTP/1.1" 200 4635 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

Webbot visitors are also concerned only with website html text and do not register page graphics. For instance, the webbot above visited text in  the biographies of Jones and Wilson. 

By contrast, visits by people include blocks for graphics and videos—jpegs, gifs, pngs, bmps, etc—in addition to website (html) pages. For instance, the first block below indicates a visit to the biography page of Ellis, followed by a block that indicates the photo of Ellis contained in the Ellis biography.

tx-node1.gmacm.com - - [24/Nov/2014:13:00:42 -0500] "GET /biographyellis.html HTTP/1.1" 200 56216 "https://www.google.com/" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.65 Safari/537.36"

tx-node1.gmacm.com - - [24/Nov/2014:13:00:42 -0500] "GET /ellisphoto.jpg HTTP/1.1" 200 29463 "http://www.ellislawgrp.com/biographyellis.html" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.65 Safari/537.36"

Step 4a

If the visitor is identified by an IP address, research it using one or more websites available on the Internet for researching IP addresses, e.g., http://cqcounter.com/whois. Note that if the IP address is hyphenated, you will need to replace the hyphens with periods for a search to work.

At the very least, search results will tell you the city, state and zip code of the visitor’s server if in America or, if foreign, the visitor’s city and nation. If the visitor’s organization accounts for all computers of the IP computer group, search results will reveal the visitor’s organization name. In sum, check results for the name of the organization. If not an Internet hosting company, the organization listed is that of the visitor,

For example, a search for 12.144.20.254 results in “Duane Morris” being listed as the organization. Googling “Duane Morris” leads to www.duanemorris.com and the fact that Duane Morris is an international law firm based in Philadelphia.   

Step 4b

If the visitor is identified by host name, different research possibilities exist.

  • Does the host name contain its IP address? This is true for the host name
    ip68-11-147-8.br.cox.net. Extract the numbers, add periods and research  68.11.147.8.  

  • Does the host name end with a domain? Example: tx-node1.gmacm.com. Google gmacm.com and learn the visitor is an employee of GMAC Mortgage in Detroit. . 


Below are some examples of this kind of host name (domain highlighted in red) followed by the name of the organization:

  • tx-node1.ml.com = Merrill Lynch

  • mail.sdsheriff.net = San Diego Sheriff's Dept.

  • dyn-rev.berkeley.edu = University of California, Berkeley

Step 4c

Occasionally a host name is indecipherable. However, it is not uncommon for a series bearing host names to be preceded by one block bearing the series’ IP address, e.g.:

98.235.10.47 - - [01/Nov/2014:09:51:48 -0400] "GET /article10fcra.html HTTP/1.1" 200 44299 "http://www.google.com/" "Mozilla/5.0 (Linux; Android 4.4; Elite7Q Build/KRT16S) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.114 Safari/537.36"

Snda06a8ns01.cpe.twtelecom.net - - [01/Nov/2014:09:51:49 -0400] "GET /background.jpg HTTP/1.1" 200 5145 "http://www.google.com/" "Mozilla/5.0 (Linux; Android 4.4; Elite7Q Build/KRT16S) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.114 Safari/537.36"

Snda06a8ns01.cpe.twtelecom.net - - [01/Nov/2014:09:51:49 -0400] "GET /2.jpg HTTP/1.1" 200 53918 "http://www.google.com/" "Mozilla/5.0 (Linux; Android 4.4; Elite7Q Build/KRT16S) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.114 Safari/537.36"

Snda06a8ns01.cpe.twtelecom.net - - [01/Nov/2014:09:51:49 -0400] "GET /smith.jpg HTTP/1.1" 404 209 "http://www.google.com/" "Mozilla/5.0 (Linux; Android 4.4; Elite7Q Build/KRT16S)

How do we know for certain 98.235.10.47 is the IP address for the host name Snda06a8ns01.cpe.twtelecom.net?  Because the blocks following the IP address block are for graphics. The only html web page they can belong to is article10fcra.

Step 5

Note the pages visited.

In the sample below, the Home page is designated simply by  / as opposed to /index.html or /default.html. This is not uncommon for Home pages and applies to Home pages only.

Otherwise, the pages visited by107.146.25.188 are the blocks in which GET/ is followed by an html page, namely aboutus.html and contactus.html.

107.146.25.188 - - [05/Nov/2014:14:40:21 -0500] "GET / HTTP/1.1" 200 9377 "http://www.acainternational.org/memberdirectory.aspx" "Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.111 Safari/537.36"

107.146.25.188 - - [05/Nov/2014:14:40:21 -0500] "GET /indexmenu.jpg HTTP/1.1" 200 12134 "http://www.ellislawgrp.com/" "Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like

107.146.25.188 - - [05/Nov/2014:14:40:21 -0500] "GET /indexlogo.jpg HTTP/1.1" 200 15204 "http://www.ellislawgrp.com/" "Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.111 Safari/537.36"

107.146.25.188 - - [05/Nov/2014:14:40:21 -0500] "GET /indexaddress.jpg HTTP/1.1" 200 30106 "http://www.ellislawgrp.com/" "Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.111 Safari/537.36"

107.146.25.188 - - [05/Nov/2014:14:40:21 -0500] "GET /indexanimation.gif HTTP/1.1" 200 326694 "http://www.ellislawgrp.com/" "Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.111 Safari/537.36"

107.146.25.188 - - [05/Nov/2014:14:40:21 -0500] "GET /aboutus.html HTTP/1.1" 200 25204 "https://www.google.com/" "Mozilla/5.0 (iPhone; CPU iPhone OS 7_1_2 like Mac OS X) AppleWebKit/537.51.2 (KHTML, like Gecko) Version/7.0 Mobile/11D257 Safari/9537.53"

107.146.25.188 - - [05/Nov/2014:14:40:21 -0500] "GET /background.jpg HTTP/1.1" 200 5145 "http://www.ellislawgrp.com/aboutus.html" "Mozilla/5.0 (iPhone; CPU iPhone OS 7_1_2 like Mac OS X) AppleWebKit/537.51.2 (KHTML, like Gecko) Version/7.0 Mobile/11D257 Safari/9537.53"

107.146.25.188 - - [05/Nov/2014:14:40:21 -0500] "GET /contactus.html HTTP/1.1" 200 25204 "https://www.google.com/" "Mozilla/5.0 (iPhone; CPU iPhone OS 7_1_2 like Mac OS X) AppleWebKit/537.51.2 (KHTML, like Gecko) Version/7.0 Mobile/11D257 Safari/9537.53"

107.146.25.188 - - [05/Nov/2014:14:40:21 -0500] "GET /2.jpg HTTP/1.1" 200 53918 "http://www.ellislawgrp.com/contactus.html" "Mozilla/5.0 (iPhone; CPU iPhone OS 7_1_2 like Mac OS X) AppleWebKit/537.51.2 (KHTML, like Gecko) Version/7.0 Mobile/11D257 Safari/9537.53"

Miscellaneous

A series can be divided in a report. If you find the same IP address or host name on two series near in time, chances are they represent a single visit. 

Raw data may be reported in reverse temporal order. Check the time sequences to determine the order.

Checklist

1. Do blocks in a series contain “bot” and/or represent website (html) pages only as opposed to some blocks representing graphics (jpegs, gifs, etc.)?  If so, ignore the series. The visitor is a webbot.

2. If the series is identified by an IP address, cut and paste IP address into an IP address search website. (If address’ number groups are hyphenated, replace hyphens with periods before activating search.) In the search results, check the listing under “Organization.” If the organization listed is not an Internet company, it is the name of the visitor’s organization. Google the name.

3. If the series is identified by a host name, first look for a domain at the end of the host name. Unless it belongs to an Internet company, any domain will be that of the visitor’s organization. 

4. If domain is no option, see if the host name includes four groups of hyphenated numbers that can be converted into an IP address. (The numbers must be completely hyphenated. If it is a combination of hyphens and periods, this is not an IP address.) Replace the hyphens with periods and search.

5.  If IP address is not an option, check to see if the host name’s series is preceded by an IP address. If so, is the content the IP address block a website page? If so, do the host name entries that follow represent graphic content for that html page? If so, research the IP address.

6. If a series identified by an unidentifiable host name is not preceded by an IP address, then look down the log for another series of the same host name. If this series does have an IP address, the two series are really one. Cut and paste the IP address.

7. If a host name is three units separated by periods, and the last two units consist of a domain, the domain is likely that of the visitor not an Internet company.  Google the domain.

©Randy Bechtel 2015

Return To Top