
Some of my colleagues were ecstatic this week about a spike in website visitor traffic. The excitement disappeared quickly after we discovered something very strange in our Google Analytics account.
As most of you know (or should now) any good website analytics package will automatically remove traffic that does not come from an actual person.
The concept is easier to understand when you think about it the opposite way. Programs like Google Analytics have a list of crawlers (also known as "spiders" or "bots") and usually exclude them from any website traffic reports. Identified by an IP address just like a "real" website visitor, these crawlers traverse your website, sometimes maliciously, and index your content so it can be found on search engines and elsewhere.
When looking at our Visitors Report, one of the top Browsers was "Googlebot-Image," which by its name alone was a red flag that something wasn't right (take a look at the first of the three images above). Googlebot-Image is the name of Google's spider that indexes your website's images and displays them in search results. You could just prevent this from happening by adding an exclusion to your website's robots.txt file.
I've always had doubts about Google and continue to be careful with what we agreed to and share with them based on the company's terms of service. After all, anything free has to raise some suspicion.
From what I can tell, traces of Googlebot-Image started showing up in our analytics around April 2010. Why wouldn't have we noticed this before now? It seems that Google snuck this in and for some unknown reason has not excluded its own browser agent from reports. We quickly added an exclusion filter for the specific Visitor Browser Program (see the second image) and voila, Googlebot-Image is no longer being counted in our visitor traffic (see the third image).
While on the subject of website analytics, make sure you look at your referral traffic more closely as effective last week all traffic from Twitter and programs like TwitterFeed and TweetDeck will now show up as coming from t.co.
- Categories:
- Web Strategy
- Companies:
