Why Google Analytics is not always right.
Working with Google Analytics is a polarizing experience. There is much written for and against the usage of Google’s free analytic tools to undertake data critical decisions. In this post we highlight some of the concerns raised by Xerago’s best, as they continue working with Google Analytics on a day to day basis.
There’s a reason why Google Analytics is quite popular among marketers; it’s free. And so are the best things in life. But apply that analogy to analytics and you’ll end up dazed and confused.
In small and medium size engagements large web numbers is rarely an issue. Visitor traffic tends to be in the tens of thousands compared to millions making it easier for the decision makers to take data driven decisions. Low traffic means low problems.
For Google this is a match made in heaven. Why? For the simple fact that it masks Google’s confounding methods of data collection and sampling. Google Analytics samples data when the data reporting requests exceeds 500,000 sessions. Only a subset of the raw data known as a ‘sample’ is analyzed by Google Analytics to produce the required reports. Google indicates this number as ‘N sessions’ at the top of reports.
The idea behind sampling is to reduce resources and still provide an ‘acceptable’ level of data for analysis and decisions. This means that there exists an inversely proportional relationship between the sheer amount of data and the overall accuracy of the same. Data accuracy can be anywhere between 5-10%, but still data is data.
Now this works fine for small and even medium sized businesses to an extent. However, for an eCommerce heavy site where transactions and conversions are in the millions for a day, even decimal digits matter and Google Analytics’ sampled results just aren’t going to cut it.
Why sampling is not a one size fits all solution
The inaccuracy is amplified when advanced data drill down actions takes place. Introduce segments and dimensions into the picture the mismatches become more conspicuous. Look beyond the usual visits and session numbers at transaction, revenue and conversion rates, and you’re in for a rude awakening. Compare your sampled and non-sampled reports to see for yourself.
The incongruence comes to a head when trying to integrate Analytics and Spreadsheet packages. Google does not support native Excel manipulation. Hence data capture is done through third party APIs like Excellent Analytic. The imported data in such cases is also subject to discrepancies due to sampling, and the differences are plain to see when compared with the original data on Google Analytics. What’s more, the accuracy rate varies between various APIs making the entire exercise futile.
In Google’s defense they do offer a choice; sampled reports with quicker processing versus ‘less’ sampling and slower processing. For unsampled data you’re going to need a premium account.
How do we decide what is best?
For those in the know, sampling is possible to get around. For novice users however, this is a conundrum. Should they trust questionable data, or spend insane amounts on an analytics packages with all the bells and whistles? Even if they do, does their inbound traffic and online conversions justify their investment? The questions are many, the answers, not so much.
Xerago has consulted for clients across various verticals and has helped to identify the right solutions for their analytics needs. Xerago continues to advise clients in understanding the scope of their web assets and maximizing their data driven value.
What is your experience with analytics? Have you witnessed the same problems we did? Let us know in the comments below.: