The Fair Share of Ratios

As Gary Angel so rightly pointed out in his very popular piece here on the WAO/FACTOR, analysis context is a crucial part of communications. You see, numbers can be very, very tricky. First, because they are exactly that, numbers, and we tend to give a lot of credibility to anything that is quantified. Numbers speak, and speak loudly.  Second, because they can get quite pernicious when they represent ratios, variations, proportions, etc.

Let us take, as a first example, variance of a simple percentage, say “Percentage of Web Sales over Total Sales”. In a business with online/offline activities, this is usually something that interests Digital Marketing teams a lot. Heck, this kind of number is often seen as a good representation of how much the Web contributes to the business. So, we like it, sometimes call it a KPI, and put it up there on the dashboard (done that). It gets tricky, though, when we start explaining its variations. You see, the percentage can very well go down on a given month, simply because another department, say Direct Marketing, launched a massive campaign that was successful. This, even if actual Web sales went up! So, you find yourself with a current percentage that is lower than the previous one, with a nice red arrow pointing down[1]. Not exposing the actual sales numbers, and how those sales did in fact go up, can give off the wrong idea, or at least a very incomplete one, to the report readers.

This is also the reason why conversion rates calculated on total traffic to a site can be very misleading (as Gary pointed out), especially when you try to optimize it. On most sites, there can be more than one reason why someone would be there. The rate can thus vary due to increased visits to parts of the site that are not directly connected to sales, say, the customer support section. Experienced analysts know that it is way more productive to lose sleep over the purchase process conversion rate, or a campaign one, rather than the overall site percentage.

Another example typical of Digital Analytics is to compare results that present proportions of various entities in comparison to the previous period. Let’s say we look at traffic source shares, Organic Search in particular, and how its overall share changed this quarter over the last one. Here too, the number is very much dependent on what happened with other sources; the percentage can be smaller, even though you got way more visits from that source.

There is also the common problem of the reference period. By that, I mean to what period you should compare a number. A very seasonal business should probably not have results that show variations month over month, and create unnecessary explaining to do. A year-over-year (current month over same month last year) certainly works better here.

Take the example of big variations of small things. In a table (or a dashboard, which is just a better looking version), showing big percentages certainly attracts attention.  However, here too context is fundamental, because it is still very important to keep sight of what exactly we are talking about. A 50/100% increase of something very small doesn’t make it big!

There are many other examples of how distorted our view can get when ignoring number context, and I hope the statistically-oriented readers will forgive the overly simplistic examples I gave. They are still common mistakes I see (and have committed myself too often!).

This certainly brings the question of dashboards, and how often the desirable simplicity of their design can make their users prone to misinterpretation. By nature, they create a strong sense of truth, of precision (ah! those percentages showing two decimals!), and too often encourage swift interpretation that can be very misleading. Sure, people can read the accompanying comments, but too often they are mere footnotes. Dashboards most often expose only the what, not the why. And it’s in the why that value is found, generally buried in the complexity of things.

Maybe we humans sometimes give too much value to the simplest explanation.

[1] By the way, better use those little arrows based on some pre-determined acceptable variance, so that you don’t start unnecessary fires whenever the darn metrics goes down by 0.03%.