Friday 22 January 2016

Why statistics without context are rubbish.

As today's world is quite fast-paced and full of information streams, we hear tons of facts, opinions and bullshit everyday. And while maths is supposed to be strict and clear, because you can't really argue with numbers, it is often used in a terrible version of the last category. I often hear that "country X is so great, the average monthly salary there is like 10k USD!". Or even better: "Y% of our nation believed in party Z!". The fun fact is that the numbers themselves do not lie, unless people make them up. But the conclusions drawn from them are total crap, most of the time.

I do understand that numbers appear to be psychologically strong, when given a proper context. To paraphrase a quote from "The Little Prince" (one of the children's book that so many adults fail to understand): if I say "I saw a really expensive car today" nobody's going to be impressed. But if I rephrased myself like "I saw a million dollar car today", it would probably stir listeners' imagination quite more. (BTW isn't it a paradox that something as abstract and "emotionless" as a number can bring more emotions than a large bunch of descriptive words?) For that reason some people that want to appear smarter or present themselves as using cold hard facts juggle with statistics, diagrams and other professionally-lookin' data. But if you stopped just for a moment and analyzed them, you would probably discover they carried no information at all. The problem is that most people don't stop.

If you hear a politican saying that they are right to do whatever they want because "more than a half of society trusted us", what do you feel? Either a damn lot of people went to exericse their right to vote and most of them actually believed a single party or, quite probably, this "data" is complete crap. Let's say that 60% of people with the right to vote actually did it (and that would be a damn lot in this century.) If a party received 55% of all votes, how many people actually trusted them? And is it really "more than a half of the society?"

There is even more to the election stuff, because in many systems parties need to have at least a few percent of all votes to enter the parliament. That means that a single party may receive just 30 or 40% and still hold much more than half of all seats, as votes for parties that did not pass are not represented there. So if a party holds, let's say, 60% of the parliament, what does it tell you about how many people trusted them? Quite nothing, really, unless you consider how many people actually went to vote. It may turn out that a party needs less than 20% of the nation to trust 'em to hold a majority in the parliament. I am not going to get into politics, because it is usually far from logical (and why people do not exercise their right to vote is yet another story), but where is that "half of the society" again?

There are better examples of why context is king, however. And the most common statistical data that blows my mind is average salary. People must be rich when working for a company that has it as high as 10k per month right? Let's see. If that company earned a million a month, hired 100 people, and 10 of them were directors and owners earning 700k combined, how much would regular workers get then? Oh wait, it is less than 3,5k! I am not saying it is too little or too much. The only thing I do say is that "average salary" is nearly always complete bullshit, unless given a proper strict context.

On the flipside, I once heard the example form above being misused by a teenager trying to convince his parents that his grades being lower than class average mean nothing. Well, that would not really be true, as school grades are (at least usually) limited by a certain scale. And there are some top and bottom boundaries then the average actually tells you something. Such clear limits do not apply to salaries and lots of other real-world data, however, and "average" means nothing then.

What I am presenting here are very simple examples. So you may think "why do you bother writing this?" Well, I am surprised everyday by how many people, including scientists, engineers and other ones whose daily job includes working with numbers, are misguided by statistical data being interpreted incorrectly. Moreover, many journalists or politicans use statistics in the wrong way because they do not understand 'em themselves, but there are some people who want to mesmerize us with numbers, just to trick us. Just come on, in order to avoid being deceived by statistics we just need to remember a simple thing: context is always necessary, because reality matters.

No comments:

Post a Comment