Thursday, May 14, 2020

How to Lie with Statistics

If you haven't already read this little classic written by Darrell Huff and published in 1954, you need to.  Especially now, you need to.  It draws from everyday life examples to show us how easily we can be misled, and even influenced, by data and how it is presented.

One of the reasons I think this book is so informative for everyone, is that it is written by a non-statistician.  That’s right, Darrell Huff wasn’t some renowned statistician, he was a journalist.  He made extra money freelancing and wrote several “how to” articles.  Like most writers, Huff is not without his own controversy, but it was How to Lie with Statistics where I think he did his best service to community and consumer.  Huff helped the common person better evaluate and understand the data and charts they see and hear about in media and marketing.  Many of the themes of this book continue today, including the ever popular “Correlation does not imply causation.”
My personal copy



Understanding how statistics can be misleading is even more important today.  Every day we are hit with battles pitting “good” data against “bad” data, “lies” against “truth.”  Today's social media uses memes and fantastic headlines to grab your attention.  But too often, we accept the conclusion (or headline) put forth without questioning or even examining the data that went into the conclusion. 

I encourage you to read the book. It’s fun, it’s informative, it’s a quick read to add to your stay-at-home reading list.  I leave you with learning from one of my favorite chapters and an example of how it’s playing out today. 

Chapter 7.  The Semiattached Figure.  “If you can’t prove what you want to prove, demonstrate something else and pretend that they are the same thing.”

This chapter resonates so strongly with me today, because it seems to be the rule of how some are reporting and lying with statistics around COVID-19.  To keep this blog short, I will talk about just one example.  I’m sure you can come up with many more examples and I’d love to hear about them in the Comments section.

It is very important to many, particularly political figures, that a message of great testing capabilities and performance be communicated.  If you can convince the public that you have testing under control and are doing appropriate amounts of testing, then they will be more likely to return to work and re-start corporate profits. 

Unfortunately, data that would help you prove appropriate amounts of testing don’t really exist.  However, there is data that shows you have done the most testing of any country.  Demonstrate you have conducted the most individual tests, and that’s the same thing as providing appropriate amounts of testing.  You further conclude that with that much testing, if there was a potential problem of virus growth, you would know it.  You are safe, go back to work, resume normal activity.  We’ve done the most testing so it’s all okay. Headline - "We're #1!"



The problem is, while the statistic is correct, it is not answering the question.  In order to answer the question of whether you have testing under control and are conducting the appropriate amount of testing, you need to look at a “per” number.  You need to put your comparison samples on even footing. You're not just comparing apples, you're comparing a Red Delicious to a Red Delicious. If you look at the statistic that is needed to answer the question – how many tests per 1,000 people – the data doesn’t support your conclusion. Headline - "We're not really consistently tracking, but we know we're better than South Korea."



Which is truth and which is a lie?  Both data points are truth, and both could also be lies - depending upon the context in which they are presented and the conclusions you are deriving from the data.

Think about it, look for more examples.  Read the book!  It could also be a great family learning exercise.  You will be amazed, and likely also disgusted, by the view through the glass you now use to evaluate media, marketing and political messages.

Cheers,
MK