We’re inundated with statistics. We hear them in the news. Politicians spout numbers that may or may not be based on statistics or even reality. No matter the source, we need to be able to evaluate and validate the numbers that we’re hearing. We need to be able to become The Data Detective [using] Ten Easy Rules to Make Sense of Statistics. Tim Hartford walks us through ten ways to not just make sense of statistics – but to validate that what’s being said matches what the data and the statistics actually say.
Flood the Zone with Shit
Before we look at ways to make sense of statistics, we should explain why it’s so hard to begin with. There are three key reasons why it’s so hard. The first is the most pernicious.
The reality is that flooding communications is an effective strategy. Even After the Ball explains that jamming is a great strategy for disrupting the status quo. (Jamming is sending other messages into the channel to disrupt the communication.) The Organized Mind emphasizes that we’re living in a world of information overload. Politicians and others are aware that if you flood the zone with shit, you may be able to avoid people discovering the truth until it’s too late.
Tobacco companies did this for years. They’d question the evidence, fund their own competing studies, including studies about interesting but unrelated topics to create discussions that weren’t about the harmful effects of smoking. It was in an effort to keep people distracted from the truth.
Donald Trump popularized the phrase “fake news.” Unfortunately, it’s damaged our already fragile trust. American trust has been in a long decline. In Trust: Human Nature and the Reconstitution of Social Order, we learned that trust has important implications to society. Robert Putnam, in his book, Bowling Alone, spoke of the decline of social capital – and the corresponding loss of trust. His book, Our Kids, further explores the concept and, particularly, the dynamics of inequity. We even see echoes of the erosion of trust in America’s Generations. Where our grandparents expected they’d work for one company their entire working life, we don’t have the same perception. Instead, we speak of the gig economy, where loyalty and trust are ancillary.
Preconceptions
As other resources like Noise, The Signal and the Noise, Superforecasting, and Thinking, Fast and Slow explain, we often overvalue our initial perception and our preconceived notions. We’ll cling on to perceptions that we have proven incorrect, because we’ve held the perceptions prior to the disconfirming data. Instead of asking the question, “Can I believe this?” we ask, “Must I believe this?” And even then, we’ll resist the change.
At some level, this makes sense. We don’t want to think that we’ve been wrong all this time or consider the innumerable number of decisions we’ve made with bad data. However, this causes us to often miss the importance of information that doesn’t corroborate our stories.
Rule 1: Search Your Feelings
Our perception of data is distorted by our feelings in ways that we cannot detect without first evaluating our feelings. In How Emotions Are Made, Lisa Feldman Barrett explains how easy it is to misinterpret our bodily sensations and assign incorrect emotions to them. However, little space is given to the awareness that our feelings shape our thoughts.
In The Happiness Hypothesis, Jonathan Haidt explains that what we consider our consciousness is a rational rider sitting on top of an emotional elephant. (See also Switch.) It’s important to recognize that emotions are in control when they want to be – the rider only has the illusion of control. Daniel Kahneman in Thinking, Fast and Slow says that System 1 (the fast, instinctive one) will lie to System 2 by giving it incorrect or partial information. In other words, we don’t know when our feelings are getting the best of us.
In Superforecasting, Phil Tetlock explains how one of the most common acknowledgements from superforecasters when they miss is that they became involved with the outcome, and their judgement was clouded by their feelings. This is consistent with Daniel Kahneman’s work in Thinking, Fast and Slow, where he explains that our immediate processing can cloud the more thoughtful approach.
There’s no foolproof way to prevent it – but a good start is to acknowledge when feelings are involved.
Rule 2: Ponder Your Personal Experience
Fewer than 50% of people accept that the MMR vaccine doesn’t cause autism. Andrew Wakefield – who started this – lost his license to practice medicine and The Lancet printed a retraction. The problem is that it’s intuitive. MMR is given around the same time that autism is diagnosed, so it “feels” like there should be a connection. It becomes very hard when the statistics don’t match our personal experience.
A colleague of mine was going to a clinic during the heart of the COVID-19 outbreak where they were using ivermectin. Ivermectin is a horse anti-parasitic. It makes no sense as a treatment for COVID, but people believed it. His retort to me was that no one had died who sought treatment at the clinic. His experience was that it cured. I kept thinking “base rate.” On a sample size of 600, I wouldn’t have expected a death – and I wouldn’t be sure they would have known. The clinic wouldn’t have seen the same patient again, so how would they know whether they were cured or died? There’s no reporting requirement that would have ensured the clinic would know. I kept wondering what would have happened without any treatment. It turns out that it probably would have been about the same.
George Washington (yes, the first president of the United States) likely died as much from his treatments – bloodletting – as the ailment he was fighting. There are cases where we use bloodletting and leeches today – but in much narrower and more careful ways than we used to, because we know when it is and isn’t helpful.
It’s another, but related, thing to confuse correlation with causation. In my review of The Halo Effect, I detailed the sequence of events that led to the 2008 financial collapse. From believing that, if we could just increase home ownership, we could increase economic stability to financial services greed, we led ourselves right into a trap – because it seemed to make sense.
Rule 3: Avoid Premature Enumeration
Amy Edmondson was flustered. She had increased the psychological safety in a hospital unit, and instead of the number of errors going down as expected, they went up. (See The Fearless Organization for more.) Lucky for her – and us – she realized that it wasn’t the actual error rate that went up, it was only the reported error rate. When it looked like things were getting worse, they were actually getting better.
This problem is particularly present in suicide, where the reported rate of suicides in the United States keeps going up. We don’t know if that’s because it’s more acceptable for a coroner to mark suicide on the death certificate or whether it’s an actual change in the rate. (See Postmortem.) It used to be that some coroners required a suicide note, but we know that notes are very rare. (See Clues to Suicide). We also know that more locales are slowly employing professional medical examiners instead of electing coroners who may or may not have any useful skills. (See Postmortem.) With suicide, everything hinges on intent, which is hard to ascertain in a dead person. (See Autopsy of a Suicidal Mind.)
Other statistics are harder to understand the intent. Consider the discussion about wealth in society. It’s a striking claim that a handful (fewer than 100 or maybe even than 10) have more wealth than the poorest half of the world. However, as Hartford points out, it doesn’t take much to have more wealth than a billion people – he surmises that his son’s piggy bank holding $15 has more than the billion poorest people on the planet. His point is that summing zeros still leads to zeros.
It’s for these reasons that, when we’re looking at statistics, we need to know who is doing the counting – and why. He cites a different perspective on inequity from Thomas Piketty – author of Capital in the Twenty-First Century.
Rule 4: Step Back and Enjoy the View
There used to be a joke that a helicopter pilot got lost in the fog near Seattle, Washington. He came upon a building and asked an occupant by the window where he was. The response was, “You’re in a helicopter.” The pilot, to the astonishment of the passengers, proceeded immediately to the landing pad for a safe landing. When asked, he explained that the answer was technically accurate and practically useless, and therefore he knew he was speaking with Microsoft support. (My friends who work in Microsoft support aren’t fond of this joke.)
The joke is not funny, but it’s indicative of a type of statistics that sound important but mean nothing. It’s not that they’re inaccurate, it’s just that they’re too narrow. They look at something that flows against the trend or obscures the larger picture – and it’s interesting, so it’s news.
The number of people showing up in the emergency room after having a stroke are (or at least was) increasing. It’s true – and it obscures the larger fact that strokes are going down. The awareness campaigns that are leading people to go the ER at the first sign of stroke result in an increase in ER visits. However, a story about how stroke awareness is up isn’t as catchy.
Rule 5: Get the Backstory
The fundamental premise of Barry Swartz’ The Paradox of Choice hangs on an experiment with jam. Sheena Iyengar and Mark Lepper set up a stand to sell jam. When they sold three kinds, they did better than when they sold sixteen. The conclusion was clear. Too many options, and you’d drive customers away. The problem is that when people tried to replicate the experiment, they didn’t get the same result. To be fair, there seems to be some reason to believe that helping people navigate complex choices is valuable. It’s why you see all the comparison charts.
However, I was personally aware of another plausible reason for selling less jam: satiation. The need for jam was solved in the first foray into the square, so the second time around, people didn’t really need or want any more jam. I saw this when I had a small cadre of folks selling candy at junior high school. It was going great until people got tired of the candy that we were selling. No amount of price discounting mattered.
It’s called survivorship bias, and it impacts more than our perception of the benevolent dolphins that guide humans back to shore. (See Mistakes Were Made (But Not by Me) for more about benevolent dolphins.) Abraham Wald observed the same thing when asked how to improve the survivability of airplanes. Confronted with reams of data about the planes that survived with bullet holes, he was concerned about the data that was missing – all the planes that didn’t make it back. Perhaps the places with the bullet holes are survivable, but those without bullet holes weren’t?
Sometimes, perverse incentives lead us away from the very thing that we should be seeking. For instance, I wrote about The Ethics of Encouraging Dishonesty, because the incentives of a particular rubric required people to lie – about using a poor framework. In the world of academic journals, it’s led to an epidemic of repeatability problems, where one result isn’t enough – you must wait for others to replicate it. And if you don’t find replication, you don’t really know if it wasn’t attempted or if the replication failed since null results rarely get published. It’s even hard to get replications published. So much for leading the way for solid science.
Sometimes the rush to create sensation means that we don’t adhere to the rules that people would reasonably expect. Consider a contest to create the best performing financial portfolio. Each person enters a portfolio in the hopes of a win. Unbeknownst to the contestants, the organizers entered their own portfolios – not just one per person, but rather a cluster of portfolios. In the end, they get some of the best results with one of their portfolios, and you believe that they’re stock picking wizards.
Except that you might have been able to get the best performing portfolio if you had twenty or two hundred chances. I can guarantee that I win the lottery if I buy a ticket with every single number combination – but that doesn’t mean I’m good at picking lottery numbers.
Rule 6: Ask Who Is Missing
Solomon Asch created an experiment where people were asked an easy question. Which line in a set of three was the same length as a reference line? Left to their own devices, the accuracy was nearly 100%. However, when confederates of the researcher gave a wrong answer consistently, so did the participant. (See Unthink for more on Asch.) There’s more to this story, including that even a small amount of dissention of the confederates caused the effect to disappear. But a more interesting question is who were Asch’s participants?
Like many studies, Asch used the raw materials of his academic environment, students. This has been a consistent criticism of research that’s WEIRD – Western, Educated, Industrialized, Rich, and Democratic. But Asch’s sample was biased in one other way that wasn’t easy to detect. All his subjects were male. We have no idea how women might have reacted, because he didn’t check. Similarly, Stanley Milgram’s famous shock experiment only used males. (See Influencer for more on Milgram’s experiments.)
The Literary Digest invested in the development of a very large survey. Using automobile registrations and telephone directories, they mailed out a huge number of surveys asking who the recipients would vote for – Franklin D. Roosevelt or Alf Landon. The year was 1936, and they predicted a landslide victory for Landon. The problem is their sample, though it was large, had a prosperity bias.
The people who received the surveys – and potentially those who responded – were wealthier than average by the nature of owning a car or having a telephone. They quite predictably picked the Republican candidate. George Gallup didn’t have nearly as much data. His was tiny in comparison.
However, Gallup paid attention to the makeup – the demographics – of the sample set and produced a correct (and better) answer based on matching the demographics of those who voted. Sometimes, it’s not who is excluded in the data but also the weightings of those who are included.
Rule 7: Demand Transparency When the Computer Says No
Spurious correlation. Computer scientists with a strong desire to see artificial intelligence succeed turn their head from the problem. It’s a problem where things are correlated but for random reasons – that eventually fall apart. Artificial intelligence uses the computer to find correlations in the data. Proponents suggest that you can trust the correlations – but experience says differently. Google developed a predictor for influenza infection rates using only search data.
Let’s pause for a second and look at what the inputs really are. They’re seeing people searching for symptoms and what to do. This is the earliest signal. When people start feeling bad, they start looking for remedies. It seems good – and it was – until it wasn’t. Google eventually shut down the program but not before seeing the model completely fall apart.
In cases where the false positives the algorithms detect are not harmful, it doesn’t matter. Consider the case that Charles Duhigg covered in the Power of Habit: Target mailed pregnancy coupons to a man’s daughter, and he complained to the store manager about it – before acknowledging that there were things going on in his house that he didn’t approve of. Certainly, it outed a pregnant daughter before she likely wanted to be outed about her pregnancy.
In singular, the example sounds spooky good. How did Target know? The answer is that it made an educated guess. People who buy things like prenatal vitamins are often pregnant. Other, less clear signals may also frequently be purchased immediately after a woman finds out that she’s pregnant. What the story omits is the number of people who did get the same packet but who weren’t pregnant. Some set of correlations showed up – but the predictive capacity is well below 100%.
The story is different when the predictive model has more important consequences. Consider the idea of using it to evaluate bail risk. The inequities that exist in the data may be reflected in the results of the system as well. People may be temporarily deprived of their freedom based on a model that has no data.
Rule 8: Don’t Take Statistical Bedrock for Granted
It’s important to recognize who is producing the statistics and what motivations they may have – internally or through external pressure – that may impact the numbers. Statistics themselves should be impartial. They should be beyond the influence of politicians who would bend the numbers for their benefit, but they’re not.
When people are threatened – and killed – because they insist on delivering the truth, we can’t expect that they’ll continue to stand. In Find Your Courage, the point is made that courage is doing the right thing in the face of fear – but it’s not reasonable to be courageous when you or your families are in danger.
Sometimes, the forces that drive us away from true and correct are the forces of commerce. Books like Infonomics draw a clear bright line between information – including statistics – and organizational value. That bright line can block out the beauty of the true nature of data.
Sometimes, it’s not those who make the statistics that are threatened but the very value of their statistics. People call them “fake,” “false,” or other names try to inject doubt about their validity. The problem is that these attacks don’t just harm the intended statistics but all statistics. In fact, they erode the very foundation of trust (see Trust: Human Nature and the Reconstitution of Social Order).
Rule 9: Remember that Misinformation Can Be Beautiful, Too
Just because it’s pretty doesn’t make it right. We look at graphs that instantly allow us to understand something without pouring through reams of numbers. Infographics have become more popular today as we have cheap and easy access to create such visualizations. Simple “errors” can make something look exponentially worse than it is. A number doubles, and it’s represented as area in a room, so both dimensions are doubled – creating exponentially more size.
The lesson is that seeing it shouldn’t be believing it when it comes to statistics.
Rule 10: Keep an Open Mind
It’s the takeaway from Superforecasting as well. The people who predicted the best weren’t the hedgehogs who knew one thing but rather the foxes who kept their options and their learning open. (See Good to Great and Should You Be a Fox or a Hedgehog?) The more we’re able to disconnect from the outcomes of the data, the more we can peer clearly into its roots and see what it is – and is not – saying. (See The Happiness Hypothesis for more on detachment.)
The Golden Rule: Be Curious
The dog’s head tilt that indicates confusion is the first step. For humans, it’s an opportunity for us to start the learning experience. It’s confusion that can lead to curiosity and a desire to understand how something can be. Like an Escher painting that cannot exist in a three-dimensional world, we look to make sense of things we see. It’s this attitude that can make us The Data Detective.