My vote for the foggiest assertion of the pandemic to date is that the U.S. has an abysmally high number of COVID-19 cases — more than 29 million as of March 8, 2021 — because of testing. “If you don’t test, you don’t have any cases,” former President Donald Trump said during a televised White House roundtable on June 15, 2020. “If we stopped testing right now, we’d have very few cases, if any.” I wonder how many people who heard that had the same first thought as I did: Correlation is not causation.
This isn’t to say that correlation — the idea that two or more things are associated in some way — isn’t valuable. Indeed, there is big money in correlation. In order to peddle subscriptions, Pandora doesn’t need to know what causes people who listen to The Grateful Dead to also listen to Phish. To bulk up its sales, Amazon doesn’t need to know what causes people who buy a Paleo diet book to also buy beef jerky.
Though associations gleaned from big data drive recommendation engines and bolster corporate revenues, they have their limitations. Imagine trying to control a viral pandemic by refusing to test people for the virus.
The passive observation of data has limited value, because, as Judea Pearl reminds readers several times in The Book of Why: The New Science of Cause and Effect, data is profoundly dumb. “Data can tell you that the people who took a medicine recovered faster than those who did not take it, but they can’t tell you why,” writes the director of UCLA’s Cognitive Systems Lab. “Maybe those who took the medicine did so because they could afford it and would have recovered just as fast without it.”
The passive observation of data has limited value because data is profoundly dumb.
Association, which Pearl, a Turing Award winner, identifies as the first of three steps on his ladder of causation, won’t help executives answer many of the questions they need to ask when formulating corporate strategy, making investment decisions, or setting prices. To answer questions such as, “What will raising prices by 10 percent do to revenues?” you need to start climbing Pearl’s ladder.
Intervention is the second step on the ladder. “Intervention ranks higher than association because it involves not just seeing but changing what is,” Pearl writes. That’s why companies are running scads of randomized controlled experiments these days. (Google ran 10,000 experiments in 2018.) They are changing things on a small scale to figure out what effects an action will produce on a large scale. Real-world experiments aren’t a necessity — you can get a machine to figure out the effects of an intervention without actually changing anything in the real world. But in either case, virtual or real, and unlike with association, which machines can determine on their own, intervention requires the creation of a causal model to determine what effect a 10 percent price hike will have on revenues.
The third and highest rung on Pearl’s causation ladder is counterfactuals. Pursuing causation at this level means determining what would have happened if your company had done something in the past. For instance, what would revenues be today if you had cut prices by 10 percent a year ago? This is a brain twister, because you can’t go back and cut prices a year ago, but being able to answer counterfactual queries is essential to scenario planning, among other activities. Pearl describes its benefits as “flexibility, the ability to reflect and improve on past actions, and, perhaps even more significant, our willingness to take responsibility for past and current actions.”
Judea Pearl is operating at a level that is far above my embarrassingly limited understanding of mathematics (which ends somewhere in the middle of Calculus I). But his ladder of causation is an accessible and intriguing conceptual tool for leaders. It offers a way of understanding what making a decision entails and how to go about getting answers. If seeing what’s happening is enough to make a decision, maybe the right answer is floating around in your company’s data lake. If you need to do something to make a decision, turn to an experiment or a model. If you can’t observe it or try it, you may need bigger analytical guns, like mediation analysis, which is used to calculate the role of variables in a hypothetical causal chain.
The ladder also offers a way of judging the analytical prowess of your organization. Data-driven management gets a lot of lip service these days, but how well equipped is your company to practice it? Do managers have access to the tools and expertise needed to make decisions that live at the association, intervention, and counterfactual levels of Pearl’s ladder? At a time when companies are making huge investments in data science, these seem like questions that leaders should be able to answer.