Causal Inference: Beyond the Limits of Conventional Machine Learning

May 12, 2023

Prof. Daniel Franks

Introduction

Conventional AI is blind to causality. It is fixated on outcomes, treating everything as mere servants to prediction. Variables are tossed into a vortex of black-box machinery, churning out predictions while conflating correlation and causation. Methods like reinforcement learning, despite their innovation, also succumb to the siren song of spurious correlation.

What if you want to know which variables causally impact the outcome, and by how much? This will be the case for most 'why' or 'what if' questions. A common misconception is that explainable AI enables this. But this is wrong: explainable AI tells you how the model arrives at its predictions, but this may have little to do with causality. Conventional AI makes heavy use of spurious correlations at any opportunity if that paves an easier path to the predicted outcome. If interpreted causally, these spurious correlations will masquerade as cause and effect. This is a problem because the most important questions for business and science are about causation, not meaningless correlation.

Establishing cause and effect is difficult and requires a more principled scientific approach. In embracing causal AI we can establish cause-and-effect reasoning to provide insights and better decision-making. In this blog post, we will briefly examine some of the most basic reasons why conventional AI fails to establish cause and effect. In later posts we will discuss causal AI and how it can solve these problems, and much more.

Causal Challanges to Conventional AI

Confounders: Common causes

A confounder is a variable that affects other variables, fabricating a spurious non-causal relationship between them. You can think of a confounder as a 'common cause' of other variables. Let’s take a toy three-variable example: whether ice cream sales cause sunburns. Intuitively, we know this isn't true. Both factors are influenced by weather—sunny days boost ice cream sales and increase sunburn risk.

‍

The figure on the left shows a causal diagram, with arrows showing the direction of causality from one variable to another. Here, sunny weather is a confounder, being a common cause for both ice cream sales and sunburns. A conventional black box model, if armed only with ice cream sales, would establish a spurious relationship between ice cream sales and sunburn (figure on the right). Increasing ice cream sales in the model would increase the chance of sunburn. But this is a correlation that is produced by weather, the common cause. Controlling for weather, by conditioning on it in the analysis, closes the non-causal path from ice cream to weather. This is what a causal model would do.

‍

Simpson's Paradox, a special case of confounding, occurs when population-level trends reverse at the group level. The figure above shows an example. If you look at the population (figure on the right), it appears that exercise causes increased cholesterol. But within each age category the causation is the opposite: exercise decreases cholesterol (figure on the left). Age is a confounder: a common cause of both cholesterol and exercise.

You might stop here to ask the question: why not just throw all variables into the model? But doing so is a perilous approach to causality. The sections below and our upcoming post on the Mutual Adjustment Fallacy paint some of the picture as to why.

Mediators: Causal intermediaries

Mediators are the go-betweens that relay causality from one variable to another, like weight loss mediating the relationship between exercise and heart health. Depending on our questions, we may seek the total causal effect, the direct effect, or the mediated effect. Causal AI offers tailored answers to suit our specific causal inquiries.

‍

The figure above shows the causal structure (left) where exercise causes health status directly (arrow from exercise to health) but also causes health indirectly through weight. We often want the total causal effect of exercise on health, in which case closing the path through weight would be wrong. In the conventional AI model on the right all variables are carelessly thrown into the model, blocking this important causal pathway.

Most of the time we are interested in the total causal effect. But sometimes we are interested in only the direct effect or only the indirect effect. It all depends on your questions, but this is the beauty of causal AI: it can provide answers based on your specific causal questions, rather than the hidden assumptions of a cookie-cutter off-the-shelf AI model.

Colliders: A causal meeting point

Colliders are the junctures where two causal paths converge. The arrows here go in the opposite direction of a confounder. In contrast to confounders, paths with colliders are already closed, and conditioning on colliders opens a path. Conditioning on a collider can create a spurious relationship between the two causing the confounder. For example, suppose you want to study the causal relationship between talent and hard work. Does talent cause hard work?

‍

Suppose we gather data from university students. Talent and hard work both contribute to admission to elite universities (through factors such as grades). In this case, elite institution status is a collider, with talent and hard work both playing a role in admission (as shown in the left figure). If a model conditions on institution status (or data is collected solely from elite institutions), a spurious correlation emerges between talent and hard work, devoid of causal basis (as depicted in the right figure). Very occasionally you do need to condition on a collider. This is legitimate as long as you can still close any non-causal paths that it opens. But much care needs to be taken with colliders.

‍

Conclusion

Every variable occupies a unique position in the web of causality, but traditional AI remains fixated on outcomes, treating everything as mere servants to prediction. For basic illustration we have focused on paths as long as two or three, but it gets more difficult and interesting than that. Decoding cause and effect is a formidable task, yet causal AI surpasses traditional AI by tackling questions of real importance. As causal AI's popularity surges, its influence is felt across industries, from healthcare and finance to marketing and policymaking. By championing causal AI, we make better-informed decisions and unlock vital insights. In the game of causation, causal AI emerges victorious, casting a discerning light on the intricate web of cause and effect.

‍

‍

Future posts will showcase the exceptional capabilities of causal inference, exploring more complex scenarios, simulated interventions, counterfactuals, structural causal models, latent (unobserved) confounders/colliders, synthetic control, missing data, causal temporal analysis, measurement error, and much more.