Class 10 Causal Inference and Randomized Controlled Trials

Author
Affiliation

Dr Wei Miao

UCL School of Management

Published

November 1, 2023

1 Causal Inference

1.1 Our Journey So Far

  • Any business activity brings benefits and costs. We’re given the benefit information in all the case studies so far

    • PineApple (Week 1): influencer marketing increases sales by 2.5%

    • 1st assignment (Week 4): loyalty program increases retention rate to 96%

  • In reality, this benefit information is often not readily available, and we need to estimate them using causal inference tools.

1.2 Causal Inference Road Map

1.3 Learning Objectives

  • Understand key concepts of causal inference and Rubin’s potential outcome framework

  • Learn the steps to conduct randomized controlled trials (RCTs)

1.4 Why Causal Inference Matters? Example 1

Tom purchases Google banner ads to advertise his new bubble tea shop. Google ads are targeted to individuals who are predicted to have a higher likelihood of being bubble lovers. In the end, some Google users saw no ads (casual bubble users) and some saw the ads (bubble lovers). The purchase rates for each group are shown below.

Question: Can Tom be confident to conclude the Google ads are effective in converting new customers?

1.5 Why Causal Inference Matters? Example 2

Tom bought a marketing survey data from a consulting agency. The survey collected prices and store visits (sales) for different bubble shops in Canary Wharf. Tom finds that there is a positive correlation between prices and store visits.

Question: Can Tom conclude that he should also increase the prices for his bubble tea shop to increase the store visits?

1.6 Why Causal Inference Matters? Example 3

This is a plane that just returned from the battlefield. Red dots are bullet holes.

Which part of A, B, and C would you reinforce to increase the pilot’s survival rate?

1.7 Why Causal Inference Matters? Example 4

I have held a secret from you for a long time….

1.8 Nobel Prize in Economics (2021)

[…] the other half jointly to Joshua D. Angrist and Guido W. Imbens “for their methodological contributions to the analysis of causal relationships.”

1.9 Causal Inference

  • Causal inference is the process of estimating the unbiased causal effects of a particular policy intervention on the outcome variable.

  • Correlation != Causation: On rainy days, we observe more umbrellas on the street

    • correct correlation statement: number of umbrellas is positively correlated with rainfall

    • correct causal statement: heavier rain leads to more umbrellas

    • incorrect causal statement: more umbrellas lead to heavier rain

  • Causality becomes more complex in the business world. Managers can easily make mistakes without causal inference training.

    • Imagine if the the actual causal effect of Google ads on incremental profit per customer is £1 for Tom, and Tom pays £1.5 for each click

2 Potential Outcome Framework

2.1 Rubin Causal Model and the Potential Outcome Framework

  • The Rubin causal model (RCM) or the Potential Outcome Framework is the well accepted framework for estimating causal effects.
  • For each customer \(i\), we can define two potential outcomes to evaluate the causal effect of Google ads on their outcomes:
    • \(Y^1_i\): the outcome if the customer sees the ads

    • \(Y^0_i\): the outcome if the customer does not see the ads

    • The causal effect of ads is the difference between the two

  • We can define the individual treatment effect \(\delta_i\) as follows.

\[ \delta_i = Y^1_i - Y^0_i \]

2.2 Examples of Individual Treatment Effects

Tom would like to measure the causal effect of introducing a loyalty program on customer retention. Let’s say we have retrieved all infinity stones from Thanos, and have created 2 parallel universes

  • In Universe 1, with loyalty program

    • Dr Strange has a retention rate of 70%
  • In Universe 2, without loyalty program

    • Dr Strange has a retention rate of 60%
Customer Y1 Y0 TreatmentEffect
Dr Strange 0.7 0.6 0.1

2.3 A Motivating Example: A Group of Customers

  • We can collect a sample of customers, and estimate individual treatment effect for each of them
Customer Y1 Y0 TreatmentEffect
Dr Strange 0.70 0.60 0.10
Iron Man 0.55 0.50 0.05
Thor 0.80 0.72 0.08
Hulk 0.60 0.62 -0.02

2.4 Fundamental Problem of Causal Inference

In reality, to measure the treatment effect of loyalty program on a customer’s retention rate

  • We need to compare the outcome of the same customer in parallel universes, with and without the treatment.

  • However, in reality, we only observe one of these two outcomes, but never both.

    • Realized actual outcome, including \(Y^1_i|D_i = 1\) and \(Y^0_i|D_i = 0\)

    • Counterfactual outcome, including \(Y^0_i|D_i = 1\) and \(Y^1_i|D_i = 0\)

2.5 Fundamental Problem of Causal Inference

  • Example:
subject Treated Y1 Y0 Y1-Y0
Dr Strange Yes 0.7 ? ?
Iron Man No ? 0.5 ?
Thor No ? 0.72 ?
Hulk Yes 0.6 ? ?
  • Since it is impossible to see both potential outcomes at once, one of the potential outcomes is always missing, so we can never observe the individual treatment effects. This dilemma is called the Fundamental Problem of Causal Inference.

2.6 Average Treatment Effect (ATE) and Randomization

  • As data scientists, we often care more about the average effects on the population, which is often referred to as the average treatment effects (ATE).

\[ ATE = \frac{\sum (Y^1_i - Y^0_i)}{N} \]

  • To estimate the ATE, we must randomize who receives the treatment, instead of letting the individuals choose the treatment.

  • After randomization, we can obtain ATE by comparing the difference in the average outcomes across the treatment group and control group. Because randomization ensures that

    • Selection bias is fully removed1

    • The treatment effects on the treatment group individuals and the control group individuals should be equal. The former is called the average treatment effects on the treated (ATT), and the latter is called the average treatment effects on the untreated (ATU).

  • Randomized experiments are the gold standard of causal inference.

3 Randomized Controlled Trials

3.1 Randomized Controlled Trials

Randomized Controlled Trials

A randomized controlled trial (RCT) is an experimental form of impact evaluation in which the population receiving the program or policy intervention is chosen at random from the eligible population, and a control group is also chosen at random from the same eligible population.

3.2 Types of RCTs: Based on Location

3.3 Types of RCTs: Based on Treatment Design

  • A/B testing (treatment group + control group)
    1. Loyalty program
    2. No loyalty program
  • A/B/N testing (multiple treatment groups + control group)
    1. Point-based loyalty program; points can be redeemd for price vouchers
    2. Point-based loyalty program; points can be redeemd for gifts
    3. Point-based loyalty program; points can be redeemd for free top ups
    4. No loyalty program
  • Factorial design
    • more than 2 dimensions of treatments, used if we care about the interaction effects2

3.4 After-Class Readings

Footnotes

  1. Selection bias refers to the pre-existing difference between the treatment group and control group even without the treatment↩︎

  2. Chintagunta, Pradeep K. and Huang, Liqiang and Miao, Wei and Zhang, Wanqing, Measuring Seller Response to Buyer-initiated Disintermediation: Evidence from a Field Experiment on a Service Platform (April 19, 2023). Available at SSRN: https://ssrn.com/abstract=4423917 or http://dx.doi.org/10.2139/ssrn.4423917↩︎