Discussion 3. Treatment effect heterogneity in an Experiment
STSCI/INFO/ILRST 3900: Causal Inference
September 10, 2025
You can download the slides. for this week’s discussion.
Get out and Vote Experiment
Last week, we explored an experiment that digs into the mechanisms underlying why people vote. This exercise is based on:
Gerber, Alan S., Donald P. Green, and Christopher W. Larimer. “Social Pressure and Voter Turnout: Evidence from a Large-scale Field Experiment.” American Political Science Review 102.1 (2008): 33-48.
A long-standing theory as to why many people vote is that it is driven by social norms (e.g. the understanding that voting is their civic duty). This theory, while being a dominant theoretical explanation, had very little empirical backing for a long time. This experiment examines this very theory by asking the question: to what extent do social norms cause voter turnout?
Experimental Design
In order to answer this question, approximately 80,000 Michigan households were randomly assigned to treatment and control groups, where the treatment group was randomly assigned to one of four possible treatment arms. These treatment arms varied in the intensity of social pressure that they conveyed, and were defined as follows:
- The first treatment arm was mailed a letter that simply reminded them that voting is a civic duty.
- The second treatment arm was mailed a letter telling them that researchers would be studying their voting turnout based on public records.
- The third treatment arm was mailed a letter stating that their voting turnout would be revealed to all other members of their household.
- The fourth treatment arm was mailed a letter stating that their voting turnout would be revealed to their household and to their neighbors.
Analyze Experiment
Import data
gotv <- read_dta("https://causal3900.github.io/assets/data/social_pressure.dta")
Clean data
First, we construct an age variable describing how old (in number of years) each person was in the year 2006. The yob
variable says which year each person was born in. > For this, we use the mutate
function.
Then, we convert the treatment
variable from it’s numeric representation to the corresponding labels which are
- 0: “Control”
- 1: “Hawthorne” (this is the ‘researchers viewing records via public data’ treatment arm)
- 2: “Civic Duty” (this is the ‘voting is your civic duty’ treatment arm)
- 3: “Neighbors” (this is the ‘voting turnout revealed to neighbors’ treatment arm)
- 4: “Self” (this is the ‘voting turnout revealed to household’ treatment arm)
For this, we use the case_when
function.
Average Causal Effect
Finally, for each treatment group, we calculate the percentage of individuals who got out and voted, as well as the total number of individuals in that group! The solutions below use the function n
which counts the number of observations in the current group for you.
gotv_results <- gotv |>
group_by(treatment) |>
summarise(Per_Voting = mean(voted), num_of_individuals = n())
print(gotv_results)
## # A tibble: 5 × 3
## treatment Per_Voting num_of_individuals
## <chr> <dbl> <int>
## 1 Civic Duty 0.315 38218
## 2 Control 0.297 191243
## 3 Hawthorne 0.322 38204
## 4 Neighbors 0.378 38201
## 5 Self 0.345 38218
Conditional Average Causal Effect
Now, we look into the treatment effect across sub-population, so we can determine if there is treatment effect heterogeneity
First, we assign into age groups and household size groups
gotv <- gotv |>
mutate(ageGroup = cut(age, breaks = c(18, 30, 45, 60, 120))) |>
mutate(hhGroup = cut(hh_size, breaks = c(0,1, 2, 10)))
Examine voting by age group
gotv_results_age <- gotv |>
group_by(ageGroup, treatment) |>
summarise(
Per_Voting = mean(voted),
Count = n(),
.groups = "drop"
) |>
group_by(treatment) |>
mutate( Per_in_AgeGroup = Count / sum(Count))
print(gotv_results_age, n = Inf)
## # A tibble: 20 × 5
## # Groups: treatment [5]
## ageGroup treatment Per_Voting Count Per_in_AgeGroup
## <fct> <chr> <dbl> <int> <dbl>
## 1 (18,30] Civic Duty 0.166 4255 0.111
## 2 (18,30] Control 0.156 20650 0.108
## 3 (18,30] Hawthorne 0.158 4087 0.107
## 4 (18,30] Neighbors 0.193 4189 0.110
## 5 (18,30] Self 0.175 4139 0.108
## 6 (30,45] Civic Duty 0.293 9921 0.260
## 7 (30,45] Control 0.268 49917 0.261
## 8 (30,45] Hawthorne 0.297 10159 0.266
## 9 (30,45] Neighbors 0.356 10026 0.262
## 10 (30,45] Self 0.317 10043 0.263
## 11 (45,60] Civic Duty 0.320 16086 0.421
## 12 (45,60] Control 0.310 80330 0.420
## 13 (45,60] Hawthorne 0.338 15926 0.417
## 14 (45,60] Neighbors 0.391 15735 0.412
## 15 (45,60] Self 0.357 15968 0.418
## 16 (60,120] Civic Duty 0.410 7956 0.208
## 17 (60,120] Control 0.378 40346 0.211
## 18 (60,120] Hawthorne 0.407 8032 0.210
## 19 (60,120] Neighbors 0.474 8251 0.216
## 20 (60,120] Self 0.444 8068 0.211
Examine voting by hh size group
gotv_results_hh <- gotv |>
group_by(hhGroup, treatment) |>
summarise(
Per_Voting = mean(voted),
Count = n(),
.groups = "drop"
) |>
group_by(treatment) |>
mutate( Per_in_hhGroup = Count / sum(Count) )
print(gotv_results_hh, n = Inf)
## # A tibble: 15 × 5
## # Groups: treatment [5]
## hhGroup treatment Per_Voting Count Per_in_hhGroup
## <fct> <chr> <dbl> <int> <dbl>
## 1 (0,1] Civic Duty 0.354 5398 0.141
## 2 (0,1] Control 0.331 26481 0.138
## 3 (0,1] Hawthorne 0.370 5281 0.138
## 4 (0,1] Neighbors 0.423 5364 0.140
## 5 (0,1] Self 0.400 5310 0.139
## 6 (1,2] Civic Duty 0.327 23536 0.616
## 7 (1,2] Control 0.303 119022 0.622
## 8 (1,2] Hawthorne 0.326 23998 0.628
## 9 (1,2] Neighbors 0.391 23738 0.621
## 10 (1,2] Self 0.352 23792 0.623
## 11 (2,10] Civic Duty 0.261 9284 0.243
## 12 (2,10] Control 0.261 45740 0.239
## 13 (2,10] Hawthorne 0.285 8925 0.234
## 14 (2,10] Neighbors 0.318 9099 0.238
## 15 (2,10] Self 0.296 9116 0.239
Questions:
- Does there seem to be heterogeneity in treatment effects across age and/or house hold size?
- Could you improve voting rates by assigning different treatments to different individuals?
- What would you expect the treatment effect for civic duty if we considered a population that was evenly split across the 4 age groups?
To answer these questions it might be useful to slightly rearrange table:
gotv_results_age <- gotv |>
group_by(ageGroup, treatment) |>
summarise(
Per_Voting = mean(voted),
Count = n(),
.groups = "drop"
) |>
group_by(treatment) |>
mutate(
Per_in_AgeGroup = Count / sum(Count)
) |>
group_by(ageGroup) |>
mutate(
Control_Voting = Per_Voting[treatment == "Control"],
Difference_from_Control = Per_Voting - Control_Voting
) |>
ungroup()
gotv_results_age |>
arrange(treatment,ageGroup) |>
kbl() |>
kable_styling(font_size = 12, full_width = FALSE) |>
scroll_box(width = "100%", height = "500px")
ageGroup | treatment | Per_Voting | Count | Per_in_AgeGroup | Control_Voting | Difference_from_Control |
---|---|---|---|---|---|---|
(18,30] | Civic Duty | 0.1661575 | 4255 | 0.1113350 | 0.1562712 | 0.0098863 |
(30,45] | Civic Duty | 0.2933172 | 9921 | 0.2595897 | 0.2679248 | 0.0253925 |
(45,60] | Civic Duty | 0.3197190 | 16086 | 0.4209011 | 0.3095730 | 0.0101460 |
(60,120] | Civic Duty | 0.4098793 | 7956 | 0.2081742 | 0.3782531 | 0.0316262 |
(18,30] | Control | 0.1562712 | 20650 | 0.1079778 | 0.1562712 | 0.0000000 |
(30,45] | Control | 0.2679248 | 49917 | 0.2610135 | 0.2679248 | 0.0000000 |
(45,60] | Control | 0.3095730 | 80330 | 0.4200415 | 0.3095730 | 0.0000000 |
(60,120] | Control | 0.3782531 | 40346 | 0.2109672 | 0.3782531 | 0.0000000 |
(18,30] | Hawthorne | 0.1583068 | 4087 | 0.1069783 | 0.1562712 | 0.0020356 |
(30,45] | Hawthorne | 0.2965843 | 10159 | 0.2659146 | 0.2679248 | 0.0286596 |
(45,60] | Hawthorne | 0.3383147 | 15926 | 0.4168673 | 0.3095730 | 0.0287417 |
(60,120] | Hawthorne | 0.4068725 | 8032 | 0.2102398 | 0.3782531 | 0.0286194 |
(18,30] | Neighbors | 0.1933636 | 4189 | 0.1096568 | 0.1562712 | 0.0370924 |
(30,45] | Neighbors | 0.3561739 | 10026 | 0.2624539 | 0.2679248 | 0.0882492 |
(45,60] | Neighbors | 0.3906578 | 15735 | 0.4119002 | 0.3095730 | 0.0810848 |
(60,120] | Neighbors | 0.4738820 | 8251 | 0.2159891 | 0.3782531 | 0.0956288 |
(18,30] | Self | 0.1751631 | 4139 | 0.1082998 | 0.1562712 | 0.0188919 |
(30,45] | Self | 0.3168376 | 10043 | 0.2627819 | 0.2679248 | 0.0489128 |
(45,60] | Self | 0.3569639 | 15968 | 0.4178136 | 0.3095730 | 0.0473909 |
(60,120] | Self | 0.4442241 | 8068 | 0.2111047 | 0.3782531 | 0.0659710 |
Answers:
- Does there seem to be heterogeneity in treatment effects across age and/or house hold size?
We say there is treatment effect heterogeneity if the treatment effect varies across sub-population. To check if there’s treatment effect heterogeneity across age groups, we look at \(E[Y^{a=j}|L=l]-E[Y^{a=0}|L=l]\) for each age group \(l\), and treatment \(j\). For example, the “Civic Duty” treatment effect for individuals ages 18-30 is \[\begin{align*} E\big[Y^{a="Civic Duty"}|L=(18-30]\big]-&E\big[Y^{a="Control"}|L=(18-30]\big]]\\ &= 0.166-0.156\\&=0.001 \end{align*}\]
These values can be found in the following table,
gotv_results_ageGroup
gotv_results_ageGroup <- gotv |>
group_by(ageGroup, treatment) |>
summarise(
Per_Voting = mean(voted),
Count = n())
A nice way to present this table would be using the variableDifference_from_Control
, created above. This table makes it easier to look at the causal effect across age groups and examining whether the effect is fixed or not
ageGroup Civic Duty Control Hawthorne Neighbors Self 1 (18,30] 0.0098863 0 0.0020356 0.0370924 0.0188919 6 (30,45] 0.0253925 0 0.0286596 0.0882492 0.0489128 11 (45,60] 0.0101460 0 0.0287417 0.0810848 0.0473909 16 (60,120] 0.0316262 0 0.0286194 0.0956288 0.0659710
- What would you expect the treatment effect for civic duty if we considered a population that was evenly split across the 4 age groups?
First, let’s consider the average treatment effect for Civic Duty, which is given by: \[E[Y|a=\text{Civic Duty}]-E[Y|a=\text{Control}]\] Standardization allows us to estimate the ACE by combining estimates from each sub-population \[\sum_l P(L=l)E[Y|a=\text{Civic Duty},L=l]-\sum_l P(L=l)E[Y|a=\text{Control},L=l]\] \[\sum_l P(L=l) \Big(E[Y|a=\text{Civic Duty},L=l]-E[Y|a=\text{Control},L=l]\Big)\] For the age group the ACE looks like: \[\begin{align*} ACE =& 0.111 \times (0.166-0.156) \\ &+0.260 \times (0.293-0.268)\\&+0.421\times (0.320-0.310)\\&+ 0.208\times (0.410-0.378) \end{align*}\]
gotv_results_age |> filter(treatment=="Civic Duty") |> summarise(sum(Per_in_AgeGroup*Difference_from_Control))
## # A tibble: 1 × 1 ## `sum(Per_in_AgeGroup * Difference_from_Control)` ## <dbl> ## 1 0.0185
To estimate the treatment effect for civic duty if the population was evenly split across the 4 age group, we replace the share of each age group with \(0.25\).
## # A tibble: 1 × 1 ## `sum(0.25 * Difference_from_Control)` ## <dbl> ## 1 0.0193