Discussion 5. Directed Acyclic Graphs

STSCI/INFO/ILRST 3900: Causal Inference

September 24, 2025

You can download the slides for this week’s discussion.

For each of the following DAGs we will look whether \(X_1\) and \(X_3\) are independent- \(X_1 \perp\!\!\!\!\perp X_3\) and whether it changes when we condition on \(X_2\): \(X_1 \perp\!\!\!\!\perp X_3 \mid X_2\)

In practice, conditioning \(Y\) on \(X\), involves linear regression \(Y=\beta X + \varepsilon\). We ask “what part of \(Y\) is not explained by \(X\)?”. The residuals \(Y-\hat \beta X\) represent the part of \(Y\) that is left after removing the part explained by \(X\).

DAG 1:

\[X_1 \rightarrow X_2 \rightarrow X_3\]

x1 <- rnorm(n = 100, mean = 5, sd = 2)
x2 <- 2 * x1 + rnorm(n = 100, mean = 0, sd = 1)
x3 <- -1 * x2 + rnorm(n = 100, mean = 0, sd = 1)

plot(x1,x3, pch=16, main = paste("Cor=",round(cor(x1,x3),3)))
x1_given_x2 <- lm(x1 ~ x2)$residuals
x3_given_x2 <- lm(x3 ~ x2)$residuals
plot(x1_given_x2,x3_given_x2, pch=16, main = paste("Conditioning on X2, Cor=",round(cor(x1_given_x2,x3_given_x2),3)))

DAG 2:

\[X_1 \leftarrow X_2 \rightarrow X_3\]

x2 <- rnorm(n = 100, mean = 5, sd = 2)
x1 <- 2 * x2 + rnorm(n = 100, mean = 0, sd = 1)
x3 <- -1 * x2 + rnorm(n = 100, mean = 0, sd = 1)

plot(x1,x3, pch=16, main = paste("Cor=",round(cor(x1,x3),3)))
x1_given_x2 <- lm(x1 ~ x2)$residuals
x3_given_x2 <- lm(x3 ~ x2)$residuals
plot(x1_given_x2,x3_given_x2, pch=16, main = paste("Conditioning on X2, Cor=",round(cor(x1_given_x2,x3_given_x2),3)))

DAG 3:

\[X_1 \rightarrow X_2 \leftarrow X_3\]

x1 <- rnorm(n = 100, mean = 5, sd = 2)
x3 <- rnorm(n = 100, mean = 3, sd = 2)
x2 <- 2 * x1 -3 * x3 + rnorm(n = 100, mean = 0, sd = 1)

plot(x1,x3, pch=16, main = paste("Cor=",round(cor(x1,x3),3)))
x1_given_x2 <- lm(x1 ~ x2)$residuals
x3_given_x2 <- lm(x3 ~ x2)$residuals
plot(x1_given_x2,x3_given_x2, pch=16, main = paste("Conditioning on X2, Cor=",round(cor(x1_given_x2,x3_given_x2),3)))