class: center, middle, inverse, title-slide # A definition of causal effect ## What If: Chapter 1 ### Elena Dudukina ### 2020-10-18 --- # INTRODUCTION ## "No book can possibly provide a comprehensive description of all methodologies for causal inference across the sciences. The authors of any Causal Inference book will have to choose which aspects of causal inference methodology they want to emphasize." ![](Screenshot 2020-10-07 at 21.21.24.png) --- # Why this book? * ## "...to make causal inferences that are explicit about both the causal question and the assumptions underlying the data analysis" -- * ## "various data analysis approaches to estimate the causal effect of interest under a particular set of assumptions when data are collected on each individual in a population" --- # Notation - `\(A\)`: treatment/exposure (1: treated, 0: untreated) - `\(Y\)`: outcome (1: death, 0: survival) - `\(Y^{a}\)`: counterfactual outcome under exposure A - `\(Y^{a=1}\)`: potential outcome under exposure level A=1 - `\(Y^{a=0}\)`: potential outcome under exposure level A=0 - `\(Y_{i}^{a=0}\)`: potential outcome for individual `\(i\)` under A=0 --- # Counterfactual outcomes Some distinguish between potential and counteractual outcomes. The outcome is potential before it has been observed. Once one outcome is observed it becomes factual, the unobserved potential outcome becomes counterfactual. HernĂ¡n does not distinguish: " `\(Y^{a=1}\)` and `\(Y^{a=0}\)` are referred as potential outcomes or as counterfactual outcomes " --- # Causal effect for an individual - Not what epidemiology does - "identifying individual causal effects is generally not possible" - When the potential outcome under one level of exposure is not the same as the outcome under another level of exposure `\(Y_{i}^{a=1}\not=Y_{i}^{a=0}\)`, there's an effect of exposure A on individual's outcome - Examples: - Zeus (individual causal effect): `\(Y_{Zeus}^{a=1}\not=Y_{Zeus}^{a=0}\)` -- - Hera (no individual causal effect): `\(Y_{Hera}^{a=1}=Y_{Hera}^{a=0}\)` --- # Consistency # One counterfactual outcome is an observed outcome! For Zeus, who was treated (A=1), `\(Y^{a=1}\)` was Y=1 **Consistency is an assumption, which allows us to shift from potential outcomes to observed outcomes** We always observe ONE of the potential outcomes: - Observed outcomes in treated - `\(Y^{a=1}\)` = `\(Y\)` among A=1 - Observed outcomes in untreated - `\(Y^{a=0}\)` = `\(Y\)` among A=0 Also we assume no interference (individual's potential outcomes are independent) --- # Average causal effects - In a population of individuals - Need - Outcome `\(Y\)` - Exposure (action/intervention) levels to compare: `\(A\)` (a=1 vs a=0) - Outcomes under exposure levels `\(Y^{a=1}\)` and `\(Y^{a=0}\)` --- # Greek gods dataset Wonder data on Gods, for whom we observed **both** potential outcomes - `Y_a0` is potential outcome under no treatment - `Y_a1` is a potential outcome under treatment Deterministic counterfactual outcomes (simplification) --- ```r greek_gods ``` ``` ## # A tibble: 20 x 3 ## greek Y_a0 Y_a1 ## <chr> <dbl> <dbl> ## 1 Rheia 0 1 ## 2 Kronos 1 0 ## 3 Demeter 0 0 ## 4 Hades 0 0 ## 5 Hestia 0 0 ## 6 Poseidon 1 0 ## 7 Hera 0 0 ## 8 Zeus 0 1 ## 9 Artemis 1 1 ## 10 Apollo 1 0 ## 11 Leto 0 1 ## 12 Ares 1 1 ## 13 Athena 1 1 ## 14 Hephaestus 0 1 ## 15 Aphrodite 0 1 ## 16 Cyclope 0 1 ## 17 Persephone 1 1 ## 18 Hermes 1 0 ## 19 Hebe 1 0 ## 20 Dionysus 1 0 ``` --- # What are the risks of death under treatment levels? Since we magically observed both potential outcomes we can contrast the risk in the whole population under A=1 and A=0 directly: ```r greek_gods %>% summarise_if(is.numeric, mean) %>% kableExtra::kable(format = "html") ``` <table> <thead> <tr> <th style="text-align:right;"> Y_a0 </th> <th style="text-align:right;"> Y_a1 </th> </tr> </thead> <tbody> <tr> <td style="text-align:right;"> 0.5 </td> <td style="text-align:right;"> 0.5 </td> </tr> </tbody> </table> - 20 gods - 50% (N=10) dies if treated and 50% dies if untreated --- # Average causal effect ## is inequality of counterfactual risks on the populational level: `\(Pr[Y^{a=1}=1]\not=Pr[Y^{a=0}=1]\)` If `\(Pr[Y^{a=1}=1]\)` = `\(Pr[Y^{a=0}=1]\)` **-->** no average causal effect (null effect) - Greek gods - Absolute scale: `\(Pr[Y^{a=1}=1]\)` - `\(Pr[Y^{a=0}=1]\)` = 0.5-0.5 = 0 - Multiplicative scale: `\(Pr[Y^{a=1}=1]\)` / `\(Pr[Y^{a=0}=1]\)` = 0.5/0.5 = 1 --- # Average causal effects vs. not individual causal effects ```r # no individual causal effect greek_gods %>% filter(Y_a0 == Y_a1) ``` ``` ## # A tibble: 8 x 3 ## greek Y_a0 Y_a1 ## <chr> <dbl> <dbl> ## 1 Demeter 0 0 ## 2 Hades 0 0 ## 3 Hestia 0 0 ## 4 Hera 0 0 ## 5 Artemis 1 1 ## 6 Ares 1 1 ## 7 Athena 1 1 ## 8 Persephone 1 1 ``` --- ```r # individual causal effect greek_gods %>% filter(Y_a0 != Y_a1) %>% arrange(Y_a1) ``` ``` ## # A tibble: 12 x 3 ## greek Y_a0 Y_a1 ## <chr> <dbl> <dbl> ## 1 Kronos 1 0 ## 2 Poseidon 1 0 ## 3 Apollo 1 0 ## 4 Hermes 1 0 ## 5 Hebe 1 0 ## 6 Dionysus 1 0 ## 7 Rheia 0 1 ## 8 Zeus 0 1 ## 9 Leto 0 1 ## 10 Hephaestus 0 1 ## 11 Aphrodite 0 1 ## 12 Cyclope 0 1 ``` --- # Measures of causal effect - Risk difference - `\(Pr[Y^{a=1}=1]\)` - `\(Pr[Y^{a=0}=1]\)` -- - Risk ratio - `\(Pr[Y^{a=1}=1]\)` / `\(Pr[Y^{a=0}=1]\)` -- - Odds ratio - `\(\frac{Pr[Y^{a=1}=1]/Pr[Y^{a=1}=0]}{Pr[Y^{a=0}=1]/Pr[Y^{a=0}=0]}\)` --- # Random error * Sampling variability: "working with samples prevents one from obtaining the exact proportion of individuals in the population who had the outcome under certain treatment value, e.g., the probability of death under no treatment cannot be directly computed" - Follows the law of big numbers * Non-deterministic (stochastic) counterfactuals Shift from certainly having a 100% chance of dying under treatment or surviving under no treatment to "90% chance of dying if treated, and a 10% chance of dying if untreated". - Does not follow the law of big numbers --- ## Causation vs association ```r greek_gods_obs ``` ``` ## # A tibble: 20 x 5 ## greek A Y_obs Y_a0 Y_a1 ## <chr> <dbl> <dbl> <dbl> <dbl> ## 1 Rheia 0 0 0 NA ## 2 Kronos 0 1 1 NA ## 3 Demeter 0 0 0 NA ## 4 Hades 0 0 0 NA ## 5 Hestia 1 0 NA 0 ## 6 Poseidon 1 0 NA 0 ## 7 Hera 1 0 NA 0 ## 8 Zeus 1 1 NA 1 ## 9 Artemis 0 1 1 NA ## 10 Apollo 0 1 1 NA ## 11 Leto 0 0 0 NA ## 12 Ares 1 1 NA 1 ## 13 Athena 1 1 NA 1 ## 14 Hephaestus 1 1 NA 1 ## 15 Aphrodite 1 1 NA 1 ## 16 Cyclope 1 1 NA 1 ## 17 Persephone 1 1 NA 1 ## 18 Hermes 1 0 NA 0 ## 19 Hebe 1 0 NA 0 ## 20 Dionysus 1 0 NA 0 ``` --- ```r greek_gods_obs %>% group_by(A) %>% count(Y_obs) %>% mutate( denominator = sum(n), Pr_Y_obs = round(n / denominator, 1) ) %>% filter(Y_obs == 1) ``` ``` ## # A tibble: 2 x 5 ## # Groups: A [2] ## A Y_obs n denominator Pr_Y_obs ## <dbl> <dbl> <int> <int> <dbl> ## 1 0 1 3 7 0.4 ## 2 1 1 7 13 0.5 ``` --- Observed (associational) risk difference, risk ratio, and odds ratio: - Risk difference - `\(Pr[Y=1|A=1]\)` - `\(Pr[Y=1|A=0]\)` -- - Risk ratio - `\(Pr[Y=1|A=1]\)` / `\(Pr[Y=1|A=0]\)` -- - Odds ratio - `\(\frac{Pr[Y=1|A=1]/Pr[Y=0|A=1]}{Pr[Y=1|A=0]/Pr[Y=0|A=0]}\)` -- - What is (observed; associational) vs. What if (counterfactual; causal) --- # Conceptualizaton of causation as comparison of **counterfactual** entities