diff --git a/doc/spec/references.rst b/doc/spec/references.rst index 07ef60507..51e7de5bd 100644 --- a/doc/spec/references.rst +++ b/doc/spec/references.rst @@ -17,6 +17,12 @@ References Two-Stage Estimation with a High-Dimensional Second Stage. 2018. +.. [Chernozhukov2022] + V. Chernozhukov, C. Cinelli, N. Kallus, W. Newey, A. Sharma, and V. Syrgkanis. + Long Story Short: Omitted Variable Bias in Causal Machine Learning. + *NBER Working Paper No. 30302*, 2022. + URL https://www.nber.org/papers/w30302. + .. [Hartford2017] Jason Hartford, Greg Lewis, Kevin Leyton-Brown, and Matt Taddy. Deep IV: A flexible approach for counterfactual prediction. diff --git a/doc/spec/spec.rst b/doc/spec/spec.rst index 38917c742..2afd6a93d 100644 --- a/doc/spec/spec.rst +++ b/doc/spec/spec.rst @@ -13,6 +13,7 @@ EconML User Guide estimation_dynamic inference model_selection + validation interpretability federated_learning references diff --git a/doc/spec/validation.rst b/doc/spec/validation.rst new file mode 100644 index 000000000..859c2e9b4 --- /dev/null +++ b/doc/spec/validation.rst @@ -0,0 +1,68 @@ +Validation +====================== + +Validating causal estimates is inherently challenging, as the true counterfactual outcome for a given treatment is +unobservable. However, there are several checks and tools available in EconML to help assess the credibility of causal +estimates. + + +Sensitivity Analysis +--------------------- + +For many EconML estimators, unobserved confounding can lead to biased causal estimates. +Moreover, it is impossible to prove the absence of unobserved confounders. +This is a fundamental problem for observational causal inference. + +To mitigate this problem, EconML provides a suite of sensitivity analysis tools, +based on [Chernozhukov2022]_, +to assess the robustness of causal estimates to unobserved confounding. + +Specifically, select estimators (subclasses of :class:`.DML` and :class:`.DRLearner`) +have access to ``sensitivity_analysis``, ``robustness_value``, and ``sensitivity_summary`` methods. + +``sensitivity_analysis`` provides an updated confidence interval for the ATE based on a specified level of unobserved confounding. + + +``robustness_value`` computes the minimum level of unobserved confounding required +so that confidence intervals around the ATE would begin to include the given point (0 by default). + + +``sensitivity_summary`` provides a summary of the the two above methods. + +DRTester +---------------- + +EconML provides the :class:`.DRTester` class, which implements Best Linear Predictor (BLP), calibration r-squared, +and uplift modeling methods for validation. + +See an example notebook `here `__. + +Scoring +------- + +Many EconML estimators implement a ``.score`` method to evaluate the goodness-of-fit of the final model. While it may be +difficult to make direct sense of results from ``.score``, EconML offers the :class:`RScorer` class to facilitate model +selection based on scoring. + +:class:`RScorer` enables comparison and selection among different causal models. + +See an example notebook `here +`__. + +Confidence Intervals and Inference +---------------------------------- + +Most EconML estimators allow for inference, including standard errors, confidence intervals, and p-values for +estimated effects. A common validation approach is to check whether the p-values are below a chosen significance level +(e.g., 0.05). If not, the null hypothesis that the causal effect is zero cannot be rejected. + +**Note:** Inference results are only valid if the model specification is correct. For example, if a linear model is used +but the true data-generating process is nonlinear, the inference may not be reliable. It is generally not possible to +guarantee correct specification, so p-value inspection should be considered a surface-level check. + +DoWhy Refutation Tests +---------------------- + +The DoWhy library, which complements EconML, includes several refutation tests for validating causal estimates. These +tests work by comparing the original causal estimate to estimates obtained from perturbed versions of the data, helping +to assess the robustness of causal conclusions. \ No newline at end of file diff --git a/econml/dml/causal_forest.py b/econml/dml/causal_forest.py index ab353e9fc..829204537 100644 --- a/econml/dml/causal_forest.py +++ b/econml/dml/causal_forest.py @@ -857,7 +857,7 @@ def sensitivity_interval(self, alpha=0.05, c_y=0.05, c_t=0.05, rho=1., interval_ Can only be calculated when Y and T are single arrays, and T is binary or continuous. - Based on `Chernozhukov et al. (2022) `_ + Based on [Chernozhukov2022]_ Parameters ---------- @@ -901,7 +901,7 @@ def robustness_value(self, null_hypothesis=0, alpha=0.05, interval_type='ci'): Can only be calculated when Y and T are single arrays, and T is binary or continuous. - Based on `Chernozhukov et al. (2022) `_ + Based on [Chernozhukov2022]_ Parameters ---------- diff --git a/econml/dml/dml.py b/econml/dml/dml.py index 01e184b74..0606f14cc 100644 --- a/econml/dml/dml.py +++ b/econml/dml/dml.py @@ -646,7 +646,7 @@ def sensitivity_interval(self, alpha=0.05, c_y=0.05, c_t=0.05, rho=1., interval_ Can only be calculated when Y and T are single arrays, and T is binary or continuous. - Based on `Chernozhukov et al. (2022) `_ + Based on [Chernozhukov2022]_ Parameters ---------- @@ -690,7 +690,7 @@ def robustness_value(self, null_hypothesis=0, alpha=0.05, interval_type='ci'): Can only be calculated when Y and T are single arrays, and T is binary or continuous. - Based on `Chernozhukov et al. (2022) `_ + Based on [Chernozhukov2022]_ Parameters ---------- diff --git a/econml/dr/_drlearner.py b/econml/dr/_drlearner.py index 3208c8064..e6f35e7e9 100644 --- a/econml/dr/_drlearner.py +++ b/econml/dr/_drlearner.py @@ -798,7 +798,7 @@ def sensitivity_interval(self, T, alpha=0.05, c_y=0.05, c_t=0.05, rho=1., interv The sensitivity interval is the range of values for the ATE that are consistent with the observed data, given a specified level of confounding. - Based on `Chernozhukov et al. (2022) `_ + Based on [Chernozhukov2022]_ Parameters ---------- @@ -848,7 +848,7 @@ def robustness_value(self, T, null_hypothesis=0, alpha=0.05, interval_type='ci') Returns 0 if the original interval already includes the null_hypothesis. - Based on `Chernozhukov et al. (2022) `_ + Based on [Chernozhukov2022]_ Parameters ----------