From cf4668ca4b057910c03ed355dfe95f4e775180b2 Mon Sep 17 00:00:00 2001 From: Fabio Vera Date: Thu, 22 May 2025 15:26:53 -0400 Subject: [PATCH 1/3] initial commit for validation docs Signed-off-by: Fabio Vera --- doc/spec/spec.rst | 1 + doc/spec/validation.rst | 68 +++++++++++++++++++++++++++++++++++++++++ 2 files changed, 69 insertions(+) create mode 100644 doc/spec/validation.rst diff --git a/doc/spec/spec.rst b/doc/spec/spec.rst index 38917c742..2afd6a93d 100644 --- a/doc/spec/spec.rst +++ b/doc/spec/spec.rst @@ -13,6 +13,7 @@ EconML User Guide estimation_dynamic inference model_selection + validation interpretability federated_learning references diff --git a/doc/spec/validation.rst b/doc/spec/validation.rst new file mode 100644 index 000000000..54280d2d7 --- /dev/null +++ b/doc/spec/validation.rst @@ -0,0 +1,68 @@ +Validation +====================== + +Validating causal estimates is inherently challenging, as the true counterfactual outcome for a given treatment is +unobservable. However, there are several checks and tools available in EconML to help assess the credibility of causal +estimates. + + +Sensitivity Analysis +--------------------- + +For many EconML estimators, unobserved confounding can lead to biased causal estimates. +Moreover, it is impossible to prove the absence of unobserved confounders. +This is a fundamental problem for observational causal inference. + +To mitigate this problem, EconML provides a suite of sensitivity analysis tools, +based on `Chernozhukov et al. (2022) `_, +to assess the robustness of causal estimates to unobserved confounding. + +Specifically, select estimators (subclasses of :class:`.DML` and :class:`.DRLearner`) +have access to ``sensitivity_analysis``, ``robustness_value``, and ``sensitivity_summary`` methods. + +``sensitivity_analysis`` provides an updated confidence interval for the ATE based on a specified level of unobserved confounding. + + +``robustness_value`` computes the minimum level of unobserved confounding required +to make it impossible to reject a null hypothesis (default 0). + + +``sensitivity_summary`` provides a summary of the the two above methods. + +DRTester +---------------- + +EconML provides the :class:`.DRTester` class, which implements Best Linear Predictor (BLP), calibration r-squared, +and uplift modeling methods for validation. + +See an example notebook `here `__. + +Scoring +------- + +Many EconML estimators implement a ``.score`` method to evaluate the goodness-of-fit of the final model. While it may be +difficult to make direct sense of results from ``.score``, EconML offers the :class:`RScorer` class to facilitate model +selection based on scoring. + +:class:`RScorer` enables comparison and selection among different causal models. + +See an example notebook `here +`__. + +Confidence Intervals and Inference +---------------------------------- + +Most EconML estimators allow for inference, including standard errors, confidence intervals, and p-values for +estimated effects. A common validation approach is to check whether the p-values are below a chosen significance level +(e.g., 0.05). If not, the null hypothesis that the causal effect is zero cannot be rejected. + +**Note:** Inference results are only valid if the model specification is correct. For example, if a linear model is used +but the true data-generating process is nonlinear, the inference may not be reliable. It is generally not possible to +guarantee correct specification, so p-value inspection should be considered a surface-level check. + +DoWhy Refutation Tests +---------------------- + +The DoWhy library, which complements EconML, includes several refutation tests for validating causal estimates. These +tests work by comparing the original causal estimate to estimates obtained from perturbed versions of the data, helping +to assess the robustness of causal conclusions. \ No newline at end of file From c1f08100ddd2b0048853859d86f536895ab1bf64 Mon Sep 17 00:00:00 2001 From: Fabio Vera Date: Thu, 5 Jun 2025 15:20:21 -0400 Subject: [PATCH 2/3] polish references Signed-off-by: Fabio Vera --- doc/spec/references.rst | 6 ++++++ doc/spec/validation.rst | 2 +- econml/dml/causal_forest.py | 4 ++-- econml/dml/dml.py | 4 ++-- econml/dr/_drlearner.py | 4 ++-- 5 files changed, 13 insertions(+), 7 deletions(-) diff --git a/doc/spec/references.rst b/doc/spec/references.rst index 07ef60507..51e7de5bd 100644 --- a/doc/spec/references.rst +++ b/doc/spec/references.rst @@ -17,6 +17,12 @@ References Two-Stage Estimation with a High-Dimensional Second Stage. 2018. +.. [Chernozhukov2022] + V. Chernozhukov, C. Cinelli, N. Kallus, W. Newey, A. Sharma, and V. Syrgkanis. + Long Story Short: Omitted Variable Bias in Causal Machine Learning. + *NBER Working Paper No. 30302*, 2022. + URL https://www.nber.org/papers/w30302. + .. [Hartford2017] Jason Hartford, Greg Lewis, Kevin Leyton-Brown, and Matt Taddy. Deep IV: A flexible approach for counterfactual prediction. diff --git a/doc/spec/validation.rst b/doc/spec/validation.rst index 54280d2d7..7a2177083 100644 --- a/doc/spec/validation.rst +++ b/doc/spec/validation.rst @@ -14,7 +14,7 @@ Moreover, it is impossible to prove the absence of unobserved confounders. This is a fundamental problem for observational causal inference. To mitigate this problem, EconML provides a suite of sensitivity analysis tools, -based on `Chernozhukov et al. (2022) `_, +based on [Chernozhukov2022]_, to assess the robustness of causal estimates to unobserved confounding. Specifically, select estimators (subclasses of :class:`.DML` and :class:`.DRLearner`) diff --git a/econml/dml/causal_forest.py b/econml/dml/causal_forest.py index ab353e9fc..829204537 100644 --- a/econml/dml/causal_forest.py +++ b/econml/dml/causal_forest.py @@ -857,7 +857,7 @@ def sensitivity_interval(self, alpha=0.05, c_y=0.05, c_t=0.05, rho=1., interval_ Can only be calculated when Y and T are single arrays, and T is binary or continuous. - Based on `Chernozhukov et al. (2022) `_ + Based on [Chernozhukov2022]_ Parameters ---------- @@ -901,7 +901,7 @@ def robustness_value(self, null_hypothesis=0, alpha=0.05, interval_type='ci'): Can only be calculated when Y and T are single arrays, and T is binary or continuous. - Based on `Chernozhukov et al. (2022) `_ + Based on [Chernozhukov2022]_ Parameters ---------- diff --git a/econml/dml/dml.py b/econml/dml/dml.py index 01e184b74..0606f14cc 100644 --- a/econml/dml/dml.py +++ b/econml/dml/dml.py @@ -646,7 +646,7 @@ def sensitivity_interval(self, alpha=0.05, c_y=0.05, c_t=0.05, rho=1., interval_ Can only be calculated when Y and T are single arrays, and T is binary or continuous. - Based on `Chernozhukov et al. (2022) `_ + Based on [Chernozhukov2022]_ Parameters ---------- @@ -690,7 +690,7 @@ def robustness_value(self, null_hypothesis=0, alpha=0.05, interval_type='ci'): Can only be calculated when Y and T are single arrays, and T is binary or continuous. - Based on `Chernozhukov et al. (2022) `_ + Based on [Chernozhukov2022]_ Parameters ---------- diff --git a/econml/dr/_drlearner.py b/econml/dr/_drlearner.py index 3208c8064..e6f35e7e9 100644 --- a/econml/dr/_drlearner.py +++ b/econml/dr/_drlearner.py @@ -798,7 +798,7 @@ def sensitivity_interval(self, T, alpha=0.05, c_y=0.05, c_t=0.05, rho=1., interv The sensitivity interval is the range of values for the ATE that are consistent with the observed data, given a specified level of confounding. - Based on `Chernozhukov et al. (2022) `_ + Based on [Chernozhukov2022]_ Parameters ---------- @@ -848,7 +848,7 @@ def robustness_value(self, T, null_hypothesis=0, alpha=0.05, interval_type='ci') Returns 0 if the original interval already includes the null_hypothesis. - Based on `Chernozhukov et al. (2022) `_ + Based on [Chernozhukov2022]_ Parameters ---------- From 940d3eae48692f406ad7b96c4d08a6288af2b5b5 Mon Sep 17 00:00:00 2001 From: Fabio Vera Date: Thu, 5 Jun 2025 15:24:06 -0400 Subject: [PATCH 3/3] update wording Signed-off-by: Fabio Vera --- doc/spec/validation.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/spec/validation.rst b/doc/spec/validation.rst index 7a2177083..859c2e9b4 100644 --- a/doc/spec/validation.rst +++ b/doc/spec/validation.rst @@ -24,7 +24,7 @@ have access to ``sensitivity_analysis``, ``robustness_value``, and ``sensitivity ``robustness_value`` computes the minimum level of unobserved confounding required -to make it impossible to reject a null hypothesis (default 0). +so that confidence intervals around the ATE would begin to include the given point (0 by default). ``sensitivity_summary`` provides a summary of the the two above methods.