Skip to content

Commit 59859d3

Browse files
committed
start sensitivity guide
1 parent 7909f43 commit 59859d3

File tree

3 files changed

+137
-3
lines changed

3 files changed

+137
-3
lines changed

doc/examples/py_double_ml_sensitivity.ipynb

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -48,7 +48,7 @@
4848
"Both parameters determine the strength of the confounding\n",
4949
"\n",
5050
"- `cf_y` measures the proportion of residual variance in the outcome explained by unobserved confounders\n",
51-
"- `cf_d` measires the porportion of residual variance of the Riesz Representer generated by unobserved confounders. In the PLR\n",
51+
"- `cf_d` measires the porportion of residual variance of the Riesz representer generated by unobserved confounders. In the PLR\n",
5252
"the following representation $$\\text{cf\\_d}=\\frac{\\eta^2_{D\\sim A|X}}{1-\\eta^2_{D\\sim A|X}},$$ where $\\eta^2_{D\\sim A|X}$ is \n",
5353
"the nonparametric $R^2$ and measures the proportion of residual variation of the treatment explained by unobserved confounders.\n",
5454
"\n",
@@ -234,7 +234,7 @@
234234
"### Sensitivity Analysis\n",
235235
"\n",
236236
"To perform a sensitivity analysis with the [DoubleML](https://docs.doubleml.org/stable/index.html) package you can use the `sensitivity_analysis()` method. <br>\n",
237-
"The sensitivity analysis is based on the strength of the confounding `cf_y` and `cf_d` (default values $0.03$) and the parameter `rho`, which measures the correlation between the difference of the long and short form of the outcome regression and the Riesz Representer (the default value $1.0$ is conservative and considers adversarial counfounding). To additionally incorporate statistical uncertainty, a significance level (default $0.95$) is used.\n",
237+
"The sensitivity analysis is based on the strength of the confounding `cf_y` and `cf_d` (default values $0.03$) and the parameter `rho`, which measures the correlation between the difference of the long and short form of the outcome regression and the Riesz representer (the default value $1.0$ is conservative and considers adversarial counfounding). To additionally incorporate statistical uncertainty, a significance level (default $0.95$) is used.\n",
238238
"\n",
239239
"These input parameters are used to calculate upper and lower bounds (including the corresponding confidence level) on the treatment effect estimate. \n",
240240
"\n",
@@ -46005,7 +46005,7 @@
4600546005
"name": "python",
4600646006
"nbconvert_exporter": "python",
4600746007
"pygments_lexer": "ipython3",
46008-
"version": "3.10.10"
46008+
"version": "3.11.2"
4600946009
},
4601046010
"orig_nbformat": 4
4601146011
},

doc/guide/guide.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,7 @@ User guide
1818
Learners, hyperparameters and hyperparameter tuning <learners>
1919
Variance estimation and confidence intervals <se_confint>
2020
Sample-splitting, cross-fitting and repeated cross-fitting <resampling>
21+
Sensitivity analysis <sensitivity>
2122

2223

2324
.. raw:: html

doc/guide/sensitivity.rst

Lines changed: 133 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,133 @@
1+
.. _sensitivity:
2+
3+
Sensitivity analysis
4+
------------------------
5+
6+
The :ref:`DoubleML <doubleml_package>` package implements sensitivity analysis with respect to ommitted variable bias
7+
based on `Chernozhukov et al. (2022) <https://www.nber.org/papers/w30302>`_.
8+
9+
General algorithm
10+
+++++++++++++++++
11+
12+
Currently, the sensitivity analysis is only available for linear models.
13+
14+
Assume that we can write the model in the following representation
15+
16+
.. math::
17+
18+
\theta_0 = \mathbb{E}[m(W,g_0)],
19+
20+
where usually :math:`g_0(W) = \mathbb{E}[Y|X, D]`.
21+
As long as :math:`\mathbb{E}[m(W,f)]` is a continuous linear functional of :math:`f`, there exists a unique square
22+
integrable random variable :math:`\alpha_0(W)`, called Riesz representer
23+
(see `Riesz representation theorem <https://en.wikipedia.org/wiki/Riesz_representation_theorem>`_), such that
24+
25+
.. math::
26+
27+
\theta_0 = \mathbb{E}[g_0(W)\alpha_0(W)].
28+
29+
The target parameter :math:`\theta_0` has the following representation
30+
31+
.. math::
32+
33+
\theta_0 = \mathbb{E}[m(W,g_0) + (Y-g_0(W))\alpha_0(W)],
34+
35+
which corresponds to a Neyman orthogonal score function (orthogonal with respect to nuisance elements :math:`(g, \alpha)`).
36+
To bound the ommited variable bias, the following further elements are needed.
37+
The variance of the main regression
38+
39+
.. math::
40+
41+
\sigma_0^2 := \mathbb{E}[(Y-g_0(W))^2]
42+
43+
and the second moment of the Riesz representer
44+
45+
.. math::
46+
47+
\nu_0^2 := \mathbb{E}[\alpha_0(W)^2] =2\mathbb{E}[m(W,\alpha_0)] - \mathbb{E}[\alpha_0(W)^2].
48+
49+
Both representations are Neyman orthogonal with respect to :math:`g` and :math:`\alpha`, respectively.
50+
Further, define the corresponding score functions
51+
52+
.. math::
53+
54+
\psi_{\sigma^2}(W, \sigma^2, g) &:= (Y-g_0(W))^2 - \sigma^2\\
55+
\psi_{\nu^2}(W, \nu^2, \alpha) &:= 2m(W,\alpha) - \alpha(W)^2 - \nu^2.
56+
57+
Recall that the parameter :math:`\theta_0` is identified via the moment condition
58+
59+
.. math::
60+
61+
\theta_0 = \mathbb{E}[m(W,g_0)].
62+
63+
If :math:`W=(Y, D, X)` does not include all confounding variables, the "true" target parameter :math:`\tilde{\theta}_0`
64+
would only be identified via the extendend (or "long") form
65+
66+
.. math::
67+
68+
\tilde{\theta}_0 = \mathbb{E}[m(\tilde{W},\tilde{g}_0)],
69+
70+
where :math:`\tilde{W}=(Y, D, X, A)` includes the unobserved counfounders :math:`A`.
71+
In Theorem 2 of their paper `Chernozhukov et al. (2022) <https://www.nber.org/papers/w30302>`_ are able to bound the ommited variable bias
72+
73+
.. math::
74+
75+
|\tilde{\theta}_0 -\theta_0|^2 = \rho^2 B^2,
76+
77+
where
78+
79+
.. math::
80+
81+
B^2 := \mathbb{E}\Big[\big(g(W) - \tilde{g}(\tilde{W})\big)^2\Big]\mathbb{E}\Big[\big(\alpha(W) - \tilde{\alpha}(\tilde{W})\big)^2\Big],
82+
83+
denotes the product of additional variations in the outcome regression and Riesz representer generated by ommited confounders and
84+
85+
.. math::
86+
87+
\rho^2 := \textrm{Cor}^2\Big(g(W) - \tilde{g}(\tilde{W}),\alpha(W) - \tilde{\alpha}(\tilde{W})\Big),
88+
89+
denotes the correlations between the deviations generated by ommited confounders. Further, the bound can be expressed as
90+
91+
.. math::
92+
93+
B^2 := S^2 C_Y^2 C_D^2,
94+
95+
where
96+
97+
.. math::
98+
99+
S^2 &:= \mathbb{E}\Big[\big(Y - g(W)\big)^2\Big]\mathbb{E}\big[\alpha(W)^2\big]
100+
101+
C_Y^2 &:= R^2_{Y-g \sim \tilde{g}-g}
102+
103+
C_D^2 &:= \frac{1 - R^2_{\tilde{\alpha} \sim \alpha}}{R^2_{\tilde{\alpha} \sim \alpha}}.
104+
105+
Here, fo general random variables :math:`U` and :math:`V`
106+
107+
.. math::
108+
109+
R^2_{U \sim V} := \frac{\textrm{Var}(V)}{\textrm{Var}(U)}
110+
111+
is defined as the variance ratio.
112+
113+
Let :math:`\psi(W,\theta,\eta)` the (correctly scaled) score function for the target parameter :math:`\theta_0`.
114+
Finally, for specified values of :math:`C_Y^2` and :math:`C_D^2`
115+
116+
For more detail and interpretations see `Chernozhukov et al. (2022) <https://www.nber.org/papers/w30302>`_.
117+
118+
.. _sensitivity-implementation:
119+
120+
Implemented sensitivity procedures
121+
+++++++++++++++++++++++++++++++++++
122+
123+
This section contains the implementation details for each specific model.
124+
125+
.. _plr-sensitivity:
126+
127+
Partially linear regression model (PLR)
128+
***************************************
129+
130+
.. _irm-sensitivity:
131+
132+
Interactive regression model (IRM)
133+
**********************************

0 commit comments

Comments
 (0)