start sensitivity guide

SvenKlaassen · SvenKlaassen · commit 59859d393170 · 2023-06-05T16:11:17.000+02:00
diff --git a/doc/examples/py_double_ml_sensitivity.ipynb b/doc/examples/py_double_ml_sensitivity.ipynb
@@ -48,7 +48,7 @@
     "Both parameters determine the strength of the confounding\n",
     "\n",
     "- `cf_y` measures the proportion of residual variance in the outcome explained by unobserved confounders\n",
-    "- `cf_d` measires the porportion of residual variance of the Riesz Representer generated by unobserved confounders. In the PLR\n",
+    "- `cf_d` measires the porportion of residual variance of the Riesz representer generated by unobserved confounders. In the PLR\n",
     "the following representation $$\\text{cf\\_d}=\\frac{\\eta^2_{D\\sim A|X}}{1-\\eta^2_{D\\sim A|X}},$$ where $\\eta^2_{D\\sim A|X}$ is \n",
     "the nonparametric $R^2$ and measures the proportion of residual variation of the treatment explained by unobserved confounders.\n",
     "\n",
@@ -234,7 +234,7 @@
     "### Sensitivity Analysis\n",
     "\n",
     "To perform a sensitivity analysis with the [DoubleML](https://docs.doubleml.org/stable/index.html) package you can use the `sensitivity_analysis()` method. <br>\n",
-    "The sensitivity analysis is based on the strength of the confounding `cf_y` and `cf_d` (default values $0.03$) and the parameter `rho`, which measures the correlation between the difference of the long and short form of the outcome regression and the Riesz Representer (the default value $1.0$ is conservative and considers adversarial counfounding). To additionally incorporate statistical uncertainty, a significance level (default $0.95$) is used.\n",
+    "The sensitivity analysis is based on the strength of the confounding `cf_y` and `cf_d` (default values $0.03$) and the parameter `rho`, which measures the correlation between the difference of the long and short form of the outcome regression and the Riesz representer (the default value $1.0$ is conservative and considers adversarial counfounding). To additionally incorporate statistical uncertainty, a significance level (default $0.95$) is used.\n",
     "\n",
     "These input parameters are used to calculate upper and lower bounds (including the corresponding confidence level) on the treatment effect estimate. \n",
     "\n",
@@ -46005,7 +46005,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.10.10"
+   "version": "3.11.2"
   },
   "orig_nbformat": 4
  },
diff --git a/doc/guide/guide.rst b/doc/guide/guide.rst
@@ -18,6 +18,7 @@ User guide
     Learners, hyperparameters and hyperparameter tuning <learners>
     Variance estimation and confidence intervals <se_confint>
     Sample-splitting, cross-fitting and repeated cross-fitting <resampling>
+    Sensitivity analysis <sensitivity>
 
 
 .. raw:: html
diff --git a/doc/guide/sensitivity.rst b/doc/guide/sensitivity.rst
@@ -0,0 +1,133 @@
+.. _sensitivity:
+
+Sensitivity analysis
+------------------------
+
+The :ref:`DoubleML <doubleml_package>` package implements sensitivity analysis with respect to ommitted variable bias
+based on `Chernozhukov et al. (2022) <https://www.nber.org/papers/w30302>`_.
+
+General algorithm
++++++++++++++++++
+
+Currently, the sensitivity analysis is only available for linear models.
+
+Assume that we can write the model in the following representation
+
+.. math::
+
+    \theta_0 = \mathbb{E}[m(W,g_0)],
+
+where usually :math:`g_0(W) = \mathbb{E}[Y|X, D]`.
+As long as :math:`\mathbb{E}[m(W,f)]` is a continuous linear functional of :math:`f`, there exists a unique square 
+integrable random variable :math:`\alpha_0(W)`, called Riesz representer
+(see `Riesz representation theorem <https://en.wikipedia.org/wiki/Riesz_representation_theorem>`_), such that
+
+.. math::
+
+    \theta_0 = \mathbb{E}[g_0(W)\alpha_0(W)].
+
+The target parameter :math:`\theta_0` has the following representation
+
+.. math::
+
+    \theta_0 = \mathbb{E}[m(W,g_0) + (Y-g_0(W))\alpha_0(W)],
+
+which corresponds to a Neyman orthogonal score function (orthogonal with respect to nuisance elements :math:`(g, \alpha)`).
+To bound the ommited variable bias, the following further elements are needed. 
+The variance of the main regression 
+
+.. math::
+
+    \sigma_0^2 := \mathbb{E}[(Y-g_0(W))^2]
+
+and the second moment of the Riesz representer 
+
+.. math::
+
+    \nu_0^2 := \mathbb{E}[\alpha_0(W)^2] =2\mathbb{E}[m(W,\alpha_0)] -  \mathbb{E}[\alpha_0(W)^2].
+
+Both representations are Neyman orthogonal with respect to :math:`g` and :math:`\alpha`, respectively.
+Further, define the corresponding score functions
+
+.. math::
+
+    \psi_{\sigma^2}(W, \sigma^2, g) &:= (Y-g_0(W))^2 - \sigma^2\\
+    \psi_{\nu^2}(W, \nu^2, \alpha) &:= 2m(W,\alpha) - \alpha(W)^2 - \nu^2.
+
+Recall that the parameter :math:`\theta_0` is identified via the moment condition
+
+.. math::
+
+    \theta_0 = \mathbb{E}[m(W,g_0)].
+
+If :math:`W=(Y, D, X)` does not include all confounding variables, the "true" target parameter :math:`\tilde{\theta}_0`
+would only be identified via the extendend (or "long") form
+
+.. math::
+
+    \tilde{\theta}_0 = \mathbb{E}[m(\tilde{W},\tilde{g}_0)],
+
+where :math:`\tilde{W}=(Y, D, X, A)` includes the unobserved counfounders :math:`A`.
+In Theorem 2 of their paper `Chernozhukov et al. (2022) <https://www.nber.org/papers/w30302>`_ are able to bound the ommited variable bias
+
+.. math::
+
+    |\tilde{\theta}_0 -\theta_0|^2 = \rho^2 B^2,
+
+where 
+
+.. math::
+
+    B^2 := \mathbb{E}\Big[\big(g(W) - \tilde{g}(\tilde{W})\big)^2\Big]\mathbb{E}\Big[\big(\alpha(W) - \tilde{\alpha}(\tilde{W})\big)^2\Big],
+
+denotes the product of additional variations in the outcome regression and Riesz representer generated by ommited confounders and
+
+.. math::
+
+    \rho^2 := \textrm{Cor}^2\Big(g(W) - \tilde{g}(\tilde{W}),\alpha(W) - \tilde{\alpha}(\tilde{W})\Big),
+
+denotes the correlations between the deviations generated by ommited confounders. Further, the bound can be expressed as
+
+.. math::
+
+    B^2 := S^2 C_Y^2 C_D^2,
+
+where
+
+.. math::
+
+    S^2 &:= \mathbb{E}\Big[\big(Y - g(W)\big)^2\Big]\mathbb{E}\big[\alpha(W)^2\big]
+
+    C_Y^2 &:= R^2_{Y-g \sim \tilde{g}-g}
+
+    C_D^2 &:= \frac{1 - R^2_{\tilde{\alpha} \sim \alpha}}{R^2_{\tilde{\alpha} \sim \alpha}}.
+
+Here, fo general random variables :math:`U` and :math:`V`  
+
+.. math::
+
+    R^2_{U \sim V} := \frac{\textrm{Var}(V)}{\textrm{Var}(U)}
+
+is defined as the variance ratio. 
+
+Let :math:`\psi(W,\theta,\eta)` the (correctly scaled) score function for the target parameter :math:`\theta_0`.
+Finally, for specified values of :math:`C_Y^2` and :math:`C_D^2`
+
+For more detail and interpretations see `Chernozhukov et al. (2022) <https://www.nber.org/papers/w30302>`_.
+
+.. _sensitivity-implementation:
+
+Implemented sensitivity procedures
++++++++++++++++++++++++++++++++++++
+
+This section contains the implementation details for each specific model.
+
+.. _plr-sensitivity:
+
+Partially linear regression model (PLR)
+***************************************
+
+.. _irm-sensitivity:
+
+Interactive regression model (IRM)
+**********************************