Skip to content

Commit 686bd46

Browse files
committed
Merge branch 'm-nonlinear-score-mixin' into dev
2 parents afc437e + 4930300 commit 686bd46

File tree

3 files changed

+37
-11
lines changed

3 files changed

+37
-11
lines changed

doc/api/api.rst

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -56,4 +56,16 @@ Dataset generators
5656
datasets.make_irm_data
5757
datasets.make_iivm_data
5858
datasets.make_plr_turrell2018
59-
datasets.make_pliv_multiway_cluster_CKMS2021
59+
datasets.make_pliv_multiway_cluster_CKMS2021
60+
61+
Score mixin classes for double machine learning models
62+
------------------------------------------------------
63+
64+
.. currentmodule:: doubleml
65+
66+
.. autosummary::
67+
:toctree: generated/
68+
:template: class.rst
69+
70+
double_ml_score_mixins.LinearScoreMixin
71+
double_ml_score_mixins.NonLinearScoreMixin

doc/guide/scores.rst

Lines changed: 12 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -21,12 +21,8 @@ and that obey the **Neyman orthogonality condition**
2121
2222
\partial_{\eta} \mathbb{E}[ \psi(W; \theta_0, \eta)] \bigg|_{\eta=\eta_0} = 0.
2323
24-
An integral component for the object-oriented (OOP) implementation of
25-
``DoubleMLPLR``,
26-
``DoubleMLPLIV``,
27-
``DoubleMLIRM``,
28-
and ``DoubleMLIIVM``
29-
is the linearity of the score function in the parameter :math:`\theta`
24+
The score functions of many double machine learning models (PLR, PLIV, IRM, IIVM) are linear in the parameter
25+
:math:`\theta`, i.e.,
3026

3127
.. math::
3228
@@ -43,7 +39,14 @@ general way.
4339
The methods and algorithms to estimate the causal parameters, to estimate their standard errors, to perform a multiplier
4440
bootstrap, to obtain confidence intervals and many more are implemented in the abstract base class ``DoubleML``.
4541
The object-oriented architecture therefore allows for easy extension to new model classes for double machine learning.
46-
This is doable with very minor effort whenever the linearity of the score function is satisfied.
42+
This is doable with very minor effort.
43+
44+
If the linearity of the score function is not satisfied, the computations are more involved.
45+
In the Python package ``DoubleML``, the functionality around the score functions is implemented in mixin classes called
46+
``LinearScoreMixin`` and ``NonLinearScoreMixin``.
47+
The R package currently only comes with an implementation for linear score functions.
48+
In case of a non-linear score function, the parameter estimate :math:`\tilde{\theta}_0` is obtained via numerical root
49+
search of the empirical analog of the moment condition :math:`\mathbb{E}[ \psi(W; \theta_0, \eta_0)] = 0`.
4750

4851
Implementation of the score function and the estimate of the causal parameter
4952
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@@ -106,7 +109,8 @@ stores the estimate :math:`\tilde{\theta}_0` in its ``coef`` attribute.
106109
print(dml_plr_obj$coef)
107110

108111
The values of the score function components :math:`\psi_a(W_i; \hat{\eta}_0)` and :math:`\psi_b(W_i; \hat{\eta}_0)`
109-
are stored in the attributes ``psi_a`` and ``psi_b``.
112+
are stored in the attributes ``psi_elements['psi_a']`` and ``psi_elements['psi_b']`` (Python package ``DoubleML``)
113+
and ``psi_a`` and ``psi_b`` (R package ``DoubleML``).
110114
In the attribute ``psi`` the values of the score function :math:`\psi(W_i; \tilde{\theta}_0, \hat{\eta}_0)` are stored.
111115

112116
.. tabbed:: Python

doc/guide/se_confint.rst

Lines changed: 12 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -19,15 +19,25 @@ with mean zero and variance given by
1919
2020
\sigma^2 := J_0^{-2} \mathbb{E}(\psi^2(W; \theta_0, \eta_0)),
2121
22-
J_0 = \mathbb{E}(\psi_a(W; \eta_0)).
22+
where :math:`J_0 = \mathbb{E}(\psi_a(W; \eta_0))`, if the score function is linear in the parameter :math:`\theta`.
23+
If the score is not linear in the parameter :math:`\theta`, then
24+
:math:`J_0 = \partial_\theta\mathbb{E}(\psi(W; \theta, \eta_0)) \big|_{\theta=\theta_0}`.
2325

2426
Estimates of the variance are obtained by
2527

2628
.. math::
2729
2830
\hat{\sigma}^2 &= \hat{J}_0^{-2} \frac{1}{N} \sum_{k=1}^{K} \sum_{i \in I_k} \big[\psi(W_i; \tilde{\theta}_0, \hat{\eta}_{0,k})\big]^2,
2931
30-
\hat{J}_0 &= \frac{1}{N} \sum_{k=1}^{K} \sum_{i \in I_k} \psi_a(W_i; \hat{\eta}_{0,k}).
32+
\hat{J}_0 &= \frac{1}{N} \sum_{k=1}^{K} \sum_{i \in I_k} \psi_a(W_i; \hat{\eta}_{0,k}),
33+
34+
for score functions being linear in the parameter :math:`\theta`.
35+
For non-linear score functions, the implementation assumes that derivatives and expectations are interchangeable, so
36+
that
37+
38+
.. math::
39+
40+
\hat{J}_0 = \frac{1}{N} \sum_{k=1}^{K} \sum_{i \in I_k} \partial_\theta \psi(W_i; \tilde{\theta}_0, \hat{\eta}_{0,k}).
3141
3242
An approximate confidence interval is given by
3343

0 commit comments

Comments
 (0)