Adapt nuisance est for `IV-type` score (PLR) & new score `IV-type` for PLIV #151

MalteKurz · 2022-05-20T09:30:31Z

Description

PLR

Nuisance estimation for IV type score: In this PR the nuisance estimation for the IV-type score in the PLR model is adapted to be in line with the DML paper Chernozhukov et al. (2018).
- Results for the default score='partialling out' (Equation (4.4) in Chernozhukov et al. (2018)) are not affected by the changes in this PR. However, the naming of the nuisance parameter is changed from ml_g to ml_l (analogously predictions g_hat have been renamed to l_hat, etc.) to be better in line with Chernozhukov et al. (2018). To make the transition to the new naming smooth, depreciation warnings have been added (see below for an overview of the API changes and examples for the depreciation warnings).
- For the score='IV-type' (Equation (4.3) in Chernozhukov et al. (2018)) the implementation now follows the approach described on pp. C31-C33 in Chernozhukov et al. (2018). This means that an initial estimate for theta_0 is obtained via the 'partialling out' score. Then an estimate for g_0(X) is obtained by regressing Y - theta_0 * D on X. Therefore, an additional learner (not needed to evaluate the score) needs to be provided, i.e., the nuisance function l_0(X) (needed for the preliminary theta_0 estimate) is estimated with learner ml_l and g_0(X) with learner ml_g. To make the transition to the new API (additional learner) smooth, depreciation warnings have been added (see below for an overview of the API changes and examples for the depreciation warnings). Especially, if only ml_g is specified but not ml_l, then ml_g = clone(ml_l) is being used and a warning is being thrown.

PLIV

In this PR a new score function for the PLIV model is implemented:
- Results for the default score='partialling out' (Equation (4.8) in Chernozhukov et al. (2018)) are not affected by the changes in this PR. However, the naming of the nuisance parameter is changed from ml_g to ml_l (analogously predictions g_hat to l_hat, etc.) to be better in line with Chernozhukov et al. (2018). To make the transition to the new naming smooth, depreciation warnings have been added (see below for examples).
- A new score='IV-type' (Equation (4.7) in Chernozhukov et al. (2018)) is now available for the PLIV model. The estimation of the nuisance parts follows the approach described on p. C33 in Chernozhukov et al. (2018). This means that an initial estimate for theta_0 is obtained via the 'partialling out' score. Then an estimate for g_0(X) is obtained by regressing Y - theta_0 * D on X. Therefore, two additional learners (not needed to evaluate the score) need to be provided, i.e., the nuisance functions l_0(X) and r_0(X) (needed for the preliminary theta_0 estimate) are estimated with learner ml_l and ml_r. g_0(X) is estimated with learner ml_g.

API changes

PLR

API changed from DoubleMLPLR(obj_dml_data, ml_g, ml_m [, ...]) to DoubleMLPLR(obj_dml_data, ml_l, ml_m, ml_g [, ...]).
- For score='partialling out' ml_l & ml_m are needed.
- For score='IV-type' ml_l, ml_m & ml_g.
- For callable scores ml_l & ml_m are mandatory and ml_g optional.
The signature of callable scores changed from psi_a, psi_b = score(y, d, g_hat, m_hat, smpls) to psi_a, psi_b = score(y, d, l_hat, m_hat, g_hat, smpls).

PLIV

API changed from DoubleMLPLIV(obj_dml_data, ml_g, ml_m, ml_r [, ...]) to DoubleMLPLIV(obj_dml_data, ml_g, ml_m, ml_r, ml_g [, ...]).
- For score='partialling out' ml_l, ml_m & ml_r are needed.
- For score='IV-type' ml_l, ml_m, ml_r & ml_g.
- For callable scores ml_l, ml_m & ml_r are mandatory and ml_g optional.
The signature of callable scores changed from psi_a, psi_b = score(y, z, d, g_hat, m_hat, r_hat, smpls) to psi_a, psi_b = score(y, z, d, l_hat, m_hat, r_hat, g_hat, smpls).

Depreciation warnings for the API changes for `DoubleMLPLR` and `DoubleMLPLIV`

Initialization code for the following code examples:

import numpy as np
import doubleml as dml
from doubleml.datasets import make_plr_CCDDHNR2018, make_pliv_CHS2015
from sklearn.ensemble import RandomForestRegressor
from sklearn.base import clone

learner = RandomForestRegressor(n_estimators=100, max_features=20, max_depth=5, min_samples_leaf=2)
ml_l = clone(learner)
ml_m = clone(learner)
ml_r = clone(learner)
ml_g = clone(learner)
plr_data = make_plr_CCDDHNR2018(n_obs=500)
pliv_data = make_pliv_CHS2015(n_obs=500)

For PLR & PLIV with score='partialling out' and if the learners are provided as positional arguments, nothing changed.

dml_plr_obj = dml.DoubleMLPLR(plr_data, ml_l, ml_m, score='partialling out')
dml_pliv_obj = dml.DoubleMLPLIV(pliv_data, ml_l, ml_m, ml_r, score='partialling out')

-- >Note however that, if, besides the learner, other arguments have also been provided as positional arguments, the changed API causes exceptions because the additional learner was added as fourth (PLR) / fifth (PLIV) argument

For PLR with score='partialling out' and keyword arguments ml_g and ml_m (old API naming), the learner provided for ml_g is used for ml_l and a warning is issued.

dml_plr_obj = dml.DoubleMLPLR(plr_data, ml_g=ml_g, ml_m=ml_m, score='partialling out')

DeprecationWarning: The required positional argument ml_g was renamed to ml_l. Please adapt the argument name accordingly. ml_g is redirected to ml_l. The redirection will be removed in a future version.

For PLR with score='IV-type' and keyword arguments ml_g and ml_m (old API naming), the learner provided for ml_g is also used for ml_l and a warning is issued. (Note it is first redirected to ml_l and then cloned to ml_g)

dml_plr_obj = dml.DoubleMLPLR(plr_data, ml_g=ml_g, ml_m=ml_m, score='IV-type')

DeprecationWarning: The required positional argument ml_g was renamed to ml_l. Please adapt the argument name accordingly. ml_g is redirected to ml_l. The redirection will be removed in a future version.
UserWarning: For score = 'IV-type', learners ml_l and ml_g should be specified. Set ml_g = clone(ml_l).

For PLR with score='IV-type' and only two learners as positional arguments, the learner provided for ml_g is used for ml_l and a warning is issued.

dml_plr_obj = dml.DoubleMLPLR(plr_data, ml_l, ml_m, score='IV-type')

UserWarning: For score = 'IV-type', learners ml_l and ml_g should be specified. Set ml_g = clone(ml_l).

For PLR & PLIV with score score='partialling out', the methods set_ml_nuisance_params and tune redirect ml_g to ml_l.

dml_plr_obj = dml.DoubleMLPLR(plr_data, ml_l, ml_m, score='partialling out')
dml_plr_obj.set_ml_nuisance_params('ml_g', 'd', {'n_estimators':100, 'max_features':20})

DeprecationWarning: Learner ml_g was renamed to ml_l. Please adapt the argument learner accordingly. The provided parameters are set for ml_l. The redirection will be removed in a future version.

Miscellaneous

When the score is set to a callable, it will in the future be called with keyword-arguments only (instead of positional arguments). This way is "safer" and in some way indirectly checks (up to a certain degree) that the signature of the function is as expected (see docu entry of the argument score for the expected signature). This was implemented for all model classes PLR, PLIV, IRM & IIVM
The website, user guide, etc will get an update to reflect the changes of this PR: Update of the basics of DML article; new score for PLIV; adaption due to changed API of DoubleMLPLR & DoubleMLPLIV doubleml-docs#73

PR Checklist

The title of the pull request summarizes the changes made.
The PR contains a detailed description of all changes and additions.
The code passes all (unit) tests.
Enhancements or new feature are equipped with unit tests.
The changes adhere to the PEP8 standards.

…e implemented in c86b332

… for IV-type score c86b332

…v-type-score

… ml_l

…m-pliv-iv-type

…v-type-score

… into m-plr-api

…m-pliv-iv-type

…ption if not all four learners ml_l, ml_m, ml_r and ml_g are set

…learner ml_g=None

…ore partialling out

…d with score ml_g (db74e05)

…liv-iv-type

PhilippBach

LGTM

upd preprocessing notebook according to changes in DoubleML/doubleml-for-py#151

MalteKurz added 30 commits February 10, 2022 16:35

adapt the nuisance estimation for the IV type score in the PLR model

c86b332

some fixes for the adaption of nuisance learning for the IV-type scor…

174ebad

…e implemented in c86b332

align unit tests with the adapted implementation of nuisance learning…

dd9ae4c

… for IV-type score c86b332

refactor the functional PLR implementation

64866df

update the documentation for the PLR model

d99ff13

fix documentation and error message

f5287e6

fix unit tests after refactoring in 64866df

acc4da6

fix unit tests after refactoring in 64866df

f562087

Merge branch 'master' of github.com:DoubleML/doubleml-for-py into m-i…

a3f56f1

…v-type-score

renamed ml_g to ml_l; Add additional learner ml_g for IV-type score

bb870cc

update documentation

4e6ecaf

simplify code to set learner and predict_method

81edf45

additional changes for the new API with ml_g and ml_l

405dc58

fix deprecation warning and add corresponding unit tests

93ad699

change order of predictions to be consistently l_hat, m_hat, g_hat

43a74bb

add unit test with callable for IV-type score

27980cf

increase number of observations in dummy test cases

16a90d4

added some additional unit tests for non-orth scores via callable scores

a7b6972

renamed ml_g into ml_l

7d0e0b7

also rename g_hat into l_hat; start implementing the IV-type score

871b762

renaming of ml_g to ml_l also in the unit tests

b93e0e8

also align names in the tuning parts with the renamed learner ml_g to…

f504590

… ml_l

adapt unit test after renaming ml_g to ml_l

22a2a72

implementation of the IV-type score for the PLIV model

80a0c6e

remove assert; score could also be a callable

ff944df

a couple more renamings ml_g to ml_l for consistency

dc0ef90

Merge branch 'm-plr-api' of github.com:DoubleML/doubleml-for-py into …

c718590

…m-pliv-iv-type

refactor the check and set learner part in the initializer

58a4674

pass n_folds as keyword argument to fix unit tests

1dadf61

Merge branch 'm-plr-api' of github.com:DoubleML/doubleml-for-py into …

a563dd0

…m-pliv-iv-type

MalteKurz added 8 commits May 5, 2022 16:12

fix notation in docu

577154f

provide all arguments as keyword arguments to callable scores

1fca759

Merge branch 'master' of github.com:DoubleML/doubleml-for-py into m-i…

7067cf7

…v-type-score

Merge branch 'm-iv-type-score' of github.com:DoubleML/doubleml-for-py…

ae90022

… into m-plr-api

Merge branch 'm-plr-api' of github.com:DoubleML/doubleml-for-py into …

9aecbae

…m-pliv-iv-type

docu update

51d64c1

IV type score is new for PLIV; no need to warn; instead throw an exce…

bfdb9fc

…ption if not all four learners ml_l, ml_m, ml_r and ml_g are set

add an additional exception handling unit test for pliv IV-type with …

9a513f0

…learner ml_g=None

MalteKurz added the enhancement extension of existing feature label May 20, 2022

MalteKurz added 6 commits May 20, 2022 11:37

pep8 fixes

3ce617f

add ignore for pylint in exception handling unit test

4cc0320

add ignore for pylint in exception handling unit test

76aeea8

with IV-type score, three learners should be specified

efcdd13

added another exception handling / warnings unit test

bc7c2c7

tune g only if it is necessary

7dcbceb

MalteKurz mentioned this pull request May 20, 2022

Update of the basics of DML article; new score for PLIV; adaption due to changed API of DoubleMLPLR & DoubleMLPLIV DoubleML/doubleml-docs#73

Merged

MalteKurz requested a review from PhilippBach May 23, 2022 08:59

MalteKurz added 3 commits June 10, 2022 08:32

add a warning if a learner ml_g is specified (but not needed) with sc…

db74e05

…ore partialling out

fix unit tests after the newly introduced warning if ml_g is specifie…

8eb4de7

…d with score ml_g (db74e05)

added a unit test for the new warning (see db74e05)

8875602

PhilippBach mentioned this pull request Jun 10, 2022

Reminder: Adapt new example notebooks according to change in 'IV-type' score DoubleML/doubleml-docs#78

Open

Merge branch 'master' of github.com:DoubleML/doubleml-for-py into m-p…

4346b23

…liv-iv-type

MalteKurz mentioned this pull request Jun 13, 2022

Release notes for the R and Python pkg in version 0.5.0 DoubleML/doubleml-docs#79

Merged

PhilippBach approved these changes Jun 13, 2022

View reviewed changes

MalteKurz merged commit 47fb25d into master Jun 14, 2022

PhilippBach added a commit to DoubleML/doubleml-docs that referenced this pull request Jun 14, 2022

da20407

upd preprocessing notebook according to changes in DoubleML/doubleml-for-py#151

PhilippBach mentioned this pull request Jun 14, 2022

Update Preprocessing Notebook for Demand Elasticity Example DoubleML/doubleml-docs#80

Merged

MalteKurz deleted the m-pliv-iv-type branch June 15, 2022 07:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Adapt nuisance est for `IV-type` score (PLR) & new score `IV-type` for PLIV #151

Adapt nuisance est for `IV-type` score (PLR) & new score `IV-type` for PLIV #151

Uh oh!

MalteKurz commented May 20, 2022 •

edited

Loading

Uh oh!

PhilippBach left a comment

Uh oh!

Uh oh!

Adapt nuisance est for IV-type score (PLR) & new score IV-type for PLIV #151

Adapt nuisance est for IV-type score (PLR) & new score IV-type for PLIV #151

Uh oh!

Conversation

MalteKurz commented May 20, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

PLR

PLIV

API changes

PLR

PLIV

Depreciation warnings for the API changes for DoubleMLPLR and DoubleMLPLIV

Miscellaneous

PR Checklist

Uh oh!

PhilippBach left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Adapt nuisance est for `IV-type` score (PLR) & new score `IV-type` for PLIV #151

Adapt nuisance est for `IV-type` score (PLR) & new score `IV-type` for PLIV #151

MalteKurz commented May 20, 2022 •

edited

Loading

Depreciation warnings for the API changes for `DoubleMLPLR` and `DoubleMLPLIV`