|
1 | 1 | .. _se_confint:
|
2 | 2 |
|
3 |
| -Variance estimation and confidence intervals for a causal parameter of interest |
4 |
| -------------------------------------------------------------------------------- |
| 3 | +Variance estimation and confidence intervals |
| 4 | +--------------------------------------------- |
5 | 5 |
|
6 | 6 | Variance estimation
|
7 | 7 | +++++++++++++++++++
|
@@ -168,3 +168,135 @@ string-representation of the object.
|
168 | 168 |
|
169 | 169 | print(dml_plr_obj)
|
170 | 170 |
|
| 171 | +.. _sim_inf: |
| 172 | + |
| 173 | +Confidence bands and multiplier bootstrap for valid simultaneous inference |
| 174 | ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ |
| 175 | + |
| 176 | +:ref:`DoubleML <doubleml_package>` provides methods to perform valid simultaneous inference for multiple treatment variables. |
| 177 | +As an example, consider a PLR with :math:`p_1` causal parameters of interest :math:`\theta_{0,1}, \ldots, \theta_{0,p_1}` associated with |
| 178 | +treatment variables :math:`D_1, \ldots, D_{p_1}`. Inference on multiple target coefficients can be performed by iteratively applying the DML inference procedure over the target variables of |
| 179 | +interests: Each of the coefficients of interest, :math:`\theta_{0,j}`, with :math:`j \in \lbrace 1, \ldots, p_1 \rbrace`, solves a corresponding moment condition |
| 180 | + |
| 181 | +.. math:: |
| 182 | +
|
| 183 | + \mathbb{E}[ \psi_j(W; \theta_{0,j}, \eta_{0,j})] = 0. |
| 184 | +
|
| 185 | +Analogously to the case with a single parameter of interest, the PLR model with multiple treatment variables includes two regression steps to achieve orthogonality. |
| 186 | +First, the main regression is given by |
| 187 | + |
| 188 | +.. math:: |
| 189 | +
|
| 190 | + Y = D_j \theta_{0,j} + g_{0,j}([D_k, X]) + \zeta_j, \quad \mathbb{E}(\zeta_j | D, X) = 0, |
| 191 | +
|
| 192 | +with :math:`[D_k, X]` being a matrix comprising the confounders, :math:`X`, and all remaining treatment variables |
| 193 | +:math:`D_k` with :math:`k \in \lbrace 1, \ldots, p_1\rbrace \setminus j`, by default. |
| 194 | +Second, the relationship between the treatment variable :math:`D_j` and the remaining explanatory variables is determined by the equation |
| 195 | + |
| 196 | +.. math:: |
| 197 | +
|
| 198 | + D_j = m_{0,j}([D_k, X]) + V_j, \quad \mathbb{E}(V_j | D_k, X) = 0, |
| 199 | +
|
| 200 | +For further details, we refer to Belloni et al. (2018). Simultaneous inference can be based on a multiplier bootstrap procedure introduced in Chernozhukov et al. (2013, 2014). |
| 201 | +Alternatively, traditional correction approaches, for example the Bonferroni correction, can be used to adjust p-values. |
| 202 | + |
| 203 | +The ``bootstrap()`` method provides an implementation of a multiplier bootstrap for double machine learning models. |
| 204 | +For :math:`b=1, \ldots, B` weights :math:`\xi_{i, b}` are generated according to a normal (Gaussian) bootstrap, wild |
| 205 | +bootstrap or exponential bootstrap. |
| 206 | +The number of bootstrap samples is provided as input ``n_rep_boot`` and for ``method`` one can choose ``'Bayes'``, |
| 207 | +``'normal'`` or ``'wild'``. |
| 208 | +Based on the estimates of the standard errors :math:`\hat{\sigma}_j` |
| 209 | +and :math:`\hat{J}_{0,j} = \mathbb{E}_N(\psi_{a,j}(W; \eta_{0,j}))` |
| 210 | +that are obtained from DML, we construct bootstrap coefficients |
| 211 | +:math:`\theta^{*,b}_j` and bootstrap t-statistics :math:`t^{*,b}_j` |
| 212 | +for :math:`j=1, \ldots, p_1` |
| 213 | + |
| 214 | +.. math:: |
| 215 | +
|
| 216 | + \theta^{*,b}_{j} &= \frac{1}{\sqrt{N} \hat{J}_{0,j}}\sum_{k=1}^{K} \sum_{i \in I_k} \xi_{i}^b \cdot \psi_j(W_i; \tilde{\theta}_{0,j}, \hat{\eta}_{0,j;k}), |
| 217 | +
|
| 218 | + t^{*,b}_{j} &= \frac{1}{\sqrt{N} \hat{J}_{0,j} \hat{\sigma}_{j}} \sum_{k=1}^{K} \sum_{i \in I_k} \xi_{i}^b \cdot \psi_j(W_i; \tilde{\theta}_{0,j}, \hat{\eta}_{0,j;k}). |
| 219 | +
|
| 220 | +The output of the multiplier bootstrap can be used to determine the constant, :math:`c_{1-\alpha}` that is required for the construction of a |
| 221 | +simultaneous :math:`(1-\alpha)` confidence band |
| 222 | + |
| 223 | +.. math:: |
| 224 | +
|
| 225 | + \left[\tilde\theta_{0,j} \pm c_{1-\alpha} \cdot \hat\sigma_j/\sqrt{N} \right]. |
| 226 | +
|
| 227 | +To demonstrate the bootstrap, we simulate data from a sparse partially linear regression model. |
| 228 | +Then we estimate the PLR model and perform the multiplier bootstrap. |
| 229 | +Joint confidence intervals based on the multiplier bootstrap are then obtained by setting the option ``joint`` |
| 230 | +when calling the method ``confint``. |
| 231 | + |
| 232 | +Moreover, a multiple hypotheses testing adjustment of p-values from a high-dimensional model can be obtained with |
| 233 | +the method ``p_adjust``. :ref:`DoubleML <doubleml_package>` performs a version of the Romano-Wolf stepdown adjustment, |
| 234 | +which is based on the multiplier bootstrap, by default. Alternatively, ``p_adjust`` allows users to apply traditional corrections |
| 235 | +via the option ``method``. |
| 236 | + |
| 237 | +.. tabbed:: Python |
| 238 | + |
| 239 | + .. ipython:: python |
| 240 | +
|
| 241 | + import doubleml as dml |
| 242 | + import numpy as np |
| 243 | + from sklearn.base import clone |
| 244 | + from sklearn.linear_model import LassoCV |
| 245 | +
|
| 246 | + # Simulate data |
| 247 | + np.random.seed(1234) |
| 248 | + n_obs = 500 |
| 249 | + n_vars = 100 |
| 250 | + X = np.random.normal(size=(n_obs, n_vars)) |
| 251 | + theta = np.array([3., 3., 3.]) |
| 252 | + y = np.dot(X[:, :3], theta) + np.random.standard_normal(size=(n_obs,)) |
| 253 | +
|
| 254 | + dml_data = dml.DoubleMLData.from_arrays(X[:, 10:], y, X[:, :10]) |
| 255 | +
|
| 256 | + learner = LassoCV() |
| 257 | + ml_l = clone(learner) |
| 258 | + ml_m = clone(learner) |
| 259 | + dml_plr = dml.DoubleMLPLR(dml_data, ml_l, ml_m) |
| 260 | +
|
| 261 | + print(dml_plr.fit().bootstrap().confint(joint=True)) |
| 262 | + print(dml_plr.p_adjust()) |
| 263 | + print(dml_plr.p_adjust(method='bonferroni')) |
| 264 | +
|
| 265 | +.. tabbed:: R |
| 266 | + |
| 267 | + .. jupyter-execute:: |
| 268 | + |
| 269 | + library(DoubleML) |
| 270 | + library(mlr3) |
| 271 | + library(mlr3learners) |
| 272 | + library(data.table) |
| 273 | + lgr::get_logger("mlr3")$set_threshold("warn") |
| 274 | + |
| 275 | + set.seed(3141) |
| 276 | + n_obs = 500 |
| 277 | + n_vars = 100 |
| 278 | + theta = rep(3, 3) |
| 279 | + X = matrix(stats::rnorm(n_obs * n_vars), nrow = n_obs, ncol = n_vars) |
| 280 | + y = X[, 1:3, drop = FALSE] %*% theta + stats::rnorm(n_obs) |
| 281 | + dml_data = double_ml_data_from_matrix(X = X[, 11:n_vars], y = y, d = X[,1:10]) |
| 282 | + |
| 283 | + learner = lrn("regr.cv_glmnet", s="lambda.min") |
| 284 | + ml_l = learner$clone() |
| 285 | + ml_m = learner$clone() |
| 286 | + dml_plr = DoubleMLPLR$new(dml_data, ml_l, ml_m) |
| 287 | + |
| 288 | + dml_plr$fit() |
| 289 | + dml_plr$bootstrap() |
| 290 | + dml_plr$confint(joint=TRUE) |
| 291 | + dml_plr$p_adjust() |
| 292 | + dml_plr$p_adjust(method="bonferroni") |
| 293 | + |
| 294 | + |
| 295 | +References |
| 296 | +++++++++++ |
| 297 | + |
| 298 | +* Belloni, A., Chernozhukov, V., Chetverikov, D., Wei, Y. (2018), Uniformly valid post-regularization confidence regions for many functional parameters in z-estimation framework. The Annals of Statistics, 46 (6B): 3643-75, `doi: 10.1214/17-AOS1671 <https://dx.doi.org/10.1214%2F17-AOS1671>`_. |
| 299 | + |
| 300 | +* Chernozhukov, V., Chetverikov, D., Kato, K. (2013). Gaussian approximations and multiplier bootstrap for maxima of sums of high-dimensional random vectors. The Annals of Statistics 41 (6): 2786-2819, `doi: 10.1214/13-AOS1161 <https://dx.doi.org/10.1214/13-AOS1161>`_. |
| 301 | + |
| 302 | +* Chernozhukov, V., Chetverikov, D., Kato, K. (2014), Gaussian approximation of suprema of empirical processes. The Annals of Statistics 42 (4): 1564-97, `doi: 10.1214/14-AOS1230 <https://dx.doi.org/10.1214/14-AOS1230>`_. |
0 commit comments