add summary to GATE and CATE example

SvenKlaassen · SvenKlaassen · commit bc82d0001ef9 · 2022-12-08T11:01:14.000+01:00
diff --git a/.gitignore b/.gitignore
@@ -0,0 +1 @@
+*idea
diff --git a/doc/examples/py_double_ml_cate.ipynb b/doc/examples/py_double_ml_cate.ipynb
diff --git a/doc/examples/py_double_ml_gate.ipynb b/doc/examples/py_double_ml_gate.ipynb
@@ -8,10 +8,7 @@
     "In this simple example, we illustrate how the [DoubleML](https://docs.doubleml.org/stable/index.html) package can be used to estimate group average treatment effects."
    ],
    "metadata": {
-    "collapsed": false,
-    "pycharm": {
-     "name": "#%% md\n"
-    }
+    "collapsed": false
    }
   },
   {
@@ -22,20 +19,14 @@
     "The data will be generated with a simple data generating process to enable us to know the true group effects."
    ],
    "metadata": {
-    "collapsed": false,
-    "pycharm": {
-     "name": "#%% md\n"
-    }
+    "collapsed": false
    }
   },
   {
    "cell_type": "code",
    "execution_count": 1,
    "metadata": {
-    "collapsed": true,
-    "pycharm": {
-     "name": "#%%\n"
-    }
+    "collapsed": true
    },
    "outputs": [],
    "source": [
@@ -51,10 +42,7 @@
     "For simplicity, the treatment effect within each group is generated to be constant, such that it corresponds to the group average treatment effect."
    ],
    "metadata": {
-    "collapsed": false,
-    "pycharm": {
-     "name": "#%% md\n"
-    }
+    "collapsed": false
    }
   },
   {
@@ -72,10 +60,7 @@
     "    return te"
    ],
    "metadata": {
-    "collapsed": false,
-    "pycharm": {
-     "name": "#%%\n"
-    }
+    "collapsed": false
    }
   },
   {
@@ -100,10 +85,7 @@
     " \\end{cases}.$$"
    ],
    "metadata": {
-    "collapsed": false,
-    "pycharm": {
-     "name": "#%% md\n"
-    }
+    "collapsed": false
    }
   },
   {
@@ -169,10 +151,7 @@
     "    return data, covariates"
    ],
    "metadata": {
-    "collapsed": false,
-    "pycharm": {
-     "name": "#%%\n"
-    }
+    "collapsed": false
    }
   },
   {
@@ -181,10 +160,7 @@
     "We will consider a quite small number of covariates to ensure fast calcualtion."
    ],
    "metadata": {
-    "collapsed": false,
-    "pycharm": {
-     "name": "#%% md\n"
-    }
+    "collapsed": false
    }
   },
   {
@@ -206,10 +182,7 @@
     "                                 x_cols=covariates)"
    ],
    "metadata": {
-    "collapsed": false,
-    "pycharm": {
-     "name": "#%%\n"
-    }
+    "collapsed": false
    }
   },
   {
@@ -219,10 +192,7 @@
     "The first step is to fit a [DoubleML IRM Model](https://docs.doubleml.org/stable/guide/models.html#interactive-regression-model-irm) to the data."
    ],
    "metadata": {
-    "collapsed": false,
-    "pycharm": {
-     "name": "#%% md\n"
-    }
+    "collapsed": false
    }
   },
   {
@@ -238,7 +208,7 @@
     },
     {
      "data": {
-      "text/plain": "<doubleml.double_ml_irm.DoubleMLIRM at 0x25f62a87430>"
+      "text/plain": "<doubleml.double_ml_irm.DoubleMLIRM at 0x2ca8ff66a40>"
      },
      "execution_count": 5,
      "metadata": {},
@@ -262,10 +232,7 @@
     "dml_irm.fit(store_predictions=True)"
    ],
    "metadata": {
-    "collapsed": false,
-    "pycharm": {
-     "name": "#%%\n"
-    }
+    "collapsed": false
    }
   },
   {
@@ -275,10 +242,7 @@
     "Next, we can specify the groups as [DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html) with boolean columns."
    ],
    "metadata": {
-    "collapsed": false,
-    "pycharm": {
-     "name": "#%% md\n"
-    }
+    "collapsed": false
    }
   },
   {
@@ -292,10 +256,7 @@
     "             columns=['Group 1', 'Group 2', 'Group 3'])"
    ],
    "metadata": {
-    "collapsed": false,
-    "pycharm": {
-     "name": "#%%\n"
-    }
+    "collapsed": false
    }
   },
   {
@@ -316,10 +277,7 @@
     "groups.head()"
    ],
    "metadata": {
-    "collapsed": false,
-    "pycharm": {
-     "name": "#%%\n"
-    }
+    "collapsed": false
    }
   },
   {
@@ -328,10 +286,7 @@
     "To calculate GATEs just call the ``gate()`` method and supply the DataFrame with the group definitions and the ``level`` (with default of ``0.95``)."
    ],
    "metadata": {
-    "collapsed": false,
-    "pycharm": {
-     "name": "#%% md\n"
-    }
+    "collapsed": false
    }
   },
   {
@@ -354,10 +309,7 @@
     "print(gate.confint(level=0.95))"
    ],
    "metadata": {
-    "collapsed": false,
-    "pycharm": {
-     "name": "#%%\n"
-    }
+    "collapsed": false
    }
   },
   {
@@ -366,10 +318,7 @@
     "The confidence intervals above are point-wise, but by setting the option ``joint`` and providing a number of bootstrap repetitions ``n_rep_boot``."
    ],
    "metadata": {
-    "collapsed": false,
-    "pycharm": {
-     "name": "#%% md\n"
-    }
+    "collapsed": false
    }
   },
   {
@@ -392,10 +341,7 @@
     "print(ci)"
    ],
    "metadata": {
-    "collapsed": false,
-    "pycharm": {
-     "name": "#%%\n"
-    }
+    "collapsed": false
    }
   },
   {
@@ -404,10 +350,7 @@
     "Finally, let us plot the estimates together with the true effect within each group."
    ],
    "metadata": {
-    "collapsed": false,
-    "pycharm": {
-     "name": "#%% md\n"
-    }
+    "collapsed": false
    }
   },
   {
@@ -444,12 +387,9 @@
    ],
    "metadata": {
     "collapsed": false,
-    "pycharm": {
-     "name": "#%%\n"
-    },
     "tags": [
-        "nbsphinx-thumbnail"
-       ]
+     "nbsphinx-thumbnail"
+    ]
    }
   },
   {
@@ -458,10 +398,7 @@
     "It is also possible to supply disjoint groups as a single vector (still as a data frame). Remark the slightly different name."
    ],
    "metadata": {
-    "collapsed": false,
-    "pycharm": {
-     "name": "#%% md\n"
-    }
+    "collapsed": false
    }
   },
   {
@@ -491,10 +428,7 @@
     "groups.head()"
    ],
    "metadata": {
-    "collapsed": false,
-    "pycharm": {
-     "name": "#%%\n"
-    }
+    "collapsed": false
    }
   },
   {
@@ -518,10 +452,71 @@
     "print(ci)"
    ],
    "metadata": {
-    "collapsed": false,
-    "pycharm": {
-     "name": "#%%\n"
+    "collapsed": false
+   }
+  },
+  {
+   "cell_type": "markdown",
+   "source": [
+    "The coefficients of the best linear predictor can be seen via the summary (the values can be accessed through the underlying model ``.blp_model``)."
+   ],
+   "metadata": {
+    "collapsed": false
+   }
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 32,
+   "outputs": [
+    {
+     "data": {
+      "text/plain": "             coef   std err          t         P>|t|    [0.025    0.975]\nGroup_1  3.126939  0.149758  20.879973  5.621183e-70  2.832703  3.421176\nGroup_2  1.234062  0.142020   8.689369  5.300034e-17  0.955029  1.513095\nGroup_3  0.026815  0.117683   0.227859  8.198497e-01 -0.204402  0.258032",
+      "text/html": "<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead th {\n        text-align: right;\n    }\n</style>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th>coef</th>\n      <th>std err</th>\n      <th>t</th>\n      <th>P&gt;|t|</th>\n      <th>[0.025</th>\n      <th>0.975]</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>Group_1</th>\n      <td>3.126939</td>\n      <td>0.149758</td>\n      <td>20.879973</td>\n      <td>5.621183e-70</td>\n      <td>2.832703</td>\n      <td>3.421176</td>\n    </tr>\n    <tr>\n      <th>Group_2</th>\n      <td>1.234062</td>\n      <td>0.142020</td>\n      <td>8.689369</td>\n      <td>5.300034e-17</td>\n      <td>0.955029</td>\n      <td>1.513095</td>\n    </tr>\n    <tr>\n      <th>Group_3</th>\n      <td>0.026815</td>\n      <td>0.117683</td>\n      <td>0.227859</td>\n      <td>8.198497e-01</td>\n      <td>-0.204402</td>\n      <td>0.258032</td>\n    </tr>\n  </tbody>\n</table>\n</div>"
+     },
+     "execution_count": 32,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "gate.summary"
+   ],
+   "metadata": {
+    "collapsed": false
+   }
+  },
+  {
+   "cell_type": "markdown",
+   "source": [
+    "Remark that the confidence intervals are slightly smaller, since they are not based on the White's heteroskedasticity robus standard errors."
+   ],
+   "metadata": {
+    "collapsed": false
+   }
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 33,
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "================== DoubleMLBLP Object ==================\n",
+      "\n",
+      "------------------ Fit summary ------------------\n",
+      "             coef   std err          t         P>|t|    [0.025    0.975]\n",
+      "Group_1  3.126939  0.149758  20.879973  5.621183e-70  2.832703  3.421176\n",
+      "Group_2  1.234062  0.142020   8.689369  5.300034e-17  0.955029  1.513095\n",
+      "Group_3  0.026815  0.117683   0.227859  8.198497e-01 -0.204402  0.258032\n"
+     ]
     }
+   ],
+   "source": [
+    "print(gate)"
+   ],
+   "metadata": {
+    "collapsed": false
    }
   },
   {
@@ -552,10 +547,7 @@
     "_ =  plt.ylabel('Effect and 95%-CI')"
    ],
    "metadata": {
-    "collapsed": false,
-    "pycharm": {
-     "name": "#%%\n"
-    }
+    "collapsed": false
    }
   }
  ],
@@ -580,4 +572,4 @@
  },
  "nbformat": 4,
  "nbformat_minor": 0
-}
+}