Skip to content

Commit c960e68

Browse files
authored
Delete KernelExplainerWrapper and remove importing LogitLink and IdentityLink (#1603)
1 parent 0fc7201 commit c960e68

File tree

2 files changed

+13
-69
lines changed

2 files changed

+13
-69
lines changed

autopilot/model-explainability/explaining_customer_churn_model.ipynb

Lines changed: 13 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
"\n",
99
"Kernel `Python 3 (Data Science)` works well with this notebook.\n",
1010
"\n",
11-
"_This notebook was created and tested on an ml.m5.large notebook instance._\n",
11+
"_This notebook was created and tested on an ml.m5.xlarge notebook instance._\n",
1212
"\n",
1313
"## Table of Contents\n",
1414
"\n",
@@ -101,9 +101,8 @@
101101
"source": [
102102
"import shap\n",
103103
"\n",
104-
"from kernel_explainer_wrapper import KernelExplainerWrapper\n",
104+
"from shap import KernelExplainer\n",
105105
"from shap import sample\n",
106-
"from shap.common import LogitLink, IdentityLink\n",
107106
"from scipy.special import expit\n",
108107
"\n",
109108
"# Initialize plugin to make plots interactive.\n",
@@ -235,7 +234,7 @@
235234
"metadata": {},
236235
"outputs": [],
237236
"source": [
238-
"churn_data = pd.read_csv('./Data sets/churn.txt')\n",
237+
"churn_data = pd.read_csv('../Data sets/churn.txt')\n",
239238
"data_without_target = churn_data.drop(columns=['Churn?'])\n",
240239
"\n",
241240
"background_data = sample(data_without_target, 50)"
@@ -252,7 +251,10 @@
252251
"cell_type": "markdown",
253252
"metadata": {},
254253
"source": [
255-
"Next, we create the `KernelExplainer`. Note that since it's a black box explainer, `KernelExplainer` only requires a handle to the predict (or predict_proba) function and does not require any other information about the model. For classification it is recommended to derive feature importance scores in the log-odds space since additivity is a more natural assumption there thus we use `LogitLink`. For regression `IdentityLink` should be used."
254+
"Next, we create the `KernelExplainer`. Note that since it's a black box explainer, `KernelExplainer` only requires a handle to the\n",
255+
"predict (or predict_proba) function and does not require any other information about the model. For classification it is recommended to\n",
256+
"derive feature importance scores in the log-odds space since additivity is a more natural assumption there thus we use `logit`. For\n",
257+
"regression `identity` should be used."
256258
]
257259
},
258260
{
@@ -263,17 +265,16 @@
263265
"source": [
264266
"# Derive link function \n",
265267
"problem_type = automl_job.describe_auto_ml_job(job_name=automl_job_name)['ResolvedAttributes']['ProblemType'] \n",
266-
"link_fn = IdentityLink if problem_type == 'Regression' else LogitLink \n",
268+
"link = \"identity\" if problem_type == 'Regression' else \"logit\"\n",
267269
"\n",
268-
"# the handle to predict_proba is passed to KernelExplainerWrapper since KernelSHAP requires the class probability\n",
269-
"explainer = KernelExplainerWrapper(automl_estimator.predict_proba, background_data, link=link_fn())"
270+
"# the handle to predict_proba is passed to KernelExplainer since KernelSHAP requires the class probability\n",
271+
"explainer = KernelExplainer(automl_estimator.predict_proba, background_data, link=link)"
270272
]
271273
},
272274
{
273275
"cell_type": "markdown",
274276
"metadata": {},
275277
"source": [
276-
"Currently, `shap.KernelExplainer` only supports numeric data. A version of SHAP that supports text will become available soon. A workaround is provided by our wrapper `KernelExplainerWrapper`. Once a new version of SHAP is released, `shap.KernelExplainer` should be used instead of `KernelExplainerWrapper`.\n",
277278
"\n",
278279
"By analyzing the background data `KernelExplainer` provides us with `explainer.expected_value` which is the model prediction with all features missing. Considering a customer for which we have no data at all (i.e. all features are missing) this should theoretically be the model prediction."
279280
]
@@ -326,7 +327,7 @@
326327
"outputs": [],
327328
"source": [
328329
"# Since shap_values are provided in the log-odds space, we convert them back to the probability space by using LogitLink\n",
329-
"shap.force_plot(explainer.expected_value, shap_values, x, link=link_fn())"
330+
"shap.force_plot(explainer.expected_value, shap_values, x, link=link)"
330331
]
331332
},
332333
{
@@ -348,7 +349,7 @@
348349
"source": [
349350
"with ManagedEndpoint(ep_name) as mep:\n",
350351
" shap_values = explainer.shap_values(x, nsamples='auto', l1_reg='num_features(5)')\n",
351-
"shap.force_plot(explainer.expected_value, shap_values, x, link=link_fn())"
352+
"shap.force_plot(explainer.expected_value, shap_values, x, link=link)"
352353
]
353354
},
354355
{
@@ -396,7 +397,7 @@
396397
"metadata": {},
397398
"outputs": [],
398399
"source": [
399-
"shap.force_plot(explainer.expected_value, shap_values, X, link=link_fn())"
400+
"shap.force_plot(explainer.expected_value, shap_values, X, link=link)"
400401
]
401402
},
402403
{

autopilot/model-explainability/kernel_explainer_wrapper.py

Lines changed: 0 additions & 57 deletions
This file was deleted.

0 commit comments

Comments
 (0)