@@ -12,7 +12,7 @@ SLEP018: Pandas Output for Transformers with set_output
12
12
Abstract
13
13
--------
14
14
15
- This SLEP proposes a ``set_output `` method to configure the output container of
15
+ This SLEP proposes a ``set_output `` method to configure the output data container of
16
16
scikit-learn transformers.
17
17
18
18
Detailed description
@@ -32,9 +32,10 @@ The index of the output DataFrame must match the index of the input. If the
32
32
transformer does not support ``transform="pandas" ``, then it must raise a
33
33
``ValueError `` stating that it does not support the feature.
34
34
35
- For this SLEP, ``set_output `` will only configure the output for dense data. If
36
- the transformer returns sparse data, then ``transform `` will raise a
37
- ``ValueError `` if ``set_output(transform="pandas") ``.
35
+ This SLEP's only focus is dense data for ``set_output ``. If a transformer returns
36
+ sparse data, e.g. `OneHotEncoder(sparse=True), then ``transform `` will raise a
37
+ ``ValueError `` if ``set_output(transform="pandas") ``. Dealing with sparse output
38
+ might be the scope of another future SLEP.
38
39
39
40
For a pipeline, calling ``set_output `` on the pipeline will configure all steps
40
41
in the pipeline::
@@ -44,6 +45,9 @@ in the pipeline::
44
45
45
46
# X_trans_df is a pandas DataFrame
46
47
X_trans_df = num_preprocessor.fit_transform(X_df)
48
+
49
+ # X_trans_df is again a pandas DataFrame
50
+ X_trans_df = num_preprocessor[0].transform(X_df)
47
51
48
52
Meta-estimators that support ``set_output `` are required to configure all inner
49
53
transformer by calling ``set_output ``. If an inner transformer does not define
@@ -52,7 +56,7 @@ transformer by calling ``set_output``. If an inner transformer does not define
52
56
Global Configuration
53
57
....................
54
58
55
- This SLEP proposes a global configuration flag that sets the output for all
59
+ For ease of use, this SLEP proposes a global configuration flag that sets the output for all
56
60
transformers::
57
61
58
62
import sklearn
@@ -64,7 +68,7 @@ determines the output container.
64
68
Implementation
65
69
--------------
66
70
67
- The implementation of this SLEP is in :pr: `23734 `.
71
+ A possible implementation of this SLEP is worked out in :pr: `23734 `.
68
72
69
73
Backward compatibility
70
74
----------------------
@@ -99,7 +103,7 @@ A list of issues discussing Pandas output are: `#14315
99
103
100
104
Future Extensions
101
105
-----------------
102
-
106
+ For information only!
103
107
Sparse Data
104
108
...........
105
109
0 commit comments