Skip to content

Commit 98ac799

Browse files
committed
Add docs for dataframe output feature
1 parent 500bcd6 commit 98ac799

File tree

1 file changed

+23
-0
lines changed

1 file changed

+23
-0
lines changed

README.rst

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -102,6 +102,29 @@ Now that the transformation is trained, we confirm that it works on new data::
102102
>>> np.round(mapper.transform(sample), 2)
103103
array([[ 1. , 0. , 0. , 1.04]])
104104

105+
106+
Outputting a dataframe
107+
**********************
108+
109+
By default the output of the dataframe mapper is a numpy array. This is so because most sklearn estimators expect a numpy array as input. If however we want the output of the mapper to be a dataframe, we can do so using the parameter ``df_out`` when creating the mapper::
110+
111+
>>> mapper_df = DataFrameMapper([
112+
... ('pet', sklearn.preprocessing.LabelBinarizer()),
113+
... (['children'], sklearn.preprocessing.StandardScaler())
114+
... ], df_out=True)
115+
>>> np.round(mapper_df.fit_transform(data.copy()), 2)
116+
pet_cat pet_dog pet_fish children
117+
0 1.0 0.0 0.0 0.21
118+
1 0.0 1.0 0.0 1.88
119+
2 0.0 1.0 0.0 -0.63
120+
3 0.0 0.0 1.0 -0.63
121+
4 1.0 0.0 0.0 -1.46
122+
5 0.0 1.0 0.0 -0.63
123+
6 1.0 0.0 0.0 1.04
124+
7 0.0 0.0 1.0 0.21
125+
126+
Note this does not work together with the ``default=True`` or ``sparse=True`` arguments to the mapper.
127+
105128
Transform Multiple Columns
106129
**************************
107130

0 commit comments

Comments
 (0)