Skip to content

Commit cb33d70

Browse files
committed
Address comments in aws#1776
1 parent 0231d76 commit cb33d70

File tree

2 files changed

+18
-4
lines changed

2 files changed

+18
-4
lines changed

doc/v2.rst

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -207,6 +207,8 @@ The follow serializer/deserializer classes have been renamed and/or moved:
207207
| ``sagemaker.predictor._JsonDeserializer`` | ``sagemaker.deserializers.JSONDeserializer`` |
208208
+--------------------------------------------------------+-------------------------------------------------------+
209209

210+
``sagemaker.serializers.LibSVMSerializer`` has been added in v2.0.
211+
210212
``distributions``
211213
~~~~~~~~~~~~~~~~~
212214

@@ -269,6 +271,11 @@ TensorFlow Serving Predictor
269271
``sagemaker.tensorflow.serving.Predictor`` has been renamed to :class:`sagemaker.tensorflow.model.TensorFlowPredictor`.
270272
(For the previous implementation of that class, see `Deprecate Legacy TensorFlow <#deprecate-legacy-tensorflow>`_).
271273

274+
XGBoost Predictor
275+
~~~~~~~~~~~~~~~~~
276+
277+
The default seriazlier of ``sagemaker.xgboost.model.XGBoostPredictor`` has been changed from ``NumpySerializer`` to ``LibSVMSerializer``.
278+
272279

273280
Airflow
274281
-------

src/sagemaker/serializers.py

Lines changed: 11 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -54,7 +54,7 @@ def CONTENT_TYPE(self):
5454

5555

5656
class CSVSerializer(BaseSerializer):
57-
"""Searilize data of various formats to a CSV-formatted string."""
57+
"""Serialize data of various formats to a CSV-formatted string."""
5858

5959
CONTENT_TYPE = "text/csv"
6060

@@ -102,7 +102,7 @@ def _serialize_row(self, data):
102102
csv_writer.writerow(data)
103103
return csv_buffer.getvalue().rstrip("\r\n")
104104

105-
raise ValueError("Unable to handle input format: ", type(data))
105+
raise ValueError("Unable to handle input format: %s" % type(data))
106106

107107
def _is_sequence_like(self, data):
108108
"""Returns true if obj is iterable and subscriptable."""
@@ -244,7 +244,14 @@ def serialize(self, data):
244244

245245

246246
class LibSVMSerializer(BaseSerializer):
247-
"""Searilize data of various formats to a LibSVM-formatted string."""
247+
"""Serialize data of various formats to a LibSVM-formatted string.
248+
249+
The data must already be in LIBSVM file format:
250+
<label> <index1>:<value1> <index2>:<value2> ...
251+
252+
It is suitable for sparse datasets since it does not store zero-valued
253+
features.
254+
"""
248255

249256
CONTENT_TYPE = "text/libsvm"
250257

@@ -264,4 +271,4 @@ def serialize(self, data):
264271
if hasattr(data, "read"):
265272
return data.read()
266273

267-
raise ValueError("Unable to handle input format: ", type(data))
274+
raise ValueError("Unable to handle input format: %s" % type(data))

0 commit comments

Comments
 (0)