Skip to content

Commit 706c5dc

Browse files
committed
fixups
1 parent b2aef95 commit 706c5dc

File tree

3 files changed

+45
-11
lines changed

3 files changed

+45
-11
lines changed

doc/source/user_guide/sparse.rst

Lines changed: 42 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -36,12 +36,20 @@ large, mostly NA ``DataFrame``:
3636
df = pd.DataFrame(np.random.randn(10000, 4))
3737
df.iloc[:9998] = np.nan
3838
sdf = df.astype(pd.SparseDtype("float", np.nan))
39-
sdf
39+
sdf.head()
40+
sdf.dtypes
4041
sdf.sparse.density
4142
4243
As you can see, the density (% of values that have not been "compressed") is
4344
extremely low. This sparse object takes up much less memory on disk (pickled)
44-
and in the Python interpreter. Functionally, their behavior should be nearly
45+
and in the Python interpreter.
46+
47+
.. ipython:: python
48+
49+
print('dense : {:0.2f} bytes'.format(df.memory_usage().sum() / 1e3))
50+
print('sparse: {:0.2f} bytes'.format(sdf.memory_usage().sum() / 1e3))
51+
52+
Functionally, their behavior should be nearly
4553
identical to their dense counterparts.
4654

4755
.. _sparse.array:
@@ -73,6 +81,12 @@ The :attr:`SparseArray.dtype` property stores two pieces of information
7381
1. The dtype of the non-sparse values
7482
2. The scalar fill value
7583

84+
85+
.. ipython:: python
86+
87+
sparr.dtype
88+
89+
7690
A :class:`SparseDtype` may be constructed by passing each of these
7791

7892
.. ipython:: python
@@ -118,7 +132,7 @@ class itself for creating a Series with sparse data from a scipy COO matrix with
118132
.. versionadded:: 0.25.0
119133

120134
A ``.sparse`` accessor has been added for :class:`DataFrame` as well.
121-
See :ref:`api.dataframe.sparse` for more.
135+
See :ref:`api.frame.sparse` for more.
122136

123137
.. _sparse.calculation:
124138

@@ -160,11 +174,6 @@ This section provides some guidance on migrating your code to the new style. As
160174
you can use the python warnings module to control warnings. But we recommend modifying
161175
your code, rather than ignoring the warning.
162176

163-
**General Differences**
164-
165-
In a SparseDataFrame, *all* columns were sparse. A :class:`DataFrame` can have a mixture of
166-
sparse and dense columns.
167-
168177
**Construction**
169178

170179
From an array-like, use the regular :class:`Series` or
@@ -195,7 +204,7 @@ From a SciPy sparse matrix, use :meth:`DataFrame.sparse.from_spmatrix`,
195204
from scipy import sparse
196205
mat = sparse.eye(3)
197206
df = pd.DataFrame.sparse.from_spmatrix(mat, columns=['A', 'B', 'C'])
198-
df
207+
df.dtypes
199208
200209
**Conversion**
201210

@@ -205,7 +214,6 @@ From sparse to dense, use the ``.sparse`` accessors
205214
206215
df.sparse.to_dense()
207216
df.sparse.to_coo()
208-
df['A']
209217
210218
From dense to sparse, use :meth:`DataFrame.astype` with a :class:`SparseDtype`.
211219

@@ -223,6 +231,30 @@ Sparse-specific properties, like ``density``, are available on the ``.sparse`` a
223231
224232
df.sparse.density
225233
234+
**General Differences**
235+
236+
In a SparseDataFrame, *all* columns were sparse. A :class:`DataFrame` can have a mixture of
237+
sparse and dense columns. As a consequence, assigning new columns to a DataFrame with sparse
238+
values will not automatically convert the input to be sparse.
239+
240+
.. code-block::
241+
242+
# Previous Way
243+
df = pd.SparseDataFrame({"A": [0, 1]})
244+
df['B'] = [0, 0] # implicitly becomes Sparse
245+
df['B'].dtype
246+
Sparse[int64, nan]
247+
248+
Instead, you'll need to ensure that the values being assigned are sparse
249+
250+
.. ipython:: python
251+
252+
df = pd.DataFrame({"A": pd.SparseArray([0, 1])})
253+
df['B'] = [0, 0] # remains dense
254+
df['B'].dtype
255+
df['B'] = pd.SparseArray([0, 0])
256+
df['B'].dtype
257+
226258
The ``SparseDataFrame.default_kind`` and ``SparseDataFrame.default_fill_value`` attributes
227259
have no replacement.
228260

pandas/core/sparse/frame.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@
3030
_shared_doc_kwargs = dict(klass='SparseDataFrame')
3131
depr_msg = """\
3232
SparseDataFrame is deprecated and will be removed in a future version.
33-
Use a DataFrame with sparse values instead.
33+
Use a regular DataFrame whose columns are SparseArrays instead.
3434
3535
See http://pandas.pydata.org/pandas-docs/stable/\
3636
user_guide/sparse.html#migrating for more.

pandas/core/sparse/series.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,8 @@
3636
SparseSeries is deprecated and will be removed in a future version.
3737
Use a Series with sparse values instead.
3838
39+
>>> series = pd.Series(pd.SparseArray(...))
40+
3941
See http://pandas.pydata.org/pandas-docs/stable/\
4042
user_guide/sparse.html#migrating for more.
4143
"""

0 commit comments

Comments
 (0)