Skip to content

Commit b2aef95

Browse files
committed
Merge remote-tracking branch 'upstream/master' into depr-sparse-depr
2 parents b043243 + 1263e1a commit b2aef95

File tree

20 files changed

+116
-88
lines changed

20 files changed

+116
-88
lines changed

.travis.yml

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -86,6 +86,14 @@ install:
8686
- ci/submit_cython_cache.sh
8787
- echo "install done"
8888

89+
before_script:
90+
# display server (for clipboard functionality) needs to be started here,
91+
# does not work if done in install:setup_env.sh (GH-26103)
92+
- export DISPLAY=":99.0"
93+
- echo "sh -e /etc/init.d/xvfb start"
94+
- sh -e /etc/init.d/xvfb start
95+
- sleep 3
96+
8997
script:
9098
- echo "script start"
9199
- source activate pandas-dev

ci/azure/windows.yml

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,10 +19,13 @@ jobs:
1919
steps:
2020
- powershell: Write-Host "##vso[task.prependpath]$env:CONDA\Scripts"
2121
displayName: Add conda to PATH
22-
- script: conda env create --file ci\\deps\\azure-windows-$(CONDA_PY).yaml
22+
- script: conda update -q -n base conda
23+
displayName: Update conda
24+
- script: conda env create -q --file ci\\deps\\azure-windows-$(CONDA_PY).yaml
2325
displayName: Create anaconda environment
2426
- script: |
2527
call activate pandas-dev
28+
call conda list
2629
ci\\incremental\\build.cmd
2730
displayName: 'Build'
2831
- script: |

ci/deps/azure-windows-37.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,11 @@
11
name: pandas-dev
22
channels:
33
- defaults
4+
- conda-forge
45
dependencies:
56
- beautifulsoup4
67
- bottleneck
8+
- gcsfs
79
- html5lib
810
- jinja2
911
- lxml

ci/setup_env.sh

Lines changed: 0 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -118,16 +118,10 @@ echo "conda list"
118118
conda list
119119

120120
# Install DB for Linux
121-
export DISPLAY=":99."
122121
if [ ${TRAVIS_OS_NAME} == "linux" ]; then
123122
echo "installing dbs"
124123
mysql -e 'create database pandas_nosetest;'
125124
psql -c 'create database pandas_nosetest;' -U postgres
126-
127-
echo
128-
echo "sh -e /etc/init.d/xvfb start"
129-
sh -e /etc/init.d/xvfb start
130-
sleep 3
131125
else
132126
echo "not using dbs on non-linux"
133127
fi

doc/source/user_guide/sparse.rst

Lines changed: 11 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -15,8 +15,7 @@ Sparse data structures
1515
Pandas provides data structures for efficiently storing sparse data.
1616
These are not necessarily sparse in the typical "mostly 0". Rather, you can view these
1717
objects as being "compressed" where any data matching a specific value (``NaN`` / missing value, though any value
18-
can be chosen, including 0) is omitted. A special ``SparseIndex`` object tracks where data has been
19-
"sparsified". For example,
18+
can be chosen, including 0) is omitted. The compressed values are not actually stored in the array.
2019

2120
.. ipython:: python
2221
@@ -121,21 +120,13 @@ class itself for creating a Series with sparse data from a scipy COO matrix with
121120
A ``.sparse`` accessor has been added for :class:`DataFrame` as well.
122121
See :ref:`api.dataframe.sparse` for more.
123122

124-
SparseIndex objects
125-
-------------------
126-
127-
Two kinds of ``SparseIndex`` are implemented, ``block`` and ``integer``. We
128-
recommend using ``block`` as it's more memory efficient. The ``integer`` format
129-
keeps an arrays of all of the locations where the data are not equal to the
130-
fill value. The ``block`` format tracks only the locations and sizes of blocks
131-
of data.
132-
133123
.. _sparse.calculation:
134124

135125
Sparse Calculation
136126
------------------
137127

138-
You can apply NumPy *ufuncs* to ``SparseArray`` and get a ``SparseArray`` as a result.
128+
You can apply NumPy `ufuncs <https://docs.scipy.org/doc/numpy/reference/ufuncs.html>`_
129+
to ``SparseArray`` and get a ``SparseArray`` as a result.
139130

140131
.. ipython:: python
141132
@@ -165,21 +156,14 @@ sparse values instead.
165156
**There's no performance or memory penalty to using a Series or DataFrame with sparse values,
166157
rather than a SparseSeries or SparseDataFrame**.
167158

168-
This section provides some guidance on migrating your code to the new style. As a reminder, you can
169-
use the python warnings module to control warnings. If you wish to ignore the warnings,
170-
171-
.. code-block:: python
172-
173-
>>> import warnings
159+
This section provides some guidance on migrating your code to the new style. As a reminder,
160+
you can use the python warnings module to control warnings. But we recommend modifying
161+
your code, rather than ignoring the warning.
174162

175-
>>> warnings.filterwarnings('ignore', 'Sparse', FutureWarning)
176-
>>> pd.SparseSeries() # No warning message
177-
Series([], dtype: Sparse[float64, nan])
178-
BlockIndex
179-
Block locations: array([], dtype=int32)
180-
Block lengths: array([], dtype=int32)
163+
**General Differences**
181164

182-
But we recommend modifying your code, rather than ignoring the warning.
165+
In a SparseDataFrame, *all* columns were sparse. A :class:`DataFrame` can have a mixture of
166+
sparse and dense columns.
183167

184168
**Construction**
185169

@@ -188,7 +172,7 @@ From an array-like, use the regular :class:`Series` or
188172

189173
.. code-block:: python
190174
191-
# Old way
175+
# Previous way
192176
>>> pd.SparseDataFrame({"A": [0, 1]})
193177
194178
.. ipython:: python
@@ -200,7 +184,7 @@ From a SciPy sparse matrix, use :meth:`DataFrame.sparse.from_spmatrix`,
200184

201185
.. code-block:: python
202186
203-
# Old way
187+
# Previous way
204188
>>> from scipy import sparse
205189
>>> mat = sparse.eye(3)
206190
>>> df = pd.SparseDataFrame(mat, columns=['A', 'B', 'C'])

doc/source/whatsnew/v0.11.0.rst

Lines changed: 1 addition & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -238,14 +238,9 @@ Enhancements
238238

239239
- support ``read_hdf/to_hdf`` API similar to ``read_csv/to_csv``
240240

241-
.. ipython:: python
242-
:suppress:
243-
244-
from pandas.compat import lrange
245-
246241
.. ipython:: python
247242
248-
df = pd.DataFrame({'A': lrange(5), 'B': lrange(5)})
243+
df = pd.DataFrame({'A': range(5), 'B': range(5)})
249244
df.to_hdf('store.h5', 'table', append=True)
250245
pd.read_hdf('store.h5', 'table', where=['index > 2'])
251246

doc/source/whatsnew/v0.12.0.rst

Lines changed: 1 addition & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -83,13 +83,8 @@ API changes
8383
``iloc`` API to be *purely* positional based.
8484

8585
.. ipython:: python
86-
:suppress:
8786
88-
from pandas.compat import lrange
89-
90-
.. ipython:: python
91-
92-
df = pd.DataFrame(lrange(5), list('ABCDE'), columns=['a'])
87+
df = pd.DataFrame(range(5), index=list('ABCDE'), columns=['a'])
9388
mask = (df.a % 2 == 0)
9489
mask
9590

doc/source/whatsnew/v0.25.0.rst

Lines changed: 28 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -255,9 +255,35 @@ Other API Changes
255255
Deprecations
256256
~~~~~~~~~~~~
257257

258+
Sparse Subclasses
259+
^^^^^^^^^^^^^^^^^
260+
261+
The ``SparseSeries`` and ``SparseDataFrame`` subclasses are deprecated. Their functionality is better-provided
262+
by a ``Series`` or ``DataFrame`` with sparse values.
263+
264+
**Previous Way**
265+
266+
.. ipython:: python
267+
:okwarning:
268+
269+
df = pd.SparseDataFrame({"A": [0, 0, 1, 2]})
270+
df.dtypes
271+
272+
**New Way**
273+
274+
.. ipython:: python
275+
276+
df = pd.DataFrame({"A": pd.SparseArray([0, 0, 1, 2])})
277+
df.dtypes
278+
279+
The memory usage of the two approaches is identical. See :ref:`sparse.migration` for more (:issue:`19239`).
280+
281+
Other Deprecations
282+
^^^^^^^^^^^^^^^^^^
283+
258284
- Deprecated the ``units=M`` (months) and ``units=Y`` (year) parameters for ``units`` of :func:`pandas.to_timedelta`, :func:`pandas.Timedelta` and :func:`pandas.TimedeltaIndex` (:issue:`16344`)
259285
- The functions :func:`pandas.to_datetime` and :func:`pandas.to_timedelta` have deprecated the ``box`` keyword. Instead, use :meth:`to_numpy` or :meth:`Timestamp.to_datetime64` or :meth:`Timedelta.to_timedelta64`. (:issue:`24416`)
260-
- The ``SparseSeries`` and ``SparseDataFrame`` subclasses are deprecated. Use a ``DataFrame`` or ``Series`` with sparse values instead. See :ref:`sparse.migration` for more (:issue:`19239`).
286+
- The :meth:`DataFrame.compound` and :meth:`Series.compound` methods are deprecated and will be removed in a future version.
261287

262288
.. _whatsnew_0250.prior_deprecations:
263289

@@ -374,7 +400,7 @@ Indexing
374400
- Improved exception message when calling :meth:`DataFrame.iloc` with a list of non-numeric objects (:issue:`25753`).
375401
- Bug in :meth:`DataFrame.loc` and :meth:`Series.loc` where ``KeyError`` was not raised for a ``MultiIndex`` when the key was less than or equal to the number of levels in the :class:`MultiIndex` (:issue:`14885`).
376402
- Bug in which :meth:`DataFrame.append` produced an erroneous warning indicating that a ``KeyError`` will be thrown in the future when the data to be appended contains new columns (:issue:`22252`).
377-
-
403+
- Bug in which :meth:`DataFrame.to_csv` caused a segfault for a reindexed data frame, when the indices were single-level :class:`MultiIndex` (:issue:`26303`).
378404

379405

380406
Missing

mypy.ini

Lines changed: 0 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -23,15 +23,9 @@ ignore_errors=True
2323
[mypy-pandas.core.internals.blocks]
2424
ignore_errors=True
2525

26-
[mypy-pandas.core.ops]
27-
ignore_errors=True
28-
2926
[mypy-pandas.core.panel]
3027
ignore_errors=True
3128

32-
[mypy-pandas.core.resample]
33-
ignore_errors=True
34-
3529
[mypy-pandas.core.reshape.merge]
3630
ignore_errors=True
3731

pandas/_libs/tslibs/period.pyx

Lines changed: 2 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -2208,10 +2208,6 @@ cdef class _Period:
22082208
def now(cls, freq=None):
22092209
return Period(datetime.now(), freq=freq)
22102210

2211-
# HACK IT UP AND YOU BETTER FIX IT SOON
2212-
def __str__(self):
2213-
return self.__unicode__()
2214-
22152211
@property
22162212
def freqstr(self):
22172213
return self.freq.freqstr
@@ -2221,9 +2217,9 @@ cdef class _Period:
22212217
formatted = period_format(self.ordinal, base)
22222218
return "Period('%s', '%s')" % (formatted, self.freqstr)
22232219

2224-
def __unicode__(self):
2220+
def __str__(self):
22252221
"""
2226-
Return a unicode string representation for a particular DataFrame
2222+
Return a string representation for a particular DataFrame
22272223
"""
22282224
base, mult = get_freq_code(self.freq)
22292225
formatted = period_format(self.ordinal, base)

pandas/core/dtypes/dtypes.py

Lines changed: 7 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -126,14 +126,11 @@ class PandasExtensionDtype(ExtensionDtype):
126126
isnative = 0
127127
_cache = {} # type: Dict[str_type, 'PandasExtensionDtype']
128128

129-
def __unicode__(self):
130-
return self.name
131-
132129
def __str__(self):
133130
"""
134131
Return a string representation for a particular Object
135132
"""
136-
return self.__unicode__()
133+
return self.name
137134

138135
def __bytes__(self):
139136
"""
@@ -142,7 +139,7 @@ def __bytes__(self):
142139
from pandas._config import get_option
143140

144141
encoding = get_option("display.encoding")
145-
return self.__unicode__().encode(encoding, 'replace')
142+
return str(self).encode(encoding, 'replace')
146143

147144
def __repr__(self):
148145
"""
@@ -707,7 +704,7 @@ def construct_from_string(cls, string):
707704

708705
raise TypeError("Could not construct DatetimeTZDtype")
709706

710-
def __unicode__(self):
707+
def __str__(self):
711708
return "datetime64[{unit}, {tz}]".format(unit=self.unit, tz=self.tz)
712709

713710
@property
@@ -837,12 +834,12 @@ def construct_from_string(cls, string):
837834
pass
838835
raise TypeError("could not construct PeriodDtype")
839836

840-
def __unicode__(self):
841-
return str(self.name)
837+
def __str__(self):
838+
return self.name
842839

843840
@property
844841
def name(self):
845-
return str("period[{freq}]".format(freq=self.freq.freqstr))
842+
return "period[{freq}]".format(freq=self.freq.freqstr)
846843

847844
@property
848845
def na_value(self):
@@ -1007,7 +1004,7 @@ def construct_from_string(cls, string):
10071004
def type(self):
10081005
return Interval
10091006

1010-
def __unicode__(self):
1007+
def __str__(self):
10111008
if self.subtype is None:
10121009
return "interval"
10131010
return "interval[{subtype}]".format(subtype=self.subtype)

pandas/core/generic.py

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10177,11 +10177,14 @@ def mad(self, axis=None, skipna=None, level=None):
1017710177
nanops.nanstd)
1017810178

1017910179
@Substitution(desc="Return the compound percentage of the values for "
10180-
"the requested axis.", name1=name, name2=name2,
10181-
axis_descr=axis_descr,
10180+
"the requested axis.\n\n.. deprecated:: 0.25.0",
10181+
name1=name, name2=name2, axis_descr=axis_descr,
1018210182
min_count='', see_also='', examples='')
1018310183
@Appender(_num_doc)
1018410184
def compound(self, axis=None, skipna=None, level=None):
10185+
msg = ("The 'compound' method is deprecated and will be"
10186+
"removed in a future version.")
10187+
warnings.warn(msg, FutureWarning, stacklevel=2)
1018510188
if skipna is None:
1018610189
skipna = True
1018710190
return (1 + self).prod(axis=axis, skipna=skipna, level=level) - 1

pandas/core/groupby/grouper.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@
33
split-apply-combine paradigm.
44
"""
55

6+
from typing import Tuple
67
import warnings
78

89
import numpy as np
@@ -84,7 +85,8 @@ class Grouper:
8485
8586
>>> df.groupby(Grouper(level='date', freq='60s', axis=1))
8687
"""
87-
_attributes = ('key', 'level', 'freq', 'axis', 'sort')
88+
_attributes = ('key', 'level', 'freq', 'axis',
89+
'sort') # type: Tuple[str, ...]
8890

8991
def __new__(cls, *args, **kwargs):
9092
if kwargs.get('freq') is not None:

pandas/core/indexes/multi.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -946,7 +946,9 @@ def _format_native_types(self, na_rep='nan', **kwargs):
946946
new_codes.append(level_codes)
947947

948948
if len(new_levels) == 1:
949-
return Index(new_levels[0])._format_native_types()
949+
# a single-level multi-index
950+
return Index(new_levels[0].take(
951+
new_codes[0]))._format_native_types()
950952
else:
951953
# reconstruct the multi-index
952954
mi = MultiIndex(levels=new_levels, codes=new_codes,

pandas/core/ops.py

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@
66
import datetime
77
import operator
88
import textwrap
9+
from typing import Dict, Optional
910
import warnings
1011

1112
import numpy as np
@@ -625,15 +626,13 @@ def _get_op_name(op, special):
625626
'desc': 'Greater than or equal to',
626627
'reverse': None,
627628
'series_examples': None}
628-
}
629+
} # type: Dict[str, Dict[str, Optional[str]]]
629630

630631
_op_names = list(_op_descriptions.keys())
631632
for key in _op_names:
632-
_op_descriptions[key]['reversed'] = False
633633
reverse_op = _op_descriptions[key]['reverse']
634634
if reverse_op is not None:
635635
_op_descriptions[reverse_op] = _op_descriptions[key].copy()
636-
_op_descriptions[reverse_op]['reversed'] = True
637636
_op_descriptions[reverse_op]['reverse'] = key
638637

639638
_flex_doc_SERIES = """
@@ -1010,7 +1009,7 @@ def _make_flex_doc(op_name, typ):
10101009
op_name = op_name.replace('__', '')
10111010
op_desc = _op_descriptions[op_name]
10121011

1013-
if op_desc['reversed']:
1012+
if op_name.startswith('r'):
10141013
equiv = 'other ' + op_desc['op'] + ' ' + typ
10151014
else:
10161015
equiv = typ + ' ' + op_desc['op'] + ' other'

0 commit comments

Comments
 (0)