Skip to content

Sync Fork from Upstream Repo #35

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 44 commits into from
Feb 6, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
44 commits
Select commit Hold shift + click to select a range
06384b7
CLN: Unreachable branch in Loc._getitem_iterable (#31636)
jbrockmendel Feb 4, 2020
feee467
PERF: avoid is_bool_indexer check where possible (#31399)
jbrockmendel Feb 4, 2020
73ea6ca
DOC: Fix typo in Getting Started docs (#31642)
Feb 4, 2020
01582c4
REGR: Categorical with np.str_ categories (#31528)
jbrockmendel Feb 4, 2020
d28db65
BUG: Fixed IntervalArray[int].shift (#31502)
TomAugspurger Feb 4, 2020
3b2b0d3
PERF: Cache MultiIndex.levels (#31651)
TomAugspurger Feb 4, 2020
79633f9
whatsnew (#31665)
TomAugspurger Feb 4, 2020
c3e32d7
REGR: Fix TypeError in groupby min / max of period column (#31477)
dsaxton Feb 4, 2020
1996b17
BUG: Block.iget not wrapping timedelta64/datetime64 (#31666)
jbrockmendel Feb 5, 2020
d3b7e64
REGR: Fixed handling of Categorical in cython ops (#31668)
TomAugspurger Feb 5, 2020
678bfae
CLN: assorted cleanups in indexes/ (#31674)
jbrockmendel Feb 5, 2020
3a4c310
CLN: Remove deprecations (#31675)
topper-123 Feb 5, 2020
69fe3c0
CLN: misc tslibs (#31673)
jbrockmendel Feb 5, 2020
1d5d5a3
REF: make _convert_scalar_indexer require a scalar (#31676)
jbrockmendel Feb 5, 2020
a1e0752
CLN: MultiIndex.get_value is a hive of scum and villainy (#31662)
jbrockmendel Feb 5, 2020
531a430
Replaced .format with f- strings (#31660)
leandermaben Feb 5, 2020
8e47971
check first and last points' labels are correct (#31659)
MarcoGorelli Feb 5, 2020
be9ee6d
BUG: avoid specifying default coerce_timestamps in to_parquet (#31652)
jorisvandenbossche Feb 5, 2020
bacd48b
CLN Replace format in test_repr_info (#31639)
thomasjpfan Feb 5, 2020
f5409cb
REF: call _maybe_cast_indexer upfront, better exception messages (#31…
jbrockmendel Feb 5, 2020
d84f9eb
BUG: Series.xs boxing datetime64 incorrectly (#31630)
jbrockmendel Feb 5, 2020
a89f7fd
Add test for gh 31605 (#31621)
fjetter Feb 5, 2020
2862b3d
CLN: convert_list_indexer is always kind=loc (#31599)
jbrockmendel Feb 5, 2020
9bfdb0d
REF: implement tests/indexes/objects/ (#31597)
jbrockmendel Feb 5, 2020
42065cd
REF: parametrize indexing tests (#31592)
jbrockmendel Feb 5, 2020
33e86bf
REF: simplify DTI._parse_string_to_bounds (#31519)
jbrockmendel Feb 5, 2020
aa47971
REF: _convert_scalar_indexer up-front (#31655)
jbrockmendel Feb 5, 2020
236f7e6
REF: move convert_scalar out of cython (#31672)
jbrockmendel Feb 5, 2020
881d0b7
REGR: fix non-reduction apply with tz-aware objects (#31614)
jorisvandenbossche Feb 5, 2020
c6c86dd
DOC: fixup v1.0.1 whatsnew entries (#31686)
jorisvandenbossche Feb 5, 2020
2f70e41
TST: add test for regression in groupby with empty MultiIndex level (…
jorisvandenbossche Feb 5, 2020
2f9a446
BUG: read_csv used in file like object RawIOBase is not recognize enc…
paihu Feb 5, 2020
d73ded0
DOC: Update 1.0.1 release notes (#31699)
TomAugspurger Feb 5, 2020
2bf618f
REGR: Fixed AssertionError in groupby (#31616)
TomAugspurger Feb 5, 2020
f0b00f8
TYP: remove type:ignore from pandas/io/common.py (#31700)
simonjayhawkins Feb 5, 2020
e2a6f6b
WEB: Link from the website to the docs (#30891)
datapythonista Feb 5, 2020
0b6debf
DOC: fix contributors listing for v1.0.1 (#31704)
jorisvandenbossche Feb 5, 2020
cc4a3e9
DOC: add plotting backends in visualization.rst (#31066)
rushabh-v Feb 5, 2020
daf5bb4
added type annotation to JSONDtype.na_value (#31718)
SaturnFromTitan Feb 5, 2020
935b6f4
DOC Adds newline to dataframe melt (#31712)
thomasjpfan Feb 5, 2020
120f35f
DOC: Add 1.0.2 whatsnew (#31723)
alimcmaster1 Feb 6, 2020
a2a35a8
TST/CLN: dtype test_construct_from_string (#31727)
simonjayhawkins Feb 6, 2020
9ef92ee
Update travis-37.yaml (#31745)
alimcmaster1 Feb 6, 2020
808004a
CLN: inconsistent kwarg name (#31721)
jbrockmendel Feb 6, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion ci/deps/travis-37.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,6 @@ name: pandas-dev
channels:
- defaults
- conda-forge
- c3i_test
dependencies:
- python=3.7.*

Expand Down
2 changes: 1 addition & 1 deletion doc/source/getting_started/basics.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1973,7 +1973,7 @@ Pandas has two ways to store strings.
1. ``object`` dtype, which can hold any Python object, including strings.
2. :class:`StringDtype`, which is dedicated to strings.

Generally, we recommend using :class:`StringDtype`. See :ref:`text.types` fore more.
Generally, we recommend using :class:`StringDtype`. See :ref:`text.types` for more.

Finally, arbitrary objects may be stored using the ``object`` dtype, but should
be avoided to the extent possible (for performance and interoperability with
Expand Down
43 changes: 43 additions & 0 deletions doc/source/user_guide/visualization.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1641,3 +1641,46 @@ when plotting a large number of points.
:suppress:

plt.close('all')

Plotting backends
-----------------

Starting in version 0.25, pandas can be extended with third-party plotting backends. The
main idea is letting users select a plotting backend different than the provided
one based on Matplotlib.

This can be done by passsing 'backend.module' as the argument ``backend`` in ``plot``
function. For example:

.. code-block:: python

>>> Series([1, 2, 3]).plot(backend='backend.module')

Alternatively, you can also set this option globally, do you don't need to specify
the keyword in each ``plot`` call. For example:

.. code-block:: python

>>> pd.set_option('plotting.backend', 'backend.module')
>>> pd.Series([1, 2, 3]).plot()

Or:

.. code-block:: python

>>> pd.options.plotting.backend = 'backend.module'
>>> pd.Series([1, 2, 3]).plot()

This would be more or less equivalent to:

.. code-block:: python

>>> import backend.module
>>> backend.module.plot(pd.Series([1, 2, 3]))

The backend module can then use other visualization tools (Bokeh, Altair, hvplot,...)
to generate the plots. Some libraries implementing a backend for pandas are listed
on the ecosystem :ref:`ecosystem.visualization` page.

Developers guide can be found at
https://dev.pandas.io/docs/development/extending.html#plotting-backends
1 change: 1 addition & 0 deletions doc/source/whatsnew/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ Version 1.0

v1.0.0
v1.0.1
v1.0.2

Version 0.25
------------
Expand Down
18 changes: 16 additions & 2 deletions doc/source/whatsnew/v1.0.1.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
.. _whatsnew_101:

What's new in 1.0.1 (??)
------------------------
What's new in 1.0.1 (February 5, 2020)
--------------------------------------

These are the changes in pandas 1.0.1. See :ref:`release` for a full changelog
including other versions of pandas.
Expand All @@ -19,13 +19,22 @@ Fixed regressions
- Fixed regression when indexing a ``Series`` or ``DataFrame`` indexed by ``DatetimeIndex`` with a slice containg a :class:`datetime.date` (:issue:`31501`)
- Fixed regression in ``DataFrame.__setitem__`` raising an ``AttributeError`` with a :class:`MultiIndex` and a non-monotonic indexer (:issue:`31449`)
- Fixed regression in :class:`Series` multiplication when multiplying a numeric :class:`Series` with >10000 elements with a timedelta-like scalar (:issue:`31457`)
- Fixed regression in ``.groupby().agg()`` raising an ``AssertionError`` for some reductions like ``min`` on object-dtype columns (:issue:`31522`)
- Fixed regression in ``.groupby()`` aggregations with categorical dtype using Cythonized reduction functions (e.g. ``first``) (:issue:`31450`)
- Fixed regression in :meth:`GroupBy.apply` if called with a function which returned a non-pandas non-scalar object (e.g. a list or numpy array) (:issue:`31441`)
- Fixed regression in :meth:`DataFrame.groupby` whereby taking the minimum or maximum of a column with period dtype would raise a ``TypeError``. (:issue:`31471`)
- Fixed regression in :meth:`DataFrame.groupby` with an empty DataFrame grouping by a level of a MultiIndex (:issue:`31670`).
- Fixed regression in :meth:`DataFrame.apply` with object dtype and non-reducing function (:issue:`31505`)
- Fixed regression in :meth:`to_datetime` when parsing non-nanosecond resolution datetimes (:issue:`31491`)
- Fixed regression in :meth:`~DataFrame.to_csv` where specifying an ``na_rep`` might truncate the values written (:issue:`31447`)
- Fixed regression in :class:`Categorical` construction with ``numpy.str_`` categories (:issue:`31499`)
- Fixed regression in :meth:`DataFrame.loc` and :meth:`DataFrame.iloc` when selecting a row containing a single ``datetime64`` or ``timedelta64`` column (:issue:`31649`)
- Fixed regression where setting :attr:`pd.options.display.max_colwidth` was not accepting negative integer. In addition, this behavior has been deprecated in favor of using ``None`` (:issue:`31532`)
- Fixed regression in objTOJSON.c fix return-type warning (:issue:`31463`)
- Fixed regression in :meth:`qcut` when passed a nullable integer. (:issue:`31389`)
- Fixed regression in assigning to a :class:`Series` using a nullable integer dtype (:issue:`31446`)
- Fixed performance regression when indexing a ``DataFrame`` or ``Series`` with a :class:`MultiIndex` for the index using a list of labels (:issue:`31648`)
- Fixed regression in :meth:`read_csv` used in file like object ``RawIOBase`` is not recognize ``encoding`` option (:issue:`31575`)

.. ---------------------------------------------------------------------------

Expand Down Expand Up @@ -56,10 +65,15 @@ Bug fixes

- Plotting tz-aware timeseries no longer gives UserWarning (:issue:`31205`)

**Interval**

- Bug in :meth:`Series.shift` with ``interval`` dtype raising a ``TypeError`` when shifting an interval array of integers or datetimes (:issue:`34195`)

.. ---------------------------------------------------------------------------

.. _whatsnew_101.contributors:

Contributors
~~~~~~~~~~~~

.. contributors:: v1.0.0..v1.0.1|HEAD
38 changes: 38 additions & 0 deletions doc/source/whatsnew/v1.0.2.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
.. _whatsnew_102:

What's new in 1.0.2 (February ??, 2020)
---------------------------------------

These are the changes in pandas 1.0.2. See :ref:`release` for a full changelog
including other versions of pandas.

{{ header }}

.. ---------------------------------------------------------------------------

.. _whatsnew_102.regressions:

Fixed regressions
~~~~~~~~~~~~~~~~~

-
-

.. ---------------------------------------------------------------------------

.. _whatsnew_102.bug_fixes:

Bug fixes
~~~~~~~~~

-
-

.. ---------------------------------------------------------------------------

.. _whatsnew_102.contributors:

Contributors
~~~~~~~~~~~~

.. contributors:: v1.0.1..v1.0.2|HEAD
5 changes: 4 additions & 1 deletion doc/source/whatsnew/v1.1.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -156,6 +156,7 @@ Indexing
- Bug in :meth:`Series.at` and :meth:`DataFrame.at` not matching ``.loc`` behavior when looking up an integer in a :class:`Float64Index` (:issue:`31329`)
- Bug in :meth:`PeriodIndex.is_monotonic` incorrectly returning ``True`` when containing leading ``NaT`` entries (:issue:`31437`)
- Bug in :meth:`DatetimeIndex.get_loc` raising ``KeyError`` with converted-integer key instead of the user-passed key (:issue:`31425`)
- Bug in :meth:`Series.xs` incorrectly returning ``Timestamp`` instead of ``datetime64`` in some object-dtype cases (:issue:`31630`)

Missing
^^^^^^^
Expand All @@ -180,7 +181,9 @@ I/O
- Bug in :meth:`read_json` where integer overflow was occuring when json contains big number strings. (:issue:`30320`)
- `read_csv` will now raise a ``ValueError`` when the arguments `header` and `prefix` both are not `None`. (:issue:`27394`)
- Bug in :meth:`DataFrame.to_json` was raising ``NotFoundError`` when ``path_or_buf`` was an S3 URI (:issue:`28375`)
-
- Bug in :meth:`DataFrame.to_parquet` overwriting pyarrow's default for
``coerce_timestamps``; following pyarrow's default allows writing nanosecond
timestamps with ``version="2.0"`` (:issue:`31652`).

Plotting
^^^^^^^^
Expand Down
10 changes: 10 additions & 0 deletions doc/sphinxext/announce.py
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,16 @@ def get_authors(revision_range):
pat = "^.*\\t(.*)$"
lst_release, cur_release = [r.strip() for r in revision_range.split("..")]

if "|" in cur_release:
# e.g. v1.0.1|HEAD
maybe_tag, head = cur_release.split("|")
assert head == "HEAD"
if maybe_tag in this_repo.tags:
cur_release = maybe_tag
else:
cur_release = head
revision_range = f"{lst_release}..{cur_release}"

# authors, in current release and previous to current release.
cur = set(re.findall(pat, this_repo.git.shortlog("-s", revision_range), re.M))
pre = set(re.findall(pat, this_repo.git.shortlog("-s", lst_release), re.M))
Expand Down
8 changes: 7 additions & 1 deletion doc/sphinxext/contributors.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,13 @@

This will be replaced with a message indicating the number of
code contributors and commits, and then list each contributor
individually.
individually. For development versions (before a tag is available)
use::

.. contributors:: v0.23.0..v0.23.1|HEAD

While the v0.23.1 tag does not exist, that will use the HEAD of the
branch as the end of the revision range.
"""
from announce import build_components
from docutils import nodes
Expand Down
8 changes: 6 additions & 2 deletions pandas/_libs/hashtable_class_helper.pxi.in
Original file line number Diff line number Diff line change
Expand Up @@ -670,7 +670,9 @@ cdef class StringHashTable(HashTable):
val = values[i]

if isinstance(val, str):
v = get_c_string(val)
# GH#31499 if we have a np.str_ get_c_string wont recognize
# it as a str, even though isinstance does.
v = get_c_string(<str>val)
else:
v = get_c_string(self.na_string_sentinel)
vecs[i] = v
Expand Down Expand Up @@ -703,7 +705,9 @@ cdef class StringHashTable(HashTable):
val = values[i]

if isinstance(val, str):
v = get_c_string(val)
# GH#31499 if we have a np.str_ get_c_string wont recognize
# it as a str, even though isinstance does.
v = get_c_string(<str>val)
else:
v = get_c_string(self.na_string_sentinel)
vecs[i] = v
Expand Down
55 changes: 0 additions & 55 deletions pandas/_libs/index.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -535,61 +535,6 @@ cdef class PeriodEngine(Int64Engine):
return super(PeriodEngine, self).get_indexer_non_unique(ordinal_array)


cpdef convert_scalar(ndarray arr, object value):
# we don't turn integers
# into datetimes/timedeltas

# we don't turn bools into int/float/complex

if arr.descr.type_num == NPY_DATETIME:
if util.is_array(value):
pass
elif isinstance(value, (datetime, np.datetime64, date)):
return Timestamp(value).to_datetime64()
elif util.is_timedelta64_object(value):
# exclude np.timedelta64("NaT") from value != value below
pass
elif value is None or value != value:
return np.datetime64("NaT", "ns")
raise ValueError("cannot set a Timestamp with a non-timestamp "
f"{type(value).__name__}")

elif arr.descr.type_num == NPY_TIMEDELTA:
if util.is_array(value):
pass
elif isinstance(value, timedelta) or util.is_timedelta64_object(value):
value = Timedelta(value)
if value is NaT:
return np.timedelta64("NaT", "ns")
return value.to_timedelta64()
elif util.is_datetime64_object(value):
# exclude np.datetime64("NaT") which would otherwise be picked up
# by the `value != value check below
pass
elif value is None or value != value:
return np.timedelta64("NaT", "ns")
raise ValueError("cannot set a Timedelta with a non-timedelta "
f"{type(value).__name__}")

else:
validate_numeric_casting(arr.dtype, value)

return value


cpdef validate_numeric_casting(dtype, object value):
# Note: we can't annotate dtype as cnp.dtype because that cases dtype.type
# to integer
if issubclass(dtype.type, (np.integer, np.bool_)):
if util.is_float_object(value) and value != value:
raise ValueError("Cannot assign nan to integer series")

if (issubclass(dtype.type, (np.integer, np.floating, np.complex)) and
not issubclass(dtype.type, np.bool_)):
if util.is_bool_object(value):
raise ValueError("Cannot assign bool to float/integer series")


cdef class BaseMultiIndexCodesEngine:
"""
Base class for MultiIndexUIntEngine and MultiIndexPyIntEngine, which
Expand Down
2 changes: 1 addition & 1 deletion pandas/_libs/parsers.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -638,7 +638,7 @@ cdef class TextReader:
raise ValueError(f'Unrecognized compression type: '
f'{self.compression}')

if self.encoding and isinstance(source, io.BufferedIOBase):
if self.encoding and isinstance(source, (io.BufferedIOBase, io.RawIOBase)):
source = io.TextIOWrapper(
source, self.encoding.decode('utf-8'), newline='')

Expand Down
3 changes: 2 additions & 1 deletion pandas/_libs/reduction.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -114,7 +114,8 @@ cdef class Reducer:
if self.typ is not None:
# In this case, we also have self.index
name = labels[i]
cached_typ = self.typ(chunk, index=self.index, name=name)
cached_typ = self.typ(
chunk, index=self.index, name=name, dtype=arr.dtype)

# use the cached_typ if possible
if cached_typ is not None:
Expand Down
18 changes: 9 additions & 9 deletions pandas/_libs/tslibs/resolution.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ cdef:

# ----------------------------------------------------------------------

cpdef resolution(int64_t[:] stamps, tz=None):
cpdef resolution(const int64_t[:] stamps, tz=None):
cdef:
Py_ssize_t i, n = len(stamps)
npy_datetimestruct dts
Expand All @@ -38,7 +38,7 @@ cpdef resolution(int64_t[:] stamps, tz=None):
return _reso_local(stamps, tz)


cdef _reso_local(int64_t[:] stamps, object tz):
cdef _reso_local(const int64_t[:] stamps, object tz):
cdef:
Py_ssize_t i, n = len(stamps)
int reso = RESO_DAY, curr_reso
Expand Down Expand Up @@ -106,7 +106,7 @@ cdef inline int _reso_stamp(npy_datetimestruct *dts):
return RESO_DAY


def get_freq_group(freq):
def get_freq_group(freq) -> int:
"""
Return frequency code group of given frequency str or offset.

Expand Down Expand Up @@ -189,7 +189,7 @@ class Resolution:
_freq_reso_map = {v: k for k, v in _reso_freq_map.items()}

@classmethod
def get_str(cls, reso):
def get_str(cls, reso: int) -> str:
"""
Return resolution str against resolution code.

Expand All @@ -201,7 +201,7 @@ class Resolution:
return cls._reso_str_map.get(reso, 'day')

@classmethod
def get_reso(cls, resostr):
def get_reso(cls, resostr: str) -> int:
"""
Return resolution str against resolution code.

Expand All @@ -216,7 +216,7 @@ class Resolution:
return cls._str_reso_map.get(resostr, cls.RESO_DAY)

@classmethod
def get_freq_group(cls, resostr):
def get_freq_group(cls, resostr: str) -> int:
"""
Return frequency str against resolution str.

Expand All @@ -228,7 +228,7 @@ class Resolution:
return get_freq_group(cls.get_freq(resostr))

@classmethod
def get_freq(cls, resostr):
def get_freq(cls, resostr: str) -> str:
"""
Return frequency str against resolution str.

Expand All @@ -240,7 +240,7 @@ class Resolution:
return cls._reso_freq_map[resostr]

@classmethod
def get_str_from_freq(cls, freq):
def get_str_from_freq(cls, freq: str) -> str:
"""
Return resolution str against frequency str.

Expand All @@ -252,7 +252,7 @@ class Resolution:
return cls._freq_reso_map.get(freq, 'day')

@classmethod
def get_reso_from_freq(cls, freq):
def get_reso_from_freq(cls, freq: str) -> int:
"""
Return resolution code against frequency str.

Expand Down
Loading