Skip to content

Commit c200f65

Browse files
authored
Merge pull request #67 from pandas-dev/master
Sync Fork from Upstream Repo
2 parents fed8144 + b512ed5 commit c200f65

File tree

102 files changed

+1041
-891
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

102 files changed

+1041
-891
lines changed

doc/make.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -83,8 +83,8 @@ def _process_single_doc(self, single_doc):
8383
obj = pandas # noqa: F821
8484
for name in single_doc.split("."):
8585
obj = getattr(obj, name)
86-
except AttributeError:
87-
raise ImportError(f"Could not import {single_doc}")
86+
except AttributeError as err:
87+
raise ImportError(f"Could not import {single_doc}") from err
8888
else:
8989
return single_doc[len("pandas.") :]
9090
else:

doc/source/user_guide/merging.rst

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -724,6 +724,27 @@ either the left or right tables, the values in the joined table will be
724724
labels=['left', 'right'], vertical=False);
725725
plt.close('all');
726726
727+
You can merge a mult-indexed Series and a DataFrame, if the names of
728+
the MultiIndex correspond to the columns from the DataFrame. Transform
729+
the Series to a DataFrame using :meth:`Series.reset_index` before merging,
730+
as shown in the following example.
731+
732+
.. ipython:: python
733+
734+
df = pd.DataFrame({"Let": ["A", "B", "C"], "Num": [1, 2, 3]})
735+
df
736+
737+
ser = pd.Series(
738+
["a", "b", "c", "d", "e", "f"],
739+
index=pd.MultiIndex.from_arrays(
740+
[["A", "B", "C"] * 2, [1, 2, 3, 4, 5, 6]], names=["Let", "Num"]
741+
),
742+
)
743+
ser
744+
745+
result = pd.merge(df, ser.reset_index(), on=['Let', 'Num'])
746+
747+
727748
Here is another example with duplicate join keys in DataFrames:
728749

729750
.. ipython:: python

doc/source/whatsnew/index.rst

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ Release Notes
77
*************
88

99
This is the list of changes to pandas between each release. For full details,
10-
see the commit logs at https://github.com/pandas-dev/pandas. For install and
10+
see the `commit logs <https://github.com/pandas-dev/pandas/commits/>`_. For install and
1111
upgrade instructions, see :ref:`install`.
1212

1313
Version 1.1
@@ -24,9 +24,9 @@ Version 1.0
2424
.. toctree::
2525
:maxdepth: 2
2626

27-
v1.0.0
28-
v1.0.1
2927
v1.0.2
28+
v1.0.1
29+
v1.0.0
3030

3131
Version 0.25
3232
------------

doc/source/whatsnew/v1.1.0.rst

Lines changed: 72 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -90,7 +90,76 @@ Backwards incompatible API changes
9090
now raise a ``TypeError`` if a not-accepted keyword argument is passed into it.
9191
Previously a ``UnsupportedFunctionCall`` was raised (``AssertionError`` if ``min_count`` passed into :meth:`~DataFrameGroupby.median``) (:issue:`31485`)
9292
- :meth:`DataFrame.at` and :meth:`Series.at` will raise a ``TypeError`` instead of a ``ValueError`` if an incompatible key is passed, and ``KeyError`` if a missing key is passed, matching the behavior of ``.loc[]`` (:issue:`31722`)
93-
-
93+
94+
.. _whatsnew_110.api_breaking.indexing_raises_key_errors:
95+
96+
Failed Label-Based Lookups Always Raise KeyError
97+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
98+
99+
Label lookups ``series[key]``, ``series.loc[key]`` and ``frame.loc[key]``
100+
used to raises either ``KeyError`` or ``TypeError`` depending on the type of
101+
key and type of :class:`Index`. These now consistently raise ``KeyError`` (:issue:`31867`)
102+
103+
.. ipython:: python
104+
105+
ser1 = pd.Series(range(3), index=[0, 1, 2])
106+
ser2 = pd.Series(range(3), index=pd.date_range("2020-02-01", periods=3))
107+
108+
*Previous behavior*:
109+
110+
.. code-block:: ipython
111+
112+
In [3]: ser1[1.5]
113+
...
114+
TypeError: cannot do label indexing on Int64Index with these indexers [1.5] of type float
115+
116+
In [4] ser1["foo"]
117+
...
118+
KeyError: 'foo'
119+
120+
In [5]: ser1.loc[1.5]
121+
...
122+
TypeError: cannot do label indexing on Int64Index with these indexers [1.5] of type float
123+
124+
In [6]: ser1.loc["foo"]
125+
...
126+
KeyError: 'foo'
127+
128+
In [7]: ser2.loc[1]
129+
...
130+
TypeError: cannot do label indexing on DatetimeIndex with these indexers [1] of type int
131+
132+
In [8]: ser2.loc[pd.Timestamp(0)]
133+
...
134+
KeyError: Timestamp('1970-01-01 00:00:00')
135+
136+
*New behavior*:
137+
138+
.. code-block:: ipython
139+
140+
In [3]: ser1[1.5]
141+
...
142+
KeyError: 1.5
143+
144+
In [4] ser1["foo"]
145+
...
146+
KeyError: 'foo'
147+
148+
In [5]: ser1.loc[1.5]
149+
...
150+
KeyError: 1.5
151+
152+
In [6]: ser1.loc["foo"]
153+
...
154+
KeyError: 'foo'
155+
156+
In [7]: ser2.loc[1]
157+
...
158+
KeyError: 1
159+
160+
In [8]: ser2.loc[pd.Timestamp(0)]
161+
...
162+
KeyError: Timestamp('1970-01-01 00:00:00')
94163
95164
.. ---------------------------------------------------------------------------
96165
@@ -126,9 +195,10 @@ Bug fixes
126195

127196
Categorical
128197
^^^^^^^^^^^
198+
199+
- Bug where :func:`merge` was unable to join on non-unique categorical indices (:issue:`28189`)
129200
- Bug when passing categorical data to :class:`Index` constructor along with ``dtype=object`` incorrectly returning a :class:`CategoricalIndex` instead of object-dtype :class:`Index` (:issue:`32167`)
130201
-
131-
-
132202

133203
Datetimelike
134204
^^^^^^^^^^^^

pandas/__init__.py

Lines changed: 7 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,7 @@
3737
f"C extension: {module} not built. If you want to import "
3838
"pandas from the source directory, you may need to run "
3939
"'python setup.py build_ext --inplace --force' to build the C extensions first."
40-
)
40+
) from e
4141

4242
from pandas._config import (
4343
get_option,
@@ -290,8 +290,8 @@ def __getattr__(self, item):
290290

291291
try:
292292
return getattr(self.np, item)
293-
except AttributeError:
294-
raise AttributeError(f"module numpy has no attribute {item}")
293+
except AttributeError as err:
294+
raise AttributeError(f"module numpy has no attribute {item}") from err
295295

296296
np = __numpy()
297297

@@ -306,8 +306,10 @@ def __getattr__(cls, item):
306306

307307
try:
308308
return getattr(cls.datetime, item)
309-
except AttributeError:
310-
raise AttributeError(f"module datetime has no attribute {item}")
309+
except AttributeError as err:
310+
raise AttributeError(
311+
f"module datetime has no attribute {item}"
312+
) from err
311313

312314
def __instancecheck__(cls, other):
313315
return isinstance(other, cls.datetime)

pandas/_config/config.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -213,8 +213,8 @@ def __getattr__(self, key: str):
213213
prefix += key
214214
try:
215215
v = object.__getattribute__(self, "d")[key]
216-
except KeyError:
217-
raise OptionError("No such option")
216+
except KeyError as err:
217+
raise OptionError("No such option") from err
218218
if isinstance(v, dict):
219219
return DictWrapper(v, prefix)
220220
else:

pandas/_libs/hashtable.pyx

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -86,7 +86,10 @@ cdef class Factorizer:
8686
self, ndarray[object] values, sort=False, na_sentinel=-1, na_value=None
8787
):
8888
"""
89+
Examples
90+
--------
8991
Factorize values with nans replaced by na_sentinel
92+
9093
>>> factorize(np.array([1,2,np.nan], dtype='O'), na_sentinel=20)
9194
array([ 0, 1, 20])
9295
"""
@@ -131,7 +134,10 @@ cdef class Int64Factorizer:
131134
def factorize(self, const int64_t[:] values, sort=False,
132135
na_sentinel=-1, na_value=None):
133136
"""
137+
Examples
138+
--------
134139
Factorize values with nans replaced by na_sentinel
140+
135141
>>> factorize(np.array([1,2,np.nan], dtype='O'), na_sentinel=20)
136142
array([ 0, 1, 20])
137143
"""

pandas/_libs/index.pyx

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -115,8 +115,6 @@ cdef class IndexEngine:
115115
cdef _maybe_get_bool_indexer(self, object val):
116116
cdef:
117117
ndarray[uint8_t, ndim=1, cast=True] indexer
118-
ndarray[intp_t, ndim=1] found
119-
int count
120118

121119
indexer = self._get_index_values() == val
122120
return self._unpack_bool_indexer(indexer, val)

pandas/_libs/internals.pyx

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -105,8 +105,7 @@ cdef class BlockPlacement:
105105
Py_ssize_t start, stop, end, _
106106
if not self._has_array:
107107
start, stop, step, _ = slice_get_indices_ex(self._as_slice)
108-
self._as_array = np.arange(start, stop, step,
109-
dtype=np.int64)
108+
self._as_array = np.arange(start, stop, step, dtype=np.int64)
110109
self._has_array = True
111110
return self._as_array
112111

@@ -283,8 +282,7 @@ cdef slice_getitem(slice slc, ind):
283282
s_start, s_stop, s_step, s_len = slice_get_indices_ex(slc)
284283

285284
if isinstance(ind, slice):
286-
ind_start, ind_stop, ind_step, ind_len = slice_get_indices_ex(ind,
287-
s_len)
285+
ind_start, ind_stop, ind_step, ind_len = slice_get_indices_ex(ind, s_len)
288286

289287
if ind_step > 0 and ind_len == s_len:
290288
# short-cut for no-op slice

pandas/_libs/interval.pyx

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -481,8 +481,7 @@ cdef class Interval(IntervalMixin):
481481

482482
@cython.wraparound(False)
483483
@cython.boundscheck(False)
484-
def intervals_to_interval_bounds(ndarray intervals,
485-
bint validate_closed=True):
484+
def intervals_to_interval_bounds(ndarray intervals, bint validate_closed=True):
486485
"""
487486
Parameters
488487
----------
@@ -502,14 +501,14 @@ def intervals_to_interval_bounds(ndarray intervals,
502501
"""
503502
cdef:
504503
object closed = None, interval
505-
int64_t n = len(intervals)
504+
Py_ssize_t i, n = len(intervals)
506505
ndarray left, right
507506
bint seen_closed = False
508507

509508
left = np.empty(n, dtype=intervals.dtype)
510509
right = np.empty(n, dtype=intervals.dtype)
511510

512-
for i in range(len(intervals)):
511+
for i in range(n):
513512
interval = intervals[i]
514513
if interval is None or util.is_nan(interval):
515514
left[i] = np.nan

pandas/_libs/join.pyx

Lines changed: 18 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -254,6 +254,8 @@ ctypedef fused join_t:
254254
float64_t
255255
float32_t
256256
object
257+
int8_t
258+
int16_t
257259
int32_t
258260
int64_t
259261
uint64_t
@@ -815,18 +817,22 @@ def asof_join_nearest_on_X_by_Y(asof_t[:] left_values,
815817
right_indexer = np.empty(left_size, dtype=np.int64)
816818

817819
# search both forward and backward
818-
bli, bri = asof_join_backward_on_X_by_Y(left_values,
819-
right_values,
820-
left_by_values,
821-
right_by_values,
822-
allow_exact_matches,
823-
tolerance)
824-
fli, fri = asof_join_forward_on_X_by_Y(left_values,
825-
right_values,
826-
left_by_values,
827-
right_by_values,
828-
allow_exact_matches,
829-
tolerance)
820+
bli, bri = asof_join_backward_on_X_by_Y(
821+
left_values,
822+
right_values,
823+
left_by_values,
824+
right_by_values,
825+
allow_exact_matches,
826+
tolerance,
827+
)
828+
fli, fri = asof_join_forward_on_X_by_Y(
829+
left_values,
830+
right_values,
831+
left_by_values,
832+
right_by_values,
833+
allow_exact_matches,
834+
tolerance,
835+
)
830836

831837
for i in range(len(bri)):
832838
# choose timestamp from right with smaller difference

0 commit comments

Comments
 (0)