Skip to content

Commit 24c00ad

Browse files
authored
Merge pull request #39 from pandas-dev/master
Sync Fork from Upstream Repo
2 parents 26acb66 + bf4e74d commit 24c00ad

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

72 files changed

+1753
-1465
lines changed

doc/redirects.csv

Lines changed: 15 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -271,21 +271,21 @@ generated/pandas.core.window.Expanding.skew,../reference/api/pandas.core.window.
271271
generated/pandas.core.window.Expanding.std,../reference/api/pandas.core.window.Expanding.std
272272
generated/pandas.core.window.Expanding.sum,../reference/api/pandas.core.window.Expanding.sum
273273
generated/pandas.core.window.Expanding.var,../reference/api/pandas.core.window.Expanding.var
274-
generated/pandas.core.window.Rolling.aggregate,../reference/api/pandas.core.window.Rolling.aggregate
275-
generated/pandas.core.window.Rolling.apply,../reference/api/pandas.core.window.Rolling.apply
276-
generated/pandas.core.window.Rolling.corr,../reference/api/pandas.core.window.Rolling.corr
277-
generated/pandas.core.window.Rolling.count,../reference/api/pandas.core.window.Rolling.count
278-
generated/pandas.core.window.Rolling.cov,../reference/api/pandas.core.window.Rolling.cov
279-
generated/pandas.core.window.Rolling.kurt,../reference/api/pandas.core.window.Rolling.kurt
280-
generated/pandas.core.window.Rolling.max,../reference/api/pandas.core.window.Rolling.max
281-
generated/pandas.core.window.Rolling.mean,../reference/api/pandas.core.window.Rolling.mean
282-
generated/pandas.core.window.Rolling.median,../reference/api/pandas.core.window.Rolling.median
283-
generated/pandas.core.window.Rolling.min,../reference/api/pandas.core.window.Rolling.min
284-
generated/pandas.core.window.Rolling.quantile,../reference/api/pandas.core.window.Rolling.quantile
285-
generated/pandas.core.window.Rolling.skew,../reference/api/pandas.core.window.Rolling.skew
286-
generated/pandas.core.window.Rolling.std,../reference/api/pandas.core.window.Rolling.std
287-
generated/pandas.core.window.Rolling.sum,../reference/api/pandas.core.window.Rolling.sum
288-
generated/pandas.core.window.Rolling.var,../reference/api/pandas.core.window.Rolling.var
274+
generated/pandas.core.window.Rolling.aggregate,../reference/api/pandas.core.window.rolling.Rolling.aggregate
275+
generated/pandas.core.window.Rolling.apply,../reference/api/pandas.core.window.rolling.Rolling.apply
276+
generated/pandas.core.window.Rolling.corr,../reference/api/pandas.core.window.rolling.Rolling.corr
277+
generated/pandas.core.window.Rolling.count,../reference/api/pandas.core.window.rolling.Rolling.count
278+
generated/pandas.core.window.Rolling.cov,../reference/api/pandas.core.window.rolling.Rolling.cov
279+
generated/pandas.core.window.Rolling.kurt,../reference/api/pandas.core.window.rolling.Rolling.kurt
280+
generated/pandas.core.window.Rolling.max,../reference/api/pandas.core.window.rolling.Rolling.max
281+
generated/pandas.core.window.Rolling.mean,../reference/api/pandas.core.window.rolling.Rolling.mean
282+
generated/pandas.core.window.Rolling.median,../reference/api/pandas.core.window.rolling.Rolling.median
283+
generated/pandas.core.window.Rolling.min,../reference/api/pandas.core.window.rolling.Rolling.min
284+
generated/pandas.core.window.Rolling.quantile,../reference/api/pandas.core.window.rolling.Rolling.quantile
285+
generated/pandas.core.window.Rolling.skew,../reference/api/pandas.core.window.rolling.Rolling.skew
286+
generated/pandas.core.window.Rolling.std,../reference/api/pandas.core.window.rolling.Rolling.std
287+
generated/pandas.core.window.Rolling.sum,../reference/api/pandas.core.window.rolling.Rolling.sum
288+
generated/pandas.core.window.Rolling.var,../reference/api/pandas.core.window.rolling.Rolling.var
289289
generated/pandas.core.window.Window.mean,../reference/api/pandas.core.window.Window.mean
290290
generated/pandas.core.window.Window.sum,../reference/api/pandas.core.window.Window.sum
291291
generated/pandas.crosstab,../reference/api/pandas.crosstab

doc/source/development/code_style.rst

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -119,14 +119,14 @@ For example:
119119
.. code-block:: python
120120
121121
value = str
122-
f"Unknown recived value, got: {repr(value)}"
122+
f"Unknown received value, got: {repr(value)}"
123123
124124
**Good:**
125125

126126
.. code-block:: python
127127
128128
value = str
129-
f"Unknown recived type, got: '{type(value).__name__}'"
129+
f"Unknown received type, got: '{type(value).__name__}'"
130130
131131
132132
Imports (aim for absolute)
@@ -135,11 +135,11 @@ Imports (aim for absolute)
135135
In Python 3, absolute imports are recommended. In absolute import doing something
136136
like ``import string`` will import the string module rather than ``string.py``
137137
in the same directory. As much as possible, you should try to write out
138-
absolute imports that show the whole import chain from toplevel pandas.
138+
absolute imports that show the whole import chain from top-level pandas.
139139

140-
Explicit relative imports are also supported in Python 3. But it is not
141-
recommended to use it. Implicit relative imports should never be used
142-
and is removed in Python 3.
140+
Explicit relative imports are also supported in Python 3 but it is not
141+
recommended to use them. Implicit relative imports should never be used
142+
and are removed in Python 3.
143143

144144
For example:
145145

doc/source/getting_started/10min.rst

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -70,17 +70,17 @@ will be completed:
7070
df2.abs df2.boxplot
7171
df2.add df2.C
7272
df2.add_prefix df2.clip
73-
df2.add_suffix df2.clip_lower
74-
df2.align df2.clip_upper
75-
df2.all df2.columns
73+
df2.add_suffix df2.columns
74+
df2.align df2.copy
75+
df2.all df2.count
7676
df2.any df2.combine
77-
df2.append df2.combine_first
78-
df2.apply df2.consolidate
79-
df2.applymap
80-
df2.D
77+
df2.append df2.D
78+
df2.apply df2.describe
79+
df2.applymap df2.diff
80+
df2.B df2.duplicated
8181

8282
As you can see, the columns ``A``, ``B``, ``C``, and ``D`` are automatically
83-
tab completed. ``E`` is there as well; the rest of the attributes have been
83+
tab completed. ``E`` and ``F`` are there as well; the rest of the attributes have been
8484
truncated for brevity.
8585

8686
Viewing data

doc/source/whatsnew/v1.0.2.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@ Fixed regressions
1616
~~~~~~~~~~~~~~~~~
1717

1818
- Fixed regression in :meth:`DataFrame.to_excel` when ``columns`` kwarg is passed (:issue:`31677`)
19+
- Fixed regression in :meth:`Series.align` when ``other`` is a DataFrame and ``method`` is not None (:issue:`31785`)
1920
-
2021

2122
.. ---------------------------------------------------------------------------

doc/source/whatsnew/v1.1.0.rst

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -72,7 +72,7 @@ Backwards incompatible API changes
7272

7373
Deprecations
7474
~~~~~~~~~~~~
75-
75+
- Lookups on a :class:`Series` with a single-item list containing a slice (e.g. ``ser[[slice(0, 4)]]``) are deprecated, will raise in a future version. Either convert the list to tuple, or pass the slice directly instead (:issue:`31333`)
7676
-
7777
-
7878

@@ -185,6 +185,7 @@ I/O
185185
- Bug in :meth:`DataFrame.to_parquet` overwriting pyarrow's default for
186186
``coerce_timestamps``; following pyarrow's default allows writing nanosecond
187187
timestamps with ``version="2.0"`` (:issue:`31652`).
188+
- Bug in :class:`HDFStore` that caused it to set to ``int64`` the dtype of a ``datetime64`` column when reading a DataFrame in Python 3 from fixed format written in Python 2 (:issue:`31750`)
188189

189190
Plotting
190191
^^^^^^^^
@@ -207,8 +208,10 @@ Reshaping
207208
- Bug in :meth:`DataFrame.pivot_table` when ``margin`` is ``True`` and only ``column`` is defined (:issue:`31016`)
208209
- Fix incorrect error message in :meth:`DataFrame.pivot` when ``columns`` is set to ``None``. (:issue:`30924`)
209210
- Bug in :func:`crosstab` when inputs are two Series and have tuple names, the output will keep dummy MultiIndex as columns. (:issue:`18321`)
211+
- :meth:`DataFrame.pivot` can now take lists for ``index`` and ``columns`` arguments (:issue:`21425`)
210212
- Bug in :func:`concat` where the resulting indices are not copied when ``copy=True`` (:issue:`29879`)
211213

214+
212215
Sparse
213216
^^^^^^
214217

@@ -227,6 +230,8 @@ Other
227230
- Appending a dictionary to a :class:`DataFrame` without passing ``ignore_index=True`` will raise ``TypeError: Can only append a dict if ignore_index=True``
228231
instead of ``TypeError: Can only append a Series if ignore_index=True or if the Series has a name`` (:issue:`30871`)
229232
- Set operations on an object-dtype :class:`Index` now always return object-dtype results (:issue:`31401`)
233+
- Bug in :meth:`AbstractHolidayCalendar.holidays` when no rules were defined (:issue:`31415`)
234+
-
230235

231236
.. ---------------------------------------------------------------------------
232237

pandas/_libs/index.pyx

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,7 @@ cnp.import_array()
1212

1313
cimport pandas._libs.util as util
1414

15+
from pandas._libs.tslibs import Period
1516
from pandas._libs.tslibs.nattype cimport c_NaT as NaT
1617
from pandas._libs.tslibs.c_timestamp cimport _Timestamp
1718

@@ -466,6 +467,28 @@ cdef class TimedeltaEngine(DatetimeEngine):
466467

467468
cdef class PeriodEngine(Int64Engine):
468469

470+
cdef int64_t _unbox_scalar(self, scalar) except? -1:
471+
if scalar is NaT:
472+
return scalar.value
473+
if isinstance(scalar, Period):
474+
# NB: we assume that we have the correct freq here.
475+
# TODO: potential optimize by checking for _Period?
476+
return scalar.ordinal
477+
raise TypeError(scalar)
478+
479+
cpdef get_loc(self, object val):
480+
# NB: the caller is responsible for ensuring that we are called
481+
# with either a Period or NaT
482+
cdef:
483+
int64_t conv
484+
485+
try:
486+
conv = self._unbox_scalar(val)
487+
except TypeError:
488+
raise KeyError(val)
489+
490+
return Int64Engine.get_loc(self, conv)
491+
469492
cdef _get_index_values(self):
470493
return super(PeriodEngine, self).vgetter().view("i8")
471494

pandas/_libs/reduction.pyx

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -309,8 +309,7 @@ cdef class SeriesGrouper(_BaseGrouper):
309309
def __init__(self, object series, object f, object labels,
310310
Py_ssize_t ngroups, object dummy):
311311

312-
# in practice we always pass either obj[:0] or the
313-
# safer obj._get_values(slice(None, 0))
312+
# in practice we always pass obj.iloc[:0] or equivalent
314313
assert dummy is not None
315314

316315
if len(series) == 0:

pandas/core/arrays/datetimelike.py

Lines changed: 23 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -702,12 +702,31 @@ def take(self, indices, allow_fill=False, fill_value=None):
702702

703703
@classmethod
704704
def _concat_same_type(cls, to_concat):
705-
dtypes = {x.dtype for x in to_concat}
706-
assert len(dtypes) == 1
707-
dtype = list(dtypes)[0]
705+
706+
# do not pass tz to set because tzlocal cannot be hashed
707+
dtypes = {str(x.dtype) for x in to_concat}
708+
if len(dtypes) != 1:
709+
raise ValueError("to_concat must have the same dtype (tz)", dtypes)
710+
711+
obj = to_concat[0]
712+
dtype = obj.dtype
708713

709714
values = np.concatenate([x.asi8 for x in to_concat])
710-
return cls(values, dtype=dtype)
715+
716+
if is_period_dtype(to_concat[0].dtype):
717+
new_freq = obj.freq
718+
else:
719+
# GH 3232: If the concat result is evenly spaced, we can retain the
720+
# original frequency
721+
new_freq = None
722+
to_concat = [x for x in to_concat if len(x)]
723+
724+
if obj.freq is not None and all(x.freq == obj.freq for x in to_concat):
725+
pairs = zip(to_concat[:-1], to_concat[1:])
726+
if all(pair[0][-1] + obj.freq == pair[1][0] for pair in pairs):
727+
new_freq = obj.freq
728+
729+
return cls._simple_new(values, dtype=dtype, freq=new_freq)
711730

712731
def copy(self):
713732
values = self.asi8.copy()

pandas/core/arrays/timedeltas.py

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -195,9 +195,12 @@ def __init__(self, values, dtype=_TD_DTYPE, freq=None, copy=False):
195195
def _simple_new(cls, values, freq=None, dtype=_TD_DTYPE):
196196
assert dtype == _TD_DTYPE, dtype
197197
assert isinstance(values, np.ndarray), type(values)
198+
if values.dtype != _TD_DTYPE:
199+
assert values.dtype == "i8"
200+
values = values.view(_TD_DTYPE)
198201

199202
result = object.__new__(cls)
200-
result._data = values.view(_TD_DTYPE)
203+
result._data = values
201204
result._freq = to_offset(freq)
202205
result._dtype = _TD_DTYPE
203206
return result

pandas/core/frame.py

Lines changed: 44 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -3011,17 +3011,12 @@ def _set_value(self, index, col, value, takeable: bool = False):
30113011
col : column label
30123012
value : scalar
30133013
takeable : interpret the index/col as indexers, default False
3014-
3015-
Returns
3016-
-------
3017-
DataFrame
3018-
If label pair is contained, will be reference to calling DataFrame,
3019-
otherwise a new object.
30203014
"""
30213015
try:
30223016
if takeable is True:
30233017
series = self._iget_item_cache(col)
3024-
return series._set_value(index, value, takeable=True)
3018+
series._set_value(index, value, takeable=True)
3019+
return
30253020

30263021
series = self._get_item_cache(col)
30273022
engine = self.index._engine
@@ -3031,7 +3026,6 @@ def _set_value(self, index, col, value, takeable: bool = False):
30313026
series._values[loc] = value
30323027
# Note: trying to use series._set_value breaks tests in
30333028
# tests.frame.indexing.test_indexing and tests.indexing.test_partial
3034-
return self
30353029
except (KeyError, TypeError):
30363030
# set using a non-recursive method & reset the cache
30373031
if takeable:
@@ -3040,8 +3034,6 @@ def _set_value(self, index, col, value, takeable: bool = False):
30403034
self.loc[index, col] = value
30413035
self._item_cache.pop(col, None)
30423036

3043-
return self
3044-
30453037
def _ensure_valid_index(self, value):
30463038
"""
30473039
Ensure that if we don't have an index, that we can create one from the
@@ -5897,11 +5889,19 @@ def groupby(
58975889
58985890
Parameters
58995891
----------%s
5900-
index : str or object, optional
5892+
index : str or object or a list of str, optional
59015893
Column to use to make new frame's index. If None, uses
59025894
existing index.
5903-
columns : str or object
5895+
5896+
.. versionchanged:: 1.1.0
5897+
Also accept list of index names.
5898+
5899+
columns : str or object or a list of str
59045900
Column to use to make new frame's columns.
5901+
5902+
.. versionchanged:: 1.1.0
5903+
Also accept list of columns names.
5904+
59055905
values : str, object or a list of the previous, optional
59065906
Column(s) to use for populating new frame's values. If not
59075907
specified, all remaining columns will be used and the result will
@@ -5968,6 +5968,38 @@ def groupby(
59685968
one 1 2 3 x y z
59695969
two 4 5 6 q w t
59705970
5971+
You could also assign a list of column names or a list of index names.
5972+
5973+
>>> df = pd.DataFrame({
5974+
... "lev1": [1, 1, 1, 2, 2, 2],
5975+
... "lev2": [1, 1, 2, 1, 1, 2],
5976+
... "lev3": [1, 2, 1, 2, 1, 2],
5977+
... "lev4": [1, 2, 3, 4, 5, 6],
5978+
... "values": [0, 1, 2, 3, 4, 5]})
5979+
>>> df
5980+
lev1 lev2 lev3 lev4 values
5981+
0 1 1 1 1 0
5982+
1 1 1 2 2 1
5983+
2 1 2 1 3 2
5984+
3 2 1 2 4 3
5985+
4 2 1 1 5 4
5986+
5 2 2 2 6 5
5987+
5988+
>>> df.pivot(index="lev1", columns=["lev2", "lev3"],values="values")
5989+
lev2 1 2
5990+
lev3 1 2 1 2
5991+
lev1
5992+
1 0.0 1.0 2.0 NaN
5993+
2 4.0 3.0 NaN 5.0
5994+
5995+
>>> df.pivot(index=["lev1", "lev2"], columns=["lev3"],values="values")
5996+
lev3 1 2
5997+
lev1 lev2
5998+
1 1 0.0 1.0
5999+
2 2.0 NaN
6000+
2 1 4.0 3.0
6001+
2 NaN 5.0
6002+
59716003
A ValueError is raised if there are any duplicates.
59726004
59736005
>>> df = pd.DataFrame({"foo": ['one', 'one', 'two', 'two'],

pandas/core/generic.py

Lines changed: 22 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -8360,9 +8360,7 @@ def _align_frame(
83608360
left = self._ensure_type(
83618361
left.fillna(method=method, axis=fill_axis, limit=limit)
83628362
)
8363-
right = self._ensure_type(
8364-
right.fillna(method=method, axis=fill_axis, limit=limit)
8365-
)
8363+
right = right.fillna(method=method, axis=fill_axis, limit=limit)
83668364

83678365
# if DatetimeIndex have different tz, convert to UTC
83688366
if is_datetime64tz_dtype(left.index):
@@ -9961,7 +9959,7 @@ def _add_numeric_operations(cls):
99619959
see_also="",
99629960
examples="",
99639961
)
9964-
@Appender(_num_doc)
9962+
@Appender(_num_doc_mad)
99659963
def mad(self, axis=None, skipna=None, level=None):
99669964
if skipna is None:
99679965
skipna = True
@@ -10330,6 +10328,26 @@ def _doc_parms(cls):
1033010328
%(examples)s
1033110329
"""
1033210330

10331+
_num_doc_mad = """
10332+
%(desc)s
10333+
10334+
Parameters
10335+
----------
10336+
axis : %(axis_descr)s
10337+
Axis for the function to be applied on.
10338+
skipna : bool, default None
10339+
Exclude NA/null values when computing the result.
10340+
level : int or level name, default None
10341+
If the axis is a MultiIndex (hierarchical), count along a
10342+
particular level, collapsing into a %(name1)s.
10343+
10344+
Returns
10345+
-------
10346+
%(name1)s or %(name2)s (if level specified)\
10347+
%(see_also)s\
10348+
%(examples)s
10349+
"""
10350+
1033310351
_num_ddof_doc = """
1033410352
%(desc)s
1033510353

0 commit comments

Comments
 (0)