Skip to content

Sync Fork from Upstream Repo #8

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Jan 13, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 26 additions & 0 deletions doc/source/development/code_style.rst
Original file line number Diff line number Diff line change
Expand Up @@ -127,3 +127,29 @@ For example:

value = str
f"Unknown recived type, got: '{type(value).__name__}'"


Imports (aim for absolute)
==========================

In Python 3, absolute imports are recommended. In absolute import doing something
like ``import string`` will import the string module rather than ``string.py``
in the same directory. As much as possible, you should try to write out
absolute imports that show the whole import chain from toplevel pandas.

Explicit relative imports are also supported in Python 3. But it is not
recommended to use it. Implicit relative imports should never be used
and is removed in Python 3.

For example:

::

# preferred
import pandas.core.common as com

# not preferred
from .common import test_base

# wrong
from common import test_base
2 changes: 1 addition & 1 deletion doc/source/whatsnew/v0.25.3.rst
Original file line number Diff line number Diff line change
Expand Up @@ -19,4 +19,4 @@ Groupby/resample/rolling
Contributors
~~~~~~~~~~~~

.. contributors:: v0.25.2..HEAD
.. contributors:: v0.25.2..v0.25.3
6 changes: 4 additions & 2 deletions doc/source/whatsnew/v1.0.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -218,15 +218,13 @@ Other enhancements
now preserve those data types with pyarrow >= 0.16.0 (:issue:`20612`, :issue:`28371`).
- The ``partition_cols`` argument in :meth:`DataFrame.to_parquet` now accepts a string (:issue:`27117`)
- :func:`pandas.read_json` now parses ``NaN``, ``Infinity`` and ``-Infinity`` (:issue:`12213`)
- The ``pandas.np`` submodule is now deprecated. Import numpy directly instead (:issue:`30296`)
- :func:`to_parquet` now appropriately handles the ``schema`` argument for user defined schemas in the pyarrow engine. (:issue:`30270`)
- DataFrame constructor preserve `ExtensionArray` dtype with `ExtensionArray` (:issue:`11363`)
- :meth:`DataFrame.sort_values` and :meth:`Series.sort_values` have gained ``ignore_index`` keyword to be able to reset index after sorting (:issue:`30114`)
- :meth:`DataFrame.sort_index` and :meth:`Series.sort_index` have gained ``ignore_index`` keyword to reset index (:issue:`30114`)
- :meth:`DataFrame.drop_duplicates` has gained ``ignore_index`` keyword to reset index (:issue:`30114`)
- Added new writer for exporting Stata dta files in version 118, ``StataWriter118``. This format supports exporting strings containing Unicode characters (:issue:`23573`)
- :meth:`Series.map` now accepts ``collections.abc.Mapping`` subclasses as a mapper (:issue:`29733`)
- The ``pandas.datetime`` class is now deprecated. Import from ``datetime`` instead (:issue:`30296`)
- Added an experimental :attr:`~DataFrame.attrs` for storing global metadata about a dataset (:issue:`29062`)
- :meth:`Timestamp.fromisocalendar` is now compatible with python 3.8 and above (:issue:`28115`)
- :meth:`DataFrame.to_pickle` and :func:`read_pickle` now accept URL (:issue:`30163`)
Expand Down Expand Up @@ -707,6 +705,8 @@ Deprecations
- ``pandas.SparseArray`` has been deprecated. Use ``pandas.arrays.SparseArray`` (:class:`arrays.SparseArray`) instead. (:issue:`30642`)
- The parameter ``is_copy`` of :meth:`DataFrame.take` has been deprecated and will be removed in a future version. (:issue:`27357`)
- Support for multi-dimensional indexing (e.g. ``index[:, None]``) on a :class:`Index` is deprecated and will be removed in a future version, convert to a numpy array before indexing instead (:issue:`30588`)
- The ``pandas.np`` submodule is now deprecated. Import numpy directly instead (:issue:`30296`)
- The ``pandas.datetime`` class is now deprecated. Import from ``datetime`` instead (:issue:`30610`)

**Selecting Columns from a Grouped DataFrame**

Expand Down Expand Up @@ -1177,3 +1177,5 @@ Other

Contributors
~~~~~~~~~~~~

.. contributors:: v0.25.3..v1.0.0rc0
9 changes: 3 additions & 6 deletions pandas/_libs/algos.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -914,8 +914,7 @@ def rank_1d(rank_t[:] in_arr, ties_method='average',
ranks[argsorted[j]] = i + 1
elif tiebreak == TIEBREAK_FIRST:
if rank_t is object:
raise ValueError('first not supported for '
'non-numeric data')
raise ValueError('first not supported for non-numeric data')
else:
for j in range(i - dups + 1, i + 1):
ranks[argsorted[j]] = j + 1
Expand Down Expand Up @@ -971,8 +970,7 @@ def rank_1d(rank_t[:] in_arr, ties_method='average',
ranks[argsorted[j]] = i + 1
elif tiebreak == TIEBREAK_FIRST:
if rank_t is object:
raise ValueError('first not supported for '
'non-numeric data')
raise ValueError('first not supported for non-numeric data')
else:
for j in range(i - dups + 1, i + 1):
ranks[argsorted[j]] = j + 1
Expand Down Expand Up @@ -1137,8 +1135,7 @@ def rank_2d(rank_t[:, :] in_arr, axis=0, ties_method='average',
ranks[i, argsorted[i, z]] = j + 1
elif tiebreak == TIEBREAK_FIRST:
if rank_t is object:
raise ValueError('first not supported '
'for non-numeric data')
raise ValueError('first not supported for non-numeric data')
else:
for z in range(j - dups + 1, j + 1):
ranks[i, argsorted[i, z]] = z + 1
Expand Down
3 changes: 1 addition & 2 deletions pandas/_libs/groupby.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -686,8 +686,7 @@ def _group_ohlc(floating[:, :] out,
raise ValueError('Output array must have 4 columns')

if K > 1:
raise NotImplementedError("Argument 'values' must have only "
"one dimension")
raise NotImplementedError("Argument 'values' must have only one dimension")
out[:] = np.nan

with nogil:
Expand Down
11 changes: 7 additions & 4 deletions pandas/_libs/hashing.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -51,8 +51,9 @@ def hash_object_array(object[:] arr, object key, object encoding='utf8'):
k = <bytes>key.encode(encoding)
kb = <uint8_t *>k
if len(k) != 16:
raise ValueError("key should be a 16-byte string encoded, "
f"got {k} (len {len(k)})")
raise ValueError(
f"key should be a 16-byte string encoded, got {k} (len {len(k)})"
)

n = len(arr)

Expand All @@ -77,8 +78,10 @@ def hash_object_array(object[:] arr, object key, object encoding='utf8'):
hash(val)
data = <bytes>str(val).encode(encoding)
else:
raise TypeError(f"{val} of type {type(val)} is not a valid type "
"for hashing, must be string or null")
raise TypeError(
f"{val} of type {type(val)} is not a valid type for hashing, "
"must be string or null"
)

l = len(data)
lens[i] = l
Expand Down
5 changes: 3 additions & 2 deletions pandas/_libs/indexing.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ cdef class _NDFrameIndexerBase:
if ndim is None:
ndim = self._ndim = self.obj.ndim
if ndim > 2:
raise ValueError("NDFrameIndexer does not support "
"NDFrame objects with ndim > 2")
raise ValueError(
"NDFrameIndexer does not support NDFrame objects with ndim > 2"
)
return ndim
6 changes: 3 additions & 3 deletions pandas/_libs/sparse.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -72,9 +72,9 @@ cdef class IntIndex(SparseIndex):
"""

if self.npoints > self.length:
msg = (f"Too many indices. Expected "
f"{self.length} but found {self.npoints}")
raise ValueError(msg)
raise ValueError(
f"Too many indices. Expected {self.length} but found {self.npoints}"
)

# Indices are vacuously ordered and non-negative
# if the sequence of indices is empty.
Expand Down
6 changes: 3 additions & 3 deletions pandas/_libs/testing.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -127,9 +127,9 @@ cpdef assert_almost_equal(a, b,
# classes can't be the same, to raise error
assert_class_equal(a, b, obj=obj)

assert has_length(a) and has_length(b), ("Can't compare objects without "
"length, one or both is invalid: "
f"({a}, {b})")
assert has_length(a) and has_length(b), (
f"Can't compare objects without length, one or both is invalid: ({a}, {b})"
)

if a_is_ndarray and b_is_ndarray:
na, nb = a.size, b.size
Expand Down
74 changes: 47 additions & 27 deletions pandas/_libs/tslibs/timestamps.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -161,8 +161,7 @@ def round_nsint64(values, mode, freq):

# if/elif above should catch all rounding modes defined in enum 'RoundTo':
# if flow of control arrives here, it is a bug
raise ValueError("round_nsint64 called with an unrecognized "
"rounding mode")
raise ValueError("round_nsint64 called with an unrecognized rounding mode")


# ----------------------------------------------------------------------
Expand Down Expand Up @@ -324,8 +323,10 @@ class Timestamp(_Timestamp):

Function is not implemented. Use pd.to_datetime().
"""
raise NotImplementedError("Timestamp.strptime() is not implemented."
"Use to_datetime() to parse date strings.")
raise NotImplementedError(
"Timestamp.strptime() is not implemented. "
"Use to_datetime() to parse date strings."
)

@classmethod
def combine(cls, date, time):
Expand Down Expand Up @@ -381,8 +382,9 @@ class Timestamp(_Timestamp):
if tzinfo is not None:
if not PyTZInfo_Check(tzinfo):
# tzinfo must be a datetime.tzinfo object, GH#17690
raise TypeError(f'tzinfo must be a datetime.tzinfo object, '
f'not {type(tzinfo)}')
raise TypeError(
f"tzinfo must be a datetime.tzinfo object, not {type(tzinfo)}"
)
elif tz is not None:
raise ValueError('Can provide at most one of tz, tzinfo')

Expand All @@ -393,8 +395,10 @@ class Timestamp(_Timestamp):
# User passed a date string to parse.
# Check that the user didn't also pass a date attribute kwarg.
if any(arg is not None for arg in _date_attributes):
raise ValueError('Cannot pass a date attribute keyword '
'argument when passing a date string')
raise ValueError(
"Cannot pass a date attribute keyword "
"argument when passing a date string"
)

elif ts_input is _no_input:
# User passed keyword arguments.
Expand Down Expand Up @@ -578,8 +582,10 @@ timedelta}, default 'raise'
@tz.setter
def tz(self, value):
# GH 3746: Prevent localizing or converting the index by setting tz
raise AttributeError("Cannot directly set timezone. Use tz_localize() "
"or tz_convert() as appropriate")
raise AttributeError(
"Cannot directly set timezone. "
"Use tz_localize() or tz_convert() as appropriate"
)

def __setstate__(self, state):
self.value = state[0]
Expand All @@ -598,9 +604,10 @@ timedelta}, default 'raise'

if self.tz is not None:
# GH#21333
warnings.warn("Converting to Period representation will "
"drop timezone information.",
UserWarning)
warnings.warn(
"Converting to Period representation will drop timezone information.",
UserWarning,
)

if freq is None:
freq = self.freq
Expand Down Expand Up @@ -810,13 +817,13 @@ default 'raise'
if ambiguous == 'infer':
raise ValueError('Cannot infer offset with only one time.')

nonexistent_options = ('raise', 'NaT', 'shift_forward',
'shift_backward')
nonexistent_options = ('raise', 'NaT', 'shift_forward', 'shift_backward')
if nonexistent not in nonexistent_options and not isinstance(
nonexistent, timedelta):
raise ValueError("The nonexistent argument must be one of 'raise', "
"'NaT', 'shift_forward', 'shift_backward' or "
"a timedelta object")
raise ValueError(
"The nonexistent argument must be one of 'raise', "
"'NaT', 'shift_forward', 'shift_backward' or a timedelta object"
)

if self.tzinfo is None:
# tz naive, localize
Expand All @@ -833,8 +840,9 @@ default 'raise'
value = tz_convert_single(self.value, UTC, self.tz)
return Timestamp(value, tz=tz, freq=self.freq)
else:
raise TypeError('Cannot localize tz-aware Timestamp, use '
'tz_convert for conversions')
raise TypeError(
"Cannot localize tz-aware Timestamp, use tz_convert for conversions"
)

def tz_convert(self, tz):
"""
Expand All @@ -857,17 +865,28 @@ default 'raise'
"""
if self.tzinfo is None:
# tz naive, use tz_localize
raise TypeError('Cannot convert tz-naive Timestamp, use '
'tz_localize to localize')
raise TypeError(
"Cannot convert tz-naive Timestamp, use tz_localize to localize"
)
else:
# Same UTC timestamp, different time zone
return Timestamp(self.value, tz=tz, freq=self.freq)

astimezone = tz_convert

def replace(self, year=None, month=None, day=None,
hour=None, minute=None, second=None, microsecond=None,
nanosecond=None, tzinfo=object, fold=0):
def replace(
self,
year=None,
month=None,
day=None,
hour=None,
minute=None,
second=None,
microsecond=None,
nanosecond=None,
tzinfo=object,
fold=0,
):
"""
implements datetime.replace, handles nanoseconds.

Expand Down Expand Up @@ -910,8 +929,9 @@ default 'raise'
def validate(k, v):
""" validate integers """
if not is_integer_object(v):
raise ValueError(f"value must be an integer, received "
f"{type(v)} for {k}")
raise ValueError(
f"value must be an integer, received {type(v)} for {k}"
)
return v

if year is not None:
Expand Down
3 changes: 1 addition & 2 deletions pandas/_libs/window/aggregations.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -1871,8 +1871,7 @@ def ewmcov(float64_t[:] input_x, float64_t[:] input_y,
bint is_observation

if <Py_ssize_t>len(input_y) != N:
raise ValueError(f"arrays are of different lengths "
f"({N} and {len(input_y)})")
raise ValueError(f"arrays are of different lengths ({N} and {len(input_y)})")

output = np.empty(N, dtype=float)
if N == 0:
Expand Down
2 changes: 1 addition & 1 deletion pandas/core/generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -177,7 +177,7 @@ class NDFrame(PandasObject, SelectionMixin, indexing.IndexingMixin):
]
_internal_names_set: Set[str] = set(_internal_names)
_accessors: Set[str] = set()
_deprecations: FrozenSet[str] = frozenset(["get_values", "ix"])
_deprecations: FrozenSet[str] = frozenset(["get_values"])
_metadata: List[str] = []
_is_copy = None
_data: BlockManager
Expand Down
Loading