Skip to content

Commit a5f5ba1

Browse files
committed
Merge remote-tracking branch 'upstream/master' into hist_legend
2 parents cbfc167 + 1ce9f0c commit a5f5ba1

File tree

204 files changed

+7399
-5068
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

204 files changed

+7399
-5068
lines changed

.travis.yml

Lines changed: 13 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,8 @@ cache:
1414

1515
env:
1616
global:
17+
# Variable for test workers
18+
- PYTEST_WORKERS="auto"
1719
# create a github personal access token
1820
# cd pandas-dev/pandas
1921
# travis encrypt 'PANDAS_GH_TOKEN=personal_access_token' -r pandas-dev/pandas
@@ -38,6 +40,10 @@ matrix:
3840
- env:
3941
- JOB="3.7" ENV_FILE="ci/deps/travis-37.yaml" PATTERN="(not slow and not network and not clipboard)"
4042

43+
- arch: arm64
44+
env:
45+
- JOB="3.7, arm64" PYTEST_WORKERS=8 ENV_FILE="ci/deps/travis-37-arm64.yaml" PATTERN="(not slow and not network and not clipboard)"
46+
4147
- env:
4248
- JOB="3.6, locale" ENV_FILE="ci/deps/travis-36-locale.yaml" PATTERN="((not slow and not network and not clipboard) or (single and db))" LOCALE_OVERRIDE="zh_CN.UTF-8" SQL="1"
4349
services:
@@ -59,15 +65,17 @@ matrix:
5965
- mysql
6066
- postgresql
6167
allow_failures:
62-
- dist: bionic
63-
python: 3.9-dev
64-
env:
68+
- arch: arm64
69+
env:
70+
- JOB="3.7, arm64" PYTEST_WORKERS=8 ENV_FILE="ci/deps/travis-37-arm64.yaml" PATTERN="(not slow and not network and not clipboard)"
71+
- dist: bionic
72+
python: 3.9-dev
73+
env:
6574
- JOB="3.9-dev" PATTERN="(not slow and not network)"
6675

6776
before_install:
6877
- echo "before_install"
69-
# set non-blocking IO on travis
70-
# https://github.com/travis-ci/travis-ci/issues/8920#issuecomment-352661024
78+
# Use blocking IO on travis. Ref: https://github.com/travis-ci/travis-ci/issues/8920#issuecomment-352661024
7179
- python -c 'import os,sys,fcntl; flags = fcntl.fcntl(sys.stdout, fcntl.F_GETFL); fcntl.fcntl(sys.stdout, fcntl.F_SETFL, flags&~os.O_NONBLOCK);'
7280
- source ci/travis_process_gbq_encryption.sh
7381
- export PATH="$HOME/miniconda3/bin:$PATH"

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@
1616
[![Downloads](https://anaconda.org/conda-forge/pandas/badges/downloads.svg)](https://pandas.pydata.org)
1717
[![Gitter](https://badges.gitter.im/Join%20Chat.svg)](https://gitter.im/pydata/pandas)
1818
[![Powered by NumFOCUS](https://img.shields.io/badge/powered%20by-NumFOCUS-orange.svg?style=flat&colorA=E1523D&colorB=007D8A)](https://numfocus.org)
19+
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
1920

2021
## What is it?
2122

azure-pipelines.yml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,9 @@ trigger:
55
pr:
66
- master
77

8+
variables:
9+
PYTEST_WORKERS: auto
10+
811
jobs:
912
# Mac and Linux use the same template
1013
- template: ci/azure/posix.yml

ci/deps/azure-37-numpydev.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ dependencies:
1616
- pip:
1717
- cython==0.29.16 # GH#34014
1818
- "git+git://github.com/dateutil/dateutil.git"
19-
- "-f https://7933911d6844c6c53a7d-47bd50c35cd79bd838daf386af554a83.ssl.cf2.rackcdn.com"
19+
- "--extra-index-url https://pypi.anaconda.org/scipy-wheels-nightly/simple"
2020
- "--pre"
2121
- "numpy"
2222
- "scipy"

ci/deps/travis-37-arm64.yaml

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
name: pandas-dev
2+
channels:
3+
- defaults
4+
- conda-forge
5+
dependencies:
6+
- python=3.7.*
7+
8+
# tools
9+
- cython>=0.29.13
10+
- pytest>=5.0.1
11+
- pytest-xdist>=1.21
12+
- hypothesis>=3.58.0
13+
14+
# pandas dependencies
15+
- botocore>=1.11
16+
- numpy
17+
- python-dateutil
18+
- pytz
19+
- pip
20+
- pip:
21+
- moto

ci/run_tests.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ if [[ $(uname) == "Linux" && -z $DISPLAY ]]; then
2020
XVFB="xvfb-run "
2121
fi
2222

23-
PYTEST_CMD="${XVFB}pytest -m \"$PATTERN\" -n auto --dist=loadfile -s --strict --durations=10 --junitxml=test-data.xml $TEST_ARGS $COVERAGE pandas"
23+
PYTEST_CMD="${XVFB}pytest -m \"$PATTERN\" -n $PYTEST_WORKERS --dist=loadfile -s --strict --durations=30 --junitxml=test-data.xml $TEST_ARGS $COVERAGE pandas"
2424

2525
echo $PYTEST_CMD
2626
sh -c "$PYTEST_CMD"

ci/setup_env.sh

Lines changed: 10 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -41,9 +41,17 @@ else
4141
exit 1
4242
fi
4343

44-
wget -q "https://repo.continuum.io/miniconda/Miniconda3-latest-$CONDA_OS.sh" -O miniconda.sh
44+
if [ "${TRAVIS_CPU_ARCH}" == "arm64" ]; then
45+
sudo apt-get -y install xvfb
46+
CONDA_URL="https://github.com/conda-forge/miniforge/releases/download/4.8.2-1/Miniforge3-4.8.2-1-Linux-aarch64.sh"
47+
else
48+
CONDA_URL="https://repo.continuum.io/miniconda/Miniconda3-latest-$CONDA_OS.sh"
49+
fi
50+
wget -q $CONDA_URL -O miniconda.sh
4551
chmod +x miniconda.sh
46-
./miniconda.sh -b
52+
53+
# Installation path is required for ARM64 platform as miniforge script installs in path $HOME/miniforge3.
54+
./miniconda.sh -b -p $MINICONDA_DIR
4755

4856
export PATH=$MINICONDA_DIR/bin:$PATH
4957

doc/source/development/contributing.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -110,7 +110,7 @@ version control to allow many people to work together on the project.
110110
Some great resources for learning Git:
111111

112112
* the `GitHub help pages <https://help.github.com/>`_.
113-
* the `NumPy's documentation <https://docs.scipy.org/doc/numpy/dev/index.html>`_.
113+
* the `NumPy's documentation <https://numpy.org/doc/stable/dev/index.html>`_.
114114
* Matthew Brett's `Pydagogue <https://matthew-brett.github.com/pydagogue/>`_.
115115

116116
Getting started with Git
@@ -974,7 +974,7 @@ it is worth getting in the habit of writing tests ahead of time so this is never
974974
Like many packages, pandas uses `pytest
975975
<https://docs.pytest.org/en/latest/>`_ and the convenient
976976
extensions in `numpy.testing
977-
<https://docs.scipy.org/doc/numpy/reference/routines.testing.html>`_.
977+
<https://numpy.org/doc/stable/reference/routines.testing.html>`_.
978978

979979
.. note::
980980

doc/source/development/extending.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -219,7 +219,7 @@ and re-boxes it if necessary.
219219

220220
If applicable, we highly recommend that you implement ``__array_ufunc__`` in your
221221
extension array to avoid coercion to an ndarray. See
222-
`the numpy documentation <https://docs.scipy.org/doc/numpy/reference/generated/numpy.lib.mixins.NDArrayOperatorsMixin.html>`__
222+
`the numpy documentation <https://numpy.org/doc/stable/reference/generated/numpy.lib.mixins.NDArrayOperatorsMixin.html>`__
223223
for an example.
224224

225225
As part of your implementation, we require that you defer to pandas when a pandas

doc/source/ecosystem.rst

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ substantial projects that you feel should be on this list, please let us know.
3030
Data cleaning and validation
3131
----------------------------
3232

33-
`pyjanitor <https://github.com/ericmjl/pyjanitor/>`__
33+
`Pyjanitor <https://github.com/ericmjl/pyjanitor/>`__
3434
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
3535

3636
Pyjanitor provides a clean API for cleaning data, using method chaining.
@@ -115,7 +115,7 @@ It is very similar to the matplotlib plotting backend, but provides interactive
115115
web-based charts and maps.
116116

117117

118-
`seaborn <https://seaborn.pydata.org>`__
118+
`Seaborn <https://seaborn.pydata.org>`__
119119
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
120120

121121
Seaborn is a Python visualization library based on
@@ -136,7 +136,7 @@ provides a powerful, declarative and extremely general way to generate bespoke p
136136
Various implementations to other languages are available.
137137
A good implementation for Python users is `has2k1/plotnine <https://github.com/has2k1/plotnine/>`__.
138138

139-
`IPython Vega <https://github.com/vega/ipyvega>`__
139+
`IPython vega <https://github.com/vega/ipyvega>`__
140140
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
141141

142142
`IPython Vega <https://github.com/vega/ipyvega>`__ leverages `Vega
@@ -147,7 +147,7 @@ A good implementation for Python users is `has2k1/plotnine <https://github.com/h
147147

148148
`Plotly’s <https://plot.ly/>`__ `Python API <https://plot.ly/python/>`__ enables interactive figures and web shareability. Maps, 2D, 3D, and live-streaming graphs are rendered with WebGL and `D3.js <https://d3js.org/>`__. The library supports plotting directly from a pandas DataFrame and cloud-based collaboration. Users of `matplotlib, ggplot for Python, and Seaborn <https://plot.ly/python/matplotlib-to-plotly-tutorial/>`__ can convert figures into interactive web-based plots. Plots can be drawn in `IPython Notebooks <https://plot.ly/ipython-notebooks/>`__ , edited with R or MATLAB, modified in a GUI, or embedded in apps and dashboards. Plotly is free for unlimited sharing, and has `cloud <https://plot.ly/product/plans/>`__, `offline <https://plot.ly/python/offline/>`__, or `on-premise <https://plot.ly/product/enterprise/>`__ accounts for private use.
149149

150-
`QtPandas <https://github.com/draperjames/qtpandas>`__
150+
`Qtpandas <https://github.com/draperjames/qtpandas>`__
151151
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
152152

153153
Spun off from the main pandas library, the `qtpandas <https://github.com/draperjames/qtpandas>`__
@@ -187,7 +187,7 @@ See :ref:`Options and Settings <options>` and
187187
:ref:`Available Options <options.available>`
188188
for pandas ``display.`` settings.
189189

190-
`quantopian/qgrid <https://github.com/quantopian/qgrid>`__
190+
`Quantopian/qgrid <https://github.com/quantopian/qgrid>`__
191191
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
192192

193193
qgrid is "an interactive grid for sorting and filtering
@@ -249,12 +249,12 @@ The following data feeds are available:
249249
* Stooq Index Data
250250
* MOEX Data
251251

252-
`quandl/Python <https://github.com/quandl/Python>`__
252+
`Quandl/Python <https://github.com/quandl/Python>`__
253253
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
254254
Quandl API for Python wraps the Quandl REST API to return
255255
Pandas DataFrames with timeseries indexes.
256256

257-
`pydatastream <https://github.com/vfilimonov/pydatastream>`__
257+
`Pydatastream <https://github.com/vfilimonov/pydatastream>`__
258258
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
259259
PyDatastream is a Python interface to the
260260
`Refinitiv Datastream (DWS) <https://www.refinitiv.com/en/products/datastream-macroeconomic-analysis>`__
@@ -384,7 +384,7 @@ Pandas provides an interface for defining
384384
system. The following libraries implement that interface to provide types not
385385
found in NumPy or pandas, which work well with pandas' data containers.
386386

387-
`cyberpandas`_
387+
`Cyberpandas`_
388388
~~~~~~~~~~~~~~
389389

390390
Cyberpandas provides an extension type for storing arrays of IP Addresses. These
@@ -411,4 +411,4 @@ Library Accessor Classes Description
411411
.. _pdvega: https://altair-viz.github.io/pdvega/
412412
.. _Altair: https://altair-viz.github.io/
413413
.. _pandas_path: https://github.com/drivendataorg/pandas-path/
414-
.. _pathlib.Path: https://docs.python.org/3/library/pathlib.html
414+
.. _pathlib.Path: https://docs.python.org/3/library/pathlib.html

doc/source/getting_started/intro_tutorials/02_read_write.rst

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@
2323
<div class="card-body">
2424
<p class="card-text">
2525

26-
This tutorial uses the titanic data set, stored as CSV. The data
26+
This tutorial uses the Titanic data set, stored as CSV. The data
2727
consists of the following data columns:
2828

2929
- PassengerId: Id of every passenger.
@@ -61,7 +61,7 @@ How do I read and write tabular data?
6161
<ul class="task-bullet">
6262
<li>
6363

64-
I want to analyse the titanic passenger data, available as a CSV file.
64+
I want to analyze the Titanic passenger data, available as a CSV file.
6565

6666
.. ipython:: python
6767
@@ -134,7 +134,7 @@ strings (``object``).
134134
<ul class="task-bullet">
135135
<li>
136136

137-
My colleague requested the titanic data as a spreadsheet.
137+
My colleague requested the Titanic data as a spreadsheet.
138138

139139
.. ipython:: python
140140

doc/source/getting_started/intro_tutorials/03_subset_data.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -330,7 +330,7 @@ When using the column names, row labels or a condition expression, use
330330
the ``loc`` operator in front of the selection brackets ``[]``. For both
331331
the part before and after the comma, you can use a single label, a list
332332
of labels, a slice of labels, a conditional expression or a colon. Using
333-
a colon specificies you want to select all rows or columns.
333+
a colon specifies you want to select all rows or columns.
334334

335335
.. raw:: html
336336

doc/source/getting_started/intro_tutorials/06_calculate_statistics.rst

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@
2323
<div class="card-body">
2424
<p class="card-text">
2525

26-
This tutorial uses the titanic data set, stored as CSV. The data
26+
This tutorial uses the Titanic data set, stored as CSV. The data
2727
consists of the following data columns:
2828

2929
- PassengerId: Id of every passenger.
@@ -72,7 +72,7 @@ Aggregating statistics
7272
<ul class="task-bullet">
7373
<li>
7474

75-
What is the average age of the titanic passengers?
75+
What is the average age of the Titanic passengers?
7676

7777
.. ipython:: python
7878
@@ -95,7 +95,7 @@ across rows by default.
9595
<ul class="task-bullet">
9696
<li>
9797

98-
What is the median age and ticket fare price of the titanic passengers?
98+
What is the median age and ticket fare price of the Titanic passengers?
9999

100100
.. ipython:: python
101101
@@ -148,7 +148,7 @@ Aggregating statistics grouped by category
148148
<ul class="task-bullet">
149149
<li>
150150

151-
What is the average age for male versus female titanic passengers?
151+
What is the average age for male versus female Titanic passengers?
152152

153153
.. ipython:: python
154154

doc/source/getting_started/intro_tutorials/07_reshape_table_layout.rst

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@
2323
<div class="card-body">
2424
<p class="card-text">
2525

26-
This tutorial uses the titanic data set, stored as CSV. The data
26+
This tutorial uses the Titanic data set, stored as CSV. The data
2727
consists of the following data columns:
2828

2929
- PassengerId: Id of every passenger.
@@ -122,7 +122,7 @@ Sort table rows
122122
<ul class="task-bullet">
123123
<li>
124124

125-
I want to sort the titanic data according to the age of the passengers.
125+
I want to sort the Titanic data according to the age of the passengers.
126126

127127
.. ipython:: python
128128
@@ -138,7 +138,7 @@ I want to sort the titanic data according to the age of the passengers.
138138
<ul class="task-bullet">
139139
<li>
140140

141-
I want to sort the titanic data according to the cabin class and age in descending order.
141+
I want to sort the Titanic data according to the cabin class and age in descending order.
142142

143143
.. ipython:: python
144144
@@ -282,7 +282,7 @@ For more information about :meth:`~DataFrame.pivot_table`, see the user guide se
282282
</div>
283283

284284
.. note::
285-
If case you are wondering, :meth:`~DataFrame.pivot_table` is indeed directly linked
285+
In case you are wondering, :meth:`~DataFrame.pivot_table` is indeed directly linked
286286
to :meth:`~DataFrame.groupby`. The same result can be derived by grouping on both
287287
``parameter`` and ``location``:
288288

@@ -338,7 +338,7 @@ newly created column.
338338

339339
The solution is the short version on how to apply :func:`pandas.melt`. The method
340340
will *melt* all columns NOT mentioned in ``id_vars`` together into two
341-
columns: A columns with the column header names and a column with the
341+
columns: A column with the column header names and a column with the
342342
values itself. The latter column gets by default the name ``value``.
343343

344344
The :func:`pandas.melt` method can be defined in more detail:
@@ -357,8 +357,8 @@ The result in the same, but in more detail defined:
357357

358358
- ``value_vars`` defines explicitly which columns to *melt* together
359359
- ``value_name`` provides a custom column name for the values column
360-
instead of the default columns name ``value``
361-
- ``var_name`` provides a custom column name for the columns collecting
360+
instead of the default column name ``value``
361+
- ``var_name`` provides a custom column name for the column collecting
362362
the column header names. Otherwise it takes the index name or a
363363
default ``variable``
364364

@@ -383,7 +383,7 @@ Conversion from wide to long format with :func:`pandas.melt` is explained in the
383383
<h4>REMEMBER</h4>
384384

385385
- Sorting by one or more columns is supported by ``sort_values``
386-
- The ``pivot`` function is purely restructering of the data,
386+
- The ``pivot`` function is purely restructuring of the data,
387387
``pivot_table`` supports aggregations
388388
- The reverse of ``pivot`` (long to wide format) is ``melt`` (wide to
389389
long format)

doc/source/getting_started/intro_tutorials/08_combine_dataframes.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -305,7 +305,7 @@ More information on join/merge of tables is provided in the user guide section o
305305
<div class="shadow gs-callout gs-callout-remember">
306306
<h4>REMEMBER</h4>
307307

308-
- Multiple tables can be concatenated both column as row wise using
308+
- Multiple tables can be concatenated both column-wise and row-wise using
309309
the ``concat`` function.
310310
- For database-like merging/joining of tables, use the ``merge``
311311
function.

doc/source/getting_started/intro_tutorials/09_timeseries.rst

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -78,7 +78,7 @@ provide any datetime operations (e.g. extract the year, day of the
7878
week,…). By applying the ``to_datetime`` function, pandas interprets the
7979
strings and convert these to datetime (i.e. ``datetime64[ns, UTC]``)
8080
objects. In pandas we call these datetime objects similar to
81-
``datetime.datetime`` from the standard library a :class:`pandas.Timestamp`.
81+
``datetime.datetime`` from the standard library as :class:`pandas.Timestamp`.
8282

8383
.. raw:: html
8484

@@ -99,7 +99,7 @@ objects. In pandas we call these datetime objects similar to
9999
Why are these :class:`pandas.Timestamp` objects useful? Let’s illustrate the added
100100
value with some example cases.
101101

102-
What is the start and end date of the time series data set working
102+
What is the start and end date of the time series data set we are working
103103
with?
104104

105105
.. ipython:: python
@@ -214,7 +214,7 @@ Plot the typical :math:`NO_2` pattern during the day of our time series of all s
214214
215215
Similar to the previous case, we want to calculate a given statistic
216216
(e.g. mean :math:`NO_2`) **for each hour of the day** and we can use the
217-
split-apply-combine approach again. For this case, the datetime property ``hour``
217+
split-apply-combine approach again. For this case, we use the datetime property ``hour``
218218
of pandas ``Timestamp``, which is also accessible by the ``dt`` accessor.
219219

220220
.. raw:: html

0 commit comments

Comments
 (0)