pandas-dev
diff --git a/‎.github/PULL_REQUEST_TEMPLATE.md
Lines changed: 24 additions & 0 deletions b/‎.github/PULL_REQUEST_TEMPLATE.md
Lines changed: 24 additions & 0 deletions
diff --git a/‎.gitignore
Lines changed: 3 additions & 2 deletions b/‎.gitignore
Lines changed: 3 additions & 2 deletions
diff --git a/‎ci/environment-dev.yaml
Lines changed: 1 addition & 0 deletions b/‎ci/environment-dev.yaml
Lines changed: 1 addition & 0 deletions
diff --git a/‎ci/requirements-3.6_DOC.run
Lines changed: 1 addition & 1 deletion b/‎ci/requirements-3.6_DOC.run
Lines changed: 1 addition & 1 deletion
diff --git a/‎ci/requirements_dev.txt
Lines changed: 2 additions & 1 deletion b/‎ci/requirements_dev.txt
Lines changed: 2 additions & 1 deletion
diff --git a/‎doc/source/categorical.rst
Lines changed: 1 addition & 1 deletion b/‎doc/source/categorical.rst
Lines changed: 1 addition & 1 deletion
diff --git a/‎doc/source/comparison_with_sas.rst
Lines changed: 22 additions & 22 deletions b/‎doc/source/comparison_with_sas.rst
Lines changed: 22 additions & 22 deletions
diff --git a/‎doc/source/conf.py
Lines changed: 9 additions & 0 deletions b/‎doc/source/conf.py
Lines changed: 9 additions & 0 deletions
diff --git a/‎doc/source/contributing.rst
Lines changed: 4 additions & 2 deletions b/‎doc/source/contributing.rst
Lines changed: 4 additions & 2 deletions
diff --git a/‎doc/source/io.rst
Lines changed: 6 additions & 0 deletions b/‎doc/source/io.rst
Lines changed: 6 additions & 0 deletions
diff --git a/‎doc/source/whatsnew/v0.23.0.txt
Lines changed: 5 additions & 0 deletions b/‎doc/source/whatsnew/v0.23.0.txt
Lines changed: 5 additions & 0 deletions
diff --git a/‎pandas/_libs/algos_rank_helper.pxi.in
Lines changed: 8 additions & 2 deletions b/‎pandas/_libs/algos_rank_helper.pxi.in
Lines changed: 8 additions & 2 deletions
diff --git a/‎pandas/core/apply.py
Lines changed: 1 addition & 1 deletion b/‎pandas/core/apply.py
Lines changed: 1 addition & 1 deletion
@@ -1,3 +1,27 @@
+Checklist for the pandas documentation sprint (ignore this if you are doing
+an unrelated PR):
+
+- [ ] PR title is "DOC: update the <your-function-or-method> docstring"
+- [ ] The validation script passes: `scripts/validate_docstrings.py <your-function-or-method>`
+- [ ] The PEP8 style check passes: `git diff upstream/master -u -- "*.py" | flake8 --diff`
+- [ ] The html version looks good: `python doc/make.py --single <your-function-or-method>`
+- [ ] It has been proofread on language by another sprint participant
+
+Please include the output of the validation script below between the "```" ticks:
+
+```
+# paste output of "scripts/validate_docstrings.py <your-function-or-method>" here
+# between the "```" (remove this comment, but keep the "```")
+
+```
+
+If the validation script still gives errors, but you think there is a good reason
+to deviate in this case (and there are certainly such cases), please state this
+explicitly.
+
+
+Checklist for other PRs (remove this part if you are doing a PR for the pandas documentation sprint):
+
 - [ ] closes #xxxx
 - [ ] tests added / passed
 - [ ] passes `git diff upstream/master -u -- "*.py" | flake8 --diff`
 
@@ -88,8 +88,9 @@ scikits
 *.c
 *.cpp
 
-# Performance Testing #
-#######################
+# Unit / Performance Testing #
+##############################
+.pytest_cache/
 asv_bench/env/
 asv_bench/html/
 asv_bench/results/
 
@@ -5,6 +5,7 @@ channels:
 dependencies:
   - Cython
   - NumPy
+  - flake8
   - moto
   - pytest>=3.1
   - python-dateutil>=2.5.0
 
@@ -5,7 +5,7 @@ sphinx
 nbconvert
 nbformat
 notebook
-matplotlib
+matplotlib=2.1*
 seaborn
 scipy
 lxml
 
@@ -2,9 +2,10 @@
 # Do not modify directly
 Cython
 NumPy
+flake8
 moto
 pytest>=3.1
 python-dateutil>=2.5.0
 pytz
 setuptools>=3.3
-sphinx
+sphinx
@@ -177,7 +177,7 @@ are consistent among all columns.
 .. note::
 
     To perform table-wise conversion, where all labels in the entire ``DataFrame`` are used as
-    categories for each column, the ``categories`` parameter can be determined programatically by
+    categories for each column, the ``categories`` parameter can be determined programmatically by
     ``categories = pd.unique(df.values.ravel())``.
 
 If you already have ``codes`` and ``categories``, you can use the 
 
@@ -25,7 +25,7 @@ As is customary, we import pandas and NumPy as follows:
    This is often used in interactive work (e.g. `Jupyter notebook
    <https://jupyter.org/>`_ or terminal) - the equivalent in SAS would be:
 
-   .. code-block:: none
+   .. code-block:: sas
 
       proc print data=df(obs=5);
       run;
@@ -65,7 +65,7 @@ in the ``DATA`` step.
 
 Every ``DataFrame`` and ``Series`` has an ``Index`` - which are labels on the
 *rows* of the data. SAS does not have an exactly analogous concept. A data set's
-row are essentially unlabeled, other than an implicit integer index that can be
+rows are essentially unlabeled, other than an implicit integer index that can be
 accessed during the ``DATA`` step (``_N_``).
 
 In pandas, if no index is specified, an integer index is also used by default
@@ -87,7 +87,7 @@ A SAS data set can be built from specified values by
 placing the data after a ``datalines`` statement and
 specifying the column names.
 
-.. code-block:: none
+.. code-block:: sas
 
    data df;
        input x y;
@@ -121,7 +121,7 @@ will be used in many of the following examples.
 
 SAS provides ``PROC IMPORT`` to read csv data into a data set.
 
-.. code-block:: none
+.. code-block:: sas
 
    proc import datafile='tips.csv' dbms=csv out=tips replace;
        getnames=yes;
@@ -156,7 +156,7 @@ Exporting Data
 
 The inverse of ``PROC IMPORT`` in SAS is ``PROC EXPORT``
 
-.. code-block:: none
+.. code-block:: sas
 
    proc export data=tips outfile='tips2.csv' dbms=csv;
    run;
@@ -178,7 +178,7 @@ Operations on Columns
 In the ``DATA`` step, arbitrary math expressions can
 be used on new or existing columns.
 
-.. code-block:: none
+.. code-block:: sas
 
    data tips;
        set tips;
@@ -207,7 +207,7 @@ Filtering
 Filtering in SAS is done with an ``if`` or ``where`` statement, on one
 or more columns.
 
-.. code-block:: none
+.. code-block:: sas
 
    data tips;
        set tips;
@@ -233,7 +233,7 @@ If/Then Logic
 
 In SAS, if/then logic can be used to create new columns.
 
-.. code-block:: none
+.. code-block:: sas
 
    data tips;
        set tips;
@@ -262,7 +262,7 @@ Date Functionality
 SAS provides a variety of functions to do operations on
 date/datetime columns.
 
-.. code-block:: none
+.. code-block:: sas
 
    data tips;
        set tips;
@@ -307,7 +307,7 @@ Selection of Columns
 SAS provides keywords in the ``DATA`` step to select,
 drop, and rename columns.
 
-.. code-block:: none
+.. code-block:: sas
 
    data tips;
        set tips;
@@ -343,7 +343,7 @@ Sorting by Values
 
 Sorting in SAS is accomplished via ``PROC SORT``
 
-.. code-block:: none
+.. code-block:: sas
 
    proc sort data=tips;
        by sex total_bill;
@@ -369,7 +369,7 @@ SAS determines the length of a character string with the
 and `LENGTHC <http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a002283942.htm>`__ 
 functions. ``LENGTHN`` excludes trailing blanks and ``LENGTHC`` includes trailing blanks.
 
-.. code-block:: none
+.. code-block:: sas
 
    data _null_;
    set tips;
@@ -395,7 +395,7 @@ SAS determines the position of a character in a string with the
 ``FINDW`` takes the string defined by the first argument and searches for the first position of the substring 
 you supply as the second argument.
 
-.. code-block:: none
+.. code-block:: sas
 
    data _null_;
    set tips;
@@ -419,7 +419,7 @@ Substring
 SAS extracts a substring from a string based on its position with the 
 `SUBSTR <http://www2.sas.com/proceedings/sugi25/25/cc/25p088.pdf>`__ function. 
 
-.. code-block:: none
+.. code-block:: sas
 
    data _null_;
    set tips;
@@ -442,7 +442,7 @@ The SAS `SCAN <http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/def
 function returns the nth word from a string. The first argument is the string you want to parse and the 
 second argument specifies which word you want to extract.
 
-.. code-block:: none
+.. code-block:: sas
 
    data firstlast;
    input String $60.;
@@ -474,7 +474,7 @@ The SAS `UPCASE <http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/d
 `PROPCASE <http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/a002598106.htm>`__ 
 functions change the case of the argument.
 
-.. code-block:: none
+.. code-block:: sas
 
    data firstlast;
    input String $60.;
@@ -516,7 +516,7 @@ types of joins are accomplished using the ``in=`` dummy
 variables to track whether a match was found in one or both
 input frames.
 
-.. code-block:: none
+.. code-block:: sas
 
    proc sort data=df1;
        by key;
@@ -572,7 +572,7 @@ operations, and is ignored by default for aggregations.
 One difference is that missing data cannot be compared to its sentinel value.
 For example, in SAS you could do this to filter missing values.
 
-.. code-block:: none
+.. code-block:: sas
 
    data outer_join_nulls;
        set outer_join;
@@ -615,7 +615,7 @@ SAS's PROC SUMMARY can be used to group by one or
 more key variables and compute aggregations on
 numeric columns.
 
-.. code-block:: none
+.. code-block:: sas
 
    proc summary data=tips nway;
        class sex smoker;
@@ -640,7 +640,7 @@ In SAS, if the group aggregations need to be used with
 the original frame, it must be merged back together.  For
 example, to subtract the mean for each observation by smoker group.
 
-.. code-block:: none
+.. code-block:: sas
 
    proc summary data=tips missing nway;
        class smoker;
@@ -679,7 +679,7 @@ replicate most other by group processing from SAS. For example,
 this ``DATA`` step reads the data by sex/smoker group and filters to
 the first entry for each.
 
-.. code-block:: none
+.. code-block:: sas
 
    proc sort data=tips;
       by sex smoker;
@@ -719,7 +719,7 @@ Data Interop
 pandas provides a :func:`read_sas` method that can read SAS data saved in
 the XPORT or SAS7BDAT binary format.
 
-.. code-block:: none
+.. code-block:: sas
 
    libname xportout xport 'transport-file.xpt';
    data xportout.tips;
 
@@ -63,6 +63,7 @@
               'ipython_sphinxext.ipython_console_highlighting',
               # lowercase didn't work
               'IPython.sphinxext.ipython_console_highlighting',
+              'matplotlib.sphinxext.plot_directive',
               'sphinx.ext.intersphinx',
               'sphinx.ext.coverage',
               'sphinx.ext.mathjax',
@@ -85,6 +86,14 @@
 if any(re.match("\s*api\s*", l) for l in index_rst_lines):
     autosummary_generate = True
 
+# matplotlib plot directive
+plot_include_source = True
+plot_formats = [("png", 90)]
+plot_html_show_formats = False
+plot_html_show_source_link = False
+plot_pre_code = """import numpy as np
+import pandas as pd"""
+
 # Add any paths that contain templates here, relative to this directory.
 templates_path = ['../_templates']
 
 
@@ -351,8 +351,10 @@ Some other important things to know about the docs:
 
       pandoc doc/source/contributing.rst -t markdown_github > CONTRIBUTING.md
 
-The utility script ``scripts/api_rst_coverage.py`` can be used to compare
-the list of methods documented in ``doc/source/api.rst`` (which is used to generate
+The utility script ``scripts/validate_docstrings.py`` can be used to get a csv
+summary of the API documentation. And also validate common errors in the docstring
+of a specific class, function or method. The summary also compares the list of
+methods documented in ``doc/source/api.rst`` (which is used to generate
 the `API Reference <http://pandas.pydata.org/pandas-docs/stable/api.html>`_ page)
 and the actual public methods.
 This will identify methods documented in ``doc/source/api.rst`` that are not actually
 
@@ -4711,6 +4711,12 @@ writes ``data`` to the database in batches of 1000 rows at a time:
 
     data.to_sql('data_chunked', engine, chunksize=1000)
 
+.. note::
+
+    The function :func:`~pandas.DataFrame.to_sql` will perform a multivalue
+    insert if the engine dialect ``supports_multivalues_insert``. This will
+    greatly speed up the insert in some cases.
+
 SQL data types
 ++++++++++++++
 
 
@@ -338,8 +338,11 @@ Other Enhancements
 - For subclassed ``DataFrames``, :func:`DataFrame.apply` will now preserve the ``Series`` subclass (if defined) when passing the data to the applied function (:issue:`19822`)
 - :func:`DataFrame.from_dict` now accepts a ``columns`` argument that can be used to specify the column names when ``orient='index'`` is used (:issue:`18529`)
 - Added option ``display.html.use_mathjax`` so `MathJax <https://www.mathjax.org/>`_ can be disabled when rendering tables in ``Jupyter`` notebooks (:issue:`19856`, :issue:`19824`)
+- :func:`DataFrame.replace` now supports the ``method`` parameter, which can be used to specify the replacement method when ``to_replace`` is a scalar, list or tuple and ``value`` is ``None`` (:issue:`19632`)
 - :meth:`Timestamp.month_name`, :meth:`DatetimeIndex.month_name`, and :meth:`Series.dt.month_name` are now available (:issue:`12805`)
 - :meth:`Timestamp.day_name` and :meth:`DatetimeIndex.day_name` are now available to return day names with a specified locale (:issue:`12806`)
+- :meth:`DataFrame.to_sql` now performs a multivalue insert if the underlying connection supports itk rather than inserting row by row.
+  ``SQLAlchemy`` dialects supporting multivalue inserts include: ``mysql``, ``postgresql``, ``sqlite`` and any dialect with ``supports_multivalues_insert``. (:issue:`14315`, :issue:`8953`)
 
 .. _whatsnew_0230.api_breaking:
 
@@ -904,6 +907,7 @@ Offsets
 
 Numeric
 ^^^^^^^
+- Bug in :meth:`DataFrame.rank` and :meth:`Series.rank` when ``method='dense'`` and ``pct=True`` in which percentile ranks were not being used with the number of distinct observations (:issue:`15630`)
 - Bug in :class:`Series` constructor with an int or float list where specifying ``dtype=str``, ``dtype='str'`` or ``dtype='U'`` failed to convert the data elements to strings (:issue:`16605`)
 - Bug in :class:`Index` multiplication and division methods where operating with a ``Series`` would return an ``Index`` object instead of a ``Series`` object (:issue:`19042`)
 - Bug in the :class:`DataFrame` constructor in which data containing very large positive or very large negative numbers was causing ``OverflowError`` (:issue:`18584`)
@@ -1015,6 +1019,7 @@ Reshaping
 - Bug in :func:`DataFrame.iterrows`, which would infers strings not compliant to `ISO8601 <https://en.wikipedia.org/wiki/ISO_8601>`_ to datetimes (:issue:`19671`)
 - Bug in :class:`Series` constructor with ``Categorical`` where a ```ValueError`` is not raised when an index of different length is given (:issue:`19342`)
 - Bug in :meth:`DataFrame.astype` where column metadata is lost when converting to categorical or a dictionary of dtypes (:issue:`19920`)
+- Bug in :func:`cut` and :func:`qcut` where timezone information was dropped (:issue:`19872`)
 
 Other
 ^^^^^
 
@@ -213,7 +213,10 @@ def rank_1d_{{dtype}}(object in_arr, ties_method='average', ascending=True,
                 sum_ranks = dups = 0
     {{endif}}
     if pct:
-        return ranks / count
+        if tiebreak == TIEBREAK_DENSE:
+            return ranks / total_tie_count
+        else:
+            return ranks / count
     else:
         return ranks
 
@@ -385,7 +388,10 @@ def rank_2d_{{dtype}}(object in_arr, axis=0, ties_method='average',
                         ranks[i, argsorted[i, z]] = total_tie_count
                 sum_ranks = dups = 0
         if pct:
-            ranks[i, :] /= count
+            if tiebreak == TIEBREAK_DENSE:
+                ranks[i, :] /= total_tie_count
+            else:
+                ranks[i, :] /= count
     if axis == 0:
         return ranks.T
     else:
 
@@ -191,7 +191,7 @@ def apply_broadcast(self, target):
 
         for i, col in enumerate(target.columns):
             res = self.f(target[col])
-            ares = np. asarray(res).ndim
+            ares = np.asarray(res).ndim
 
             # must be a scalar or 1d
             if ares > 1: