Skip to content

Commit 7a8ac7c

Browse files
committed
API: str.cat will align on index (collected)
1 parent 70468df commit 7a8ac7c

File tree

5 files changed

+645
-87
lines changed

5 files changed

+645
-87
lines changed

doc/source/text.rst

Lines changed: 67 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -247,27 +247,84 @@ Missing values on either side will result in missing values in the result as wel
247247
s.str.cat(t)
248248
s.str.cat(t, na_rep='-')
249249
250-
Series are *not* aligned on their index before concatenation:
250+
Concatenating a Series and something array-like into a Series
251+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
252+
253+
.. versionadded:: 0.23.0
254+
255+
The parameter ``others`` can also be two-dimensional. In this case, the number or rows must match the lengths of the calling ``Series`` (or ``Index``).
251256

252257
.. ipython:: python
253258
254-
u = pd.Series(['b', 'd', 'e', 'c'], index=[1, 3, 4, 2])
255-
# without alignment
259+
d = pd.concat([t, s], axis=1)
260+
d
261+
s.str.cat(d, na_rep='-')
262+
263+
Concatenating a Series and an indexed object into a Series, with alignment
264+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
265+
266+
.. versionadded:: 0.23.0
267+
268+
For concatenation with a ``Series`` or ``DataFrame``, it is possible to align the respective indexes before concatenation by setting
269+
the ``join``-keyword, which controls the manner of alignment.
270+
271+
.. ipython:: python
272+
273+
u = pd.Series(['b', 'd', 'a', 'c'], index=[1, 3, 0, 2])
256274
s.str.cat(u)
257-
# with separate alignment
258-
v, w = s.align(u)
259-
v.str.cat(w, na_rep='-')
275+
s.str.cat(u, join='left')
276+
277+
.. warning::
278+
279+
If the ``join`` keyword is not passed, the method :meth:`~Series.str.cat` will currently fall back to the behavior before version 0.23.0 (i.e. no alignment),
280+
but a ``FutureWarning`` will be raised, since this default will change to ``join='left'`` in a future version.
281+
282+
To usual options are available for ``join`` (one of ``'left', 'outer', 'inner', 'right'``).
283+
In particular, alignment also means that the different lengths do not need to coincide anymore.
284+
285+
.. ipython:: python
286+
287+
v = pd.Series(['z', 'a', 'b', 'd', 'e'], index=[-1, 0, 1, 3, 4])
288+
s.str.cat(v, join='left', na_rep='-')
289+
s.str.cat(v, join='outer', na_rep='-')
290+
291+
The same alignment can be used when ``others`` is a ``DataFrame``:
292+
293+
.. ipython:: python
294+
295+
f = d.loc[[3, 2, 1, 0], :]
296+
f
297+
s.str.cat(f, join='left', na_rep='-')
260298
261299
Concatenating a Series and many objects into a Series
262300
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
263301

264-
List-likes (excluding iterators, ``dict``-views, etc.) can be arbitrarily combined in a list.
265-
All elements of the list must match in length to the calling ``Series`` (resp. ``Index``):
302+
All list-likes (as well as ``DataFrame`` and two-dimensional ``ndarray``) can be arbitrarily combined in a list-like container:
303+
304+
.. ipython:: python
305+
306+
s.str.cat([u, t.values, ['A', 'B', 'C', 'D'], d.values, f], na_rep='-')
307+
308+
All elements must match in length to the calling ``Series``, except those having an index if ``join`` is not None:
266309

267310
.. ipython:: python
268311
269-
x = pd.Series([1, 2, 3, 4], index=['A', 'B', 'C', 'D'])
270-
s.str.cat([['A', 'B', 'C', 'D'], s, s.values, x.index])
312+
s.str.cat([u, v, ['A', 'B', 'C', 'D'], d.values, f.loc[[1]]],
313+
join='outer', na_rep='-')
314+
315+
If using ``join='right'`` on a list of ``others`` that contains different indexes,
316+
the union of these indexes will be used as the basis for the final concatenation:
317+
318+
.. ipython:: python
319+
320+
s.str.cat([u.loc[[3]], v.loc[[-1, 0]]], join='right', na_rep='-')
321+
322+
Finally, the surrounding container can also be an :obj:`Iterable` other than a ``list`` (e.g. an iterator, or a ``dict``-view, etc.):
323+
324+
.. ipython:: python
325+
326+
from collections import OrderedDict
327+
s.str.cat(d.to_dict('series', into=OrderedDict).values(), na_rep='-')
271328
272329
Indexing with ``.str``
273330
----------------------

doc/source/whatsnew/v0.23.0.txt

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -308,6 +308,39 @@ The :func:`DataFrame.assign` now accepts dependent keyword arguments for python
308308

309309
df.assign(A=df.A+1, C= lambda df: df.A* -1)
310310

311+
.. _whatsnew_0230.enhancements.str_cat_align:
312+
313+
``Series.str.cat`` has gained the ``join`` kwarg
314+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
315+
316+
Previously, :meth:`Series.str.cat` did not -- in contrast to most of ``pandas`` -- align :class:`Series` on their index before concatenation (see :issue:`18657`).
317+
The method has now gained a keyword ``join`` to control the manner of alignment. In v.0.23 it will default to None (meaning no alignment), but this default will change
318+
to ``'left'`` in a future version of pandas.
319+
320+
.. ipython:: python
321+
322+
s = pd.Series(['a', 'b', 'c', 'd'])
323+
t = pd.Series(['b', 'd', 'e', 'c'], index=[1, 3, 4, 2])
324+
s.str.cat(t)
325+
s.str.cat(t, join='left', na_rep='-')
326+
327+
In particular, ``others`` does not need to be of the same length as the calling ``Series`` (if both have an index and ``join is not None``).
328+
For more examples, see :ref:`here <text.concatenate>`.
329+
330+
Additionally, ``str.cat`` now allows ``others`` to be a ``DataFrame`` or two-dimensional ``np.ndarray``.
331+
332+
.. ipython:: python
333+
334+
u = pd.Series(['b', 'd', 'a', 'c'], index=[1, 3, 0, 2])
335+
d = pd.concat([s, u], axis=1)
336+
t.str.cat(d.values)
337+
s.str.cat(d, join='left', na_rep='-')
338+
339+
Furthermore, any combination of "concatenateable" arguments can be passed in a list-like container (e.g. an iterator).
340+
341+
For categorical data, it is now possible to call :meth:`Series.str.cat` for ``CategoricalIndex`` as well (previously raised a ``ValueError``).
342+
Finally, if ``others is not None``, the resulting ``Series``/``Index`` will now remain categorical if the calling
343+
``Series``/``Index`` is categorical.
311344

312345
.. _whatsnew_0230.enhancements.astype_category:
313346

0 commit comments

Comments
 (0)