Skip to content

Commit aef1162

Browse files
update release note
1 parent ef38660 commit aef1162

File tree

1 file changed

+61
-11
lines changed

1 file changed

+61
-11
lines changed

doc/source/whatsnew/v1.3.0.rst

Lines changed: 61 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -179,20 +179,16 @@ We've enhanced the :class:`StringDtype`, an extension type dedicated to string d
179179
(:issue:`39908`)
180180

181181
It is now possible to specify a ``storage`` keyword option to :class:`StringDtype`, use
182-
pandas options or specify the dtype using ``dtype='string[pyarrow]'``
182+
pandas options or specify the dtype using ``dtype='string[pyarrow]'`` to allow the
183+
StringArray to be backed by a PyArrow array instead of a NumPy array of Python objects.
184+
185+
The PyArrow backed StringArray requires pyarrow 1.0.0 or greater to be installed.
183186

184187
.. warning::
185188

186189
``string[pyarrow]`` is currently considered experimental. The implementation
187190
and parts of the API may change without warning.
188191

189-
The ``'string[pyarrow]'`` extension type solves several issues with NumPy backed arrays:
190-
191-
1.
192-
2.
193-
3.
194-
195-
196192
.. ipython:: python
197193
198194
pd.Series(['abc', None, 'def'], dtype=pd.StringDtype(storage="pyarrow"))
@@ -212,8 +208,8 @@ You can also create a PyArrow backed string array using pandas options.
212208
s = pd.Series(['abc', None, 'def'], dtype="string")
213209
s
214210
215-
The usual string accessor methods work. Where appropriate, the return type
216-
of the Series or columns of a DataFrame will also have string dtype.
211+
The usual string accessor methods work. Where appropriate, the return type of the Series
212+
or columns of a DataFrame will also have string dtype.
217213

218214
.. ipython:: python
219215
@@ -226,7 +222,61 @@ String accessor methods returning integers will return a value with :class:`Int6
226222
227223
s.str.count("a")
228224
229-
See :ref:`text.types` for more.
225+
Some string accessor methods use native PyArrow string kernels operating directly on the
226+
PyArrow memory, others fallback to converting to a NumPy array of Python objects and
227+
using the native Python string functions. String methods using Pyarrow kernels are
228+
generally much more performant.
229+
230+
Some PyArrow string kernels are implemented in later versions of pyarrow that the
231+
minimum version required to create a PyArrow backed StringArray. In these cases, the
232+
string accessor will fall back to the Python implementations.
233+
234+
Some string accessor methods accept arguments controlling their behaviour which are not
235+
supported by the PyArrow kernels. These cases will also fall back to object mode.
236+
237+
+--------------------------------+----------+------------------------------------------+
238+
| Accessor | Minimum | Limitations (otherwise fall back to |
239+
| Method | PyArrow | object mode) |
240+
| | Version | |
241+
+================================+==========+==========================================+
242+
| :meth:`~Series.str.contains` | 1.0.0 | The ``flags`` argument is not supported. |
243+
| | | If ``regex=True``, pyarrow 4.0.0 is |
244+
| | | required and ``case=False`` is not |
245+
| | | supported. |
246+
+--------------------------------+----------+------------------------------------------+
247+
| :meth:`~Series.str.startswith` | 4.0.0 | |
248+
| :meth:`~Series.str.endswith` | | |
249+
+--------------------------------+----------+------------------------------------------+
250+
| :meth:`~Series.str.replace` | 4.0.0 | The ``flags`` argument, ``case=False``, |
251+
| | | passing a callable for the ``repr`` |
252+
| | | argument or passing a compiled regex is |
253+
| | | not supported. |
254+
+--------------------------------+----------+------------------------------------------+
255+
| :meth:`~Series.str.match` | 4.0.0 | |
256+
| :meth:`~Series.str.fullmatch` | | |
257+
+--------------------------------+----------+------------------------------------------+
258+
| :meth:`~Series.str.isalnum` | 1.0.0 | |
259+
| :meth:`~Series.str.isalpha` | | |
260+
| :meth:`~Series.str.isdecimal` | | |
261+
| :meth:`~Series.str.isdigit` | | |
262+
| :meth:`~Series.str.islower` | | |
263+
| :meth:`~Series.str.isnumeric` | | |
264+
| :meth:`~Series.str.istitle` | | |
265+
| :meth:`~Series.str.isupper` | | |
266+
+--------------------------------+----------+------------------------------------------+
267+
| :meth:`~Series.str.isspace` | 2.0.0 | |
268+
+--------------------------------+----------+------------------------------------------+
269+
| :meth:`~Series.str.len` | 4.0.0 | |
270+
+--------------------------------+----------+------------------------------------------+
271+
| :meth:`~Series.str.lower` | 1.0.0 | |
272+
| :meth:`~Series.str.upper` | | |
273+
+--------------------------------+----------+------------------------------------------+
274+
| :meth:`~Series.str.strip` | 4.0.0 | |
275+
| :meth:`~Series.str.lstrip` | | |
276+
| :meth:`~Series.str.rstrip` | | |
277+
+--------------------------------+----------+------------------------------------------+
278+
279+
230280

231281
Centered Datetime-Like Rolling Windows
232282
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

0 commit comments

Comments
 (0)