Skip to content

Commit b95b561

Browse files
committed
Issue20284: Implement PEP461
1 parent 8861502 commit b95b561

File tree

10 files changed

+1185
-158
lines changed

10 files changed

+1185
-158
lines changed

Doc/library/stdtypes.rst

Lines changed: 191 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3057,6 +3057,197 @@ place, and instead produce new objects.
30573057
always produces a new object, even if no changes were made.
30583058

30593059

3060+
.. _bytes-formatting:
3061+
3062+
``printf``-style Bytes Formatting
3063+
----------------------------------
3064+
3065+
.. index::
3066+
single: formatting, bytes (%)
3067+
single: formatting, bytearray (%)
3068+
single: interpolation, bytes (%)
3069+
single: interpolation, bytearray (%)
3070+
single: bytes; formatting
3071+
single: bytearray; formatting
3072+
single: bytes; interpolation
3073+
single: bytearray; interpolation
3074+
single: printf-style formatting
3075+
single: sprintf-style formatting
3076+
single: % formatting
3077+
single: % interpolation
3078+
3079+
.. note::
3080+
3081+
The formatting operations described here exhibit a variety of quirks that
3082+
lead to a number of common errors (such as failing to display tuples and
3083+
dictionaries correctly). If the value being printed may be a tuple or
3084+
dictionary, wrap it in a tuple.
3085+
3086+
Bytes objects (``bytes``/``bytearray``) have one unique built-in operation:
3087+
the ``%`` operator (modulo).
3088+
This is also known as the bytes *formatting* or *interpolation* operator.
3089+
Given ``format % values`` (where *format* is a bytes object), ``%`` conversion
3090+
specifications in *format* are replaced with zero or more elements of *values*.
3091+
The effect is similar to using the :c:func:`sprintf` in the C language.
3092+
3093+
If *format* requires a single argument, *values* may be a single non-tuple
3094+
object. [5]_ Otherwise, *values* must be a tuple with exactly the number of
3095+
items specified by the format bytes object, or a single mapping object (for
3096+
example, a dictionary).
3097+
3098+
A conversion specifier contains two or more characters and has the following
3099+
components, which must occur in this order:
3100+
3101+
#. The ``'%'`` character, which marks the start of the specifier.
3102+
3103+
#. Mapping key (optional), consisting of a parenthesised sequence of characters
3104+
(for example, ``(somename)``).
3105+
3106+
#. Conversion flags (optional), which affect the result of some conversion
3107+
types.
3108+
3109+
#. Minimum field width (optional). If specified as an ``'*'`` (asterisk), the
3110+
actual width is read from the next element of the tuple in *values*, and the
3111+
object to convert comes after the minimum field width and optional precision.
3112+
3113+
#. Precision (optional), given as a ``'.'`` (dot) followed by the precision. If
3114+
specified as ``'*'`` (an asterisk), the actual precision is read from the next
3115+
element of the tuple in *values*, and the value to convert comes after the
3116+
precision.
3117+
3118+
#. Length modifier (optional).
3119+
3120+
#. Conversion type.
3121+
3122+
When the right argument is a dictionary (or other mapping type), then the
3123+
formats in the bytes object *must* include a parenthesised mapping key into that
3124+
dictionary inserted immediately after the ``'%'`` character. The mapping key
3125+
selects the value to be formatted from the mapping. For example:
3126+
3127+
>>> print(b'%(language)s has %(number)03d quote types.' %
3128+
... {b'language': b"Python", b"number": 2})
3129+
b'Python has 002 quote types.'
3130+
3131+
In this case no ``*`` specifiers may occur in a format (since they require a
3132+
sequential parameter list).
3133+
3134+
The conversion flag characters are:
3135+
3136+
+---------+---------------------------------------------------------------------+
3137+
| Flag | Meaning |
3138+
+=========+=====================================================================+
3139+
| ``'#'`` | The value conversion will use the "alternate form" (where defined |
3140+
| | below). |
3141+
+---------+---------------------------------------------------------------------+
3142+
| ``'0'`` | The conversion will be zero padded for numeric values. |
3143+
+---------+---------------------------------------------------------------------+
3144+
| ``'-'`` | The converted value is left adjusted (overrides the ``'0'`` |
3145+
| | conversion if both are given). |
3146+
+---------+---------------------------------------------------------------------+
3147+
| ``' '`` | (a space) A blank should be left before a positive number (or empty |
3148+
| | string) produced by a signed conversion. |
3149+
+---------+---------------------------------------------------------------------+
3150+
| ``'+'`` | A sign character (``'+'`` or ``'-'``) will precede the conversion |
3151+
| | (overrides a "space" flag). |
3152+
+---------+---------------------------------------------------------------------+
3153+
3154+
A length modifier (``h``, ``l``, or ``L``) may be present, but is ignored as it
3155+
is not necessary for Python -- so e.g. ``%ld`` is identical to ``%d``.
3156+
3157+
The conversion types are:
3158+
3159+
+------------+-----------------------------------------------------+-------+
3160+
| Conversion | Meaning | Notes |
3161+
+============+=====================================================+=======+
3162+
| ``'d'`` | Signed integer decimal. | |
3163+
+------------+-----------------------------------------------------+-------+
3164+
| ``'i'`` | Signed integer decimal. | |
3165+
+------------+-----------------------------------------------------+-------+
3166+
| ``'o'`` | Signed octal value. | \(1) |
3167+
+------------+-----------------------------------------------------+-------+
3168+
| ``'u'`` | Obsolete type -- it is identical to ``'d'``. | \(7) |
3169+
+------------+-----------------------------------------------------+-------+
3170+
| ``'x'`` | Signed hexadecimal (lowercase). | \(2) |
3171+
+------------+-----------------------------------------------------+-------+
3172+
| ``'X'`` | Signed hexadecimal (uppercase). | \(2) |
3173+
+------------+-----------------------------------------------------+-------+
3174+
| ``'e'`` | Floating point exponential format (lowercase). | \(3) |
3175+
+------------+-----------------------------------------------------+-------+
3176+
| ``'E'`` | Floating point exponential format (uppercase). | \(3) |
3177+
+------------+-----------------------------------------------------+-------+
3178+
| ``'f'`` | Floating point decimal format. | \(3) |
3179+
+------------+-----------------------------------------------------+-------+
3180+
| ``'F'`` | Floating point decimal format. | \(3) |
3181+
+------------+-----------------------------------------------------+-------+
3182+
| ``'g'`` | Floating point format. Uses lowercase exponential | \(4) |
3183+
| | format if exponent is less than -4 or not less than | |
3184+
| | precision, decimal format otherwise. | |
3185+
+------------+-----------------------------------------------------+-------+
3186+
| ``'G'`` | Floating point format. Uses uppercase exponential | \(4) |
3187+
| | format if exponent is less than -4 or not less than | |
3188+
| | precision, decimal format otherwise. | |
3189+
+------------+-----------------------------------------------------+-------+
3190+
| ``'c'`` | Single byte (accepts integer or single | |
3191+
| | byte objects). | |
3192+
+------------+-----------------------------------------------------+-------+
3193+
| ``'b'`` | Bytes (any object that follows the | \(5) |
3194+
| | :ref:`buffer protocol <bufferobjects>` or has | |
3195+
| | :meth:`__bytes__`). | |
3196+
+------------+-----------------------------------------------------+-------+
3197+
| ``'s'`` | ``'s'`` is an alias for ``'b'`` and should only | \(6) |
3198+
| | be used for Python2/3 code bases. | |
3199+
+------------+-----------------------------------------------------+-------+
3200+
| ``'a'`` | Bytes (converts any Python object using | \(5) |
3201+
| | ``repr(obj).encode('ascii','backslashreplace)``). | |
3202+
+------------+-----------------------------------------------------+-------+
3203+
| ``'%'`` | No argument is converted, results in a ``'%'`` | |
3204+
| | character in the result. | |
3205+
+------------+-----------------------------------------------------+-------+
3206+
3207+
Notes:
3208+
3209+
(1)
3210+
The alternate form causes a leading zero (``'0'``) to be inserted between
3211+
left-hand padding and the formatting of the number if the leading character
3212+
of the result is not already a zero.
3213+
3214+
(2)
3215+
The alternate form causes a leading ``'0x'`` or ``'0X'`` (depending on whether
3216+
the ``'x'`` or ``'X'`` format was used) to be inserted between left-hand padding
3217+
and the formatting of the number if the leading character of the result is not
3218+
already a zero.
3219+
3220+
(3)
3221+
The alternate form causes the result to always contain a decimal point, even if
3222+
no digits follow it.
3223+
3224+
The precision determines the number of digits after the decimal point and
3225+
defaults to 6.
3226+
3227+
(4)
3228+
The alternate form causes the result to always contain a decimal point, and
3229+
trailing zeroes are not removed as they would otherwise be.
3230+
3231+
The precision determines the number of significant digits before and after the
3232+
decimal point and defaults to 6.
3233+
3234+
(5)
3235+
If precision is ``N``, the output is truncated to ``N`` characters.
3236+
3237+
(6)
3238+
``b'%s'`` is deprecated, but will not be removed during the 3.x series.
3239+
3240+
(7)
3241+
See :pep:`237`.
3242+
3243+
.. note::
3244+
3245+
The bytearray version of this method does *not* operate in place - it
3246+
always produces a new object, even if no changes were made.
3247+
3248+
.. seealso:: :pep:`461`.
3249+
.. versionadded:: 3.5
3250+
30603251
.. _typememoryview:
30613252

30623253
Memory Views

Include/bytesobject.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -62,6 +62,7 @@ PyAPI_FUNC(void) PyBytes_Concat(PyObject **, PyObject *);
6262
PyAPI_FUNC(void) PyBytes_ConcatAndDel(PyObject **, PyObject *);
6363
#ifndef Py_LIMITED_API
6464
PyAPI_FUNC(int) _PyBytes_Resize(PyObject **, Py_ssize_t);
65+
PyAPI_FUNC(PyObject *) _PyBytes_Format(PyObject *, PyObject *);
6566
#endif
6667
PyAPI_FUNC(PyObject *) PyBytes_DecodeEscape(const char *, Py_ssize_t,
6768
const char *, Py_ssize_t,

Include/unicodeobject.h

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2245,6 +2245,8 @@ PyAPI_FUNC(Py_UNICODE*) Py_UNICODE_strrchr(
22452245
Py_UNICODE c
22462246
);
22472247

2248+
PyAPI_FUNC(PyObject*) _PyUnicode_FormatLong(PyObject *, int, int, int);
2249+
22482250
/* Create a copy of a unicode string ending with a nul character. Return NULL
22492251
and raise a MemoryError exception on memory allocation failure, otherwise
22502252
return a new allocated buffer (use PyMem_Free() to free the buffer). */

Lib/test/test_bytes.py

Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -461,6 +461,28 @@ def test_rindex(self):
461461
self.assertEqual(b.rindex(i, 3, 9), 7)
462462
self.assertRaises(ValueError, b.rindex, w, 1, 3)
463463

464+
def test_mod(self):
465+
b = b'hello, %b!'
466+
orig = b
467+
b = b % b'world'
468+
self.assertEqual(b, b'hello, world!')
469+
self.assertEqual(orig, b'hello, %b!')
470+
self.assertFalse(b is orig)
471+
b = b'%s / 100 = %d%%'
472+
a = b % (b'seventy-nine', 79)
473+
self.assertEquals(a, b'seventy-nine / 100 = 79%')
474+
475+
def test_imod(self):
476+
b = b'hello, %b!'
477+
orig = b
478+
b %= b'world'
479+
self.assertEqual(b, b'hello, world!')
480+
self.assertEqual(orig, b'hello, %b!')
481+
self.assertFalse(b is orig)
482+
b = b'%s / 100 = %d%%'
483+
b %= (b'seventy-nine', 79)
484+
self.assertEquals(b, b'seventy-nine / 100 = 79%')
485+
464486
def test_replace(self):
465487
b = self.type2test(b'mississippi')
466488
self.assertEqual(b.replace(b'i', b'a'), b'massassappa')
@@ -990,6 +1012,28 @@ def test_setslice_trap(self):
9901012
b[8:] = b
9911013
self.assertEqual(b, bytearray(list(range(8)) + list(range(256))))
9921014

1015+
def test_mod(self):
1016+
b = bytearray(b'hello, %b!')
1017+
orig = b
1018+
b = b % b'world'
1019+
self.assertEqual(b, b'hello, world!')
1020+
self.assertEqual(orig, bytearray(b'hello, %b!'))
1021+
self.assertFalse(b is orig)
1022+
b = bytearray(b'%s / 100 = %d%%')
1023+
a = b % (b'seventy-nine', 79)
1024+
self.assertEquals(a, bytearray(b'seventy-nine / 100 = 79%'))
1025+
1026+
def test_imod(self):
1027+
b = bytearray(b'hello, %b!')
1028+
orig = b
1029+
b %= b'world'
1030+
self.assertEqual(b, b'hello, world!')
1031+
self.assertEqual(orig, bytearray(b'hello, %b!'))
1032+
self.assertFalse(b is orig)
1033+
b = bytearray(b'%s / 100 = %d%%')
1034+
b %= (b'seventy-nine', 79)
1035+
self.assertEquals(b, bytearray(b'seventy-nine / 100 = 79%'))
1036+
9931037
def test_iconcat(self):
9941038
b = bytearray(b"abc")
9951039
b1 = b

0 commit comments

Comments
 (0)