Skip to content

Commit 562b9d2

Browse files
committed
Merge branch 'main' into gh-94808/improve-coverage-pyobject-print
2 parents 7055a43 + bded5ed commit 562b9d2

37 files changed

+679
-331
lines changed

Doc/data/stable_abi.dat

Lines changed: 3 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Doc/howto/perf_profiling.rst

Lines changed: 38 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -8,10 +8,11 @@ Python support for the Linux ``perf`` profiler
88

99
:author: Pablo Galindo
1010

11-
The Linux ``perf`` profiler is a very powerful tool that allows you to profile and
12-
obtain information about the performance of your application. ``perf`` also has
13-
a very vibrant ecosystem of tools that aid with the analysis of the data that it
14-
produces.
11+
`The Linux perf profiler <https://perf.wiki.kernel.org>`_
12+
is a very powerful tool that allows you to profile and obtain
13+
information about the performance of your application.
14+
``perf`` also has a very vibrant ecosystem of tools
15+
that aid with the analysis of the data that it produces.
1516

1617
The main problem with using the ``perf`` profiler with Python applications is that
1718
``perf`` only allows to get information about native symbols, this is, the names of
@@ -25,7 +26,7 @@ fly before the execution of every Python function and it will teach ``perf`` the
2526
relationship between this piece of code and the associated Python function using
2627
`perf map files`_.
2728

28-
.. warning::
29+
.. note::
2930

3031
Support for the ``perf`` profiler is only currently available for Linux on
3132
selected architectures. Check the output of the configure build step or
@@ -51,11 +52,11 @@ For example, consider the following script:
5152
if __name__ == "__main__":
5253
baz(1000000)
5354
54-
We can run perf to sample CPU stack traces at 9999 Hertz:
55+
We can run ``perf`` to sample CPU stack traces at 9999 Hertz::
5556

5657
$ perf record -F 9999 -g -o perf.data python my_script.py
5758

58-
Then we can use perf report to analyze the data:
59+
Then we can use ``perf`` report to analyze the data:
5960

6061
.. code-block:: shell-session
6162
@@ -101,7 +102,7 @@ As you can see here, the Python functions are not shown in the output, only ``_P
101102
functions use the same C function to evaluate bytecode so we cannot know which Python function corresponds to which
102103
bytecode-evaluating function.
103104

104-
Instead, if we run the same experiment with perf support activated we get:
105+
Instead, if we run the same experiment with ``perf`` support enabled we get:
105106

106107
.. code-block:: shell-session
107108
@@ -147,52 +148,58 @@ Instead, if we run the same experiment with perf support activated we get:
147148
148149
149150
150-
Enabling perf profiling mode
151-
----------------------------
151+
How to enable ``perf`` profiling support
152+
----------------------------------------
152153

153-
There are two main ways to activate the perf profiling mode. If you want it to be
154-
active since the start of the Python interpreter, you can use the ``-Xperf`` option:
154+
``perf`` profiling support can either be enabled from the start using
155+
the environment variable :envvar:`PYTHONPERFSUPPORT` or the
156+
:option:`-X perf <-X>` option,
157+
or dynamically using :func:`sys.activate_stack_trampoline` and
158+
:func:`sys.deactivate_stack_trampoline`.
155159

156-
$ python -Xperf my_script.py
160+
The :mod:`!sys` functions take precedence over the :option:`!-X` option,
161+
the :option:`!-X` option takes precedence over the environment variable.
157162

158-
You can also set the :envvar:`PYTHONPERFSUPPORT` to a nonzero value to actiavate perf
159-
profiling mode globally.
163+
Example, using the environment variable::
160164

161-
There is also support for dynamically activating and deactivating the perf
162-
profiling mode by using the APIs in the :mod:`sys` module:
165+
$ PYTHONPERFSUPPORT=1
166+
$ python script.py
167+
$ perf report -g -i perf.data
163168

164-
.. code-block:: python
165-
166-
import sys
167-
sys.activate_stack_trampoline("perf")
169+
Example, using the :option:`!-X` option::
168170

169-
# Run some code with Perf profiling active
171+
$ python -X perf script.py
172+
$ perf report -g -i perf.data
170173

171-
sys.deactivate_stack_trampoline()
174+
Example, using the :mod:`sys` APIs in file :file:`example.py`:
172175

173-
# Perf profiling is not active anymore
176+
.. code-block:: python
174177
175-
These APIs can be handy if you want to activate/deactivate profiling mode in
176-
response to a signal or other communication mechanism with your process.
178+
import sys
177179
180+
sys.activate_stack_trampoline("perf")
181+
do_profiled_stuff()
182+
sys.deactivate_stack_trampoline()
178183
184+
non_profiled_stuff()
179185
180-
Now we can analyze the data with ``perf report``:
186+
...then::
181187

182-
$ perf report -g -i perf.data
188+
$ python ./example.py
189+
$ perf report -g -i perf.data
183190

184191

185192
How to obtain the best results
186-
-------------------------------
193+
------------------------------
187194

188195
For the best results, Python should be compiled with
189196
``CFLAGS="-fno-omit-frame-pointer -mno-omit-leaf-frame-pointer"`` as this allows
190197
profilers to unwind using only the frame pointer and not on DWARF debug
191-
information. This is because as the code that is interposed to allow perf
198+
information. This is because as the code that is interposed to allow ``perf``
192199
support is dynamically generated it doesn't have any DWARF debugging information
193200
available.
194201

195-
You can check if you system has been compiled with this flag by running:
202+
You can check if your system has been compiled with this flag by running::
196203

197204
$ python -m sysconfig | grep 'no-omit-frame-pointer'
198205

Doc/library/dis.rst

Lines changed: 12 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -413,6 +413,15 @@ The Python compiler currently generates the following bytecode instructions.
413413
Removes the top-of-stack (TOS) item.
414414

415415

416+
.. opcode:: END_FOR
417+
418+
Removes the top two values from the stack.
419+
Equivalent to POP_TOP; POP_TOP.
420+
Used to clean up at the end of loops, hence the name.
421+
422+
.. versionadded:: 3.12
423+
424+
416425
.. opcode:: COPY (i)
417426

418427
Push the *i*-th item to the top of the stack. The item is not removed from its
@@ -1088,9 +1097,11 @@ iterations of the loop.
10881097

10891098
TOS is an :term:`iterator`. Call its :meth:`~iterator.__next__` method. If
10901099
this yields a new value, push it on the stack (leaving the iterator below
1091-
it). If the iterator indicates it is exhausted, TOS is popped, and the byte
1100+
it). If the iterator indicates it is exhausted then the byte
10921101
code counter is incremented by *delta*.
10931102

1103+
.. versionchanged:: 3.12
1104+
Up until 3.11 the iterator was popped when it was exhausted.
10941105

10951106
.. opcode:: LOAD_GLOBAL (namei)
10961107

Doc/library/sys.rst

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1555,6 +1555,38 @@ always available.
15551555
This function has been added on a provisional basis (see :pep:`411`
15561556
for details.) Use it only for debugging purposes.
15571557

1558+
.. function:: activate_stack_trampoline(backend, /)
1559+
1560+
Activate the stack profiler trampoline *backend*.
1561+
The only supported backend is ``"perf"``.
1562+
1563+
.. availability:: Linux.
1564+
1565+
.. versionadded:: 3.12
1566+
1567+
.. seealso::
1568+
1569+
* :ref:`perf_profiling`
1570+
* https://perf.wiki.kernel.org
1571+
1572+
.. function:: deactivate_stack_trampoline()
1573+
1574+
Deactivate the current stack profiler trampoline backend.
1575+
1576+
If no stack profiler is activated, this function has no effect.
1577+
1578+
.. availability:: Linux.
1579+
1580+
.. versionadded:: 3.12
1581+
1582+
.. function:: is_stack_trampoline_active()
1583+
1584+
Return ``True`` if a stack profiler trampoline is active.
1585+
1586+
.. availability:: Linux.
1587+
1588+
.. versionadded:: 3.12
1589+
15581590
.. function:: _enablelegacywindowsfsencoding()
15591591

15601592
Changes the :term:`filesystem encoding and error handler` to 'mbcs' and

Doc/reference/expressions.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -154,7 +154,7 @@ tuple may or may not yield the same object).
154154
single: , (comma)
155155

156156
Note that tuples are not formed by the parentheses, but rather by use of the
157-
comma operator. The exception is the empty tuple, for which parentheses *are*
157+
comma. The exception is the empty tuple, for which parentheses *are*
158158
required --- allowing unparenthesized "nothing" in expressions would cause
159159
ambiguities and allow common typos to pass uncaught.
160160

Doc/using/cmdline.rst

Lines changed: 10 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -538,12 +538,11 @@ Miscellaneous options
538538
development (running from the source tree) then the default is "off".
539539
Note that the "importlib_bootstrap" and "importlib_bootstrap_external"
540540
frozen modules are always used, even if this flag is set to "off".
541-
* ``-X perf`` to activate compatibility mode with the ``perf`` profiler.
542-
When this option is activated, the Linux ``perf`` profiler will be able to
541+
* ``-X perf`` enables support for the Linux ``perf`` profiler.
542+
When this option is provided, the ``perf`` profiler will be able to
543543
report Python calls. This option is only available on some platforms and
544544
will do nothing if is not supported on the current system. The default value
545-
is "off". See also :envvar:`PYTHONPERFSUPPORT` and :ref:`perf_profiling`
546-
for more information.
545+
is "off". See also :envvar:`PYTHONPERFSUPPORT` and :ref:`perf_profiling`.
547546

548547
It also allows passing arbitrary values and retrieving them through the
549548
:data:`sys._xoptions` dictionary.
@@ -1048,9 +1047,13 @@ conflict.
10481047

10491048
.. envvar:: PYTHONPERFSUPPORT
10501049

1051-
If this variable is set to a nonzero value, it activates compatibility mode
1052-
with the ``perf`` profiler so Python calls can be detected by it. See the
1053-
:ref:`perf_profiling` section for more information.
1050+
If this variable is set to a nonzero value, it enables support for
1051+
the Linux ``perf`` profiler so Python calls can be detected by it.
1052+
1053+
If set to ``0``, disable Linux ``perf`` profiler support.
1054+
1055+
See also the :option:`-X perf <-X>` command-line option
1056+
and :ref:`perf_profiling`.
10541057

10551058
.. versionadded:: 3.12
10561059

Doc/whatsnew/3.12.rst

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -74,6 +74,15 @@ Important deprecations, removals or restrictions:
7474
New Features
7575
============
7676

77+
* Add :ref:`perf_profiling` through the new
78+
environment variable :envvar:`PYTHONPERFSUPPORT`,
79+
the new command-line option :option:`-X perf <-X>`,
80+
as well as the new :func:`sys.activate_stack_trampoline`,
81+
:func:`sys.deactivate_stack_trampoline`,
82+
and :func:`sys.is_stack_trampoline_active` APIs.
83+
(Design by Pablo Galindo. Contributed by Pablo Galindo and Christian Heimes
84+
with contributions from Gregory P. Smith [Google] and Mark Shannon
85+
in :gh:`96123`.)
7786

7887

7988
Other Language Changes
@@ -194,6 +203,19 @@ tempfile
194203
The :class:`tempfile.NamedTemporaryFile` function has a new optional parameter
195204
*delete_on_close* (Contributed by Evgeny Zorin in :gh:`58451`.)
196205

206+
sys
207+
---
208+
209+
* Add :func:`sys.activate_stack_trampoline` and
210+
:func:`sys.deactivate_stack_trampoline` for activating and deactivating
211+
stack profiler trampolines,
212+
and :func:`sys.is_stack_trampoline_active` for querying if stack profiler
213+
trampolines are active.
214+
(Contributed by Pablo Galindo and Christian Heimes
215+
with contributions from Gregory P. Smith [Google] and Mark Shannon
216+
in :gh:`96123`.)
217+
218+
197219
Optimizations
198220
=============
199221

Include/abstract.h

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -238,6 +238,22 @@ PyAPI_FUNC(Py_ssize_t) PyVectorcall_NARGS(size_t nargsf);
238238
"tuple" and keyword arguments "dict". "dict" may also be NULL */
239239
PyAPI_FUNC(PyObject *) PyVectorcall_Call(PyObject *callable, PyObject *tuple, PyObject *dict);
240240

241+
#if !defined(Py_LIMITED_API) || Py_LIMITED_API+0 >= 0x030C0000
242+
#define PY_VECTORCALL_ARGUMENTS_OFFSET \
243+
(_Py_STATIC_CAST(size_t, 1) << (8 * sizeof(size_t) - 1))
244+
245+
/* Perform a PEP 590-style vector call on 'callable' */
246+
PyAPI_FUNC(PyObject *) PyObject_Vectorcall(
247+
PyObject *callable,
248+
PyObject *const *args,
249+
size_t nargsf,
250+
PyObject *kwnames);
251+
252+
/* Call the method 'name' on args[0] with arguments in args[1..nargsf-1]. */
253+
PyAPI_FUNC(PyObject *) PyObject_VectorcallMethod(
254+
PyObject *name, PyObject *const *args,
255+
size_t nargsf, PyObject *kwnames);
256+
#endif
241257

242258
/* Implemented elsewhere:
243259

Include/cpython/abstract.h

Lines changed: 0 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -50,9 +50,6 @@ PyAPI_FUNC(PyObject *) _PyObject_MakeTpCall(
5050
PyObject *const *args, Py_ssize_t nargs,
5151
PyObject *keywords);
5252

53-
#define PY_VECTORCALL_ARGUMENTS_OFFSET \
54-
(_Py_STATIC_CAST(size_t, 1) << (8 * sizeof(size_t) - 1))
55-
5653
// PyVectorcall_NARGS() is exported as a function for the stable ABI.
5754
// Here (when we are not using the stable ABI), the name is overridden to
5855
// call a static inline function for best performance.
@@ -65,12 +62,6 @@ _PyVectorcall_NARGS(size_t n)
6562

6663
PyAPI_FUNC(vectorcallfunc) PyVectorcall_Function(PyObject *callable);
6764

68-
PyAPI_FUNC(PyObject *) PyObject_Vectorcall(
69-
PyObject *callable,
70-
PyObject *const *args,
71-
size_t nargsf,
72-
PyObject *kwnames);
73-
7465
// Backwards compatibility aliases for API that was provisional in Python 3.8
7566
#define _PyObject_Vectorcall PyObject_Vectorcall
7667
#define _PyObject_VectorcallMethod PyObject_VectorcallMethod
@@ -96,10 +87,6 @@ PyAPI_FUNC(PyObject *) _PyObject_FastCall(
9687

9788
PyAPI_FUNC(PyObject *) PyObject_CallOneArg(PyObject *func, PyObject *arg);
9889

99-
PyAPI_FUNC(PyObject *) PyObject_VectorcallMethod(
100-
PyObject *name, PyObject *const *args,
101-
size_t nargsf, PyObject *kwnames);
102-
10390
static inline PyObject *
10491
PyObject_CallMethodNoArgs(PyObject *self, PyObject *name)
10592
{

0 commit comments

Comments
 (0)