@@ -254,13 +254,28 @@ files in the current directory which are ELF images for all the JIT trampolines
254
254
that were created by Python.
255
255
256
256
.. warning ::
257
- Notice that when using ``--call-graph dwarf `` the ``perf `` tool will take
257
+ When using ``--call-graph dwarf ``, the ``perf `` tool will take
258
258
snapshots of the stack of the process being profiled and save the
259
- information in the ``perf.data `` file. By default the size of the stack dump
260
- is 8192 bytes but the user can change the size by passing the size after
261
- comma like ``--call-graph dwarf,4096 ``. The size of the stack dump is
262
- important because if the size is too small ``perf `` will not be able to
263
- unwind the stack and the output will be incomplete. On the other hand, if
264
- the size is too big, then ``perf `` won't be able to sample the process as
265
- frequently as it would like as the overhead will be higher.
259
+ information in the ``perf.data `` file. By default, the size of the stack dump
260
+ is 8192 bytes, but you can change the size by passing it after
261
+ a comma like ``--call-graph dwarf,16384 ``.
266
262
263
+ The size of the stack dump is important because if the size is too small
264
+ ``perf `` will not be able to unwind the stack and the output will be
265
+ incomplete. On the other hand, if the size is too big, then ``perf `` won't
266
+ be able to sample the process as frequently as it would like as the overhead
267
+ will be higher.
268
+
269
+ The stack size is particularly important when profiling Python code compiled
270
+ with low optimization levels (like ``-O0 ``), as these builds tend to have
271
+ larger stack frames. If you are compiling Python with ``-O0 `` and not seeing
272
+ Python functions in your profiling output, try increasing the stack dump
273
+ size to 65528 bytes (the maximum)::
274
+
275
+ $ perf record -F 9999 -g -k 1 --call-graph dwarf,65528 -o perf.data python -Xperf_jit my_script.py
276
+
277
+ Different compilation flags can significantly impact stack sizes:
278
+
279
+ - Builds with ``-O0 `` typically have much larger stack frames than those with ``-O1 `` or higher
280
+ - Adding optimizations (``-O1 ``, ``-O2 ``, etc.) typically reduces stack size
281
+ - Frame pointers (``-fno-omit-frame-pointer ``) generally provide more reliable stack unwinding
0 commit comments