Skip to content

Commit a121044

Browse files
committed
LuaJIT platform profiler documentation
Introduce a new document on LuaJIT platform profiler * LuaJIT platform profiler is a new feature implemented in Tarantool 2.10.0. The document describes the profiler's behavior as of this and next Tarantool versions. * The document is placed in the Tooling chapter. Closes #2587
1 parent 6f21161 commit a121044

File tree

2 files changed

+321
-0
lines changed

2 files changed

+321
-0
lines changed

doc/tooling/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,4 +11,5 @@ to work with Tarantool.
1111
tcm/index
1212
interactive_console
1313
luajit_memprof
14+
luajit_sysprof
1415
luajit_getmetrics

doc/tooling/luajit_sysprof.rst

Lines changed: 320 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,320 @@
1+
.. _luajit_sysprof:
2+
3+
LuaJIT platform profiler
4+
========================
5+
6+
The default profiling options for LuaJIT are not fine enough to
7+
get an understanding of performance. For example, performance only
8+
able to show host stack, so all the Lua calls are seen as single
9+
``pcall()``. Oppositely, the ``jit.p`` module provided with LuaJIT
10+
is not able to give any information about the host stack.
11+
12+
Starting from version :doc:`2.10.0 </release/2.10.0>`, Tarantool
13+
has a built‑in module called ``misc.syprof`` that implements a
14+
LuaJIT sampling profiler (which we will just call *the profiler*
15+
in this section). The profiler is able to capture both guest and
16+
host stacks simultaneously, along with virtual machine states, so
17+
it can show the whole picture.
18+
19+
The following profiling modes are available:
20+
21+
* **Default**: only virtual machine state counters.
22+
* **Leaf**: shows the last frame on the stack.
23+
* **Callchain**: performs a complete stack dump.
24+
25+
The profiler comes with the default parser, which produces output in
26+
a `flamegraph.pl`-suitable format.
27+
28+
.. contents::
29+
:local:
30+
:depth: 2
31+
32+
.. _profiler_usage:
33+
34+
Working with the profiler
35+
-------------------------
36+
37+
Usage of the profiler involves two steps:
38+
39+
1. :ref:`Collecting <profiler_usage_get>` a binary profile of
40+
stacks, (further, *binary sampling profile* or *binary profile*
41+
for short).
42+
2. :ref:`Parsing <profiler_usage_parse>` the collected binary
43+
profile to get a human-readable profiling report.
44+
45+
.. _profiler_usage_get:
46+
47+
Collecting binary profile
48+
~~~~~~~~~~~~~~~~~~~~~~~~~
49+
50+
To collect a binary profile for a particular part of the Lua and C code,
51+
you need to place this part between two ``misc.sysprof`` functions,
52+
namely, ``misc.sysprof.start()`` and ``misc.sysprof.stop()``, and
53+
then execute the code under Tarantool.
54+
55+
Below is a chunk of Lua code named ``test.lua`` to illustrate this.
56+
57+
.. _profiler_usage_example01:
58+
59+
.. code-block:: lua
60+
:linenos:
61+
62+
local function payload()
63+
local function fib(n)
64+
if n <= 1 then
65+
return n
66+
end
67+
return fib(n - 1) + fib(n - 2)
68+
end
69+
return fib(32)
70+
end
71+
72+
payload()
73+
74+
local res, err = misc.sysprof.start({mode = 'C', interval = 1, path = 'sysprof.bin'})
75+
assert(res, err)
76+
77+
payload()
78+
79+
res, err = misc.sysprof.stop()
80+
assert(res, err)
81+
82+
The Lua code for starting the profiler -- as in line 1 in the
83+
``test.lua`` example above -- is:
84+
85+
.. code-block:: lua
86+
87+
local str, err = misc.sysprof.start({mode = 'C', interval = 1, path = 'sysprof.bin'})
88+
89+
where ``mode`` is a profiling mode, ``interval`` is a sampling interval,
90+
and ``sysprof.bin`` is the name of the binary file where
91+
profiling events are written.
92+
93+
If the operation fails, for example if it is not possible to open
94+
a file for writing or if the profiler is already running,
95+
``misc.sysprof.start()`` returns ``nil`` as the first result,
96+
an error-message string as the second result,
97+
and a system-dependent error code number as the third result.
98+
If the operation succeeds, ``misc.sysprof.start()`` returns ``true``.
99+
100+
The Lua code for stopping the profiler -- as in line 15 in the
101+
``test.lua`` example above -- is:
102+
103+
.. code-block:: lua
104+
105+
local res, err = misc.sysprof.stop()
106+
107+
If the operation fails, for example if there is an error when the
108+
file descriptor is being closed or if there is a failure during
109+
reporting, ``misc.sysprof.stop()`` returns ``nil`` as the first
110+
result, an error-message string as the second result,
111+
and a system-dependent error code number as the third result.
112+
If the operation succeeds, ``misc.sysprof.stop()`` returns ``true``.
113+
114+
.. _profiler_usage_generate:
115+
116+
To generate the file with memory profile in binary format
117+
(in the :ref:`test.lua code example above <profiler_usage_example01>`
118+
the file name is ``sysprof.bin``), execute the code under Tarantool:
119+
120+
.. code-block:: console
121+
122+
$ tarantool test.lua
123+
124+
Tarantool collects the allocation events in ``sysprof.bin``, puts
125+
the file in its :ref:`working directory <cfg_basic-work_dir>`,
126+
and closes the session.
127+
128+
.. _profiler_usage_parse:
129+
130+
Parsing binary profile and generating profiling report
131+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
132+
133+
.. _profiler_usage_parse_command:
134+
135+
After getting the platform profile in binary format, the next step is
136+
to parse it to get a human-readable profiling report. You can do this
137+
via Tarantool by using the following command
138+
(mind the hyphen ``-`` before the filename):
139+
140+
.. code-block:: console
141+
142+
$ tarantool -e 'require("sysprof")(arg)' - sysprof.bin > tmp
143+
$ curl -O https://raw.githubusercontent.com/brendangregg/FlameGraph/refs/heads/master/flamegraph.pl
144+
$ perl flamegraph.pl tmp > sysprof.svg
145+
146+
where ``sysprof.bin`` is the binary profile
147+
:ref:`generated earlier <profiler_usage_generate>` by ``tarantool test.lua``.
148+
(Warning: there is a slight behavior change here, the ``tarantool -e ...``
149+
command was slightly different in Tarantool versions prior to Tarantool 2.8.1.)
150+
Resulted SVG image contains a flamegraph with collected stacks and can be opened
151+
by modern web-browser for analysis.
152+
153+
As for investigating the Lua code with the help of profiling reports,
154+
it is always code-dependent and there can't be hundred per cent definite
155+
recommendations in this regard. Nevertheless, you can see some of the things
156+
in the :ref:`Profiling report analysis example <profiler_analysis>` later.
157+
158+
.. _profiler_api:
159+
160+
The Lua API
161+
~~~~~~~~~~~
162+
163+
* ``misc.sysprof.start(opts)``
164+
* ``misc.sysprof.stop()``
165+
* ``misc.sysprof.report()``
166+
167+
First two functions return boolean ``res`` and ``err``, which is
168+
``nil`` on success and contains an error message on failure.
169+
170+
``misc.sysprof.report`` returns a Lua table containing the
171+
following counters:
172+
173+
.. code-block:: console
174+
175+
{
176+
"samples" = int,
177+
"INTERP" = int,
178+
"LFUNC" = int,
179+
"FFUNC" = int,
180+
"CFUNC" = int,
181+
"GC" = int,
182+
"EXIT" = int,
183+
"RECORD" = int,
184+
"OPT" = int,
185+
"ASM" = int,
186+
"TRACE" = int
187+
}
188+
189+
Parameter opts for the ``misc.sysprof.start`` can contain the
190+
following parameters:
191+
192+
.. code-block:: console
193+
194+
{
195+
mode = 'D'/'L'/'C', -- 'D' = DEFAULT, 'L' = LEAF, 'C' = CALLGRAPH
196+
interval = 10, -- sampling interval in msec.
197+
path = '/path/to/file' -- location to store profile data.
198+
}
199+
200+
Mode MUST be provided always, interval and path are optional.
201+
The default interval is 10 msec, default path is ``sysprof.bin``.
202+
203+
The C API
204+
~~~~~~~~~
205+
206+
The platform profiler provides a low-level C interface:
207+
208+
.. code-block:: console
209+
210+
int luaM_sysprof_set_writer(sp_writer writer). Sets writer function for sysprof.
211+
212+
int luaM_sysprof_set_on_stop(sp_on_stop on_stop). Sets on stop callback for sysprof to clear resources.
213+
214+
int luaM_sysprof_set_backtracer(sp_backtracer backtracer). Sets backtracking function. If backtracer arg is NULL, the default backtracer is set.
215+
There is no need to call the configuration functions multiple times, if you are starting and stopping profiler several times in a single program. Also, it is not necessary to configure sysprof for the default mode, however, one MUST configure it for the other modes.
216+
217+
int luaM_sysprof_start(lua_State *L, const struct luam_Sysprof_Options *opt)
218+
219+
int luaM_sysprof_stop(lua_State *L)
220+
221+
int luaM_sysprof_report(struct luam_Sysprof_Counters *counters). Writes profiling counters for each vmstate.
222+
223+
All of the functions return 0 on success and an error code on failure.
224+
225+
The configuration C types are:
226+
227+
.. code-block:: console
228+
229+
/* Profiler configurations. */
230+
/*
231+
** Writer function for profile events. Must be async-safe, see also
232+
** `man 7 signal-safety`.
233+
** Should return amount of written bytes on success or zero in case of error.
234+
** Setting *data to NULL means end of profiling.
235+
** For details see <lj_wbuf.h>.
236+
*/
237+
238+
typedef size_t (*sp_writer)(const void **data, size_t len, void *ctx);
239+
/*
240+
** Callback on profiler stopping. Required for correctly cleaning
241+
** at VM finalization when profiler is still running.
242+
** Returns zero on success.
243+
*/
244+
typedef int (*sp_on_stop)(void *ctx, uint8_t *buf);
245+
/*
246+
** Backtracing function for the host stack. Should call `frame_writer` on
247+
** each frame in the stack in the order from the stack top to the stack
248+
** bottom. The `frame_writer` function is implemented inside the sysprof
249+
** and will be passed to the `backtracer` function. If `frame_writer` returns
250+
** NULL, backtracing should be stopped. If `frame_writer` returns not NULL,
251+
** the backtracing should be continued if there are frames left.
252+
*/
253+
typedef void (*sp_backtracer)(void *(*frame_writer)(int frame_no, void *addr));
254+
255+
Profiler options are the following:
256+
257+
.. code-block:: console
258+
259+
struct luam_Sysprof_Options {
260+
/* Profiling mode. */
261+
uint8_t mode;
262+
/* Sampling interval in msec. */
263+
uint64_t interval;
264+
/* Custom buffer to write data. */
265+
uint8_t *buf;
266+
/* The buffer's size. */
267+
size_t len;
268+
/* Context for the profile writer and final callback. */
269+
void *ctx;
270+
};
271+
272+
Profiling modes:
273+
274+
.. code-block:: console
275+
276+
/*
277+
** DEFAULT mode collects only data for luam_sysprof_counters, which is stored
278+
** in memory and can be collected with luaM_sysprof_report after profiler
279+
** stops.
280+
*/
281+
#define LUAM_SYSPROF_DEFAULT 0
282+
/*
283+
** LEAF mode = DEFAULT + streams samples with only top frames of host and
284+
** guests stacks in format described in <lj_sysprof.h>
285+
*/
286+
#define LUAM_SYSPROF_LEAF 1
287+
/*
288+
** CALLGRAPH mode = DEFAULT + streams samples with full callchains of host
289+
** and guest stacks in format described in <lj_sysprof.h>
290+
*/
291+
#define LUAM_SYSPROF_CALLGRAPH 2
292+
293+
Counters structure for the luaM_Sysprof_Report:
294+
295+
.. code-block:: console
296+
297+
struct luam_Sysprof_Counters {
298+
uint64_t vmst_interp;
299+
uint64_t vmst_lfunc;
300+
uint64_t vmst_ffunc;
301+
uint64_t vmst_cfunc;
302+
uint64_t vmst_gc;
303+
uint64_t vmst_exit;
304+
uint64_t vmst_record;
305+
uint64_t vmst_opt;
306+
uint64_t vmst_asm;
307+
uint64_t vmst_trace;
308+
/*
309+
** XXX: Order of vmst counters is important: it should be the same as the
310+
** order of the vmstates.
311+
*/
312+
uint64_t samples;
313+
};
314+
315+
Caveats:
316+
317+
* Providing writers, backtracers, etc; in the Default mode is pointless, since
318+
it just collect counters.
319+
* There is NO default configuration for sysprof, so the ``luaM_Sysprof_Configure``
320+
must be called before the first run of the sysprof. Mind the async-safety.

0 commit comments

Comments
 (0)