Skip to content

Commit 66a0f04

Browse files
committed
LuaJIT platform profiler documentation
Introduce a new document on LuaJIT platform profiler * LuaJIT platform profiler is a new feature implemented in Tarantool 2.10.0. The document describes the profiler's behavior as of this and next Tarantool versions. * The document is placed in the Tooling chapter. Closes #2587
1 parent 6f21161 commit 66a0f04

File tree

2 files changed

+321
-0
lines changed

2 files changed

+321
-0
lines changed

doc/tooling/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,4 +11,5 @@ to work with Tarantool.
1111
tcm/index
1212
interactive_console
1313
luajit_memprof
14+
luajit_sysprof
1415
luajit_getmetrics

doc/tooling/luajit_sysprof.rst

Lines changed: 320 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,320 @@
1+
.. _luajit_sysprof:
2+
3+
LuaJIT platform profiler
4+
========================
5+
6+
The default profiling options for LuaJIT are not fine enough to
7+
get an understanding of performance. For example, performance only
8+
able to show host stack, so all the Lua calls are seen as single
9+
``pcall()``. Oppositely, the ``jit.p`` module provided with LuaJIT
10+
is not able to give any information about the host stack.
11+
12+
Starting from version :doc:`2.10.0 </release/2.10.0>`, Tarantool
13+
has a built‑in module called ``misc.syprof`` that implements a
14+
LuaJIT sampling profiler (which we will just call *the profiler*
15+
in this section). The profiler is able to capture both guest and
16+
host stacks simultaneously, along with virtual machine states, so
17+
it can show the whole picture.
18+
19+
The following profiling modes are available:
20+
21+
* **Default**: only virtual machine state counters.
22+
* **Leaf**: shows the last frame on the stack.
23+
* **Callchain**: performs a complete stack dump.
24+
25+
The profiler comes with the default parser, which produces output in
26+
a `flamegraph.pl`-suitable format.
27+
28+
.. contents::
29+
:local:
30+
:depth: 2
31+
32+
.. _profiler_usage:
33+
34+
Working with the profiler
35+
-------------------------
36+
37+
Usage of the profiler involves two steps:
38+
39+
1. :ref:`Collecting <profiler_usage_get>` a binary profile of
40+
stacks, (further, *binary sampling profile* or *binary profile*
41+
for short).
42+
2. :ref:`Parsing <profiler_usage_parse>` the collected binary
43+
profile to get a human-readable profiling report.
44+
45+
.. _profiler_usage_get:
46+
47+
Collecting binary profile
48+
~~~~~~~~~~~~~~~~~~~~~~~~~
49+
50+
To collect a binary profile for a particular part of the Lua and C code,
51+
you need to place this part between two ``misc.sysprof`` functions,
52+
namely, ``misc.sysprof.start()`` and ``misc.sysprof.stop()``, and
53+
then execute the code under Tarantool.
54+
55+
Below is a chunk of Lua code named ``test.lua`` to illustrate this.
56+
57+
.. _profiler_usage_example01:
58+
59+
.. code-block:: lua
60+
:linenos:
61+
62+
local function payload()
63+
local function fib(n)
64+
if n <= 1 then
65+
return n
66+
end
67+
return fib(n - 1) + fib(n - 2)
68+
end
69+
return fib(32)
70+
end
71+
72+
payload()
73+
74+
local res, err = misc.sysprof.start({mode = 'C', interval = 1, path = 'sysprof.bin'})
75+
assert(res, err)
76+
77+
payload()
78+
79+
res, err = misc.sysprof.stop()
80+
assert(res, err)
81+
82+
The Lua code for starting the profiler -- as in line 1 in the
83+
``test.lua`` example above -- is:
84+
85+
.. code-block:: lua
86+
87+
local str, err = misc.sysprof.start({mode = 'C', interval = 1, path = 'sysprof.bin'})
88+
89+
where ``mode`` is a profiling mode, ``interval`` is a sampling interval,
90+
and ``sysprof.bin`` is the name of the binary file where
91+
profiling events are written.
92+
93+
If the operation fails, for example if it is not possible to open
94+
a file for writing or if the profiler is already running,
95+
``misc.sysprof.start()`` returns ``nil`` as the first result,
96+
an error-message string as the second result,
97+
and a system-dependent error code number as the third result.
98+
If the operation succeeds, ``misc.sysprof.start()`` returns ``true``.
99+
100+
The Lua code for stopping the profiler -- as in line 15 in the
101+
``test.lua`` example above -- is:
102+
103+
.. code-block:: lua
104+
105+
local res, err = misc.sysprof.stop()
106+
107+
If the operation fails, for example if there is an error when the
108+
file descriptor is being closed or if there is a failure during
109+
reporting, ``misc.sysprof.stop()`` returns ``nil`` as the first
110+
result, an error-message string as the second result,
111+
and a system-dependent error code number as the third result.
112+
If the operation succeeds, ``misc.sysprof.stop()`` returns ``true``.
113+
114+
.. _profiler_usage_generate:
115+
116+
To generate the file with memory profile in binary format
117+
(in the :ref:`test.lua code example above <profiler_usage_example01>`
118+
the file name is ``sysprof.bin``), execute the code under Tarantool:
119+
120+
.. code-block:: console
121+
122+
$ tarantool test.lua
123+
124+
Tarantool collects the allocation events in ``sysprof.bin``, puts
125+
the file in its :ref:`working directory <cfg_basic-work_dir>`,
126+
and closes the session.
127+
128+
.. _profiler_usage_parse:
129+
130+
Parsing binary profile and generating profiling report
131+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
132+
133+
.. _profiler_usage_parse_command:
134+
135+
After getting the platform profile in binary format, the next step is
136+
to parse it to get a human-readable profiling report. You can do this
137+
via Tarantool by using the following command
138+
(mind the hyphen ``-`` before the filename):
139+
140+
.. code-block:: console
141+
142+
$ tarantool -e 'require("sysprof")(arg)' - sysprof.bin > tmp
143+
$ curl -O https://raw.githubusercontent.com/brendangregg/FlameGraph/refs/heads/master/flamegraph.pl
144+
$ perl flamegraph.pl tmp > sysprof.svg
145+
146+
where ``sysprof.bin`` is the binary profile
147+
:ref:`generated earlier <profiler_usage_generate>` by ``tarantool test.lua``.
148+
(Warning: there is a slight behavior change here, the ``tarantool -e ...``
149+
command was slightly different in Tarantool versions prior to Tarantool 2.8.1.)
150+
Resulted SVG image contains a flamegraph with collected stacks and can be opened
151+
by modern web-browser for analysis.
152+
153+
As for investigating the Lua code with the help of profiling reports,
154+
it is always code-dependent and there can't be hundred per cent definite
155+
recommendations in this regard. Nevertheless, you can see some of the things
156+
in the :ref:`Profiling report analysis example <profiler_analysis>` later.
157+
158+
.. _profiler_api:
159+
160+
The C API
161+
~~~~~~~~~
162+
163+
The platform profiler provides a low-level C interface:
164+
165+
.. code-block:: console
166+
167+
int luaM_sysprof_set_writer(sp_writer writer). Sets writer function for sysprof.
168+
169+
int luaM_sysprof_set_on_stop(sp_on_stop on_stop). Sets on stop callback for sysprof to clear resources.
170+
171+
int luaM_sysprof_set_backtracer(sp_backtracer backtracer). Sets backtracking function. If backtracer arg is NULL, the default backtracer is set.
172+
There is no need to call the configuration functions multiple times, if you are starting and stopping profiler several times in a single program. Also, it is not necessary to configure sysprof for the default mode, however, one MUST configure it for the other modes.
173+
174+
int luaM_sysprof_start(lua_State *L, const struct luam_Sysprof_Options *opt)
175+
176+
int luaM_sysprof_stop(lua_State *L)
177+
178+
int luaM_sysprof_report(struct luam_Sysprof_Counters *counters). Writes profiling counters for each vmstate.
179+
180+
All of the functions return 0 on success and an error code on failure.
181+
182+
The configuration C types are:
183+
184+
.. code-block:: console
185+
186+
/* Profiler configurations. */
187+
/*
188+
** Writer function for profile events. Must be async-safe, see also
189+
** `man 7 signal-safety`.
190+
** Should return amount of written bytes on success or zero in case of error.
191+
** Setting *data to NULL means end of profiling.
192+
** For details see <lj_wbuf.h>.
193+
*/
194+
195+
typedef size_t (*sp_writer)(const void **data, size_t len, void *ctx);
196+
/*
197+
** Callback on profiler stopping. Required for correctly cleaning
198+
** at VM finalization when profiler is still running.
199+
** Returns zero on success.
200+
*/
201+
typedef int (*sp_on_stop)(void *ctx, uint8_t *buf);
202+
/*
203+
** Backtracing function for the host stack. Should call `frame_writer` on
204+
** each frame in the stack in the order from the stack top to the stack
205+
** bottom. The `frame_writer` function is implemented inside the sysprof
206+
** and will be passed to the `backtracer` function. If `frame_writer` returns
207+
** NULL, backtracing should be stopped. If `frame_writer` returns not NULL,
208+
** the backtracing should be continued if there are frames left.
209+
*/
210+
typedef void (*sp_backtracer)(void *(*frame_writer)(int frame_no, void *addr));
211+
212+
Profiler options are the following:
213+
214+
.. code-block:: console
215+
216+
struct luam_Sysprof_Options {
217+
/* Profiling mode. */
218+
uint8_t mode;
219+
/* Sampling interval in msec. */
220+
uint64_t interval;
221+
/* Custom buffer to write data. */
222+
uint8_t *buf;
223+
/* The buffer's size. */
224+
size_t len;
225+
/* Context for the profile writer and final callback. */
226+
void *ctx;
227+
};
228+
229+
Profiling modes:
230+
231+
.. code-block:: console
232+
233+
/*
234+
** DEFAULT mode collects only data for luam_sysprof_counters, which is stored
235+
** in memory and can be collected with luaM_sysprof_report after profiler
236+
** stops.
237+
*/
238+
#define LUAM_SYSPROF_DEFAULT 0
239+
/*
240+
** LEAF mode = DEFAULT + streams samples with only top frames of host and
241+
** guests stacks in format described in <lj_sysprof.h>
242+
*/
243+
#define LUAM_SYSPROF_LEAF 1
244+
/*
245+
** CALLGRAPH mode = DEFAULT + streams samples with full callchains of host
246+
** and guest stacks in format described in <lj_sysprof.h>
247+
*/
248+
#define LUAM_SYSPROF_CALLGRAPH 2
249+
250+
Counters structure for the luaM_Sysprof_Report:
251+
252+
.. code-block:: console
253+
254+
struct luam_Sysprof_Counters {
255+
uint64_t vmst_interp;
256+
uint64_t vmst_lfunc;
257+
uint64_t vmst_ffunc;
258+
uint64_t vmst_cfunc;
259+
uint64_t vmst_gc;
260+
uint64_t vmst_exit;
261+
uint64_t vmst_record;
262+
uint64_t vmst_opt;
263+
uint64_t vmst_asm;
264+
uint64_t vmst_trace;
265+
/*
266+
** XXX: Order of vmst counters is important: it should be the same as the
267+
** order of the vmstates.
268+
*/
269+
uint64_t samples;
270+
};
271+
272+
Caveats:
273+
274+
* Providing writers, backtracers, etc; in the Default mode is pointless, since
275+
it just collect counters.
276+
* There is NO default configuration for sysprof, so the ``luaM_Sysprof_Configure``
277+
must be called before the first run of the sysprof. Mind the async-safety.
278+
279+
The Lua API
280+
~~~~~~~~~~~
281+
282+
* ``misc.sysprof.start(opts)``
283+
* ``misc.sysprof.stop()``
284+
* ``misc.sysprof.report()``
285+
286+
First two functions return boolean ``res`` and ``err``, which is
287+
``nil`` on success and contains an error message on failure.
288+
289+
``misc.sysprof.report`` returns a Lua table containing the
290+
following counters:
291+
292+
.. code-block:: console
293+
294+
{
295+
"samples" = int,
296+
"INTERP" = int,
297+
"LFUNC" = int,
298+
"FFUNC" = int,
299+
"CFUNC" = int,
300+
"GC" = int,
301+
"EXIT" = int,
302+
"RECORD" = int,
303+
"OPT" = int,
304+
"ASM" = int,
305+
"TRACE" = int
306+
}
307+
308+
Parameter opts for the ``misc.sysprof.start`` can contain the
309+
following parameters:
310+
311+
.. code-block:: console
312+
313+
{
314+
mode = 'D'/'L'/'C', -- 'D' = DEFAULT, 'L' = LEAF, 'C' = CALLGRAPH
315+
interval = 10, -- sampling interval in msec.
316+
path = '/path/to/file' -- location to store profile data.
317+
}
318+
319+
Mode MUST be provided always, interval and path are optional.
320+
The default interval is 10 msec, default path is ``sysprof.bin``.

0 commit comments

Comments
 (0)