Skip to content

Commit 558ed22

Browse files
liguriolenkis
andauthored
LuaJIT platform profiler documentation (#4733)
Introduce a new document on LuaJIT platform profiler * LuaJIT platform profiler is a new feature implemented in Tarantool 2.10.0. The document describes the profiler's behavior as of this and next Tarantool versions. * The document is placed in the Tooling chapter. Closes #2587 Co-authored-by: lenkis <[email protected]>
1 parent e6977b7 commit 558ed22

File tree

3 files changed

+475
-101
lines changed

3 files changed

+475
-101
lines changed

doc/tooling/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,4 +11,5 @@ to work with Tarantool.
1111
tcm/index
1212
interactive_console
1313
luajit_memprof
14+
luajit_sysprof
1415
luajit_getmetrics

doc/tooling/luajit_memprof.rst

Lines changed: 104 additions & 101 deletions
Original file line numberDiff line numberDiff line change
@@ -3,11 +3,13 @@
33
LuaJIT memory profiler
44
======================
55

6-
Starting from version :doc:`2.7.1 </release/2.7.1>`, Tarantool
6+
Since version :doc:`2.7.1 </release/2.7.1>`, Tarantool
77
has a built‑in module called ``misc.memprof`` that implements a LuaJIT memory
8-
profiler (which we will just call *the profiler* in this section). The profiler provides
9-
a memory allocation report that helps analyze Lua code and find the places
10-
that put the most pressure on the Lua garbage collector (GC).
8+
profiler (further in this section we call it *the profiler* for short).
9+
The profiler provides a memory allocation report that helps analyze Lua code
10+
and find the places that put the most pressure on the Lua garbage collector (GC).
11+
12+
Inside this section:
1113

1214
.. contents::
1315
:local:
@@ -18,7 +20,7 @@ that put the most pressure on the Lua garbage collector (GC).
1820
Working with the profiler
1921
-------------------------
2022

21-
Usage of the profiler involves two steps:
23+
The profiler usage involves two steps:
2224

2325
1. :ref:`Collecting <profiler_usage_get>` a binary profile of allocations,
2426
reallocations, and deallocations in memory related to Lua
@@ -28,13 +30,13 @@ Usage of the profiler involves two steps:
2830

2931
.. _profiler_usage_get:
3032

31-
Collecting binary profile
32-
~~~~~~~~~~~~~~~~~~~~~~~~~
33+
Collecting a binary profile
34+
~~~~~~~~~~~~~~~~~~~~~~~~~~~
3335

3436
To collect a binary profile for a particular part of the Lua code,
3537
you need to place this part between two ``misc.memprof`` functions,
3638
namely, ``misc.memprof.start()`` and ``misc.memprof.stop()``, and then execute
37-
the code under Tarantool.
39+
the code in Tarantool.
3840

3941
Below is a chunk of Lua code named ``test.lua`` to illustrate this.
4042

@@ -43,26 +45,26 @@ Below is a chunk of Lua code named ``test.lua`` to illustrate this.
4345
.. code-block:: lua
4446
:linenos:
4547
46-
-- Prevent allocations on traces.
47-
jit.off()
48-
local str, err = misc.memprof.start("memprof_new.bin")
49-
-- Lua doesn't create a new frame to call string.rep, and all allocations
50-
-- are attributed not to the append() function but to the parent scope.
51-
local function append(str, rep)
52-
return string.rep(str, rep)
53-
end
48+
-- Prevent allocations on traces.
49+
jit.off()
50+
local str, err = misc.memprof.start("memprof_new.bin")
51+
-- Lua doesn't create a new frame to call string.rep, and all allocations
52+
-- are attributed not to the append() function but to the parent scope.
53+
local function append(str, rep)
54+
return string.rep(str, rep)
55+
end
5456
55-
local t = {}
56-
for i = 1, 1e4 do
57-
-- table.insert is the built-in function and all corresponding
58-
-- allocations are reported in the scope of the main chunk.
59-
table.insert(t,
60-
append('q', i)
61-
)
62-
end
63-
local stp, err = misc.memprof.stop()
57+
local t = {}
58+
for i = 1, 1e4 do
59+
-- table.insert is the built-in function and all corresponding
60+
-- allocations are reported in the scope of the main chunk.
61+
table.insert(t,
62+
append('q', i)
63+
)
64+
end
65+
local str, err = misc.memprof.stop()
6466
65-
The Lua code for starting the profiler -- as in line 3 in the test.lua example above -- is:
67+
The Lua code for starting the profiler -- as in line 3 in the ``test.lua`` example above -- is:
6668

6769
.. code-block:: lua
6870
@@ -75,27 +77,29 @@ for example if it is not possible to open a file for writing or if the profiler
7577
``misc.memprof.start()`` returns ``nil`` as the first result,
7678
an error-message string as the second result,
7779
and a system-dependent error code number as the third result.
80+
7881
If the operation succeeds, ``misc.memprof.start()`` returns ``true``.
7982

80-
The Lua code for stopping the profiler -- as in line 18 in the test.lua example above -- is:
83+
The Lua code for stopping the profiler -- as in line 18 in the ``test.lua`` example above -- is:
8184

8285
.. code-block:: lua
8386
84-
local stp, err = misc.memprof.stop()
87+
local str, err = misc.memprof.stop()
8588
8689
If the operation fails,
87-
for example if there is an error when the file descriptor is being closed
90+
for example if there is an error when the file descriptor is being closed
8891
or if there is a failure during reporting,
8992
``misc.memprof.stop()`` returns ``nil`` as the first result,
9093
an error-message string as the second result,
9194
and a system-dependent error code number as the third result.
95+
9296
If the operation succeeds, ``misc.memprof.stop()`` returns ``true``.
9397

9498
.. _profiler_usage_generate:
9599

96100
To generate the file with memory profile in binary format
97101
(in the :ref:`test.lua code example above <profiler_usage_example01>`
98-
the file name is ``memprof_new.bin``), execute the code under Tarantool:
102+
the file name is ``memprof_new.bin``), execute the code in Tarantool:
99103

100104
.. code-block:: console
101105
@@ -115,7 +119,7 @@ a profiling report:
115119
:ref:`note about jitoff <profiler_usage_internal_jitoff>` for more details.
116120

117121
* Lines 6-8: Tail call optimization doesn't create a new call frame, so all
118-
allocations inside the function called via the ``CALLT/CALLMT`` `bytecodes <http://wiki.luajit.org/Bytecode-2.0#calls-and-vararg-handling>`_
122+
allocations inside the function called via the ``CALLT/CALLMT`` `bytecodes <https://github.com/tarantool/tarantool/wiki/LuaJIT-Bytecodes>`_
119123
are attributed to the function's caller. See also the comments preceding these lines.
120124

121125
* Lines 14-16: Usually the information about allocations inside Lua built‑ins
@@ -127,8 +131,8 @@ a profiling report:
127131

128132
.. _profiler_usage_parse:
129133

130-
Parsing binary profile and generating profiling report
131-
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
134+
Parsing a binary profile and generating a profiling report
135+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
132136

133137
.. _profiler_usage_parse_command:
134138

@@ -143,8 +147,11 @@ via Tarantool by using the following command
143147
144148
where ``memprof_new.bin`` is the binary profile
145149
:ref:`generated earlier <profiler_usage_generate>` by ``tarantool test.lua``.
146-
(Warning: there is a slight behavior change here, the ``tarantool -e ...``
147-
command was slightly different in Tarantool versions prior to Tarantool 2.8.1.)
150+
151+
.. note::
152+
153+
There is a slight behavior change here: the ``tarantool -e ...`` command
154+
was slightly different in Tarantool versions prior to Tarantool 2.8.1.
148155

149156
Tarantool generates a profiling report and displays it on the console before closing
150157
the session:
@@ -187,7 +194,6 @@ the session:
187194
On macOS, a report will be different for the same chunk of code because
188195
Tarantool and LuaJIT are built with the GC64 mode enabled for macOS.
189196

190-
191197
Let's examine the report structure. A report has four sections:
192198

193199
* ALLOCATIONS
@@ -203,21 +209,20 @@ An event record has the following format:
203209
.. code-block:: text
204210
205211
@<filename>:<line_number>: <number_of_events> events +<allocated> bytes -<freed> bytes
206-
207212
213+
where:
208214

209-
* <filename>a name of the file containing Lua code.
210-
* <line_number>the line number where the event is detected.
211-
* <number_of_events>a number of events for this code line.
212-
* +<allocated> bytesamount of memory allocated during all the events on this line.
213-
* -<freed> bytesamount of memory freed during all the events on this line.
215+
* ``<filename>`` -— a name of the file containing Lua code.
216+
* ``<line_number>`` -— the line number where the event is detected.
217+
* ``<number_of_events>`` —- a number of events for this code line.
218+
* ``+<allocated> bytes`` —- amount of memory allocated during all the events on this line.
219+
* ``-<freed> bytes`` —- amount of memory freed during all the events on this line.
214220

215221
The ``Overrides`` label shows what allocation has been overridden.
216222

217223
See the :ref:`test.lua chunk above <profiler_usage_example01>`
218224
with the explanation in the comments for some examples.
219225

220-
221226
.. _profiler_usage_internal_jitoff:
222227

223228
The ``INTERNAL`` label indicates that this event is caused by internal LuaJIT
@@ -256,10 +261,10 @@ structures.
256261
As for investigating the Lua code with the help of profiling reports,
257262
it is always code-dependent and there can't be hundred per cent definite
258263
recommendations in this regard. Nevertheless, you can see some of the things
259-
in the :ref:`Profiling report analysis example <profiler_analysis>` later.
264+
in the :ref:`Profiling a report analysis example <profiler_analysis>` later.
260265

261266
Also, below is the :ref:`FAQ <profiler_faq>` section with the questions that
262-
most probably can arise while using profiler.
267+
most probably can arise while using the profiler.
263268

264269
.. _profiler_faq:
265270

@@ -317,44 +322,44 @@ console for a running instance.
317322
.. code-block:: lua
318323
:linenos:
319324
320-
local fiber = require "fiber"
321-
local log = require "log"
325+
local fiber = require "fiber"
326+
local log = require "log"
322327
323-
fiber.create(function()
324-
fiber.name("memprof")
328+
fiber.create(function()
329+
fiber.name("memprof")
325330
326-
collectgarbage() -- Collect all objects already dead
327-
log.warn("start of profile")
331+
collectgarbage() -- Collect all objects already dead
332+
log.warn("start of profile")
328333
329-
local st, err = misc.memprof.start(FILENAME)
330-
if not st then
331-
log.error("failed to start profiler: %s", err)
332-
end
334+
local st, err = misc.memprof.start(FILENAME)
335+
if not st then
336+
log.error("failed to start profiler: %s", err)
337+
end
333338
334-
fiber.sleep(TIME)
339+
fiber.sleep(TIME)
335340
336-
collectgarbage()
337-
st, err = misc.memprof.stop()
341+
collectgarbage()
342+
st, err = misc.memprof.stop()
338343
339-
if not st then
340-
log.error("profiler on stop error: %s", err)
341-
end
344+
if not st then
345+
log.error("profiler on stop error: %s", err)
346+
end
342347
343-
log.warn("end of profile")
344-
end)
348+
log.warn("end of profile")
349+
end)
345350
346-
where
351+
where:
347352

348-
* ``FILENAME``—the name of the binary file where profiling events are written
349-
* ``TIME``—duration of profiling, in seconds.
353+
* ``FILENAME``—the name of the binary file where profiling events are written
354+
* ``TIME``—duration of profiling, in seconds.
350355

351356
Also, you can directly call ``misc.memprof.start()`` and ``misc.memprof.stop()``
352357
from a console.
353358

354359
.. _profiler_analysis:
355360

356-
Profiling report analysis example
357-
---------------------------------
361+
Profiling a report analysis example
362+
-----------------------------------
358363

359364
In the example below, the following Lua code named ``format_concat.lua`` is
360365
investigated with the help of the memory profiler reports.
@@ -364,39 +369,39 @@ investigated with the help of the memory profiler reports.
364369
.. code-block:: lua
365370
:linenos:
366371
367-
-- Prevent allocations on new traces.
368-
jit.off()
372+
-- Prevent allocations on new traces.
373+
jit.off()
369374
370-
local function concat(a)
371-
local nstr = a.."a"
372-
return nstr
373-
end
375+
local function concat(a)
376+
local nstr = a.."a"
377+
return nstr
378+
end
374379
375-
local function format(a)
376-
local nstr = string.format("%sa", a)
377-
return nstr
378-
end
380+
local function format(a)
381+
local nstr = string.format("%sa", a)
382+
return nstr
383+
end
379384
380-
collectgarbage()
385+
collectgarbage()
381386
382-
local binfile = "/tmp/memprof_"..(arg[0]):match("([^/]*).lua")..".bin"
387+
local binfile = "/tmp/memprof_"..(arg[0]):match("([^/]*).lua")..".bin"
383388
384-
local st, err = misc.memprof.start(binfile)
385-
assert(st, err)
389+
local st, err = misc.memprof.start(binfile)
390+
assert(st, err)
386391
387-
-- Payload.
388-
for i = 1, 10000 do
389-
local f = format(i)
390-
local c = concat(i)
391-
end
392-
collectgarbage()
392+
-- Payload.
393+
for i = 1, 10000 do
394+
local f = format(i)
395+
local c = concat(i)
396+
end
397+
collectgarbage()
393398
394-
local st, err = misc.memprof.stop()
395-
assert(st, err)
399+
local st, err = misc.memprof.stop()
400+
assert(st, err)
396401
397-
os.exit()
402+
os.exit()
398403
399-
When you run this code :ref:`under Tarantool <profiler_usage_generate>` and
404+
When you run this code :ref:`in Tarantool <profiler_usage_generate>` and
400405
then :ref:`parse <profiler_usage_parse_command>` the binary memory profile
401406
in /tmp/memprof_format_concat.bin,
402407
you will get the following profiling report:
@@ -423,9 +428,9 @@ you will get the following profiling report:
423428
424429
Reasonable questions regarding the report can be:
425430

426-
* Why are there no allocations related to the ``concat()`` function?
427-
* Why is the number of allocations not a round number?
428-
* Why are there about 20K allocations instead of 10K?
431+
* Why are there no allocations related to the ``concat()`` function?
432+
* Why is the number of allocations not a round number?
433+
* Why are there about 20K allocations instead of 10K?
429434

430435
First of all, LuaJIT doesn't create a new string if the string with the same
431436
payload exists (see details on `lua-users.org/wiki <http://lua-users.org/wiki/ImmutableObjects>`_).
@@ -469,8 +474,8 @@ To profile only the ``concat()`` function, comment out line 23 (which is
469474
**Q**: But what will change if JIT compilation is enabled?
470475

471476
**A**: In the :ref:`code <profiler_usage_example03>`, comment out line 2 (which is
472-
``jit.off()``) and run
473-
the profiler . Now there are only 56 allocations in the report, and all other
477+
``jit.off()``) and run the profiler.
478+
Now there are only 56 allocations in the report, and all the other
474479
allocations are JIT-related (see also the related
475480
`dev issue <https://github.com/tarantool/tarantool/issues/5679>`_):
476481

@@ -494,8 +499,6 @@ allocations are JIT-related (see also the related
494499
@format_concat.lua:0 holds 640 bytes: 4 allocs, 0 frees
495500
INTERNAL holds 360 bytes: 2 allocs, 1 frees
496501
497-
498-
499502
This happens because a trace has been compiled after 56 iterations (the default
500503
value of the ``hotloop`` compiler parameter). Then, the
501504
JIT-compiler removed the unused variable ``c`` from the trace, and, therefore,
@@ -538,7 +541,7 @@ identifier is not yet compiled via LuaJIT. So, a trace can't be recorded and
538541
the compiler doesn't perform the corresponding optimizations.
539542

540543
If we change the ``format()`` function in lines 9-12 of the
541-
:ref:`Profiling report analysis example <profiler_usage_example03>`
544+
:ref:`Profiling a report analysis example <profiler_usage_example03>`
542545
in the following way
543546

544547
.. code-block:: lua
@@ -594,4 +597,4 @@ for example
594597
595598
$ tarantool -e 'require("memprof")(arg)' - --leak-only memprof_new.bin
596599
597-
When `--leak-only`` is used, only the HEAP SUMMARY section is displayed.
600+
When ``--leak-only`` is used, only the HEAP SUMMARY section is displayed.

0 commit comments

Comments
 (0)