Skip to content

Commit ca594fe

Browse files
Update documentation and release notes for llvm-profgen COFF support (#84864)
This change: - Updates the existing Clang User's Manual section on SPGO so that it describes how to use llvm-profgen to perform SPGO on Windows. This is new functionality implemented in #83972. - Fixes a minor typo in the existing llvm-profgen invocation example. - Adds an LLVM release note on this new functionality in llvm-profgen.
1 parent a51d13f commit ca594fe

File tree

2 files changed

+52
-13
lines changed

2 files changed

+52
-13
lines changed

clang/docs/UsersManual.rst

Lines changed: 47 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -2441,20 +2441,39 @@ usual build cycle when using sample profilers for optimization:
24412441

24422442
1. Build the code with source line table information. You can use all the
24432443
usual build flags that you always build your application with. The only
2444-
requirement is that you add ``-gline-tables-only`` or ``-g`` to the
2445-
command line. This is important for the profiler to be able to map
2446-
instructions back to source line locations.
2444+
requirement is that DWARF debug info including source line information is
2445+
generated. This DWARF information is important for the profiler to be able
2446+
to map instructions back to source line locations.
2447+
2448+
On Linux, ``-g`` or just ``-gline-tables-only`` is sufficient:
24472449

24482450
.. code-block:: console
24492451
24502452
$ clang++ -O2 -gline-tables-only code.cc -o code
24512453
2454+
While MSVC-style targets default to CodeView debug information, DWARF debug
2455+
information is required to generate source-level LLVM profiles. Use
2456+
``-gdwarf`` to include DWARF debug information:
2457+
2458+
.. code-block:: console
2459+
2460+
$ clang-cl -O2 -gdwarf -gline-tables-only coff-profile.cpp -fuse-ld=lld -link -debug:dwarf
2461+
24522462
2. Run the executable under a sampling profiler. The specific profiler
24532463
you use does not really matter, as long as its output can be converted
2454-
into the format that the LLVM optimizer understands. Currently, there
2455-
exists a conversion tool for the Linux Perf profiler
2456-
(https://perf.wiki.kernel.org/), so these examples assume that you
2457-
are using Linux Perf to profile your code.
2464+
into the format that the LLVM optimizer understands.
2465+
2466+
Two such profilers are the the Linux Perf profiler
2467+
(https://perf.wiki.kernel.org/) and Intel's Sampling Enabling Product (SEP),
2468+
available as part of `Intel VTune
2469+
<https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/vtune-profiler.html>`_.
2470+
While Perf is Linux-specific, SEP can be used on Linux, Windows, and FreeBSD.
2471+
2472+
The LLVM tool ``llvm-profgen`` can convert output of either Perf or SEP. An
2473+
external project, `AutoFDO <https://github.com/google/autofdo>`_, also
2474+
provides a ``create_llvm_prof`` tool which supports Linux Perf output.
2475+
2476+
When using Perf:
24582477

24592478
.. code-block:: console
24602479
@@ -2465,11 +2484,19 @@ usual build cycle when using sample profilers for optimization:
24652484
it provides better call information, which improves the accuracy of
24662485
the profile data.
24672486

2468-
3. Convert the collected profile data to LLVM's sample profile format.
2469-
This is currently supported via the AutoFDO converter ``create_llvm_prof``.
2470-
It is available at https://github.com/google/autofdo. Once built and
2471-
installed, you can convert the ``perf.data`` file to LLVM using
2472-
the command:
2487+
When using SEP:
2488+
2489+
.. code-block:: console
2490+
2491+
$ sep -start -out code.tb7 -ec BR_INST_RETIRED.NEAR_TAKEN:precise=yes:pdir -lbr no_filter:usr -perf-script brstack -app ./code
2492+
2493+
This produces a ``code.perf.data.script`` output which can be used with
2494+
``llvm-profgen``'s ``--perfscript`` input option.
2495+
2496+
3. Convert the collected profile data to LLVM's sample profile format. This is
2497+
currently supported via the `AutoFDO <https://github.com/google/autofdo>`_
2498+
converter ``create_llvm_prof``. Once built and installed, you can convert
2499+
the ``perf.data`` file to LLVM using the command:
24732500

24742501
.. code-block:: console
24752502
@@ -2485,7 +2512,14 @@ usual build cycle when using sample profilers for optimization:
24852512

24862513
.. code-block:: console
24872514
2488-
$ llvm-profgen --binary=./code --output=code.prof--perfdata=perf.data
2515+
$ llvm-profgen --binary=./code --output=code.prof --perfdata=perf.data
2516+
2517+
When using SEP the output is in the textual format corresponding to
2518+
``llvm-profgen --perfscript``. For example:
2519+
2520+
.. code-block:: console
2521+
2522+
$ llvm-profgen --binary=./code --output=code.prof --perfscript=code.perf.data.script
24892523
24902524
24912525
4. Build the code again using the collected profile. This step feeds

llvm/docs/ReleaseNotes.rst

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -181,6 +181,11 @@ Changes to the LLVM tools
181181
for ELF input to skip the specified symbols when executing other options
182182
that can change a symbol's name, binding or visibility.
183183

184+
* llvm-profgen now supports COFF+DWARF binaries. This enables Sample-based PGO
185+
on Windows using Intel VTune's SEP. For details on usage, see the `end-user
186+
documentation for SPGO
187+
<https://clang.llvm.org/docs/UsersManual.html#using-sampling-profilers>`_.
188+
184189
Changes to LLDB
185190
---------------------------------
186191

0 commit comments

Comments
 (0)