Skip to content

Commit 4dc108d

Browse files
committed
Update documentation and release notes for llvm-profgen COFF support
This change: - Updates the existing Clang User's Manual section on SPGO so that it describes how to use llvm-profgen to perform SPGO on Windows. This is new functionality implemented in llvm#83972. - Fixes a minor typo in the existing llvm-profgen invocation example. - Adds an LLVM release note on this new functionality in llvm-profgen.
1 parent 9a9aa41 commit 4dc108d

File tree

2 files changed

+44
-8
lines changed

2 files changed

+44
-8
lines changed

clang/docs/UsersManual.rst

Lines changed: 39 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -2410,20 +2410,35 @@ usual build cycle when using sample profilers for optimization:
24102410

24112411
1. Build the code with source line table information. You can use all the
24122412
usual build flags that you always build your application with. The only
2413-
requirement is that you add ``-gline-tables-only`` or ``-g`` to the
2414-
command line. This is important for the profiler to be able to map
2415-
instructions back to source line locations.
2413+
requirement is that DWARF debug info including source line information is
2414+
generated. This DWARF information is important for the profiler to be able
2415+
to map instructions back to source line locations.
2416+
2417+
On Linux, ``-g`` or just ``-gline-tables-only`` is sufficient:
24162418

24172419
.. code-block:: console
24182420
24192421
$ clang++ -O2 -gline-tables-only code.cc -o code
24202422
2423+
It is also possible to include DWARF in Windows binaries:
2424+
2425+
.. code-block:: console
2426+
2427+
$ clang-cl -O2 -gdwarf -gline-tables-only coff-profile.cpp -fuse-ld=lld -link -debug:dwarf
2428+
24212429
2. Run the executable under a sampling profiler. The specific profiler
24222430
you use does not really matter, as long as its output can be converted
2423-
into the format that the LLVM optimizer understands. Currently, there
2424-
exists a conversion tool for the Linux Perf profiler
2425-
(https://perf.wiki.kernel.org/), so these examples assume that you
2426-
are using Linux Perf to profile your code.
2431+
into the format that the LLVM optimizer understands.
2432+
2433+
Two such profilers are the the Linux Perf profiler
2434+
(https://perf.wiki.kernel.org/) and Intel's Sampling Enabling Product (SEP),
2435+
available as part of `Intel VTune
2436+
<https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/vtune-profiler.html>`_.
2437+
2438+
The LLVM tool ``llvm-profgen`` can convert output of either Perf or SEP. An
2439+
external tool, AutoFDO, also supports Linux Perf output.
2440+
2441+
When using Perf:
24272442

24282443
.. code-block:: console
24292444
@@ -2434,6 +2449,15 @@ usual build cycle when using sample profilers for optimization:
24342449
it provides better call information, which improves the accuracy of
24352450
the profile data.
24362451

2452+
When using SEP:
2453+
2454+
.. code-block:: console
2455+
2456+
$ sep -start -ec BR_INST_RETIRED.NEAR_TAKEN:precise=yes:pdir -lbr no_filter:usr -perf-script ip,brstack -app ./code
2457+
2458+
This produces a ``perf.data.script`` output which can be used with
2459+
``llvm-profgen``'s ``--perfscript`` input option.
2460+
24372461
3. Convert the collected profile data to LLVM's sample profile format.
24382462
This is currently supported via the AutoFDO converter ``create_llvm_prof``.
24392463
It is available at https://github.com/google/autofdo. Once built and
@@ -2454,7 +2478,14 @@ usual build cycle when using sample profilers for optimization:
24542478

24552479
.. code-block:: console
24562480
2457-
$ llvm-profgen --binary=./code --output=code.prof--perfdata=perf.data
2481+
$ llvm-profgen --binary=./code --output=code.prof --perfdata=perf.data
2482+
2483+
When using SEP the output is in the textual format corresponding to
2484+
`llvm-profgen --perfscript`. For example:
2485+
2486+
.. code-block:: console
2487+
2488+
$ llvm-profgen --binary=./code --output=code.prof --perfscript=perf.data.script
24582489
24592490
24602491
4. Build the code again using the collected profile. This step feeds

llvm/docs/ReleaseNotes.rst

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -157,6 +157,11 @@ Changes to the LLVM tools
157157
``--set-symbols-visibility`` options for ELF input to change the
158158
visibility of symbols.
159159

160+
* llvm-profgen now supports COFF+DWARF binaries. This enables Sample-based PGO
161+
on Windows using Intel VTune's SEP. For details on usage, see the `end-user
162+
documentation for SPGO
163+
<https://clang.llvm.org/docs/UsersManual.html#using-sampling-profilers>`_.
164+
160165
Changes to LLDB
161166
---------------------------------
162167

0 commit comments

Comments
 (0)