Skip to content

Commit 4a17e86

Browse files
authored
[LinkerWrapper] Add an overriding option for debugging (#91984)
Summary: One of the downsides of the linker wrapper is that it made debugging more difficult. It is very powerful in that it can resolve a lot of input matching and library handling that could not be done before. However, the old method allowed users to simply copy-paste the script files to modify the output and test it. This patch attempts to make it easier to debug changes by letting the user override all the linker inputs. That is, we provide a user-created binary that is treated like the final output of the device link step. The intended use-case is for using `-save-temps` to get some IR, then modifying the IR and sticking it back in to see if it exhibits the old failures.
1 parent e417e61 commit 4a17e86

File tree

4 files changed

+92
-0
lines changed

4 files changed

+92
-0
lines changed

clang/docs/ClangLinkerWrapper.rst

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -46,6 +46,8 @@ only for the linker wrapper will be forwarded to the wrapped linker job.
4646
-l <libname> Search for library <libname>
4747
--opt-level=<O0, O1, O2, or O3>
4848
Optimization level for LTO
49+
--override-image=<kind=file>
50+
Uses the provided file as if it were the output of the device link step
4951
-o <path> Path to file to write output
5052
--pass-remarks-analysis=<value>
5153
Pass remarks for LTO
@@ -87,6 +89,42 @@ other. Generally, this requires that the target triple and architecture match.
8789
An exception is made when the architecture is listed as ``generic``, which will
8890
cause it be linked with any other device code with the same target triple.
8991

92+
Debugging
93+
=========
94+
95+
The linker wrapper performs a lot of steps internally, such as input matching,
96+
symbol resolution, and image registration. This makes it difficult to debug in
97+
some scenarios. The behavior of the linker-wrapper is controlled mostly through
98+
metadata, described in `clang documentation
99+
<https://clang.llvm.org/docs/OffloadingDesign.html>`_. Intermediate output can
100+
be obtained from the linker-wrapper using the ``--save-temps`` flag. These files
101+
can then be modified.
102+
103+
.. code-block:: sh
104+
105+
$> clang openmp.c -fopenmp --offload-arch=gfx90a -c
106+
$> clang openmp.o -fopenmp --offload-arch=gfx90a -Wl,--save-temps
107+
$> ; Modify temp files.
108+
$> llvm-objcopy --update-section=.llvm.offloading=out.bc openmp.o
109+
110+
Doing this will allow you to override one of the input files by replacing its
111+
embedded offloading metadata with a user-modified version. However, this will be
112+
more difficult when there are multiple input files. For a very large hammer, the
113+
``--override-image=<kind>=<file>`` flag can be used.
114+
115+
In the following example, we use the ``--save-temps`` to obtain the LLVM-IR just
116+
before running the backend. We then modify it to test altered behavior, and then
117+
compile it to a binary. This can then be passed to the linker-wrapper which will
118+
then ignore all embedded metadata and use the provided image as if it were the
119+
result of the device linking phase.
120+
121+
.. code-block:: sh
122+
123+
$> clang openmp.c -fopenmp --offload-arch=gfx90a -Wl,--save-temps
124+
$> ; Modify temp files.
125+
$> clang --target=amdgcn-amd-amdhsa -mcpu=gfx90a -nogpulib out.bc -o a.out
126+
$> clang openmp.c -fopenmp --offload-arch=gfx90a -Wl,--override-image=openmp=a.out
127+
90128
Example
91129
=======
92130

clang/test/Driver/linker-wrapper.c

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -226,3 +226,10 @@ __attribute__((visibility("protected"), used)) int x;
226226
// RELOCATABLE-LINK-CUDA: fatbinary{{.*}} -64 --create {{.*}}.fatbin --image=profile=sm_89,file={{.*}}.img
227227
// RELOCATABLE-LINK-CUDA: /usr/bin/ld.lld{{.*}}-r
228228
// RELOCATABLE-LINK-CUDA: llvm-objcopy{{.*}}a.out --remove-section .llvm.offloading
229+
230+
// RUN: %clang -cc1 %s -triple x86_64-unknown-linux-gnu -emit-obj -o %t.o
231+
// RUN: clang-linker-wrapper --host-triple=x86_64-unknown-linux-gnu --dry-run \
232+
// RUN: --linker-path=/usr/bin/ld --override=image=openmp=%t.o %t.o -o a.out 2>&1 \
233+
// RUN: | FileCheck %s --check-prefix=OVERRIDE
234+
// OVERRIDE-NOT: clang
235+
// OVERRIDE: /usr/bin/ld

clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp

Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1149,6 +1149,39 @@ DerivedArgList getLinkerArgs(ArrayRef<OffloadFile> Input,
11491149
return DAL;
11501150
}
11511151

1152+
Error handleOverrideImages(
1153+
const InputArgList &Args,
1154+
DenseMap<OffloadKind, SmallVector<OffloadingImage>> &Images) {
1155+
for (StringRef Arg : Args.getAllArgValues(OPT_override_image)) {
1156+
OffloadKind Kind = getOffloadKind(Arg.split("=").first);
1157+
StringRef Filename = Arg.split("=").second;
1158+
1159+
ErrorOr<std::unique_ptr<MemoryBuffer>> BufferOrErr =
1160+
MemoryBuffer::getFileOrSTDIN(Filename);
1161+
if (std::error_code EC = BufferOrErr.getError())
1162+
return createFileError(Filename, EC);
1163+
1164+
Expected<std::unique_ptr<ObjectFile>> ElfOrErr =
1165+
ObjectFile::createELFObjectFile(**BufferOrErr,
1166+
/*InitContent=*/false);
1167+
if (!ElfOrErr)
1168+
return ElfOrErr.takeError();
1169+
ObjectFile &Elf = **ElfOrErr;
1170+
1171+
OffloadingImage TheImage{};
1172+
TheImage.TheImageKind = IMG_Object;
1173+
TheImage.TheOffloadKind = Kind;
1174+
TheImage.StringData["triple"] =
1175+
Args.MakeArgString(Elf.makeTriple().getTriple());
1176+
if (std::optional<StringRef> CPU = Elf.tryGetCPUName())
1177+
TheImage.StringData["arch"] = Args.MakeArgString(*CPU);
1178+
TheImage.Image = std::move(*BufferOrErr);
1179+
1180+
Images[Kind].emplace_back(std::move(TheImage));
1181+
}
1182+
return Error::success();
1183+
}
1184+
11521185
/// Transforms all the extracted offloading input files into an image that can
11531186
/// be registered by the runtime.
11541187
Expected<SmallVector<StringRef>> linkAndWrapDeviceFiles(
@@ -1158,6 +1191,12 @@ Expected<SmallVector<StringRef>> linkAndWrapDeviceFiles(
11581191

11591192
std::mutex ImageMtx;
11601193
DenseMap<OffloadKind, SmallVector<OffloadingImage>> Images;
1194+
1195+
// Initialize the images with any overriding inputs.
1196+
if (Args.hasArg(OPT_override_image))
1197+
if (Error Err = handleOverrideImages(Args, Images))
1198+
return Err;
1199+
11611200
auto Err = parallelForEachError(LinkerInputFiles, [&](auto &Input) -> Error {
11621201
llvm::TimeTraceScope TimeScope("Link device input");
11631202

@@ -1439,6 +1478,10 @@ Expected<SmallVector<SmallVector<OffloadFile>>>
14391478
getDeviceInput(const ArgList &Args) {
14401479
llvm::TimeTraceScope TimeScope("ExtractDeviceCode");
14411480

1481+
// Skip all the input if the user is overriding the output.
1482+
if (Args.hasArg(OPT_override_image))
1483+
return SmallVector<SmallVector<OffloadFile>>();
1484+
14421485
StringRef Root = Args.getLastArgValue(OPT_sysroot_EQ);
14431486
SmallVector<StringRef> LibraryPaths;
14441487
for (const opt::Arg *Arg : Args.filtered(OPT_library_path, OPT_libpath))

clang/tools/clang-linker-wrapper/LinkerWrapperOpts.td

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -74,6 +74,10 @@ def wrapper_jobs : Joined<["--"], "wrapper-jobs=">,
7474
Flags<[WrapperOnlyOption]>, MetaVarName<"<number>">,
7575
HelpText<"Sets the number of parallel jobs to use for device linking">;
7676

77+
def override_image : Joined<["--"], "override-image=">,
78+
Flags<[WrapperOnlyOption]>, MetaVarName<"<kind=file>">,
79+
HelpText<"Uses the provided file as if it were the output of the device link step">;
80+
7781
// Flags passed to the device linker.
7882
def arch_EQ : Joined<["--"], "arch=">,
7983
Flags<[DeviceOnlyOption, HelpHidden]>, MetaVarName<"<arch>">,

0 commit comments

Comments
 (0)