Skip to content

Commit af5cc55

Browse files
committed
[libc] Add loader option to force serial execution of GPU region
Summary: The loader is used as a test utility to run traditionally CPU based unit tests on the GPU. This has issues when used with something like `llvm-lit` because the GPU runtimes have a nasty habit of either running out of resources or hanging when they are overloaded. To combat this, I added this option to force each process to perform the GPU part serially. This is done right now with a simple file lock on the executing file. I was originally thinking about using more complex IPC to allow N processes to share execution, but that seemed overly complicated given the incredibly large number of failure modes it introduces. File locks are nice here because if the process crashes or is killed it will release the lock automatically (at least on Linux). This is in contrast to something like POSIX shared memory which will stick around until it's unlinked, meaning that if someone did `sigkill` on the program it would never get cleaned up and other threads might wait on a mutex that never occurs. Restricting this to one thread isn't overly ideal, given the fact that the runtime can likely handle at least a *few* separate processes, but this was easy and it works, so might as well start here. This will hopefully unblock me on running `libcxx` tests, as those ran with so much parallelism spurious failures were very common.
1 parent 97f723b commit af5cc55

File tree

1 file changed

+25
-0
lines changed

1 file changed

+25
-0
lines changed

libc/utils/gpu/loader/Main.cpp

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,9 +20,12 @@
2020
#include "llvm/Support/Signals.h"
2121
#include "llvm/Support/WithColor.h"
2222

23+
#include <cerrno>
2324
#include <cstdio>
2425
#include <cstdlib>
26+
#include <cstring>
2527
#include <string>
28+
#include <sys/file.h>
2629

2730
using namespace llvm;
2831

@@ -62,6 +65,12 @@ static cl::opt<bool>
6265
cl::desc("Output resource usage of launched kernels"),
6366
cl::init(false), cl::cat(loader_category));
6467

68+
static cl::opt<bool>
69+
no_parallelism("no-parallelism",
70+
cl::desc("Allows only a single process to use the GPU at a "
71+
"time. Useful to suppress out-of-resource errors"),
72+
cl::init(false), cl::cat(loader_category));
73+
6574
static cl::opt<std::string> file(cl::Positional, cl::Required,
6675
cl::desc("<gpu executable>"),
6776
cl::cat(loader_category));
@@ -98,12 +107,28 @@ int main(int argc, const char **argv, const char **envp) {
98107
llvm::transform(args, std::back_inserter(new_argv),
99108
[](const std::string &arg) { return arg.c_str(); });
100109

110+
// Claim a file lock on the executable so only a single process can enter this
111+
// region if requested. This prevents the loader from spurious failures.
112+
int fd = -1;
113+
if (no_parallelism) {
114+
fd = open(argv[0], O_RDONLY);
115+
if (flock(fd, LOCK_EX) == -1)
116+
report_error(createStringError("Failed to lock '%s': %s", argv[0],
117+
strerror(errno)));
118+
}
119+
101120
// Drop the loader from the program arguments.
102121
LaunchParameters params{threads_x, threads_y, threads_z,
103122
blocks_x, blocks_y, blocks_z};
104123
int ret = load(new_argv.size(), new_argv.data(), envp,
105124
const_cast<char *>(image.getBufferStart()),
106125
image.getBufferSize(), params, print_resource_usage);
107126

127+
if (no_parallelism) {
128+
if (flock(fd, LOCK_UN) == -1)
129+
report_error(createStringError("Failed to unlock '%s': %s", argv[0],
130+
strerror(errno)));
131+
}
132+
108133
return ret;
109134
}

0 commit comments

Comments
 (0)