[Libomptarget] Fix JIT on the NVPTX target by calling ptx manually #77801
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary:
Recently a patch added an assertion in the GlobalHandler to indicate
when an ELF was not used. This began to fire whenever NVPTX JIT was
used, because the JIT pass output a PTX file instead of an ELF. The
CUModuleLoad method consumes
.s
internally and compiles it to a cubin,however, this is too late as we perform several checks on the ELF
directly for the presence of certain symbols and to read some necessary
constants. This results in inconsistent behaviour.
To address this, this patch simply calls
ptxas
manually, similar tohow
lld
is called for the AMDGPU JIT pass. This is inevitably going tobe slower than simply passing it to the CUDA routine due to the overhead
involved in file IO and a fork call, but it's necessary for correctness.
CUDA provides an API for compiling PTX manually. However, this only
started showing up in CUDA 11.1 and is only provided "officially" in a
static library. The
libnvidia-ptxjitcompiler.so
next to the CUDAdriver has the same symbols and can likely be used as a replacement.
This would be the faster solution. However, given that it's not
documented it may have some issues.