Skip to content

[mlir][Target][NVPTX] Add fatbin support to NVPTX compilation. #65398

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Sep 7, 2023

Conversation

fabianmcg
Copy link
Contributor

Currently, the NVPTX tool compilation path only calls ptxas; thus, the GPU running the binary must be an exact match of the arch of the target, or else the runtime throws an error due to the arch mismatch.

This patch adds a call to fatbinary, creating a fat binary with the cubin object and the PTX code, allowing the driver to JIT the PTX at runtime if there's an arch mismatch.

This patch is needed to start migrating the Integration Tests, otherwise there will be a runtime error due to architecture mismatch.

Currently, the NVPTX tool compilation path only calls `ptxas`; thus, the GPU
running the binary must be an exact match of the arch of the target, or else the
runtime throws an error due to the arch mismatch.

This patch adds a call to `fatbinary`, creating a fat binary with the cubin object
and the PTX code, allowing the driver to JIT the PTX at runtime if there's an
arch mismatch.
@fabianmcg fabianmcg requested a review from a team as a code owner September 5, 2023 18:28
@fabianmcg fabianmcg requested a review from joker-eph September 5, 2023 18:44
@joker-eph joker-eph removed the request for review from a team September 5, 2023 19:30
@joker-eph
Copy link
Collaborator

Is this something that should be under the control of the target attribute with an option there? The user may want to include the PTX or not...

@@ -184,7 +184,7 @@ class NVPTXSerializer : public SerializeGPUModuleBase {
// 1. The toolkit path in `targetOptions`.
// 2. In the system PATH.
// 3. The path from `getCUDAToolkitPath()`.
std::optional<std::string> findPtxas() const;
std::optional<std::string> findTool(StringRef tool);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please document the tool arg

@fabianmcg
Copy link
Contributor Author

Is this something that should be under the control of the target attribute with an option there? The user may want to include the PTX or not...

I'm not sure if it should be in the target attribute, as we tried to remove those kinds of flags from it. However, I'm thinking something more like a compilation option passed through TargetOptions, but making the fatbin path the default one, as users might expect that behavior by default.

@fabianmcg fabianmcg requested a review from a team as a code owner September 6, 2023 14:22
@github-actions github-actions bot added mlir:core MLIR Core Infrastructure mlir:gpu labels Sep 6, 2023
@fabianmcg fabianmcg requested review from joker-eph and removed request for a team September 6, 2023 14:24
@fabianmcg fabianmcg merged commit c16adb0 into llvm:main Sep 7, 2023
@fabianmcg fabianmcg deleted the nvptx-fatbin branch September 7, 2023 12:47
avillega pushed a commit to avillega/llvm-project that referenced this pull request Sep 11, 2023
…65398)

Currently, the NVPTX tool compilation path only calls `ptxas`; thus, the
GPU running the binary must be an exact match of the arch of the target,
or else the runtime throws an error due to the arch mismatch.

This patch adds a call to `fatbinary`, creating a fat binary with the
cubin object and the PTX code, allowing the driver to JIT the PTX at
runtime if there's an arch mismatch.
Guzhu-AMD pushed a commit to GPUOpen-Drivers/llvm-project that referenced this pull request Sep 14, 2023
Local branch amd-gfx 0d8d006 Fix build warning in SIInsertWaterfall
Remote branch main c16adb0 [mlir][Target][NVPTX] Add fatbin support to NVPTX compilation. (llvm#65398)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants