-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[mlir][Target][NVPTX] Add fatbin support to NVPTX compilation. #65398
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Currently, the NVPTX tool compilation path only calls `ptxas`; thus, the GPU running the binary must be an exact match of the arch of the target, or else the runtime throws an error due to the arch mismatch. This patch adds a call to `fatbinary`, creating a fat binary with the cubin object and the PTX code, allowing the driver to JIT the PTX at runtime if there's an arch mismatch.
Is this something that should be under the control of the target attribute with an option there? The user may want to include the PTX or not... |
@@ -184,7 +184,7 @@ class NVPTXSerializer : public SerializeGPUModuleBase { | |||
// 1. The toolkit path in `targetOptions`. | |||
// 2. In the system PATH. | |||
// 3. The path from `getCUDAToolkitPath()`. | |||
std::optional<std::string> findPtxas() const; | |||
std::optional<std::string> findTool(StringRef tool); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please document the tool
arg
I'm not sure if it should be in the target attribute, as we tried to remove those kinds of flags from it. However, I'm thinking something more like a compilation option passed through |
…65398) Currently, the NVPTX tool compilation path only calls `ptxas`; thus, the GPU running the binary must be an exact match of the arch of the target, or else the runtime throws an error due to the arch mismatch. This patch adds a call to `fatbinary`, creating a fat binary with the cubin object and the PTX code, allowing the driver to JIT the PTX at runtime if there's an arch mismatch.
Local branch amd-gfx 0d8d006 Fix build warning in SIInsertWaterfall Remote branch main c16adb0 [mlir][Target][NVPTX] Add fatbin support to NVPTX compilation. (llvm#65398)
Currently, the NVPTX tool compilation path only calls
ptxas
; thus, the GPU running the binary must be an exact match of the arch of the target, or else the runtime throws an error due to the arch mismatch.This patch adds a call to
fatbinary
, creating a fat binary with the cubin object and the PTX code, allowing the driver to JIT the PTX at runtime if there's an arch mismatch.This patch is needed to start migrating the Integration Tests, otherwise there will be a runtime error due to architecture mismatch.