[MLIR][NVVM] Add `dumpISA` and `dumpMachineISA` Flags #116199

grypp · 2024-11-14T10:17:02Z

Currently, dumping virtual and machine-level ISA is restricted to debug builds, making it unavailable in release builds. However, the ability to view these ISAs can be valuable even in release builds or with production compilers. For instance, nvcc provides similar functionality, making it a useful reference.

This PR introduces dumpISA and dumpMachineISA flags to the GpuModuleToBinaryPass.

Additionally, it adds dump-ptx and dump-sass flags to the GPUToNVVMPipelineOptions.

Currently, dumping virtual and machine-level ISA is restricted to debug builds, making it unavailable in release builds. However, the ability to view these ISAs can be valuable even in release builds or with production compilers. For instance, `nvcc` provides similar functionality, making it a useful reference. This PR introduces `dumpISA` and `dumpMachineISA` flags to the `GpuModuleToBinaryPass`. Additionally, it adds `dump-ptx` and `dump-sass` flags to the `GPUToNVVMPipelineOptions`.

llvmbot · 2024-11-14T10:17:37Z

@llvm/pr-subscribers-mlir-llvm
@llvm/pr-subscribers-mlir-gpu

@llvm/pr-subscribers-mlir

Author: Guray Ozen (grypp)

Changes

Currently, dumping virtual and machine-level ISA is restricted to debug builds, making it unavailable in release builds. However, the ability to view these ISAs can be valuable even in release builds or with production compilers. For instance, nvcc provides similar functionality, making it a useful reference.

This PR introduces dumpISA and dumpMachineISA flags to the GpuModuleToBinaryPass.

Additionally, it adds dump-ptx and dump-sass flags to the GPUToNVVMPipelineOptions.

Full diff: https://github.com/llvm/llvm-project/pull/116199.diff

11 Files Affected:

(modified) mlir/include/mlir/Dialect/GPU/IR/CompilationInterfaces.h (+16-2)
(modified) mlir/include/mlir/Dialect/GPU/Pipelines/Passes.h (+6)
(modified) mlir/include/mlir/Dialect/GPU/Transforms/Passes.td (+9-1)
(modified) mlir/include/mlir/Target/LLVM/NVVM/Utils.h (+1-1)
(modified) mlir/lib/Dialect/GPU/IR/GPUDialect.cpp (+12-4)
(modified) mlir/lib/Dialect/GPU/Pipelines/GPUToNVVMPipeline.cpp (+2)
(modified) mlir/lib/Dialect/GPU/Transforms/ModuleToBinary.cpp (+1-1)
(modified) mlir/lib/Target/LLVM/NVVM/Target.cpp (+14-14)
(modified) mlir/lib/Target/LLVM/ROCDL/Target.cpp (+3-4)
(modified) mlir/test/Integration/GPU/CUDA/dump-ptx.mlir (+1-1)
(modified) mlir/test/Integration/GPU/CUDA/dump-sass.mlir (+1-1)

diff --git a/mlir/include/mlir/Dialect/GPU/IR/CompilationInterfaces.h b/mlir/include/mlir/Dialect/GPU/IR/CompilationInterfaces.h
index 6d7cb5ca7a7f81..88111aed2a0a6b 100644
--- a/mlir/include/mlir/Dialect/GPU/IR/CompilationInterfaces.h
+++ b/mlir/include/mlir/Dialect/GPU/IR/CompilationInterfaces.h
@@ -52,7 +52,8 @@ class TargetOptions {
       StringRef toolkitPath = {}, ArrayRef<std::string> linkFiles = {},
       StringRef cmdOptions = {},
       CompilationTarget compilationTarget = getDefaultCompilationTarget(),
-      function_ref<SymbolTable *()> getSymbolTableCallback = {});
+      function_ref<SymbolTable *()> getSymbolTableCallback = {},
+      bool dumpISA = false, bool dumpMachineISA = false);
 
   /// Returns the typeID.
   TypeID getTypeID() const;
@@ -66,6 +67,12 @@ class TargetOptions {
   /// Returns the command line options.
   StringRef getCmdOptions() const;
 
+  /// Returns the dump-isa command line options.
+  bool getDumpISA() const;
+
+  /// Returns the dump-machine-isa command line options.
+  bool getDumpMachineISA() const;
+
   /// Returns a tokenization of the command line options.
   std::pair<llvm::BumpPtrAllocator, SmallVector<const char *>>
   tokenizeCmdOptions() const;
@@ -90,7 +97,8 @@ class TargetOptions {
       TypeID typeID, StringRef toolkitPath = {},
       ArrayRef<std::string> linkFiles = {}, StringRef cmdOptions = {},
       CompilationTarget compilationTarget = getDefaultCompilationTarget(),
-      function_ref<SymbolTable *()> getSymbolTableCallback = {});
+      function_ref<SymbolTable *()> getSymbolTableCallback = {},
+      bool dumpISA = false, bool dumpMachineISA = false);
 
   /// Path to the target toolkit.
   std::string toolkitPath;
@@ -102,6 +110,12 @@ class TargetOptions {
   /// process.
   std::string cmdOptions;
 
+  /// An optional flag to dump generated ISA.
+  bool dumpISA = false;
+
+  /// An optional flag to dump generated and disassembled machine ISA.
+  bool dumpMachineISA = false;
+
   /// Compilation process target format.
   CompilationTarget compilationTarget;
 
diff --git a/mlir/include/mlir/Dialect/GPU/Pipelines/Passes.h b/mlir/include/mlir/Dialect/GPU/Pipelines/Passes.h
index caa0901bb49434..485eac6bad5b89 100644
--- a/mlir/include/mlir/Dialect/GPU/Pipelines/Passes.h
+++ b/mlir/include/mlir/Dialect/GPU/Pipelines/Passes.h
@@ -53,6 +53,12 @@ struct GPUToNVVMPipelineOptions
           "Whether to use the bareptr calling convention on the host (warning "
           "this should be false until the GPU layering is fixed)"),
       llvm::cl::init(false)};
+  PassOptions::Option<bool> dumpPtx{
+      *this, "dump-ptx", llvm::cl::desc("Dumps PTX code to the error output"),
+      llvm::cl::init(false)};
+  PassOptions::Option<bool> dumpSass{
+      *this, "dump-sass", llvm::cl::desc("Dumps SASS code to the error output"),
+      llvm::cl::init(false)};
 };
 
 //===----------------------------------------------------------------------===//
diff --git a/mlir/include/mlir/Dialect/GPU/Transforms/Passes.td b/mlir/include/mlir/Dialect/GPU/Transforms/Passes.td
index 4a9ddafdd177d2..f7cd9dd8c15bfb 100644
--- a/mlir/include/mlir/Dialect/GPU/Transforms/Passes.td
+++ b/mlir/include/mlir/Dialect/GPU/Transforms/Passes.td
@@ -95,7 +95,15 @@ def GpuModuleToBinaryPass
     Option<"cmdOptions", "opts", "std::string", [{""}],
            "Command line options to pass to the tools.">,
     Option<"compilationTarget", "format", "std::string", [{"fatbin"}],
-           "The target representation of the compilation process.">
+           "The target representation of the compilation process.">,           
+    Option<"dumpISA", "dump-isa", "bool",
+           /*default=*/"false",
+           "Dumps generated ISA to the error output.">,
+    Option<"dumpMachineISA", "dump-machine-isa", "bool",
+           /*default=*/"false",
+           "Dumps the generated machine-level ISA to the error output. "
+           "If the generated ISA is virtual, it instead dumps the"
+           "machine-level equivalent.">
   ];
 }
 
diff --git a/mlir/include/mlir/Target/LLVM/NVVM/Utils.h b/mlir/include/mlir/Target/LLVM/NVVM/Utils.h
index 65ae8a6bdb4ada..6926acaa3e337a 100644
--- a/mlir/include/mlir/Target/LLVM/NVVM/Utils.h
+++ b/mlir/include/mlir/Target/LLVM/NVVM/Utils.h
@@ -54,7 +54,7 @@ class SerializeGPUModuleBase : public LLVM::ModuleToObject {
   LogicalResult appendStandardLibs();
 
   /// Loads the bitcode files in `fileList`.
-  virtual std::optional<SmallVector<std::unique_ptr<llvm::Module>>>
+  std::optional<SmallVector<std::unique_ptr<llvm::Module>>>
   loadBitcodeFiles(llvm::Module &module) override;
 
 protected:
diff --git a/mlir/lib/Dialect/GPU/IR/GPUDialect.cpp b/mlir/lib/Dialect/GPU/IR/GPUDialect.cpp
index 956877497d9338..3407cb9ec3cbad 100644
--- a/mlir/lib/Dialect/GPU/IR/GPUDialect.cpp
+++ b/mlir/lib/Dialect/GPU/IR/GPUDialect.cpp
@@ -2302,18 +2302,26 @@ KernelMetadataAttr KernelTableAttr::lookup(StringAttr key) const {
 TargetOptions::TargetOptions(
     StringRef toolkitPath, ArrayRef<std::string> linkFiles,
     StringRef cmdOptions, CompilationTarget compilationTarget,
-    function_ref<SymbolTable *()> getSymbolTableCallback)
+    function_ref<SymbolTable *()> getSymbolTableCallback, bool dumpISA,
+    bool dumpMachineISA)
     : TargetOptions(TypeID::get<TargetOptions>(), toolkitPath, linkFiles,
-                    cmdOptions, compilationTarget, getSymbolTableCallback) {}
+                    cmdOptions, compilationTarget, getSymbolTableCallback,
+                    dumpISA, dumpMachineISA) {}
 
 TargetOptions::TargetOptions(
     TypeID typeID, StringRef toolkitPath, ArrayRef<std::string> linkFiles,
     StringRef cmdOptions, CompilationTarget compilationTarget,
-    function_ref<SymbolTable *()> getSymbolTableCallback)
+    function_ref<SymbolTable *()> getSymbolTableCallback, bool dumpISA,
+    bool dumpMachineISA)
     : toolkitPath(toolkitPath.str()), linkFiles(linkFiles),
-      cmdOptions(cmdOptions.str()), compilationTarget(compilationTarget),
+      cmdOptions(cmdOptions.str()), dumpISA(dumpISA),
+      dumpMachineISA(dumpMachineISA), compilationTarget(compilationTarget),
       getSymbolTableCallback(getSymbolTableCallback), typeID(typeID) {}
 
+bool TargetOptions::getDumpISA() const { return dumpISA; }
+
+bool TargetOptions::getDumpMachineISA() const { return dumpMachineISA; }
+
 TypeID TargetOptions::getTypeID() const { return typeID; }
 
 StringRef TargetOptions::getToolkitPath() const { return toolkitPath; }
diff --git a/mlir/lib/Dialect/GPU/Pipelines/GPUToNVVMPipeline.cpp b/mlir/lib/Dialect/GPU/Pipelines/GPUToNVVMPipeline.cpp
index fb440756e0c1d5..8d32be95f57696 100644
--- a/mlir/lib/Dialect/GPU/Pipelines/GPUToNVVMPipeline.cpp
+++ b/mlir/lib/Dialect/GPU/Pipelines/GPUToNVVMPipeline.cpp
@@ -95,6 +95,8 @@ void buildHostPostPipeline(OpPassManager &pm,
 
   GpuModuleToBinaryPassOptions gpuModuleToBinaryPassOptions;
   gpuModuleToBinaryPassOptions.compilationTarget = options.cubinFormat;
+  gpuModuleToBinaryPassOptions.dumpISA = options.dumpPtx;
+  gpuModuleToBinaryPassOptions.dumpMachineISA = options.dumpSass;
   pm.addPass(createGpuModuleToBinaryPass(gpuModuleToBinaryPassOptions));
   pm.addPass(createConvertMathToLLVMPass());
   pm.addPass(createCanonicalizerPass());
diff --git a/mlir/lib/Dialect/GPU/Transforms/ModuleToBinary.cpp b/mlir/lib/Dialect/GPU/Transforms/ModuleToBinary.cpp
index 86a3b4780e88ce..a9538407888cb3 100644
--- a/mlir/lib/Dialect/GPU/Transforms/ModuleToBinary.cpp
+++ b/mlir/lib/Dialect/GPU/Transforms/ModuleToBinary.cpp
@@ -70,7 +70,7 @@ void GpuModuleToBinaryPass::runOnOperation() {
   };
 
   TargetOptions targetOptions(toolkitPath, linkFiles, cmdOptions, *targetFormat,
-                              lazyTableBuilder);
+                              lazyTableBuilder, dumpISA, dumpMachineISA);
   if (failed(transformGpuModulesToBinaries(
           getOperation(), OffloadingLLVMTranslationAttrInterface(nullptr),
           targetOptions)))
diff --git a/mlir/lib/Target/LLVM/NVVM/Target.cpp b/mlir/lib/Target/LLVM/NVVM/Target.cpp
index 69602af8563aa0..b5aff6be272ffd 100644
--- a/mlir/lib/Target/LLVM/NVVM/Target.cpp
+++ b/mlir/lib/Target/LLVM/NVVM/Target.cpp
@@ -292,8 +292,8 @@ NVPTXSerializer::compileToBinary(const std::string &ptxCode) {
     return std::nullopt;
   TmpFile cubinFile;
   if (createFatbin) {
-    Twine cubinFilename = ptxFile->first + ".cubin";
-    cubinFile = TmpFile(cubinFilename.str(), llvm::FileRemover(cubinFilename));
+    std::string cubinFilename = (ptxFile->first + ".cubin").str();
+    cubinFile = TmpFile(cubinFilename, llvm::FileRemover(cubinFilename));
   } else {
     cubinFile.first = binaryFile->first;
   }
@@ -402,8 +402,8 @@ NVPTXSerializer::compileToBinary(const std::string &ptxCode) {
                                 /*MemoryLimit=*/0,
                                 /*ErrMsg=*/&message))
     return emitLogError("`ptxas`");
-#define DEBUG_TYPE "dump-sass"
-  LLVM_DEBUG({
+
+  if (targetOptions.getDumpMachineISA()) {
     std::optional<std::string> nvdisasm = findTool("nvdisasm");
     SmallVector<StringRef> nvdisasmArgs(
         {StringRef("nvdisasm"), StringRef(cubinFile.first)});
@@ -417,11 +417,10 @@ NVPTXSerializer::compileToBinary(const std::string &ptxCode) {
     llvm::ErrorOr<std::unique_ptr<llvm::MemoryBuffer>> logBuffer =
         llvm::MemoryBuffer::getFile(logFile->first);
     if (logBuffer && !(*logBuffer)->getBuffer().empty()) {
-      llvm::dbgs() << "Output:\n" << (*logBuffer)->getBuffer() << "\n";
-      llvm::dbgs().flush();
+      llvm::errs() << "Output:\n" << (*logBuffer)->getBuffer() << "\n";
+      llvm::errs().flush();
     }
-  });
-#undef DEBUG_TYPE
+  }
 
   // Invoke `fatbin`.
   message.clear();
@@ -572,12 +571,13 @@ NVPTXSerializer::moduleToObject(llvm::Module &llvmModule) {
     getOperation().emitError() << "Failed translating the module to ISA.";
     return std::nullopt;
   }
-#define DEBUG_TYPE "serialize-to-isa"
-  LLVM_DEBUG({
-    llvm::dbgs() << "PTX for module: " << getOperation().getNameAttr() << "\n";
-    llvm::dbgs() << *serializedISA << "\n";
-    llvm::dbgs().flush();
-  });
+  if (targetOptions.getDumpISA()) {
+    llvm::errs() << "// Generated PTX for module: "
+                 << getOperation().getNameAttr() << "\n";
+    llvm::errs() << *serializedISA << "\n";
+    llvm::errs().flush();
+  }
+
 #undef DEBUG_TYPE
 
   // Return PTX if the compilation target is `assembly`.
diff --git a/mlir/lib/Target/LLVM/ROCDL/Target.cpp b/mlir/lib/Target/LLVM/ROCDL/Target.cpp
index 227b45133b57e3..6761479e88da61 100644
--- a/mlir/lib/Target/LLVM/ROCDL/Target.cpp
+++ b/mlir/lib/Target/LLVM/ROCDL/Target.cpp
@@ -430,13 +430,12 @@ std::optional<SmallVector<char, 0>> SerializeGPUModuleBase::moduleToObjectImpl(
     getOperation().emitError() << "failed translating the module to ISA";
     return std::nullopt;
   }
-#define DEBUG_TYPE "serialize-to-isa"
-  LLVM_DEBUG({
+  if (targetOptions.getDumpISA()) {
     llvm::dbgs() << "ISA for module: "
                  << cast<gpu::GPUModuleOp>(getOperation()).getNameAttr() << "\n"
                  << *serializedISA << "\n";
-  });
-#undef DEBUG_TYPE
+  }
+
   // Return ISA assembly code if the compilation target is assembly.
   if (targetOptions.getCompilationTarget() == gpu::CompilationTarget::Assembly)
     return SmallVector<char, 0>(serializedISA->begin(), serializedISA->end());
diff --git a/mlir/test/Integration/GPU/CUDA/dump-ptx.mlir b/mlir/test/Integration/GPU/CUDA/dump-ptx.mlir
index 0cc5d8645bb364..c511596d49c27a 100644
--- a/mlir/test/Integration/GPU/CUDA/dump-ptx.mlir
+++ b/mlir/test/Integration/GPU/CUDA/dump-ptx.mlir
@@ -1,5 +1,5 @@
 // RUN: mlir-opt %s \
-// RUN:  | mlir-opt -gpu-lower-to-nvvm-pipeline -debug-only=serialize-to-isa \
+// RUN:  | mlir-opt -gpu-lower-to-nvvm-pipeline="dump-ptx=true" \
 // RUN:  2>&1 | FileCheck %s
 
 // CHECK: Generated by LLVM NVPTX Back-End
diff --git a/mlir/test/Integration/GPU/CUDA/dump-sass.mlir b/mlir/test/Integration/GPU/CUDA/dump-sass.mlir
index d32f5efc29d58e..2b204644c90c14 100644
--- a/mlir/test/Integration/GPU/CUDA/dump-sass.mlir
+++ b/mlir/test/Integration/GPU/CUDA/dump-sass.mlir
@@ -1,5 +1,5 @@
 // RUN: mlir-opt %s \
-// RUN:  | mlir-opt -gpu-lower-to-nvvm-pipeline -debug-only=dump-sass \
+// RUN:  | mlir-opt -gpu-lower-to-nvvm-pipeline="dump-sass=true" \
 // RUN:  2>&1 | FileCheck %s
 
 // CHECK: MOV

joker-eph

Currently, dumping virtual and machine-level ISA is restricted to debug builds, making it unavailable in release builds. However, the ability to view these ISAs can be valuable even in release builds or with production compilers.

I don't buy this argument: debug output remains debug output. You're just arbitrarily interested in some debug output instead of others, but that does not seem like well plumbed to me here.

See #116007 for a more principled approach that offers the actually features in a more structured way.

grypp · 2024-11-14T11:47:59Z

I don't think generated code is a debug output.

See #116007 for a more principled approach that offers the actually features in a more structured way.

I am totally fine with that approach. I can close my PR :) I wasn't aware of this PR.

llvmbot added mlir:llvm mlir:gpu mlir labels Nov 14, 2024

grypp requested review from joker-eph, fabianmcg and ftynse and removed request for joker-eph November 14, 2024 10:17

joker-eph requested changes Nov 14, 2024

View reviewed changes

grypp closed this Nov 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[MLIR][NVVM] Add `dumpISA` and `dumpMachineISA` Flags #116199

[MLIR][NVVM] Add `dumpISA` and `dumpMachineISA` Flags #116199

Uh oh!

grypp commented Nov 14, 2024

Uh oh!

llvmbot commented Nov 14, 2024 •

edited

Loading

Uh oh!

joker-eph left a comment

Uh oh!

grypp commented Nov 14, 2024

Uh oh!

Uh oh!

[MLIR][NVVM] Add dumpISA and dumpMachineISA Flags #116199

[MLIR][NVVM] Add dumpISA and dumpMachineISA Flags #116199

Uh oh!

Conversation

grypp commented Nov 14, 2024

Uh oh!

llvmbot commented Nov 14, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

joker-eph left a comment

Choose a reason for hiding this comment

Uh oh!

grypp commented Nov 14, 2024

Uh oh!

Uh oh!

[MLIR][NVVM] Add `dumpISA` and `dumpMachineISA` Flags #116199

[MLIR][NVVM] Add `dumpISA` and `dumpMachineISA` Flags #116199

llvmbot commented Nov 14, 2024 •

edited

Loading