-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[MLIR][NVVM] Add dumpISA
and dumpMachineISA
Flags
#116199
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Currently, dumping virtual and machine-level ISA is restricted to debug builds, making it unavailable in release builds. However, the ability to view these ISAs can be valuable even in release builds or with production compilers. For instance, `nvcc` provides similar functionality, making it a useful reference. This PR introduces `dumpISA` and `dumpMachineISA` flags to the `GpuModuleToBinaryPass`. Additionally, it adds `dump-ptx` and `dump-sass` flags to the `GPUToNVVMPipelineOptions`.
@llvm/pr-subscribers-mlir-llvm @llvm/pr-subscribers-mlir Author: Guray Ozen (grypp) ChangesCurrently, dumping virtual and machine-level ISA is restricted to debug builds, making it unavailable in release builds. However, the ability to view these ISAs can be valuable even in release builds or with production compilers. For instance, This PR introduces Additionally, it adds Full diff: https://github.com/llvm/llvm-project/pull/116199.diff 11 Files Affected:
diff --git a/mlir/include/mlir/Dialect/GPU/IR/CompilationInterfaces.h b/mlir/include/mlir/Dialect/GPU/IR/CompilationInterfaces.h
index 6d7cb5ca7a7f81..88111aed2a0a6b 100644
--- a/mlir/include/mlir/Dialect/GPU/IR/CompilationInterfaces.h
+++ b/mlir/include/mlir/Dialect/GPU/IR/CompilationInterfaces.h
@@ -52,7 +52,8 @@ class TargetOptions {
StringRef toolkitPath = {}, ArrayRef<std::string> linkFiles = {},
StringRef cmdOptions = {},
CompilationTarget compilationTarget = getDefaultCompilationTarget(),
- function_ref<SymbolTable *()> getSymbolTableCallback = {});
+ function_ref<SymbolTable *()> getSymbolTableCallback = {},
+ bool dumpISA = false, bool dumpMachineISA = false);
/// Returns the typeID.
TypeID getTypeID() const;
@@ -66,6 +67,12 @@ class TargetOptions {
/// Returns the command line options.
StringRef getCmdOptions() const;
+ /// Returns the dump-isa command line options.
+ bool getDumpISA() const;
+
+ /// Returns the dump-machine-isa command line options.
+ bool getDumpMachineISA() const;
+
/// Returns a tokenization of the command line options.
std::pair<llvm::BumpPtrAllocator, SmallVector<const char *>>
tokenizeCmdOptions() const;
@@ -90,7 +97,8 @@ class TargetOptions {
TypeID typeID, StringRef toolkitPath = {},
ArrayRef<std::string> linkFiles = {}, StringRef cmdOptions = {},
CompilationTarget compilationTarget = getDefaultCompilationTarget(),
- function_ref<SymbolTable *()> getSymbolTableCallback = {});
+ function_ref<SymbolTable *()> getSymbolTableCallback = {},
+ bool dumpISA = false, bool dumpMachineISA = false);
/// Path to the target toolkit.
std::string toolkitPath;
@@ -102,6 +110,12 @@ class TargetOptions {
/// process.
std::string cmdOptions;
+ /// An optional flag to dump generated ISA.
+ bool dumpISA = false;
+
+ /// An optional flag to dump generated and disassembled machine ISA.
+ bool dumpMachineISA = false;
+
/// Compilation process target format.
CompilationTarget compilationTarget;
diff --git a/mlir/include/mlir/Dialect/GPU/Pipelines/Passes.h b/mlir/include/mlir/Dialect/GPU/Pipelines/Passes.h
index caa0901bb49434..485eac6bad5b89 100644
--- a/mlir/include/mlir/Dialect/GPU/Pipelines/Passes.h
+++ b/mlir/include/mlir/Dialect/GPU/Pipelines/Passes.h
@@ -53,6 +53,12 @@ struct GPUToNVVMPipelineOptions
"Whether to use the bareptr calling convention on the host (warning "
"this should be false until the GPU layering is fixed)"),
llvm::cl::init(false)};
+ PassOptions::Option<bool> dumpPtx{
+ *this, "dump-ptx", llvm::cl::desc("Dumps PTX code to the error output"),
+ llvm::cl::init(false)};
+ PassOptions::Option<bool> dumpSass{
+ *this, "dump-sass", llvm::cl::desc("Dumps SASS code to the error output"),
+ llvm::cl::init(false)};
};
//===----------------------------------------------------------------------===//
diff --git a/mlir/include/mlir/Dialect/GPU/Transforms/Passes.td b/mlir/include/mlir/Dialect/GPU/Transforms/Passes.td
index 4a9ddafdd177d2..f7cd9dd8c15bfb 100644
--- a/mlir/include/mlir/Dialect/GPU/Transforms/Passes.td
+++ b/mlir/include/mlir/Dialect/GPU/Transforms/Passes.td
@@ -95,7 +95,15 @@ def GpuModuleToBinaryPass
Option<"cmdOptions", "opts", "std::string", [{""}],
"Command line options to pass to the tools.">,
Option<"compilationTarget", "format", "std::string", [{"fatbin"}],
- "The target representation of the compilation process.">
+ "The target representation of the compilation process.">,
+ Option<"dumpISA", "dump-isa", "bool",
+ /*default=*/"false",
+ "Dumps generated ISA to the error output.">,
+ Option<"dumpMachineISA", "dump-machine-isa", "bool",
+ /*default=*/"false",
+ "Dumps the generated machine-level ISA to the error output. "
+ "If the generated ISA is virtual, it instead dumps the"
+ "machine-level equivalent.">
];
}
diff --git a/mlir/include/mlir/Target/LLVM/NVVM/Utils.h b/mlir/include/mlir/Target/LLVM/NVVM/Utils.h
index 65ae8a6bdb4ada..6926acaa3e337a 100644
--- a/mlir/include/mlir/Target/LLVM/NVVM/Utils.h
+++ b/mlir/include/mlir/Target/LLVM/NVVM/Utils.h
@@ -54,7 +54,7 @@ class SerializeGPUModuleBase : public LLVM::ModuleToObject {
LogicalResult appendStandardLibs();
/// Loads the bitcode files in `fileList`.
- virtual std::optional<SmallVector<std::unique_ptr<llvm::Module>>>
+ std::optional<SmallVector<std::unique_ptr<llvm::Module>>>
loadBitcodeFiles(llvm::Module &module) override;
protected:
diff --git a/mlir/lib/Dialect/GPU/IR/GPUDialect.cpp b/mlir/lib/Dialect/GPU/IR/GPUDialect.cpp
index 956877497d9338..3407cb9ec3cbad 100644
--- a/mlir/lib/Dialect/GPU/IR/GPUDialect.cpp
+++ b/mlir/lib/Dialect/GPU/IR/GPUDialect.cpp
@@ -2302,18 +2302,26 @@ KernelMetadataAttr KernelTableAttr::lookup(StringAttr key) const {
TargetOptions::TargetOptions(
StringRef toolkitPath, ArrayRef<std::string> linkFiles,
StringRef cmdOptions, CompilationTarget compilationTarget,
- function_ref<SymbolTable *()> getSymbolTableCallback)
+ function_ref<SymbolTable *()> getSymbolTableCallback, bool dumpISA,
+ bool dumpMachineISA)
: TargetOptions(TypeID::get<TargetOptions>(), toolkitPath, linkFiles,
- cmdOptions, compilationTarget, getSymbolTableCallback) {}
+ cmdOptions, compilationTarget, getSymbolTableCallback,
+ dumpISA, dumpMachineISA) {}
TargetOptions::TargetOptions(
TypeID typeID, StringRef toolkitPath, ArrayRef<std::string> linkFiles,
StringRef cmdOptions, CompilationTarget compilationTarget,
- function_ref<SymbolTable *()> getSymbolTableCallback)
+ function_ref<SymbolTable *()> getSymbolTableCallback, bool dumpISA,
+ bool dumpMachineISA)
: toolkitPath(toolkitPath.str()), linkFiles(linkFiles),
- cmdOptions(cmdOptions.str()), compilationTarget(compilationTarget),
+ cmdOptions(cmdOptions.str()), dumpISA(dumpISA),
+ dumpMachineISA(dumpMachineISA), compilationTarget(compilationTarget),
getSymbolTableCallback(getSymbolTableCallback), typeID(typeID) {}
+bool TargetOptions::getDumpISA() const { return dumpISA; }
+
+bool TargetOptions::getDumpMachineISA() const { return dumpMachineISA; }
+
TypeID TargetOptions::getTypeID() const { return typeID; }
StringRef TargetOptions::getToolkitPath() const { return toolkitPath; }
diff --git a/mlir/lib/Dialect/GPU/Pipelines/GPUToNVVMPipeline.cpp b/mlir/lib/Dialect/GPU/Pipelines/GPUToNVVMPipeline.cpp
index fb440756e0c1d5..8d32be95f57696 100644
--- a/mlir/lib/Dialect/GPU/Pipelines/GPUToNVVMPipeline.cpp
+++ b/mlir/lib/Dialect/GPU/Pipelines/GPUToNVVMPipeline.cpp
@@ -95,6 +95,8 @@ void buildHostPostPipeline(OpPassManager &pm,
GpuModuleToBinaryPassOptions gpuModuleToBinaryPassOptions;
gpuModuleToBinaryPassOptions.compilationTarget = options.cubinFormat;
+ gpuModuleToBinaryPassOptions.dumpISA = options.dumpPtx;
+ gpuModuleToBinaryPassOptions.dumpMachineISA = options.dumpSass;
pm.addPass(createGpuModuleToBinaryPass(gpuModuleToBinaryPassOptions));
pm.addPass(createConvertMathToLLVMPass());
pm.addPass(createCanonicalizerPass());
diff --git a/mlir/lib/Dialect/GPU/Transforms/ModuleToBinary.cpp b/mlir/lib/Dialect/GPU/Transforms/ModuleToBinary.cpp
index 86a3b4780e88ce..a9538407888cb3 100644
--- a/mlir/lib/Dialect/GPU/Transforms/ModuleToBinary.cpp
+++ b/mlir/lib/Dialect/GPU/Transforms/ModuleToBinary.cpp
@@ -70,7 +70,7 @@ void GpuModuleToBinaryPass::runOnOperation() {
};
TargetOptions targetOptions(toolkitPath, linkFiles, cmdOptions, *targetFormat,
- lazyTableBuilder);
+ lazyTableBuilder, dumpISA, dumpMachineISA);
if (failed(transformGpuModulesToBinaries(
getOperation(), OffloadingLLVMTranslationAttrInterface(nullptr),
targetOptions)))
diff --git a/mlir/lib/Target/LLVM/NVVM/Target.cpp b/mlir/lib/Target/LLVM/NVVM/Target.cpp
index 69602af8563aa0..b5aff6be272ffd 100644
--- a/mlir/lib/Target/LLVM/NVVM/Target.cpp
+++ b/mlir/lib/Target/LLVM/NVVM/Target.cpp
@@ -292,8 +292,8 @@ NVPTXSerializer::compileToBinary(const std::string &ptxCode) {
return std::nullopt;
TmpFile cubinFile;
if (createFatbin) {
- Twine cubinFilename = ptxFile->first + ".cubin";
- cubinFile = TmpFile(cubinFilename.str(), llvm::FileRemover(cubinFilename));
+ std::string cubinFilename = (ptxFile->first + ".cubin").str();
+ cubinFile = TmpFile(cubinFilename, llvm::FileRemover(cubinFilename));
} else {
cubinFile.first = binaryFile->first;
}
@@ -402,8 +402,8 @@ NVPTXSerializer::compileToBinary(const std::string &ptxCode) {
/*MemoryLimit=*/0,
/*ErrMsg=*/&message))
return emitLogError("`ptxas`");
-#define DEBUG_TYPE "dump-sass"
- LLVM_DEBUG({
+
+ if (targetOptions.getDumpMachineISA()) {
std::optional<std::string> nvdisasm = findTool("nvdisasm");
SmallVector<StringRef> nvdisasmArgs(
{StringRef("nvdisasm"), StringRef(cubinFile.first)});
@@ -417,11 +417,10 @@ NVPTXSerializer::compileToBinary(const std::string &ptxCode) {
llvm::ErrorOr<std::unique_ptr<llvm::MemoryBuffer>> logBuffer =
llvm::MemoryBuffer::getFile(logFile->first);
if (logBuffer && !(*logBuffer)->getBuffer().empty()) {
- llvm::dbgs() << "Output:\n" << (*logBuffer)->getBuffer() << "\n";
- llvm::dbgs().flush();
+ llvm::errs() << "Output:\n" << (*logBuffer)->getBuffer() << "\n";
+ llvm::errs().flush();
}
- });
-#undef DEBUG_TYPE
+ }
// Invoke `fatbin`.
message.clear();
@@ -572,12 +571,13 @@ NVPTXSerializer::moduleToObject(llvm::Module &llvmModule) {
getOperation().emitError() << "Failed translating the module to ISA.";
return std::nullopt;
}
-#define DEBUG_TYPE "serialize-to-isa"
- LLVM_DEBUG({
- llvm::dbgs() << "PTX for module: " << getOperation().getNameAttr() << "\n";
- llvm::dbgs() << *serializedISA << "\n";
- llvm::dbgs().flush();
- });
+ if (targetOptions.getDumpISA()) {
+ llvm::errs() << "// Generated PTX for module: "
+ << getOperation().getNameAttr() << "\n";
+ llvm::errs() << *serializedISA << "\n";
+ llvm::errs().flush();
+ }
+
#undef DEBUG_TYPE
// Return PTX if the compilation target is `assembly`.
diff --git a/mlir/lib/Target/LLVM/ROCDL/Target.cpp b/mlir/lib/Target/LLVM/ROCDL/Target.cpp
index 227b45133b57e3..6761479e88da61 100644
--- a/mlir/lib/Target/LLVM/ROCDL/Target.cpp
+++ b/mlir/lib/Target/LLVM/ROCDL/Target.cpp
@@ -430,13 +430,12 @@ std::optional<SmallVector<char, 0>> SerializeGPUModuleBase::moduleToObjectImpl(
getOperation().emitError() << "failed translating the module to ISA";
return std::nullopt;
}
-#define DEBUG_TYPE "serialize-to-isa"
- LLVM_DEBUG({
+ if (targetOptions.getDumpISA()) {
llvm::dbgs() << "ISA for module: "
<< cast<gpu::GPUModuleOp>(getOperation()).getNameAttr() << "\n"
<< *serializedISA << "\n";
- });
-#undef DEBUG_TYPE
+ }
+
// Return ISA assembly code if the compilation target is assembly.
if (targetOptions.getCompilationTarget() == gpu::CompilationTarget::Assembly)
return SmallVector<char, 0>(serializedISA->begin(), serializedISA->end());
diff --git a/mlir/test/Integration/GPU/CUDA/dump-ptx.mlir b/mlir/test/Integration/GPU/CUDA/dump-ptx.mlir
index 0cc5d8645bb364..c511596d49c27a 100644
--- a/mlir/test/Integration/GPU/CUDA/dump-ptx.mlir
+++ b/mlir/test/Integration/GPU/CUDA/dump-ptx.mlir
@@ -1,5 +1,5 @@
// RUN: mlir-opt %s \
-// RUN: | mlir-opt -gpu-lower-to-nvvm-pipeline -debug-only=serialize-to-isa \
+// RUN: | mlir-opt -gpu-lower-to-nvvm-pipeline="dump-ptx=true" \
// RUN: 2>&1 | FileCheck %s
// CHECK: Generated by LLVM NVPTX Back-End
diff --git a/mlir/test/Integration/GPU/CUDA/dump-sass.mlir b/mlir/test/Integration/GPU/CUDA/dump-sass.mlir
index d32f5efc29d58e..2b204644c90c14 100644
--- a/mlir/test/Integration/GPU/CUDA/dump-sass.mlir
+++ b/mlir/test/Integration/GPU/CUDA/dump-sass.mlir
@@ -1,5 +1,5 @@
// RUN: mlir-opt %s \
-// RUN: | mlir-opt -gpu-lower-to-nvvm-pipeline -debug-only=dump-sass \
+// RUN: | mlir-opt -gpu-lower-to-nvvm-pipeline="dump-sass=true" \
// RUN: 2>&1 | FileCheck %s
// CHECK: MOV
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, dumping virtual and machine-level ISA is restricted to debug builds, making it unavailable in release builds. However, the ability to view these ISAs can be valuable even in release builds or with production compilers.
I don't buy this argument: debug output remains debug output. You're just arbitrarily interested in some debug output instead of others, but that does not seem like well plumbed to me here.
See #116007 for a more principled approach that offers the actually features in a more structured way.
I don't think generated code is a debug output.
I am totally fine with that approach. I can close my PR :) I wasn't aware of this PR. |
Currently, dumping virtual and machine-level ISA is restricted to debug builds, making it unavailable in release builds. However, the ability to view these ISAs can be valuable even in release builds or with production compilers. For instance,
nvcc
provides similar functionality, making it a useful reference.This PR introduces
dumpISA
anddumpMachineISA
flags to theGpuModuleToBinaryPass
.Additionally, it adds
dump-ptx
anddump-sass
flags to theGPUToNVVMPipelineOptions
.