Skip to content

[AMDGPU] Do not print kernel-resource-usage information on non-kernels #99720

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jul 22, 2024

Conversation

jhuber6
Copy link
Contributor

@jhuber6 jhuber6 commented Jul 19, 2024

Summary:
This pass is used to get helpful information about the kernel resources
without needing to insepct the binary. However, it currently prints on
every function. These values will always be zero, so it's just spam on
the terminal, at best an indication that a function wasn't internalized
/ optimized out. This patch makes it only print for kernels to make it
more useful in practice.

Summary:
This pass is used to get helpful information about the kernel resources
without needing to insepct the binary. However, it currently prints on
every function. These values will always be zero, so it's just spam on
the terminal, at best an indication that a function wasn't internalized
/ optimized out. This patch makes it only print for kernels to make it
more useful in practice.
@llvmbot
Copy link
Member

llvmbot commented Jul 19, 2024

@llvm/pr-subscribers-backend-amdgpu

Author: Joseph Huber (jhuber6)

Changes

Summary:
This pass is used to get helpful information about the kernel resources
without needing to insepct the binary. However, it currently prints on
every function. These values will always be zero, so it's just spam on
the terminal, at best an indication that a function wasn't internalized
/ optimized out. This patch makes it only print for kernels to make it
more useful in practice.


Full diff: https://github.com/llvm/llvm-project/pull/99720.diff

2 Files Affected:

  • (modified) llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp (+4)
  • (modified) llvm/test/CodeGen/AMDGPU/resource-optimization-remarks.ll (+2-19)
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp b/llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp
index 3154dc6fe433d..be338962e858b 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp
@@ -1487,6 +1487,10 @@ void AMDGPUAsmPrinter::emitResourceUsageRemarks(
   if (!Ctx.getDiagHandlerPtr()->isAnalysisRemarkEnabled(Name))
     return;
 
+  // Currently non-kernel functions have no resources to emit.
+  if (MF.getFunction().getCallingConv() != CallingConv::AMDGPU_KERNEL)
+    return;
+
   auto EmitResourceUsageRemark = [&](StringRef RemarkName,
                                      StringRef RemarkLabel, auto Argument) {
     // Add an indent for every line besides the line with the kernel name. This
diff --git a/llvm/test/CodeGen/AMDGPU/resource-optimization-remarks.ll b/llvm/test/CodeGen/AMDGPU/resource-optimization-remarks.ll
index a640ac985ade4..7da89c335157a 100644
--- a/llvm/test/CodeGen/AMDGPU/resource-optimization-remarks.ll
+++ b/llvm/test/CodeGen/AMDGPU/resource-optimization-remarks.ll
@@ -113,16 +113,7 @@ define amdgpu_kernel void @test_kernel() !dbg !3 {
   ret void
 }
 
-; STDERR: remark: foo.cl:42:0: Function Name: test_func
-; STDERR-NEXT: remark: foo.cl:42:0:     SGPRs: 0
-; STDERR-NEXT: remark: foo.cl:42:0:     VGPRs: 0
-; STDERR-NEXT: remark: foo.cl:42:0:     AGPRs: 0
-; STDERR-NEXT: remark: foo.cl:42:0:     ScratchSize [bytes/lane]: 0
-; STDERR-NEXT: remark: foo.cl:42:0:     Dynamic Stack: False
-; STDERR-NEXT: remark: foo.cl:42:0:     Occupancy [waves/SIMD]: 0
-; STDERR-NEXT: remark: foo.cl:42:0:     SGPRs Spill: 0
-; STDERR-NEXT: remark: foo.cl:42:0:     VGPRs Spill: 0
-; STDERR-NOT: LDS Size
+; STDERR-NOT: remark: foo.cl:42:0: Function Name: test_func
 define void @test_func() !dbg !6 {
   call void asm sideeffect "; clobber v17", "~{v17}"()
   call void asm sideeffect "; clobber s11", "~{s11}"()
@@ -144,15 +135,7 @@ define amdgpu_kernel void @empty_kernel() !dbg !7 {
   ret void
 }
 
-; STDERR: remark: foo.cl:52:0: Function Name: empty_func
-; STDERR-NEXT: remark: foo.cl:52:0:     SGPRs: 0
-; STDERR-NEXT: remark: foo.cl:52:0:     VGPRs: 0
-; STDERR-NEXT: remark: foo.cl:52:0:     AGPRs: 0
-; STDERR-NEXT: remark: foo.cl:52:0:     ScratchSize [bytes/lane]: 0
-; STDERR-NEXT: remark: foo.cl:52:0:     Dynamic Stack: False
-; STDERR-NEXT: remark: foo.cl:52:0:     Occupancy [waves/SIMD]: 0
-; STDERR-NEXT: remark: foo.cl:52:0:     SGPRs Spill: 0
-; STDERR-NEXT: remark: foo.cl:52:0:     VGPRs Spill: 0
+; STDERR-NOT: remark: foo.cl:52:0: Function Name: empty_func
 define void @empty_func() !dbg !8 {
   ret void
 }

@jhuber6 jhuber6 requested review from shiltian and Pierre-vh July 19, 2024 23:53
@@ -1487,6 +1487,10 @@ void AMDGPUAsmPrinter::emitResourceUsageRemarks(
if (!Ctx.getDiagHandlerPtr()->isAnalysisRemarkEnabled(Name))
return;

// Currently non-kernel functions have no resources to emit.
if (MF.getFunction().getCallingConv() != CallingConv::AMDGPU_KERNEL)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This ignores graphics. Should use isEntryFunctionCC instead

; STDERR-NEXT: remark: foo.cl:52:0: Occupancy [waves/SIMD]: 0
; STDERR-NEXT: remark: foo.cl:52:0: SGPRs Spill: 0
; STDERR-NEXT: remark: foo.cl:52:0: VGPRs Spill: 0
; STDERR-NOT: remark: foo.cl:52:0: Function Name: empty_func
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not checks should be as simple as possible. Just -NOT: remark. Or add -implicit-check-not=remark

@jhuber6 jhuber6 merged commit f39bd0a into llvm:main Jul 22, 2024
7 checks passed
yuxuanchen1997 pushed a commit that referenced this pull request Jul 25, 2024
…els (#99720)

Summary:
This pass is used to get helpful information about the kernel resources
without needing to insepct the binary. However, it currently prints on
every function. These values will always be zero, so it's just spam on
the terminal, at best an indication that a function wasn't internalized
/ optimized out. This patch makes it only print for kernels to make it
more useful in practice.

Test Plan: 

Reviewers: 

Subscribers: 

Tasks: 

Tags: 


Differential Revision: https://phabricator.intern.facebook.com/D60251371
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants