Skip to content

[AMDGPU] Update hasUnwantedEffectsWhenEXECEmpty #97982

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jul 17, 2024

Conversation

perlfu
Copy link
Contributor

@perlfu perlfu commented Jul 8, 2024

Add barriers and s_wait_event to hasUnwantedEffectsWhenEXECEmpty.
Add a comment documenting the current expected use of the function.

@perlfu
Copy link
Contributor Author

perlfu commented Jul 8, 2024

Currently includes #97676 as this builts on top of it.

@llvmbot
Copy link
Member

llvmbot commented Jul 8, 2024

@llvm/pr-subscribers-backend-amdgpu

Author: Carl Ritson (perlfu)

Changes

Add barriers and s_wait_event to hasUnwantedEffectsWhenEXECEmpty.
Add a comment documenting the current expected use of the function.


Patch is 25.73 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/97982.diff

5 Files Affected:

  • (modified) llvm/lib/Target/AMDGPU/SIInstrInfo.cpp (+13-1)
  • (modified) llvm/lib/Target/AMDGPU/SIInstrInfo.h (+31)
  • (modified) llvm/lib/Target/AMDGPU/SIPreEmitPeephole.cpp (+1-1)
  • (added) llvm/test/CodeGen/AMDGPU/insert-skips-gfx10.mir (+216)
  • (added) llvm/test/CodeGen/AMDGPU/insert-skips-gfx12.mir (+610)
diff --git a/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp b/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
index cc1b9ac0c9ecda..a2cb3834643227 100644
--- a/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
@@ -4118,6 +4118,13 @@ bool SIInstrInfo::modifiesModeRegister(const MachineInstr &MI) {
 }
 
 bool SIInstrInfo::hasUnwantedEffectsWhenEXECEmpty(const MachineInstr &MI) const {
+  // This function is used to determine if an instruction can be safely
+  // executed under EXECZ without hardware error, indeterminate results,
+  // and/or visible effects on future vector execution or outside the shader.
+  // Note: as of 2024 the only use of this is SIPreEmitPeephole where it is
+  // used in removing branches over short EXECZ sequences.
+  // As such it embeds certain assumptions which may not apply in every case
+  // of EXECZ execution.
   unsigned Opcode = MI.getOpcode();
 
   if (MI.mayStore() && isSMRD(MI))
@@ -4136,12 +4143,17 @@ bool SIInstrInfo::hasUnwantedEffectsWhenEXECEmpty(const MachineInstr &MI) const
   if (Opcode == AMDGPU::S_SENDMSG || Opcode == AMDGPU::S_SENDMSGHALT ||
       isEXP(Opcode) ||
       Opcode == AMDGPU::DS_ORDERED_COUNT || Opcode == AMDGPU::S_TRAP ||
-      Opcode == AMDGPU::DS_GWS_INIT || Opcode == AMDGPU::DS_GWS_BARRIER)
+      Opcode == AMDGPU::DS_GWS_INIT || Opcode == AMDGPU::DS_GWS_BARRIER ||
+      Opcode == AMDGPU::S_WAIT_EVENT)
     return true;
 
   if (MI.isCall() || MI.isInlineAsm())
     return true; // conservative assumption
 
+  // Assume that barrier interactions are only intended with active lanes.
+  if (isBarrierRelated(Opcode))
+    return true;
+
   // A mode change is a scalar operation that influences vector instructions.
   if (modifiesModeRegister(MI))
     return true;
diff --git a/llvm/lib/Target/AMDGPU/SIInstrInfo.h b/llvm/lib/Target/AMDGPU/SIInstrInfo.h
index 1e2b687854c77a..bee24b3a7a91b3 100644
--- a/llvm/lib/Target/AMDGPU/SIInstrInfo.h
+++ b/llvm/lib/Target/AMDGPU/SIInstrInfo.h
@@ -936,6 +936,14 @@ class SIInstrInfo final : public AMDGPUGenInstrInfo {
            Opcode == AMDGPU::S_BARRIER_SIGNAL_ISFIRST_IMM;
   }
 
+  bool isBarrierRelated(unsigned Opcode) const {
+    return isBarrierStart(Opcode) || Opcode == AMDGPU::S_BARRIER_WAIT ||
+           Opcode == AMDGPU::S_BARRIER_INIT_M0 ||
+           Opcode == AMDGPU::S_BARRIER_INIT_IMM ||
+           Opcode == AMDGPU::S_BARRIER_JOIN_IMM ||
+           Opcode == AMDGPU::S_BARRIER_LEAVE;
+  }
+
   static bool doesNotReadTiedSource(const MachineInstr &MI) {
     return MI.getDesc().TSFlags & SIInstrFlags::TiedSourceNotRead;
   }
@@ -967,6 +975,29 @@ class SIInstrInfo final : public AMDGPUGenInstrInfo {
     }
   }
 
+  bool isWaitcnt(unsigned Opcode) const {
+    switch (getNonSoftWaitcntOpcode(Opcode)) {
+    case AMDGPU::S_WAITCNT:
+    case AMDGPU::S_WAITCNT_VSCNT:
+    case AMDGPU::S_WAITCNT_VMCNT:
+    case AMDGPU::S_WAITCNT_EXPCNT:
+    case AMDGPU::S_WAITCNT_LGKMCNT:
+    case AMDGPU::S_WAIT_LOADCNT:
+    case AMDGPU::S_WAIT_LOADCNT_DSCNT:
+    case AMDGPU::S_WAIT_STORECNT:
+    case AMDGPU::S_WAIT_STORECNT_DSCNT:
+    case AMDGPU::S_WAIT_SAMPLECNT:
+    case AMDGPU::S_WAIT_BVHCNT:
+    case AMDGPU::S_WAIT_EXPCNT:
+    case AMDGPU::S_WAIT_DSCNT:
+    case AMDGPU::S_WAIT_KMCNT:
+    case AMDGPU::S_WAIT_IDLE:
+      return true;
+    default:
+      return false;
+    }
+  }
+
   bool isVGPRCopy(const MachineInstr &MI) const {
     assert(isCopyInstr(MI));
     Register Dest = MI.getOperand(0).getReg();
diff --git a/llvm/lib/Target/AMDGPU/SIPreEmitPeephole.cpp b/llvm/lib/Target/AMDGPU/SIPreEmitPeephole.cpp
index 875bccb208c846..1334029544f999 100644
--- a/llvm/lib/Target/AMDGPU/SIPreEmitPeephole.cpp
+++ b/llvm/lib/Target/AMDGPU/SIPreEmitPeephole.cpp
@@ -328,7 +328,7 @@ bool SIPreEmitPeephole::mustRetainExeczBranch(
 
       // These instructions are potentially expensive even if EXEC = 0.
       if (TII->isSMRD(MI) || TII->isVMEM(MI) || TII->isFLAT(MI) ||
-          TII->isDS(MI) || MI.getOpcode() == AMDGPU::S_WAITCNT)
+          TII->isDS(MI) || TII->isWaitcnt(MI.getOpcode()))
         return true;
 
       ++NumInstr;
diff --git a/llvm/test/CodeGen/AMDGPU/insert-skips-gfx10.mir b/llvm/test/CodeGen/AMDGPU/insert-skips-gfx10.mir
new file mode 100644
index 00000000000000..b4ed3cafbacb5f
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/insert-skips-gfx10.mir
@@ -0,0 +1,216 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5
+# RUN: llc -mtriple=amdgcn -mcpu=gfx1030 -run-pass si-pre-emit-peephole -amdgpu-skip-threshold=10 -verify-machineinstrs  %s -o - | FileCheck %s
+
+---
+name: skip_waitcnt_vscnt
+body: |
+  ; CHECK-LABEL: name: skip_waitcnt_vscnt
+  ; CHECK: bb.0:
+  ; CHECK-NEXT:   successors: %bb.1(0x40000000), %bb.2(0x40000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   S_CBRANCH_EXECZ %bb.2, implicit $exec
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.1:
+  ; CHECK-NEXT:   successors: %bb.2(0x80000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   V_NOP_e32 implicit $exec
+  ; CHECK-NEXT:   S_WAITCNT_VSCNT $sgpr_null, 0
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.2:
+  ; CHECK-NEXT:   S_ENDPGM 0
+  bb.0:
+    successors: %bb.1, %bb.2
+    S_CBRANCH_EXECZ %bb.2, implicit $exec
+
+  bb.1:
+    successors: %bb.2
+    V_NOP_e32 implicit $exec
+    S_WAITCNT_VSCNT $sgpr_null, 0
+
+  bb.2:
+    S_ENDPGM 0
+...
+
+---
+name: skip_waitcnt_expcnt
+body: |
+  ; CHECK-LABEL: name: skip_waitcnt_expcnt
+  ; CHECK: bb.0:
+  ; CHECK-NEXT:   successors: %bb.1(0x40000000), %bb.2(0x40000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   S_CBRANCH_EXECZ %bb.2, implicit $exec
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.1:
+  ; CHECK-NEXT:   successors: %bb.2(0x80000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   V_NOP_e32 implicit $exec
+  ; CHECK-NEXT:   S_WAITCNT_EXPCNT $sgpr_null, 0
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.2:
+  ; CHECK-NEXT:   S_ENDPGM 0
+  bb.0:
+    successors: %bb.1, %bb.2
+    S_CBRANCH_EXECZ %bb.2, implicit $exec
+
+  bb.1:
+    successors: %bb.2
+    V_NOP_e32 implicit $exec
+    S_WAITCNT_EXPCNT $sgpr_null, 0
+
+  bb.2:
+    S_ENDPGM 0
+...
+
+---
+name: skip_waitcnt_vmcnt
+body: |
+  ; CHECK-LABEL: name: skip_waitcnt_vmcnt
+  ; CHECK: bb.0:
+  ; CHECK-NEXT:   successors: %bb.1(0x40000000), %bb.2(0x40000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   S_CBRANCH_EXECZ %bb.2, implicit $exec
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.1:
+  ; CHECK-NEXT:   successors: %bb.2(0x80000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   V_NOP_e32 implicit $exec
+  ; CHECK-NEXT:   S_WAITCNT_VMCNT $sgpr_null, 0
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.2:
+  ; CHECK-NEXT:   S_ENDPGM 0
+  bb.0:
+    successors: %bb.1, %bb.2
+    S_CBRANCH_EXECZ %bb.2, implicit $exec
+
+  bb.1:
+    successors: %bb.2
+    V_NOP_e32 implicit $exec
+    S_WAITCNT_VMCNT $sgpr_null, 0
+
+  bb.2:
+    S_ENDPGM 0
+...
+
+---
+name: skip_waitcnt_lgkmcnt
+body: |
+  ; CHECK-LABEL: name: skip_waitcnt_lgkmcnt
+  ; CHECK: bb.0:
+  ; CHECK-NEXT:   successors: %bb.1(0x40000000), %bb.2(0x40000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   S_CBRANCH_EXECZ %bb.2, implicit $exec
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.1:
+  ; CHECK-NEXT:   successors: %bb.2(0x80000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   V_NOP_e32 implicit $exec
+  ; CHECK-NEXT:   S_WAITCNT_LGKMCNT $sgpr_null, 0
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.2:
+  ; CHECK-NEXT:   S_ENDPGM 0
+  bb.0:
+    successors: %bb.1, %bb.2
+    S_CBRANCH_EXECZ %bb.2, implicit $exec
+
+  bb.1:
+    successors: %bb.2
+    V_NOP_e32 implicit $exec
+    S_WAITCNT_LGKMCNT $sgpr_null, 0
+
+  bb.2:
+    S_ENDPGM 0
+...
+
+---
+name: skip_wait_idle
+body: |
+  ; CHECK-LABEL: name: skip_wait_idle
+  ; CHECK: bb.0:
+  ; CHECK-NEXT:   successors: %bb.1(0x40000000), %bb.2(0x40000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   S_CBRANCH_EXECZ %bb.2, implicit $exec
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.1:
+  ; CHECK-NEXT:   successors: %bb.2(0x80000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   V_NOP_e32 implicit $exec
+  ; CHECK-NEXT:   S_WAIT_IDLE
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.2:
+  ; CHECK-NEXT:   S_ENDPGM 0
+  bb.0:
+    successors: %bb.1, %bb.2
+    S_CBRANCH_EXECZ %bb.2, implicit $exec
+
+  bb.1:
+    successors: %bb.2
+    V_NOP_e32 implicit $exec
+    S_WAIT_IDLE
+
+  bb.2:
+    S_ENDPGM 0
+...
+
+---
+name: skip_bvh
+body: |
+  ; CHECK-LABEL: name: skip_bvh
+  ; CHECK: bb.0:
+  ; CHECK-NEXT:   successors: %bb.1(0x40000000), %bb.2(0x40000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   S_CBRANCH_EXECZ %bb.2, implicit $exec
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.1:
+  ; CHECK-NEXT:   successors: %bb.2(0x80000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   V_NOP_e32 implicit $exec
+  ; CHECK-NEXT:   $vgpr4_vgpr5_vgpr6_vgpr7_vgpr8_vgpr9_vgpr10_vgpr11_vgpr12_vgpr13_vgpr14 = IMPLICIT_DEF
+  ; CHECK-NEXT:   $sgpr0_sgpr1_sgpr2_sgpr3 = IMPLICIT_DEF
+  ; CHECK-NEXT:   $vgpr0_vgpr1_vgpr2_vgpr3 = IMAGE_BVH_INTERSECT_RAY_sa_gfx11 $vgpr4_vgpr5_vgpr6_vgpr7_vgpr8_vgpr9_vgpr10_vgpr11_vgpr12_vgpr13_vgpr14, renamable $sgpr0_sgpr1_sgpr2_sgpr3, 0, implicit $exec :: (dereferenceable load (s128), addrspace 7)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.2:
+  ; CHECK-NEXT:   S_ENDPGM 0
+  bb.0:
+    successors: %bb.1, %bb.2
+    S_CBRANCH_EXECZ %bb.2, implicit $exec
+
+  bb.1:
+    successors: %bb.2
+    V_NOP_e32 implicit $exec
+    $vgpr4_vgpr5_vgpr6_vgpr7_vgpr8_vgpr9_vgpr10_vgpr11_vgpr12_vgpr13_vgpr14 = IMPLICIT_DEF
+    $sgpr0_sgpr1_sgpr2_sgpr3 = IMPLICIT_DEF
+    $vgpr0_vgpr1_vgpr2_vgpr3 = IMAGE_BVH_INTERSECT_RAY_sa_gfx11 $vgpr4_vgpr5_vgpr6_vgpr7_vgpr8_vgpr9_vgpr10_vgpr11_vgpr12_vgpr13_vgpr14, renamable $sgpr0_sgpr1_sgpr2_sgpr3, 0, implicit $exec :: (dereferenceable load (s128), addrspace 7)
+
+  bb.2:
+    S_ENDPGM 0
+...
+
+---
+name: skip_barrier
+body: |
+  ; CHECK-LABEL: name: skip_barrier
+  ; CHECK: bb.0:
+  ; CHECK-NEXT:   successors: %bb.1(0x40000000), %bb.2(0x40000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   S_CBRANCH_EXECZ %bb.2, implicit $exec
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.1:
+  ; CHECK-NEXT:   successors: %bb.2(0x80000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   V_NOP_e32 implicit $exec
+  ; CHECK-NEXT:   S_BARRIER
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.2:
+  ; CHECK-NEXT:   S_ENDPGM 0
+  bb.0:
+    successors: %bb.1, %bb.2
+    S_CBRANCH_EXECZ %bb.2, implicit $exec
+
+  bb.1:
+    successors: %bb.2
+    V_NOP_e32 implicit $exec
+    S_BARRIER
+
+  bb.2:
+    S_ENDPGM 0
+...
diff --git a/llvm/test/CodeGen/AMDGPU/insert-skips-gfx12.mir b/llvm/test/CodeGen/AMDGPU/insert-skips-gfx12.mir
new file mode 100644
index 00000000000000..2d092974ac566f
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/insert-skips-gfx12.mir
@@ -0,0 +1,610 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5
+# RUN: llc -mtriple=amdgcn -mcpu=gfx1200 -run-pass si-pre-emit-peephole -amdgpu-skip-threshold=10 -verify-machineinstrs  %s -o - | FileCheck %s
+
+---
+name: skip_wait_loadcnt
+body: |
+  ; CHECK-LABEL: name: skip_wait_loadcnt
+  ; CHECK: bb.0:
+  ; CHECK-NEXT:   successors: %bb.1(0x40000000), %bb.2(0x40000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   S_CBRANCH_EXECZ %bb.2, implicit $exec
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.1:
+  ; CHECK-NEXT:   successors: %bb.2(0x80000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   V_NOP_e32 implicit $exec
+  ; CHECK-NEXT:   S_WAIT_LOADCNT 0
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.2:
+  ; CHECK-NEXT:   S_ENDPGM 0
+  bb.0:
+    successors: %bb.1, %bb.2
+    S_CBRANCH_EXECZ %bb.2, implicit $exec
+
+  bb.1:
+    successors: %bb.2
+    V_NOP_e32 implicit $exec
+    S_WAIT_LOADCNT 0
+
+  bb.2:
+    S_ENDPGM 0
+...
+
+---
+name: skip_wait_loadcnt_dscnt
+body: |
+  ; CHECK-LABEL: name: skip_wait_loadcnt_dscnt
+  ; CHECK: bb.0:
+  ; CHECK-NEXT:   successors: %bb.1(0x40000000), %bb.2(0x40000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   S_CBRANCH_EXECZ %bb.2, implicit $exec
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.1:
+  ; CHECK-NEXT:   successors: %bb.2(0x80000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   V_NOP_e32 implicit $exec
+  ; CHECK-NEXT:   S_WAIT_LOADCNT_DSCNT 0
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.2:
+  ; CHECK-NEXT:   S_ENDPGM 0
+  bb.0:
+    successors: %bb.1, %bb.2
+    S_CBRANCH_EXECZ %bb.2, implicit $exec
+
+  bb.1:
+    successors: %bb.2
+    V_NOP_e32 implicit $exec
+    S_WAIT_LOADCNT_DSCNT 0
+
+  bb.2:
+    S_ENDPGM 0
+...
+
+---
+name: skip_wait_storecnt
+body: |
+  ; CHECK-LABEL: name: skip_wait_storecnt
+  ; CHECK: bb.0:
+  ; CHECK-NEXT:   successors: %bb.1(0x40000000), %bb.2(0x40000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   S_CBRANCH_EXECZ %bb.2, implicit $exec
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.1:
+  ; CHECK-NEXT:   successors: %bb.2(0x80000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   V_NOP_e32 implicit $exec
+  ; CHECK-NEXT:   S_WAIT_STORECNT 0
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.2:
+  ; CHECK-NEXT:   S_ENDPGM 0
+  bb.0:
+    successors: %bb.1, %bb.2
+    S_CBRANCH_EXECZ %bb.2, implicit $exec
+
+  bb.1:
+    successors: %bb.2
+    V_NOP_e32 implicit $exec
+    S_WAIT_STORECNT 0
+
+  bb.2:
+    S_ENDPGM 0
+...
+
+---
+name: skip_wait_storecnt_dscnt
+body: |
+  ; CHECK-LABEL: name: skip_wait_storecnt_dscnt
+  ; CHECK: bb.0:
+  ; CHECK-NEXT:   successors: %bb.1(0x40000000), %bb.2(0x40000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   S_CBRANCH_EXECZ %bb.2, implicit $exec
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.1:
+  ; CHECK-NEXT:   successors: %bb.2(0x80000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   V_NOP_e32 implicit $exec
+  ; CHECK-NEXT:   S_WAIT_STORECNT_DSCNT 0
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.2:
+  ; CHECK-NEXT:   S_ENDPGM 0
+  bb.0:
+    successors: %bb.1, %bb.2
+    S_CBRANCH_EXECZ %bb.2, implicit $exec
+
+  bb.1:
+    successors: %bb.2
+    V_NOP_e32 implicit $exec
+    S_WAIT_STORECNT_DSCNT 0
+
+  bb.2:
+    S_ENDPGM 0
+...
+
+---
+name: skip_wait_samplecnt
+body: |
+  ; CHECK-LABEL: name: skip_wait_samplecnt
+  ; CHECK: bb.0:
+  ; CHECK-NEXT:   successors: %bb.1(0x40000000), %bb.2(0x40000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   S_CBRANCH_EXECZ %bb.2, implicit $exec
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.1:
+  ; CHECK-NEXT:   successors: %bb.2(0x80000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   V_NOP_e32 implicit $exec
+  ; CHECK-NEXT:   S_WAIT_SAMPLECNT 0
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.2:
+  ; CHECK-NEXT:   S_ENDPGM 0
+  bb.0:
+    successors: %bb.1, %bb.2
+    S_CBRANCH_EXECZ %bb.2, implicit $exec
+
+  bb.1:
+    successors: %bb.2
+    V_NOP_e32 implicit $exec
+    S_WAIT_SAMPLECNT 0
+
+  bb.2:
+    S_ENDPGM 0
+...
+
+---
+name: skip_wait_bvhcnt
+body: |
+  ; CHECK-LABEL: name: skip_wait_bvhcnt
+  ; CHECK: bb.0:
+  ; CHECK-NEXT:   successors: %bb.1(0x40000000), %bb.2(0x40000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   S_CBRANCH_EXECZ %bb.2, implicit $exec
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.1:
+  ; CHECK-NEXT:   successors: %bb.2(0x80000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   V_NOP_e32 implicit $exec
+  ; CHECK-NEXT:   S_WAIT_BVHCNT 0
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.2:
+  ; CHECK-NEXT:   S_ENDPGM 0
+  bb.0:
+    successors: %bb.1, %bb.2
+    S_CBRANCH_EXECZ %bb.2, implicit $exec
+
+  bb.1:
+    successors: %bb.2
+    V_NOP_e32 implicit $exec
+    S_WAIT_BVHCNT 0
+
+  bb.2:
+    S_ENDPGM 0
+...
+
+---
+name: skip_wait_expcnt
+body: |
+  ; CHECK-LABEL: name: skip_wait_expcnt
+  ; CHECK: bb.0:
+  ; CHECK-NEXT:   successors: %bb.1(0x40000000), %bb.2(0x40000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   S_CBRANCH_EXECZ %bb.2, implicit $exec
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.1:
+  ; CHECK-NEXT:   successors: %bb.2(0x80000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   V_NOP_e32 implicit $exec
+  ; CHECK-NEXT:   S_WAIT_EXPCNT 0
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.2:
+  ; CHECK-NEXT:   S_ENDPGM 0
+  bb.0:
+    successors: %bb.1, %bb.2
+    S_CBRANCH_EXECZ %bb.2, implicit $exec
+
+  bb.1:
+    successors: %bb.2
+    V_NOP_e32 implicit $exec
+    S_WAIT_EXPCNT 0
+
+  bb.2:
+    S_ENDPGM 0
+...
+
+---
+name: skip_wait_dscnt
+body: |
+  ; CHECK-LABEL: name: skip_wait_dscnt
+  ; CHECK: bb.0:
+  ; CHECK-NEXT:   successors: %bb.1(0x40000000), %bb.2(0x40000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   S_CBRANCH_EXECZ %bb.2, implicit $exec
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.1:
+  ; CHECK-NEXT:   successors: %bb.2(0x80000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   V_NOP_e32 implicit $exec
+  ; CHECK-NEXT:   S_WAIT_DSCNT 0
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.2:
+  ; CHECK-NEXT:   S_ENDPGM 0
+  bb.0:
+    successors: %bb.1, %bb.2
+    S_CBRANCH_EXECZ %bb.2, implicit $exec
+
+  bb.1:
+    successors: %bb.2
+    V_NOP_e32 implicit $exec
+    S_WAIT_DSCNT 0
+
+  bb.2:
+    S_ENDPGM 0
+...
+
+---
+name: skip_wait_kmcnt
+body: |
+  ; CHECK-LABEL: name: skip_wait_kmcnt
+  ; CHECK: bb.0:
+  ; CHECK-NEXT:   successors: %bb.1(0x40000000), %bb.2(0x40000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   S_CBRANCH_EXECZ %bb.2, implicit $exec
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.1:
+  ; CHECK-NEXT:   successors: %bb.2(0x80000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   V_NOP_e32 implicit $exec
+  ; CHECK-NEXT:   S_WAIT_KMCNT 0
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.2:
+  ; CHECK-NEXT:   S_ENDPGM 0
+  bb.0:
+    successors: %bb.1, %bb.2
+    S_CBRANCH_EXECZ %bb.2, implicit $exec
+
+  bb.1:
+    successors: %bb.2
+    V_NOP_e32 implicit $exec
+    S_WAIT_KMCNT 0
+
+  bb.2:
+    S_ENDPGM 0
+...
+
+---
+name: skip_wait_idle
+body: |
+  ; CHECK-LABEL: name: skip_wait_idle
+  ; CHECK: bb.0:
+  ; CHECK-NEXT:   successors: %bb.1(0x40000000), %bb.2(0x40000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   S_CBRANCH_EXECZ %bb.2, implicit $exec
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.1:
+  ; CHECK-NEXT:   successors: %bb.2(0x80000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   V_NOP_e32 implicit $exec
+  ; CHECK-NEXT:   S_WAIT_IDLE
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.2:
+  ; CHECK-NEXT:   S_ENDPGM 0
+  bb.0:
+    successors: %bb.1, %bb.2
+    S_CBRANCH_EXECZ %bb.2, implicit $exec
+
+  bb.1:
+    successors: %bb.2
+    V_NOP_e32 implicit $exec
+    S_WAIT_IDLE
+
+  bb.2:
+    S_ENDPGM 0
+...
+
+---
+name: skip_wait_event
+body: |
+  ; CHECK-LABEL: name: skip_wait_event
+  ; CHECK: bb.0:
+  ; CHECK-NEXT:   successors: %bb.1(0x40000000), %bb.2(0x40000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   S_CBRANCH_EXECZ %bb.2, implicit $exec
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.1:
+  ; CHECK-NEXT:   successors: %bb.2(0x80000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   V_NOP_e32 implicit $exec
+  ; CHECK-NEXT:   S_WAIT_EVENT 0
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.2:
+  ; CHECK-NEXT:   S_ENDPGM 0
+  bb.0:
+    successors: %bb.1, %bb.2
+    S_CBRANCH_EXECZ %bb.2, implicit $exec
+
+  bb.1:
+    successors: %bb.2
+    V_NOP_e32 implicit $exec
+    S_WAIT_EVENT 0
+
+  bb.2:
+    S_ENDPGM 0
+...
+
+---
+name: skip_barrier_signal_imm
+body: |
+  ; CHECK-LABEL: name: skip_barrier_signal_imm
+  ; CHECK: bb.0:
+  ; CHECK-NEXT:   successors: %bb.1(0x40000000), %bb.2(0x40000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   S_CBRANCH_EXECZ %bb.2, implicit $exec
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.1:
+  ; CHECK-NEXT:   successors: %bb.2(0x80000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   V_NOP_e32 implicit $exec
+  ; CHECK-NEXT:   S_BARRIER_SIGNAL_IMM -1
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.2:
+  ; CHECK-NEXT:   S_ENDPGM 0
+  bb.0:
+    successors: %bb.1, %bb.2
+    S_CBRANCH_EXECZ %bb.2, implicit $exec
+
+  bb.1:
+    successors: %bb.2
+    V_NOP_e32 implicit $exec
+    S_BARRIER_SIGNAL_IMM -1
+
+  bb.2:
+    S_ENDPGM 0
+...
+
+---
+name: skip_barrier_signal_isfirst_imm
+body: |
+  ; CHECK-LABEL: name: skip_barrier_signal_isfirst_imm
+  ; CHECK: bb.0:
+  ; CHECK-NEXT:   successors: %bb.1(0x40000000), %bb.2(0x40000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   S_CBRANCH_EXECZ %bb.2, implicit $exec
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.1:
+  ; CHECK-NEXT:   successors: %bb.2(0x80000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   V_NOP_e32 implicit $exec
+  ; CHECK-NEXT:   S_BARRIER_SIGNAL_ISFIRST_IMM -1, implicit-def $scc
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.2:
+  ; CHECK-NEXT:   S_ENDPGM 0
+  bb.0:
+    successors...
[truncated]

Copy link

github-actions bot commented Jul 8, 2024

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff a77d3ea310c61cf59c1146895b2d51fe014eb0a9 4965fd6d8034fc3680c805aa0127132677146d88 --extensions h,cpp -- llvm/lib/Target/AMDGPU/SIInstrInfo.cpp llvm/lib/Target/AMDGPU/SIInstrInfo.h
View the diff from clang-format here.
diff --git a/llvm/lib/Target/AMDGPU/SIInstrInfo.h b/llvm/lib/Target/AMDGPU/SIInstrInfo.h
index 1712dfe8d4..cc71e4ff3b 100644
--- a/llvm/lib/Target/AMDGPU/SIInstrInfo.h
+++ b/llvm/lib/Target/AMDGPU/SIInstrInfo.h
@@ -941,8 +941,7 @@ public:
            Opcode == AMDGPU::S_BARRIER_INIT_M0 ||
            Opcode == AMDGPU::S_BARRIER_INIT_IMM ||
            Opcode == AMDGPU::S_BARRIER_JOIN_IMM ||
-           Opcode == AMDGPU::S_BARRIER_LEAVE ||
-           Opcode == AMDGPU::DS_GWS_INIT ||
+           Opcode == AMDGPU::S_BARRIER_LEAVE || Opcode == AMDGPU::DS_GWS_INIT ||
            Opcode == AMDGPU::DS_GWS_BARRIER;
   }
 

@perlfu perlfu force-pushed the amdgpu-skip-barrier branch from 7710319 to 08fb1b2 Compare July 8, 2024 03:08
Copy link
Contributor

@jayfoad jayfoad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems reasonable, but I don't really feel super confident about whether this could break anything or fix anything. Does it have any effect on Vulkan CTS?

@perlfu
Copy link
Contributor Author

perlfu commented Jul 9, 2024

Seems reasonable, but I don't really feel super confident about whether this could break anything or fix anything. Does it have any effect on Vulkan CTS?

This change passes Vulkan CTS.

Let me clarify why I think this is "correct".
Current codegen can generate branches over barriers, these will be always be retained if we run with -O0.
SIPreEmitPeephole can remove these if the branch distances is short and the function modified in this change returns false.
This leads to the situation where we might not skip a barrier for arbitrary reasons.
In practice, if all invocations in the workgroup are running the same shader/kernel in non-cyclic control flow then unskipping the barrier will not make any difference.
However if we have some kind cyclic barrier invocation with divergence, or interaction of different kernels on a barrier, then we should respect the branches over barrier operations, because we do not know any better.

We might of course decide that certain barriers should always be executed.
However I think this is for the control flow lowering to decide.
If we want this then we simply need to not introduce a branch over the barrier in the first place.

Add barriers and s_wait_event to hasUnwantedEffectsWhenEXECEmpty.
Add a comment documenting the current expected use of the function.
@perlfu perlfu force-pushed the amdgpu-skip-barrier branch from 08fb1b2 to 1b2695f Compare July 9, 2024 09:08
perlfu added 2 commits July 14, 2024 08:57
- Rename isBarrierRelated -> isBarrier
- Add DS_GWS instructions to isBarrier
- Move comment to header
@perlfu perlfu merged commit befd44b into llvm:main Jul 17, 2024
4 of 7 checks passed
@llvm-ci
Copy link
Collaborator

llvm-ci commented Jul 17, 2024

LLVM Buildbot has detected a new failure on builder sanitizer-aarch64-linux running on sanitizer-buildbot8 while building llvm at step 2 "annotate".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/51/builds/1437

Here is the relevant piece of the build log for the reference:

Step 2 (annotate) failure: 'python ../sanitizer_buildbot/sanitizers/zorg/buildbot/builders/sanitizers/buildbot_selector.py' (failure)
...
llvm-lit: /b/sanitizer-aarch64-linux/build/llvm-project/compiler-rt/test/fuzzer/lit.cfg.py:60: note: linux feature available
llvm-lit: /b/sanitizer-aarch64-linux/build/llvm-project/compiler-rt/test/fuzzer/lit.cfg.py:36: note: lsan feature available
llvm-lit: /b/sanitizer-aarch64-linux/build/llvm-project/compiler-rt/test/fuzzer/lit.cfg.py:48: note: msan feature available
llvm-lit: /b/sanitizer-aarch64-linux/build/llvm-project/compiler-rt/test/fuzzer/lit.cfg.py:60: note: linux feature available
llvm-lit: /b/sanitizer-aarch64-linux/build/llvm-project/compiler-rt/test/fuzzer/lit.cfg.py:36: note: lsan feature available
llvm-lit: /b/sanitizer-aarch64-linux/build/llvm-project/compiler-rt/test/fuzzer/lit.cfg.py:48: note: msan feature available
llvm-lit: /b/sanitizer-aarch64-linux/build/llvm-project/compiler-rt/test/fuzzer/lit.cfg.py:60: note: linux feature available
llvm-lit: /b/sanitizer-aarch64-linux/build/llvm-project/llvm/utils/lit/lit/main.py:72: note: The test suite configuration requested an individual test timeout of 0 seconds but a timeout of 900 seconds was requested on the command line. Forcing timeout to be 900 seconds.
-- Testing: 2605 of 5489 tests, 48 workers --
Testing:  0.
FAIL: LeakSanitizer-Standalone-aarch64 :: TestCases/use_registers.cpp (175 of 2605)
******************** TEST 'LeakSanitizer-Standalone-aarch64 :: TestCases/use_registers.cpp' FAILED ********************
Exit Code: 23

Command Output (stdout):
--
Test alloc: 0x51a000000000 

=================================================================
==1057114==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 1337 byte(s) in 1 object(s) allocated from:
    #0 0xbc8532b254c8 in malloc /b/sanitizer-aarch64-linux/build/llvm-project/compiler-rt/lib/lsan/lsan_interceptors.cpp:75:3
    #1 0xbc8532b28350 in registers_thread_func /b/sanitizer-aarch64-linux/build/llvm-project/compiler-rt/test/lsan/TestCases/use_registers.cpp:16:13
    #2 0xbc8532b27990 in void* ThreadStartFunc<false>(void*) /b/sanitizer-aarch64-linux/build/llvm-project/compiler-rt/lib/lsan/lsan_interceptors.cpp:434:18
    #3 0xe72a74da5978  (/lib/aarch64-linux-gnu/libc.so.6+0x85978) (BuildId: 3119d7bcf68a34e21742cd08095272de601d373f)
    #4 0xe72a74e0ba48  (/lib/aarch64-linux-gnu/libc.so.6+0xeba48) (BuildId: 3119d7bcf68a34e21742cd08095272de601d373f)

Objects leaked above:
0x51a000000000 (1337 bytes)

SUMMARY: LeakSanitizer: 1337 byte(s) leaked in 1 allocation(s).

--
Command Output (stderr):
--
RUN: at line 2: /b/sanitizer-aarch64-linux/build/build_default/./bin/clang  --driver-mode=g++ -O0   -Wthread-safety -Wthread-safety-reference -Wthread-safety-beta   -gline-tables-only -fsanitize=leak -I/b/sanitizer-aarch64-linux/build/llvm-project/compiler-rt/test/lsan/../ -pthread /b/sanitizer-aarch64-linux/build/llvm-project/compiler-rt/test/lsan/TestCases/use_registers.cpp -o /b/sanitizer-aarch64-linux/build/build_default/runtimes/runtimes-bins/compiler-rt/test/lsan/AARCH64LsanConfig/TestCases/Output/use_registers.cpp.tmp
+ /b/sanitizer-aarch64-linux/build/build_default/./bin/clang --driver-mode=g++ -O0 -Wthread-safety -Wthread-safety-reference -Wthread-safety-beta -gline-tables-only -fsanitize=leak -I/b/sanitizer-aarch64-linux/build/llvm-project/compiler-rt/test/lsan/../ -pthread /b/sanitizer-aarch64-linux/build/llvm-project/compiler-rt/test/lsan/TestCases/use_registers.cpp -o /b/sanitizer-aarch64-linux/build/build_default/runtimes/runtimes-bins/compiler-rt/test/lsan/AARCH64LsanConfig/TestCases/Output/use_registers.cpp.tmp
RUN: at line 3: env LSAN_OPTIONS=:detect_leaks=1:"report_objects=1:use_stacks=0:use_registers=0" not  /b/sanitizer-aarch64-linux/build/build_default/runtimes/runtimes-bins/compiler-rt/test/lsan/AARCH64LsanConfig/TestCases/Output/use_registers.cpp.tmp 2>&1 | FileCheck /b/sanitizer-aarch64-linux/build/llvm-project/compiler-rt/test/lsan/TestCases/use_registers.cpp
+ env LSAN_OPTIONS=:detect_leaks=1:report_objects=1:use_stacks=0:use_registers=0 not /b/sanitizer-aarch64-linux/build/build_default/runtimes/runtimes-bins/compiler-rt/test/lsan/AARCH64LsanConfig/TestCases/Output/use_registers.cpp.tmp
+ FileCheck /b/sanitizer-aarch64-linux/build/llvm-project/compiler-rt/test/lsan/TestCases/use_registers.cpp
RUN: at line 4: env LSAN_OPTIONS=:detect_leaks=1:"report_objects=1:use_stacks=0:use_registers=1"  /b/sanitizer-aarch64-linux/build/build_default/runtimes/runtimes-bins/compiler-rt/test/lsan/AARCH64LsanConfig/TestCases/Output/use_registers.cpp.tmp 2>&1
+ env LSAN_OPTIONS=:detect_leaks=1:report_objects=1:use_stacks=0:use_registers=1 /b/sanitizer-aarch64-linux/build/build_default/runtimes/runtimes-bins/compiler-rt/test/lsan/AARCH64LsanConfig/TestCases/Output/use_registers.cpp.tmp

--

********************
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90.. 

Step 11 (test compiler-rt debug) failure: test compiler-rt debug (failure)
...
llvm-lit: /b/sanitizer-aarch64-linux/build/llvm-project/compiler-rt/test/fuzzer/lit.cfg.py:60: note: linux feature available
llvm-lit: /b/sanitizer-aarch64-linux/build/llvm-project/compiler-rt/test/fuzzer/lit.cfg.py:36: note: lsan feature available
llvm-lit: /b/sanitizer-aarch64-linux/build/llvm-project/compiler-rt/test/fuzzer/lit.cfg.py:48: note: msan feature available
llvm-lit: /b/sanitizer-aarch64-linux/build/llvm-project/compiler-rt/test/fuzzer/lit.cfg.py:60: note: linux feature available
llvm-lit: /b/sanitizer-aarch64-linux/build/llvm-project/compiler-rt/test/fuzzer/lit.cfg.py:36: note: lsan feature available
llvm-lit: /b/sanitizer-aarch64-linux/build/llvm-project/compiler-rt/test/fuzzer/lit.cfg.py:48: note: msan feature available
llvm-lit: /b/sanitizer-aarch64-linux/build/llvm-project/compiler-rt/test/fuzzer/lit.cfg.py:60: note: linux feature available
llvm-lit: /b/sanitizer-aarch64-linux/build/llvm-project/llvm/utils/lit/lit/main.py:72: note: The test suite configuration requested an individual test timeout of 0 seconds but a timeout of 900 seconds was requested on the command line. Forcing timeout to be 900 seconds.
-- Testing: 2605 of 5489 tests, 48 workers --
Testing:  0.
FAIL: LeakSanitizer-Standalone-aarch64 :: TestCases/use_registers.cpp (175 of 2605)
******************** TEST 'LeakSanitizer-Standalone-aarch64 :: TestCases/use_registers.cpp' FAILED ********************
Exit Code: 23

Command Output (stdout):
--
Test alloc: 0x51a000000000 

=================================================================
==1057114==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 1337 byte(s) in 1 object(s) allocated from:
    #0 0xbc8532b254c8 in malloc /b/sanitizer-aarch64-linux/build/llvm-project/compiler-rt/lib/lsan/lsan_interceptors.cpp:75:3
    #1 0xbc8532b28350 in registers_thread_func /b/sanitizer-aarch64-linux/build/llvm-project/compiler-rt/test/lsan/TestCases/use_registers.cpp:16:13
    #2 0xbc8532b27990 in void* ThreadStartFunc<false>(void*) /b/sanitizer-aarch64-linux/build/llvm-project/compiler-rt/lib/lsan/lsan_interceptors.cpp:434:18
    #3 0xe72a74da5978  (/lib/aarch64-linux-gnu/libc.so.6+0x85978) (BuildId: 3119d7bcf68a34e21742cd08095272de601d373f)
    #4 0xe72a74e0ba48  (/lib/aarch64-linux-gnu/libc.so.6+0xeba48) (BuildId: 3119d7bcf68a34e21742cd08095272de601d373f)

Objects leaked above:
0x51a000000000 (1337 bytes)

SUMMARY: LeakSanitizer: 1337 byte(s) leaked in 1 allocation(s).

--
Command Output (stderr):
--
RUN: at line 2: /b/sanitizer-aarch64-linux/build/build_default/./bin/clang  --driver-mode=g++ -O0   -Wthread-safety -Wthread-safety-reference -Wthread-safety-beta   -gline-tables-only -fsanitize=leak -I/b/sanitizer-aarch64-linux/build/llvm-project/compiler-rt/test/lsan/../ -pthread /b/sanitizer-aarch64-linux/build/llvm-project/compiler-rt/test/lsan/TestCases/use_registers.cpp -o /b/sanitizer-aarch64-linux/build/build_default/runtimes/runtimes-bins/compiler-rt/test/lsan/AARCH64LsanConfig/TestCases/Output/use_registers.cpp.tmp
+ /b/sanitizer-aarch64-linux/build/build_default/./bin/clang --driver-mode=g++ -O0 -Wthread-safety -Wthread-safety-reference -Wthread-safety-beta -gline-tables-only -fsanitize=leak -I/b/sanitizer-aarch64-linux/build/llvm-project/compiler-rt/test/lsan/../ -pthread /b/sanitizer-aarch64-linux/build/llvm-project/compiler-rt/test/lsan/TestCases/use_registers.cpp -o /b/sanitizer-aarch64-linux/build/build_default/runtimes/runtimes-bins/compiler-rt/test/lsan/AARCH64LsanConfig/TestCases/Output/use_registers.cpp.tmp
RUN: at line 3: env LSAN_OPTIONS=:detect_leaks=1:"report_objects=1:use_stacks=0:use_registers=0" not  /b/sanitizer-aarch64-linux/build/build_default/runtimes/runtimes-bins/compiler-rt/test/lsan/AARCH64LsanConfig/TestCases/Output/use_registers.cpp.tmp 2>&1 | FileCheck /b/sanitizer-aarch64-linux/build/llvm-project/compiler-rt/test/lsan/TestCases/use_registers.cpp
+ env LSAN_OPTIONS=:detect_leaks=1:report_objects=1:use_stacks=0:use_registers=0 not /b/sanitizer-aarch64-linux/build/build_default/runtimes/runtimes-bins/compiler-rt/test/lsan/AARCH64LsanConfig/TestCases/Output/use_registers.cpp.tmp
+ FileCheck /b/sanitizer-aarch64-linux/build/llvm-project/compiler-rt/test/lsan/TestCases/use_registers.cpp
RUN: at line 4: env LSAN_OPTIONS=:detect_leaks=1:"report_objects=1:use_stacks=0:use_registers=1"  /b/sanitizer-aarch64-linux/build/build_default/runtimes/runtimes-bins/compiler-rt/test/lsan/AARCH64LsanConfig/TestCases/Output/use_registers.cpp.tmp 2>&1
+ env LSAN_OPTIONS=:detect_leaks=1:report_objects=1:use_stacks=0:use_registers=1 /b/sanitizer-aarch64-linux/build/build_default/runtimes/runtimes-bins/compiler-rt/test/lsan/AARCH64LsanConfig/TestCases/Output/use_registers.cpp.tmp

--

********************
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90..


@llvm-ci
Copy link
Collaborator

llvm-ci commented Jul 17, 2024

LLVM Buildbot has detected a new failure on builder lld-x86_64-win running on as-worker-93 while building llvm at step 7 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/146/builds/265

Here is the relevant piece of the build log for the reference:

Step 7 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM-Unit :: Support/./SupportTests.exe/39/86' FAILED ********************
Script(shard):
--
GTEST_OUTPUT=json:C:\a\lld-x86_64-win\build\unittests\Support\.\SupportTests.exe-LLVM-Unit-9728-39-86.json GTEST_SHUFFLE=0 GTEST_TOTAL_SHARDS=86 GTEST_SHARD_INDEX=39 C:\a\lld-x86_64-win\build\unittests\Support\.\SupportTests.exe
--

Script:
--
C:\a\lld-x86_64-win\build\unittests\Support\.\SupportTests.exe --gtest_filter=ProgramEnvTest.CreateProcessLongPath
--
C:\a\lld-x86_64-win\llvm-project\llvm\unittests\Support\ProgramTest.cpp(160): error: Expected equality of these values:
  0
  RC
    Which is: -2

C:\a\lld-x86_64-win\llvm-project\llvm\unittests\Support\ProgramTest.cpp(163): error: fs::remove(Twine(LongPath)): did not return errc::success.
error number: 13
error message: permission denied



C:\a\lld-x86_64-win\llvm-project\llvm\unittests\Support\ProgramTest.cpp:160
Expected equality of these values:
  0
  RC
    Which is: -2

C:\a\lld-x86_64-win\llvm-project\llvm\unittests\Support\ProgramTest.cpp:163
fs::remove(Twine(LongPath)): did not return errc::success.
error number: 13
error message: permission denied




********************


yuxuanchen1997 pushed a commit that referenced this pull request Jul 25, 2024
Summary:
Add barriers and s_wait_event to hasUnwantedEffectsWhenEXECEmpty.
Add a comment documenting the current expected use of the function.

Test Plan: 

Reviewers: 

Subscribers: 

Tasks: 

Tags: 


Differential Revision: https://phabricator.intern.facebook.com/D60250953
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants