Skip to content

[GlobalISel] Combine G_ZEXT of undef -> 0 #113764

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

tschuett
Copy link

Pull out of #113616

Legality checks for CombinerHelper::replaceInstWithConstant.

Pull out of llvm#113616

Legality checks for CombinerHelper::replaceInstWithConstant.
@llvmbot
Copy link
Member

llvmbot commented Oct 26, 2024

@llvm/pr-subscribers-backend-amdgpu
@llvm/pr-subscribers-llvm-globalisel

@llvm/pr-subscribers-backend-aarch64

Author: Thorsten Schütt (tschuett)

Changes

Pull out of #113616

Legality checks for CombinerHelper::replaceInstWithConstant.


Full diff: https://github.com/llvm/llvm-project/pull/113764.diff

6 Files Affected:

  • (modified) llvm/include/llvm/Target/GlobalISel/Combine.td (+1-1)
  • (modified) llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp (+5-2)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/combine-cast.mir (+16)
  • (modified) llvm/test/CodeGen/AArch64/extract-vector-elt.ll (+4-11)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/combine-amdgpu-cvt-f32-ubyte.mir (+2-4)
  • (modified) llvm/test/CodeGen/AMDGPU/shrink-add-sub-constant.ll (+1-6)
diff --git a/llvm/include/llvm/Target/GlobalISel/Combine.td b/llvm/include/llvm/Target/GlobalISel/Combine.td
index ead4149fc11068..f2c0cf18d68a97 100644
--- a/llvm/include/llvm/Target/GlobalISel/Combine.td
+++ b/llvm/include/llvm/Target/GlobalISel/Combine.td
@@ -412,7 +412,7 @@ def binop_right_undef_to_undef: GICombineRule<
 
 def unary_undef_to_zero: GICombineRule<
   (defs root:$root),
-  (match (wip_match_opcode G_ABS):$root,
+  (match (wip_match_opcode G_ABS, G_ZEXT):$root,
          [{ return Helper.matchOperandIsUndef(*${root}, 1); }]),
   (apply [{ Helper.replaceInstWithConstant(*${root}, 0); }])>;
 
diff --git a/llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp b/llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp
index b7ddf9f479ef8e..d27d1336a05602 100644
--- a/llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp
+++ b/llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp
@@ -2916,8 +2916,11 @@ void CombinerHelper::replaceInstWithFConstant(MachineInstr &MI, double C) {
 
 void CombinerHelper::replaceInstWithConstant(MachineInstr &MI, int64_t C) {
   assert(MI.getNumDefs() == 1 && "Expected only one def?");
-  Builder.buildConstant(MI.getOperand(0), C);
-  MI.eraseFromParent();
+  LLT DstTy = MRI.getType(MI.getOperand(0).getReg());
+  if (isConstantLegalOrBeforeLegalizer(DstTy)) {
+    Builder.buildConstant(MI.getOperand(0), C);
+    MI.eraseFromParent();
+  }
 }
 
 void CombinerHelper::replaceInstWithConstant(MachineInstr &MI, APInt C) {
diff --git a/llvm/test/CodeGen/AArch64/GlobalISel/combine-cast.mir b/llvm/test/CodeGen/AArch64/GlobalISel/combine-cast.mir
index b045deebc56e03..e603e48760c463 100644
--- a/llvm/test/CodeGen/AArch64/GlobalISel/combine-cast.mir
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/combine-cast.mir
@@ -217,3 +217,19 @@ body:             |
     %large:_(<2 x s64>) = G_ANYEXT %bv(<2 x s32>)
     $q0 = COPY %large(<2 x s64>)
     $d0 = COPY %bv(<2 x s32>)
+...
+---
+name:            test_combine_zext_undef
+legalized: true
+body:             |
+  bb.1:
+    ; CHECK-LABEL: name: test_combine_zext_undef
+    ; CHECK: %undef:_(<2 x s32>) = G_IMPLICIT_DEF
+    ; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 0
+    ; CHECK-NEXT: %large:_(<2 x s64>) = G_BUILD_VECTOR [[C]](s64), [[C]](s64)
+    ; CHECK-NEXT: $q0 = COPY %large(<2 x s64>)
+    ; CHECK-NEXT: $d0 = COPY %undef(<2 x s32>)
+    %undef:_(<2 x s32>) = G_IMPLICIT_DEF
+    %large:_(<2 x s64>) = G_ZEXT %undef(<2 x s32>)
+    $q0 = COPY %large(<2 x s64>)
+    $d0 = COPY %undef(<2 x s32>)
diff --git a/llvm/test/CodeGen/AArch64/extract-vector-elt.ll b/llvm/test/CodeGen/AArch64/extract-vector-elt.ll
index 0481d997d24faf..d5b7c63a80053a 100644
--- a/llvm/test/CodeGen/AArch64/extract-vector-elt.ll
+++ b/llvm/test/CodeGen/AArch64/extract-vector-elt.ll
@@ -8,17 +8,10 @@
 ; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for extract_v4i32_vector_extract_const
 
 define i64 @extract_v2i64_undef_index(<2 x i64> %a, i32 %c) {
-; CHECK-SD-LABEL: extract_v2i64_undef_index:
-; CHECK-SD:       // %bb.0: // %entry
-; CHECK-SD-NEXT:    fmov x0, d0
-; CHECK-SD-NEXT:    ret
-;
-; CHECK-GI-LABEL: extract_v2i64_undef_index:
-; CHECK-GI:       // %bb.0: // %entry
-; CHECK-GI-NEXT:    str q0, [sp, #-16]!
-; CHECK-GI-NEXT:    .cfi_def_cfa_offset 16
-; CHECK-GI-NEXT:    ldr x0, [sp], #16
-; CHECK-GI-NEXT:    ret
+; CHECK-LABEL: extract_v2i64_undef_index:
+; CHECK:       // %bb.0: // %entry
+; CHECK-NEXT:    fmov x0, d0
+; CHECK-NEXT:    ret
 entry:
   %d = extractelement <2 x i64> %a, i32 undef
   ret i64 %d
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/combine-amdgpu-cvt-f32-ubyte.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/combine-amdgpu-cvt-f32-ubyte.mir
index 7893bfa1d38f08..9b39afd32ac378 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/combine-amdgpu-cvt-f32-ubyte.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/combine-amdgpu-cvt-f32-ubyte.mir
@@ -261,8 +261,7 @@ body:             |
     ; CHECK-LABEL: name: cvt_f32_ubyte0_zext_lshr_16
     ; CHECK: liveins: $vgpr0
     ; CHECK-NEXT: {{  $}}
-    ; CHECK-NEXT: %shift:_(s16) = G_IMPLICIT_DEF
-    ; CHECK-NEXT: %zext:_(s32) = G_ZEXT %shift(s16)
+    ; CHECK-NEXT: %zext:_(s32) = G_CONSTANT i32 0
     ; CHECK-NEXT: %result:_(s32) = G_AMDGPU_CVT_F32_UBYTE0 %zext
     ; CHECK-NEXT: $vgpr0 = COPY %result(s32)
     %arg:_(s32) = COPY $vgpr0
@@ -284,8 +283,7 @@ body:             |
     ; CHECK-LABEL: name: cvt_f32_ubyte0_zext_lshr_24
     ; CHECK: liveins: $vgpr0
     ; CHECK-NEXT: {{  $}}
-    ; CHECK-NEXT: %shift:_(s16) = G_IMPLICIT_DEF
-    ; CHECK-NEXT: %zext:_(s32) = G_ZEXT %shift(s16)
+    ; CHECK-NEXT: %zext:_(s32) = G_CONSTANT i32 0
     ; CHECK-NEXT: %result:_(s32) = G_AMDGPU_CVT_F32_UBYTE0 %zext
     ; CHECK-NEXT: $vgpr0 = COPY %result(s32)
     %arg:_(s32) = COPY $vgpr0
diff --git a/llvm/test/CodeGen/AMDGPU/shrink-add-sub-constant.ll b/llvm/test/CodeGen/AMDGPU/shrink-add-sub-constant.ll
index 6ac04d8bc42bba..253377ff47fe74 100644
--- a/llvm/test/CodeGen/AMDGPU/shrink-add-sub-constant.ll
+++ b/llvm/test/CodeGen/AMDGPU/shrink-add-sub-constant.ll
@@ -3962,14 +3962,12 @@ define amdgpu_kernel void @v_test_v2i16_x_add_undef_neg32(ptr addrspace(1) %out,
 ; VI-GISEL-NEXT:    v_addc_u32_e32 v1, vcc, 0, v1, vcc
 ; VI-GISEL-NEXT:    flat_load_dword v3, v[0:1]
 ; VI-GISEL-NEXT:    v_mov_b32_e32 v0, s0
+; VI-GISEL-NEXT:    v_mov_b32_e32 v1, s1
 ; VI-GISEL-NEXT:    v_add_u32_e32 v0, vcc, v0, v2
 ; VI-GISEL-NEXT:    v_not_b32_e32 v2, 31
-; VI-GISEL-NEXT:    v_mov_b32_e32 v1, s1
-; VI-GISEL-NEXT:    s_and_b32 s0, 0xffff, s0
 ; VI-GISEL-NEXT:    v_addc_u32_e32 v1, vcc, 0, v1, vcc
 ; VI-GISEL-NEXT:    s_waitcnt vmcnt(0)
 ; VI-GISEL-NEXT:    v_add_u16_sdwa v2, v3, v2 dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:WORD_1 src1_sel:DWORD
-; VI-GISEL-NEXT:    v_or_b32_e32 v2, s0, v2
 ; VI-GISEL-NEXT:    flat_store_dword v[0:1], v2
 ; VI-GISEL-NEXT:    s_endpgm
 ;
@@ -4079,15 +4077,12 @@ define amdgpu_kernel void @v_test_v2i16_x_add_neg32_undef(ptr addrspace(1) %out,
 ; VI-GISEL-NEXT:    v_add_u32_e32 v0, vcc, v0, v2
 ; VI-GISEL-NEXT:    v_addc_u32_e32 v1, vcc, 0, v1, vcc
 ; VI-GISEL-NEXT:    flat_load_dword v3, v[0:1]
-; VI-GISEL-NEXT:    s_and_b32 s2, 0xffff, s0
 ; VI-GISEL-NEXT:    v_mov_b32_e32 v0, s0
 ; VI-GISEL-NEXT:    v_mov_b32_e32 v1, s1
 ; VI-GISEL-NEXT:    v_add_u32_e32 v0, vcc, v0, v2
-; VI-GISEL-NEXT:    s_lshl_b32 s0, s2, 16
 ; VI-GISEL-NEXT:    v_addc_u32_e32 v1, vcc, 0, v1, vcc
 ; VI-GISEL-NEXT:    s_waitcnt vmcnt(0)
 ; VI-GISEL-NEXT:    v_add_u16_e32 v2, 0xffe0, v3
-; VI-GISEL-NEXT:    v_or_b32_e32 v2, s0, v2
 ; VI-GISEL-NEXT:    flat_store_dword v[0:1], v2
 ; VI-GISEL-NEXT:    s_endpgm
 ;

Builder.buildConstant(MI.getOperand(0), C);
MI.eraseFromParent();
LLT DstTy = MRI.getType(MI.getOperand(0).getReg());
if (isConstantLegalOrBeforeLegalizer(DstTy)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I'd expect this to be checked in the match rule

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It isn't. See above.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just because a G_ZEXT is sitting on a register doesn't imply that a G_CONSTANT is legal on that register. GlobalISel is for me for weird hardware.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't, but should it

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the current code, we can build illegal MIR after the legalizer.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One option is to move away from the generic matchers Helper.matchOperandIsUndef to specific matchers for unary_undef_to_zero, i.e., Helper.matchUndefToZero.

But I really really hate to have a function that builds something and I have to search around if and where the legality check is.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One option is to move away from the generic matchers Helper.matchOperandIsUndef to specific matchers for unary_undef_to_zero, i.e., Helper.matchUndefToZero.

Why not just check in the matcher C++ code? The matcher already knows what the intended output looks like, it just defers the actual mutation to the applicator.

But I really really hate to have a function that builds something and I have to search around if and where the legality check is.

If I understand right you're basically saying you don't like the combiner infrastructure as it was designed, because that's how it's supposed to work.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see a need for Helper.matchOperandIsUndef any longer. You can directly match a G_IMPLICIT_DEF input in tablegen. The match predicate code would then only need to check the legality of the new G_IMPLICIT_DEF or G_CONSTANT.

The legality check would always be in the matcher, there shouldn't be any searching to do.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer my solution. The legality check and the build are close together and easy to reason about.

Firstly, Helper.matchOperandIsUndef does not know whether we are combining into undef or constants. How can it reason about legality of uncertainty? Secondly, legality checks and build are spread wide afar, which makes it hard to reason about correctness.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then the utility of matchOperandIsUndef is limited and should not be used, or supplemented with a legality check

@tschuett
Copy link
Author

There is a bug in the repo. We do Builder.buildConstant(MI.getOperand(0), C); without checking for legality. This PR fixes the bug in the correct way, which is also the long-term correct solution. It just puts a legality check on top. For readers, it is easy to understand, there is a legality check before the build.

Exploding PRs based on misunderstanding is a poor sport.

@tschuett
Copy link
Author

There is a bug in the repo. We do Builder.buildConstant(MI.getOperand(0), C); without checking for legality. This PR fixes the bug in the correct way, which is also the long-term correct solution. It just puts a legality check on top. For readers, it is easy to understand, there is a legality check before the build.

Exploding PRs based on misunderstanding is a poor sport.

You are ignoring my comment. No surprise.

Copy link
Contributor

@arsenm arsenm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replaceInstWithConstant reads as a must-succeed operation. The match function should have the legality information. In an ideal future, the tablegen apply could automatically check the legality

Builder.buildConstant(MI.getOperand(0), C);
MI.eraseFromParent();
LLT DstTy = MRI.getType(MI.getOperand(0).getReg());
if (isConstantLegalOrBeforeLegalizer(DstTy)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then the utility of matchOperandIsUndef is limited and should not be used, or supplemented with a legality check

@tschuett
Copy link
Author

tschuett commented Nov 5, 2024

I stated several times before if we remove the legality check from the replaceInstWithConstant function, then we have to verify that on all paths to the function the correct legality checks were made. Today the combiner is spread over several files. This is error-prone.

@aemerson
Copy link
Contributor

aemerson commented Nov 5, 2024

Exploding PRs based on misunderstanding is a poor sport.

Believe me when I say none of us enjoy this any more than you do. Us having to argue for why a function called replaceInstWithConstant should do as it says it will do is frankly tiring. We understand your points but our opinion is that it's the wrong choice in terms of good engineering practices.

You are ignoring my comment. No surprise.

I don't know who exactly this was directed to but I don't believe anyone was ignoring anything. If you continue to make these sorts of comments I will completely stop engaging in any reviews from you.

@aemerson
Copy link
Contributor

aemerson commented Nov 5, 2024

And FWIW, there are other significant GlobalISel contributors who also agree that this is not the right approach.

@tschuett
Copy link
Author

tschuett commented Nov 5, 2024

This patch fixes a bug in the obvious place. If you prefer an error-prone system, patches welcome.

@topperc
Copy link
Collaborator

topperc commented Nov 5, 2024

If the match function indicates that a change is being made, but the change is never made, won't that cause the combiner to iterate repeatedly until it hits the MaxIterations limit?

@topperc
Copy link
Collaborator

topperc commented Nov 5, 2024

If the match function indicates that a change is being made, but the change is never made, won't that cause the combiner to iterate repeatedly until it hits the MaxIterations limit?

As an experiment I made replaceInstWithConstant not make any change and increased the MaxIteration limit to 5. As I expected the combiner ran 5 times.

@aemerson
Copy link
Contributor

aemerson commented Nov 5, 2024

If the match function indicates that a change is being made, but the change is never made, won't that cause the combiner to iterate repeatedly until it hits the MaxIterations limit?

As an experiment I made replaceInstWithConstant not make any change and increased the MaxIteration limit to 5. As I expected the combiner ran 5 times.

This pretty much settles it IMO. We have a case where even in the existing codebase there is an assumption that apply functions mutate MIR. So the bug-proneness is actually the opposite of what's being claimed here.

tschuett added a commit to tschuett/llvm-project that referenced this pull request Nov 23, 2024
Alternative for llvm#113764

It builds on a minimalistic approach with the legality check in match
and a blind apply. The precise patterns are used for better
compile-time and modularity. It also moves the pattern check into
combiner. While unary_undef_to_zero and propagate_undef_any_op rely on
custom C++ code for pattern matching.

Is there a limit on the number of patterns?

G_ANYEXT of undef -> undef
G_SEXT of undef -> 0
G_ZEXT of undef -> 0

The combine is not a member of the post legalizer combiner for AArch64.

Test:
llvm/test/CodeGen/AArch64/GlobalISel/combine-cast.mir
@tschuett tschuett closed this Nov 25, 2024
tschuett added a commit that referenced this pull request Dec 3, 2024
Alternative for #113764

It builds on a minimalistic approach with the legality check in match
and a blind apply. The precise patterns are used for better compile-time
and modularity. It also moves the pattern check into combiner. While
unary_undef_to_zero and propagate_undef_any_op rely on custom C++ code
for pattern matching.

Is there a limit on the number of patterns?

G_ANYEXT of undef -> undef
G_SEXT of undef -> 0
G_ZEXT of undef -> 0

The combine is not a member of the post legalizer combiner for AArch64.

Test:
llvm/test/CodeGen/AArch64/GlobalISel/combine-cast.mir
@arsenm
Copy link
Contributor

arsenm commented Dec 4, 2024

If the match function indicates that a change is being made, but the change is never made, won't that cause the combiner to iterate repeatedly until it hits the MaxIterations limit?

As an experiment I made replaceInstWithConstant not make any change and increased the MaxIteration limit to 5. As I expected the combiner ran 5 times.

So we still need this, but the legality check should be moved into the matcher? Shouldn't just abandon this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants