Skip to content

[GlobalISel] Add identity fold for fadd -0.0 #73296

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Nov 25, 2023

Conversation

davemgreen
Copy link
Collaborator

-0.0 acts as the identity element for fadd. This doesn't try to add 0.0 too, which would require nsz fast math flags.

-0.0 acts as the identity element for fadd. This doesn't try to add 0.0 too,
which would require nsz fast math flags.
@llvmbot
Copy link
Member

llvmbot commented Nov 24, 2023

@llvm/pr-subscribers-llvm-globalisel

@llvm/pr-subscribers-backend-aarch64

Author: David Green (davemgreen)

Changes

-0.0 acts as the identity element for fadd. This doesn't try to add 0.0 too, which would require nsz fast math flags.


Full diff: https://github.com/llvm/llvm-project/pull/73296.diff

2 Files Affected:

  • (modified) llvm/include/llvm/Target/GlobalISel/Combine.td (+9-1)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/combine-add.mir (+80)
diff --git a/llvm/include/llvm/Target/GlobalISel/Combine.td b/llvm/include/llvm/Target/GlobalISel/Combine.td
index 76b83cc5df073ae..9a84ab80157f35f 100644
--- a/llvm/include/llvm/Target/GlobalISel/Combine.td
+++ b/llvm/include/llvm/Target/GlobalISel/Combine.td
@@ -473,6 +473,13 @@ def right_identity_zero: GICombineRule<
   (apply (GIReplaceReg $dst, $lhs))
 >;
 
+def right_identity_neg_zero_fp: GICombineRule<
+  (defs root:$dst),
+  (match (G_FADD $dst, $x, $y):$root,
+    [{ return Helper.matchConstantFPOp(${y}, -0.0); }]),
+  (apply (GIReplaceReg $dst, $x))
+>;
+
 // Fold x op 1 -> x
 def right_identity_one_int: GICombineRule<
   (defs root:$dst),
@@ -1250,7 +1257,8 @@ def identity_combines : GICombineGroup<[select_same_val, right_identity_zero,
                                         add_sub_reg, buildvector_identity_fold,
                                         trunc_buildvector_fold,
                                         trunc_lshr_buildvector_fold,
-                                        bitcast_bitcast_fold, fptrunc_fpext_fold]>;
+                                        bitcast_bitcast_fold, fptrunc_fpext_fold,
+                                        right_identity_neg_zero_fp]>;
 
 def const_combines : GICombineGroup<[constant_fold_fp_ops, const_ptradd_to_i2p,
                                      overlapping_and, mulo_by_2, mulo_by_0,
diff --git a/llvm/test/CodeGen/AArch64/GlobalISel/combine-add.mir b/llvm/test/CodeGen/AArch64/GlobalISel/combine-add.mir
index 9a81a5fc176ab8d..fad3655da9d0132 100644
--- a/llvm/test/CodeGen/AArch64/GlobalISel/combine-add.mir
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/combine-add.mir
@@ -127,3 +127,83 @@ body:             |
     %3:_(<4 x s16>) = G_ADD %1, %2
     $x0 = COPY %3
 ...
+---
+name:            fadd_by_zero
+tracksRegLiveness: true
+body:             |
+  bb.0:
+    liveins: $d0
+    ; CHECK-LABEL: name: fadd_by_zero
+    ; CHECK: liveins: $d0
+    ; CHECK-NEXT: {{  $}}
+    ; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s64) = COPY $d0
+    ; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_FCONSTANT double 0.000000e+00
+    ; CHECK-NEXT: [[FADD:%[0-9]+]]:_(s64) = G_FADD [[COPY]], [[C]]
+    ; CHECK-NEXT: $d0 = COPY [[FADD]](s64)
+    %0:_(s64) = COPY $d0
+    %1:_(s64) = G_FCONSTANT double 0.000000e+00
+    %2:_(s64) = G_FADD %0, %1(s64)
+    $d0 = COPY %2(s64)
+...
+---
+name:            fadd_vector_by_zero
+alignment:       4
+tracksRegLiveness: true
+frameInfo:
+  maxAlignment:    1
+machineFunctionInfo: {}
+body:             |
+  bb.0:
+    liveins: $q0
+    ; CHECK-LABEL: name: fadd_vector_by_zero
+    ; CHECK: liveins: $q0
+    ; CHECK-NEXT: {{  $}}
+    ; CHECK-NEXT: [[COPY:%[0-9]+]]:_(<4 x s32>) = COPY $q0
+    ; CHECK-NEXT: [[C:%[0-9]+]]:_(s32) = G_FCONSTANT float 0.000000e+00
+    ; CHECK-NEXT: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[C]](s32), [[C]](s32), [[C]](s32), [[C]](s32)
+    ; CHECK-NEXT: [[FADD:%[0-9]+]]:_(<4 x s32>) = G_FADD [[COPY]], [[BUILD_VECTOR]]
+    ; CHECK-NEXT: $q0 = COPY [[FADD]](<4 x s32>)
+    %0:_(<4 x s32>) = COPY $q0
+    %1:_(s32) = G_FCONSTANT float 0.0
+    %2:_(<4 x s32>) = G_BUILD_VECTOR %1(s32), %1(s32), %1(s32), %1(s32)
+    %3:_(<4 x s32>) = G_FADD %0, %2(<4 x s32>)
+    $q0 = COPY %3(<4 x s32>)
+...
+
+---
+name:            fadd_by_neg_zero
+tracksRegLiveness: true
+body:             |
+  bb.0:
+    liveins: $d0
+    ; CHECK-LABEL: name: fadd_by_neg_zero
+    ; CHECK: liveins: $d0
+    ; CHECK-NEXT: {{  $}}
+    ; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s64) = COPY $d0
+    ; CHECK-NEXT: $d0 = COPY [[COPY]](s64)
+    %0:_(s64) = COPY $d0
+    %1:_(s64) = G_FCONSTANT double -0.000000e+00
+    %2:_(s64) = G_FADD %0, %1(s64)
+    $d0 = COPY %2(s64)
+...
+---
+name:            fadd_vector_by_neg_zero
+alignment:       4
+tracksRegLiveness: true
+frameInfo:
+  maxAlignment:    1
+machineFunctionInfo: {}
+body:             |
+  bb.0:
+    liveins: $q0
+    ; CHECK-LABEL: name: fadd_vector_by_neg_zero
+    ; CHECK: liveins: $q0
+    ; CHECK-NEXT: {{  $}}
+    ; CHECK-NEXT: [[COPY:%[0-9]+]]:_(<4 x s32>) = COPY $q0
+    ; CHECK-NEXT: $q0 = COPY [[COPY]](<4 x s32>)
+    %0:_(<4 x s32>) = COPY $q0
+    %1:_(s32) = G_FCONSTANT float -0.0
+    %2:_(<4 x s32>) = G_BUILD_VECTOR %1(s32), %1(s32), %1(s32), %1(s32)
+    %3:_(<4 x s32>) = G_FADD %0, %2(<4 x s32>)
+    $q0 = COPY %3(<4 x s32>)
+...

@davemgreen
Copy link
Collaborator Author

Cheers

@davemgreen davemgreen merged commit 9cee94b into llvm:main Nov 25, 2023
@davemgreen davemgreen deleted the gh-gi-faddnegzero branch November 25, 2023 08:35
Guzhu-AMD pushed a commit to GPUOpen-Drivers/llvm-project that referenced this pull request Nov 30, 2023
Local branch amd-gfx fe664d7 Merged main:024718313b52 into amd-gfx:ff07c6f2ec34
Remote branch main 9cee94b [GlobalISel] Add identity fold for fadd -0.0 (llvm#73296)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants