Skip to content

[SystemZ] Fix bitwidth problem in FindReplicatedImm() (NFC). #115383

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Nov 11, 2024

Conversation

JonPsson1
Copy link
Contributor

A test case emerged with an i32 truncating store of an i64 constant operand, where the i64 constant did not fit in 32 bits, which caused FindReplicatedImm() to crash.

As SystemZVectorConstantInfo() does not really care about the bitwidth of its initial APInt, it seems simple enough to just not recreate it with a limited bitwidth (corresponding to MemVT or Splat VT), but rather just pass the APInt of the ConstantSDNode directly instead.

@llvmbot
Copy link
Member

llvmbot commented Nov 7, 2024

@llvm/pr-subscribers-backend-systemz

Author: Jonas Paulsson (JonPsson1)

Changes

A test case emerged with an i32 truncating store of an i64 constant operand, where the i64 constant did not fit in 32 bits, which caused FindReplicatedImm() to crash.

As SystemZVectorConstantInfo() does not really care about the bitwidth of its initial APInt, it seems simple enough to just not recreate it with a limited bitwidth (corresponding to MemVT or Splat VT), but rather just pass the APInt of the ConstantSDNode directly instead.


Full diff: https://github.com/llvm/llvm-project/pull/115383.diff

2 Files Affected:

  • (modified) llvm/lib/Target/SystemZ/SystemZISelLowering.cpp (+4-4)
  • (added) llvm/test/CodeGen/SystemZ/dag-combine-07.ll (+26)
diff --git a/llvm/lib/Target/SystemZ/SystemZISelLowering.cpp b/llvm/lib/Target/SystemZ/SystemZISelLowering.cpp
index 3999b54de81b65..ed170bae6aec30 100644
--- a/llvm/lib/Target/SystemZ/SystemZISelLowering.cpp
+++ b/llvm/lib/Target/SystemZ/SystemZISelLowering.cpp
@@ -7218,12 +7218,12 @@ SDValue SystemZTargetLowering::combineSTORE(
 
     // Find a replicated immediate and return it if found in Word and its
     // type in WordVT.
-    auto FindReplicatedImm = [&](ConstantSDNode *C, unsigned TotBytes) {
+    auto FindReplicatedImm = [&](ConstantSDNode *C) {
       // Some constants are better handled with a scalar store.
       if (C->getAPIntValue().getBitWidth() > 64 || C->isAllOnes() ||
           isInt<16>(C->getSExtValue()) || MemVT.getStoreSize() <= 2)
         return;
-      SystemZVectorConstantInfo VCI(APInt(TotBytes * 8, C->getZExtValue()));
+      SystemZVectorConstantInfo VCI(C->getAPIntValue());
       if (VCI.isVectorConstantLegal(Subtarget) &&
           VCI.Opcode == SystemZISD::REPLICATE) {
         Word = DAG.getConstant(VCI.OpVals[0], SDLoc(SN), MVT::i32);
@@ -7261,12 +7261,12 @@ SDValue SystemZTargetLowering::combineSTORE(
         DAG.isSplatValue(Op1, true/*AllowUndefs*/)) {
       SDValue SplatVal = Op1->getOperand(0);
       if (auto *C = dyn_cast<ConstantSDNode>(SplatVal))
-        FindReplicatedImm(C, SplatVal.getValueType().getStoreSize());
+        FindReplicatedImm(C);
       else
         FindReplicatedReg(SplatVal);
     } else {
       if (auto *C = dyn_cast<ConstantSDNode>(Op1))
-        FindReplicatedImm(C, MemVT.getStoreSize());
+        FindReplicatedImm(C);
       else
         FindReplicatedReg(Op1);
     }
diff --git a/llvm/test/CodeGen/SystemZ/dag-combine-07.ll b/llvm/test/CodeGen/SystemZ/dag-combine-07.ll
new file mode 100644
index 00000000000000..11359009c2899b
--- /dev/null
+++ b/llvm/test/CodeGen/SystemZ/dag-combine-07.ll
@@ -0,0 +1,26 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=z16 | FileCheck %s
+;
+; Test that SystemZTargetLowering::combineSTORE() does not crash on a
+; truncated immediate.
+
+@G1 = external global i64, align 8
+@G2 = external global i64, align 8
+
+define void @func_5(ptr %Dst) {
+; CHECK-LABEL: func_5:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    lgrl %r1, G2@GOT
+; CHECK-NEXT:    llihl %r0, 50
+; CHECK-NEXT:    oill %r0, 2
+; CHECK-NEXT:    stg %r0, 0(%r1)
+; CHECK-NEXT:    lgrl %r1, G1@GOT
+; CHECK-NEXT:    stg %r0, 0(%r1)
+; CHECK-NEXT:    st %r0, 0(%r2)
+; CHECK-NEXT:    br %r14
+  store i64 214748364802, ptr @G2, align 8
+  store i64 214748364802, ptr @G1, align 8
+  %1 = load i32, ptr getelementptr inbounds (i8, ptr @G1, i64 4), align 4
+  store i32 %1, ptr %Dst, align 4
+  ret void
+}

@uweigand
Copy link
Member

As SystemZVectorConstantInfo() does not really care about the bitwidth of its initial APInt

I don't think this is quite true. It does use the initial bitwidth to decide whether the value can be generated as a splat in some cases. For example, it would appear that the value 0x01010101 can be created via VREPIB if the initial bitwidth is 32 (because then it is a splat), but not if the initial bitwidth is 64 (then it is not).

@JonPsson1
Copy link
Contributor Author

Ouch... ofc, that is needed for the SplatBits.

Patch updated to instead truncate the APInt if the store is truncating. This should work also for the case of a BuildVector being truncated to less than the element width.

This is NFC on benchmarks, so not sure if it is worth taking the care to have a handling there. Alternatively FindReplicatedImm() could just bail if the immediate does not fit..?

Val = Val.trunc(TotBytes * 8);
}

SystemZVectorConstantInfo VCI(Val);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still uses the size of Val rather than TotBytes, so it still pessimizes some cases (where the original Val.getExtValue() does fit, like in my example of a 64->32 truncating store of 0x01010101, which should use VREPI).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, was just about to change that.

I guess in the non-truncating case the original APInt should be of the matching bitwidth, but it doesn't hurt to be explicit and use TotBytes * 8.

And in the truncating case, it would be truncated where also the Bitwidth of the new APInt should be reduced.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and yes, if there is a truncating store with a value that fits, this also makes a difference.

Copy link
Member

@uweigand uweigand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@JonPsson1 JonPsson1 merged commit 77ddcf7 into llvm:main Nov 11, 2024
5 of 7 checks passed
@JonPsson1 JonPsson1 deleted the VecConst branch November 11, 2024 21:16
Groverkss pushed a commit to iree-org/llvm-project that referenced this pull request Nov 15, 2024
A test case emerged with an i32 truncating store of an i64 constant
operand, where the i64 constant did not fit in 32 bits, which caused
FindReplicatedImm() to crash.

Make sure to truncate the APInt in these cases.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants