Skip to content

[InstCombine] Reducing rust i128::midpoint instructions #99614

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

wizardengineer
Copy link
Contributor

Copy link

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be
notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write
permissions for the repository. In which case you can instead tag reviewers by
name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review
by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate
is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@llvmbot
Copy link
Member

llvmbot commented Jul 19, 2024

@llvm/pr-subscribers-llvm-transforms

Author: Julius Alexandre (medievalghoul)

Changes

Issue: rust-lang/rust#124790

This seems to only affect u128 data types

https://alive2.llvm.org/ce/z/QmFkhy
https://rust.godbolt.org/z/T7eKP3Tvo


Full diff: https://github.com/llvm/llvm-project/pull/99614.diff

1 Files Affected:

  • (added) llvm/test/Transforms/InstCombine/xor_lshr_and_i128.ll (+142)
diff --git a/llvm/test/Transforms/InstCombine/xor_lshr_and_i128.ll b/llvm/test/Transforms/InstCombine/xor_lshr_and_i128.ll
new file mode 100644
index 0000000000000..a219a945722d7
--- /dev/null
+++ b/llvm/test/Transforms/InstCombine/xor_lshr_and_i128.ll
@@ -0,0 +1,142 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt < %s -passes=instcombine -S | FileCheck %s
+
+define noundef i128 @xor_lshr_and(i128 noundef %x, i128 noundef %y) unnamed_addr #0 {
+; check-label: define noundef i128 @xor_lshr_and(
+; check-same: i128 noundef [[x:%.*]], i128 noundef [[y:%.*]]) unnamed_addr {
+; check-next:  [[start:.*:]]
+; check-next:    [[xor:%.*]] = xor i128 [[y]], [[x]]
+; check-next:    [[lshr:%.*]] = lshr i128 [[xor]], 1
+; check-next:    [[and:%.*]] = and i128 [[y]], [[x]]
+; check-next:    [[add:%.*]] = add i128 [[lshr]], [[and]]
+; check-next:    ret i128 [[add]]
+;
+; CHECK-LABEL: define noundef i128 @xor_lshr_and(
+; CHECK-SAME: i128 noundef [[X:%.*]], i128 noundef [[Y:%.*]]) unnamed_addr {
+; CHECK-NEXT:  [[START:.*:]]
+; CHECK-NEXT:    [[XOR:%.*]] = xor i128 [[Y]], [[X]]
+; CHECK-NEXT:    [[LSHR:%.*]] = lshr i128 [[XOR]], 1
+; CHECK-NEXT:    [[AND:%.*]] = and i128 [[Y]], [[X]]
+; CHECK-NEXT:    [[ADD:%.*]] = add i128 [[LSHR]], [[AND]]
+; CHECK-NEXT:    ret i128 [[ADD]]
+;
+start:
+  %xor = xor i128 %y, %x
+  %lshr = lshr i128 %xor, 1
+  %and = and i128 %y, %x
+  %add = add i128 %lshr, %and
+  ret i128 %add
+}
+
+define noundef i128 @xor_lshr_and_commuted1(i128 noundef %x, i128 noundef %y) unnamed_addr #0 {
+; CHECK-LABEL: define noundef i128 @xor_lshr_and_commuted1(
+; CHECK-SAME: i128 noundef [[X:%.*]], i128 noundef [[Y:%.*]]) unnamed_addr {
+; CHECK-NEXT:  [[START:.*:]]
+; CHECK-NEXT:    [[AND:%.*]] = and i128 [[Y]], [[X]]
+; CHECK-NEXT:    [[XOR:%.*]] = xor i128 [[Y]], [[X]]
+; CHECK-NEXT:    [[LSHR:%.*]] = lshr i128 [[XOR]], 1
+; CHECK-NEXT:    [[ADD:%.*]] = add i128 [[LSHR]], [[AND]]
+; CHECK-NEXT:    ret i128 [[ADD]]
+;
+start:
+  %and = and i128 %y, %x
+  %xor = xor i128 %y, %x
+  %lshr = lshr i128 %xor, 1
+  %add = add i128 %lshr, %and
+  ret i128 %add
+}
+
+define noundef i128 @xor_lshr_and_commuted2(i128 noundef %x, i128 noundef %y) unnamed_addr #0 {
+; CHECK-LABEL: define noundef i128 @xor_lshr_and_commuted2(
+; CHECK-SAME: i128 noundef [[X:%.*]], i128 noundef [[Y:%.*]]) unnamed_addr {
+; CHECK-NEXT:  [[START:.*:]]
+; CHECK-NEXT:    [[AND:%.*]] = and i128 [[Y]], [[X]]
+; CHECK-NEXT:    [[XOR:%.*]] = xor i128 [[Y]], [[X]]
+; CHECK-NEXT:    [[LSHR:%.*]] = lshr i128 [[XOR]], 1
+; CHECK-NEXT:    [[ADD:%.*]] = add i128 [[LSHR]], [[AND]]
+; CHECK-NEXT:    ret i128 [[ADD]]
+;
+start:
+  %and = and i128 %y, %x
+  %xor = xor i128 %y, %x
+  %lshr = lshr i128 %xor, 1
+  %add = add i128 %lshr, %and
+  ret i128 %add
+}
+
+declare void @use(i8)
+
+define noundef i128 @xor_lshr_and_multi_use(i128 noundef %x, i128 noundef %y) unnamed_addr #0 {
+; CHECK-LABEL: define noundef i128 @xor_lshr_and_multi_use(
+; CHECK-SAME: i128 noundef [[X:%.*]], i128 noundef [[Y:%.*]]) unnamed_addr {
+; CHECK-NEXT:  [[START:.*:]]
+; CHECK-NEXT:    [[XOR:%.*]] = xor i128 [[Y]], [[X]]
+; CHECK-NEXT:    call void @use(i128 [[XOR]])
+; CHECK-NEXT:    [[LSHR:%.*]] = lshr i128 [[XOR]], 1
+; CHECK-NEXT:    call void @use(i128 [[LSHR]])
+; CHECK-NEXT:    [[AND:%.*]] = and i128 [[Y]], [[X]]
+; CHECK-NEXT:    [[ADD:%.*]] = add i128 [[LSHR]], [[AND]]
+; CHECK-NEXT:    ret i128 [[ADD]]
+;
+start:
+  %xor = xor i128 %y, %x
+  call void @use(i128 %xor)
+  %lshr = lshr i128 %xor, 1
+  call void @use(i128 %lshr)
+  %and = and i128 %y, %x
+  %add = add i128 %lshr, %and
+  ret i128 %add
+}
+
+define noundef i128 @xor_lshr_and_negative(i128 noundef %x, i128 noundef %y) unnamed_addr #0 {
+; CHECK-LABEL: define noundef i128 @xor_lshr_and_negative(
+; CHECK-SAME: i128 noundef [[X:%.*]], i128 noundef [[Y:%.*]]) unnamed_addr {
+; CHECK-NEXT:  [[START:.*:]]
+; CHECK-NEXT:    [[XOR:%.*]] = xor i128 [[X]], -1
+; CHECK-NEXT:    [[AND:%.*]] = and i128 [[Y]], [[X]]
+; CHECK-NEXT:    [[ADD:%.*]] = add i128 [[AND]], [[XOR]]
+; CHECK-NEXT:    ret i128 [[ADD]]
+;
+start:
+  %xor = xor i128 %x, -1
+  %and = and i128 %y, %x
+  %add = add i128 %xor, %and
+  ret i128 %add
+}
+
+define noundef i32 @xor_lshr_and_negative2(i32 noundef %x, i32 noundef %y) unnamed_addr #0 {
+; CHECK-LABEL: define noundef i32 @xor_lshr_and_negative2(
+; CHECK-SAME: i32 noundef [[X:%.*]], i32 noundef [[Y:%.*]]) unnamed_addr {
+; CHECK-NEXT:  [[START:.*:]]
+; CHECK-NEXT:    [[XOR:%.*]] = xor i32 [[Y]], [[X]]
+; CHECK-NEXT:    [[LSHR:%.*]] = lshr i32 [[XOR]], 1
+; CHECK-NEXT:    [[AND:%.*]] = and i32 [[Y]], [[X]]
+; CHECK-NEXT:    [[ADD:%.*]] = add i32 [[LSHR]], [[AND]]
+; CHECK-NEXT:    ret i32 [[ADD]]
+;
+start:
+  %xor = xor i32 %y, %x
+  %lshr = lshr i32 %xor, 1
+  %and = and i32 %y, %x
+  %add = add i32 %lshr, %and
+  ret i32 %add
+}
+
+define noundef <2 x i128> @xor_lshr_and_vec(<2 x i128> noundef %x, <2 x i128> noundef %y) unnamed_addr #0 {
+; CHECK-LABEL: define noundef <2 x i128> @xor_lshr_and_vec(
+; CHECK-SAME: <2 x i128> noundef [[X:%.*]], <2 x i128> noundef [[Y:%.*]]) unnamed_addr {
+; CHECK-NEXT:  [[START:.*:]]
+; CHECK-NEXT:    [[XOR:%.*]] = xor <2 x i128> [[Y]], [[X]]
+; CHECK-NEXT:    [[LSHR:%.*]] = lshr <2 x i128> [[XOR]], <i128 1, i128 1>
+; CHECK-NEXT:    [[AND:%.*]] = and <2 x i128> [[Y]], [[X]]
+; CHECK-NEXT:    [[ADD:%.*]] = add <2 x i128> [[LSHR]], [[AND]]
+; CHECK-NEXT:    ret <2 x i128> [[ADD]]
+;
+start:
+  %xor = xor <2 x i128> %y, %x
+  %lshr = lshr <2 x i128> %xor, <i128 1, i128 1>
+  %and = and <2 x i128> %y, %x
+  %add = add <2 x i128> %lshr, %and
+  ret <2 x i128> %add
+}
+

@dtcxzyw
Copy link
Member

dtcxzyw commented Jul 19, 2024

Please read the guideline https://llvm.org/docs/InstCombineContributorGuide.html :)

@wizardengineer wizardengineer requested a review from nikic as a code owner July 19, 2024 08:58
Copy link

github-actions bot commented Jul 19, 2024

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff 401d7bcabc0affc38a6e52ffad782eedf0a5ec5a be6de10dbb4a9809d0e008de7dccd0cd4beb4614 --extensions cpp -- llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp
View the diff from clang-format here.
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp b/llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp
index 3020e81e91..28f408db36 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp
@@ -1483,7 +1483,7 @@ static Instruction *foldBoxMultiply(BinaryOperator &I) {
   return nullptr;
 }
 
-// Relating to the i128::midpoint in rust nightly 
+// Relating to the i128::midpoint in rust nightly
 // we're finding this expression: ((Y ^ X) >> 1 + (Y & X))
 // with the i128 data type and reducing the amount of asm instructions
 // generated.
@@ -1494,29 +1494,32 @@ static Instruction *foldMidpointExpression(BinaryOperator &I) {
     return nullptr;
 
   if (!match(&I, m_Add(m_LShr(m_Xor(m_Value(X), m_Value(Y)), m_ConstantInt()),
-                      m_c_And(m_Deferred(X), m_Deferred(Y)))))
+                       m_c_And(m_Deferred(X), m_Deferred(Y)))))
     return nullptr;
 
-  IRBuilder<> Builder(&I); 
+  IRBuilder<> Builder(&I);
   Module *Mod = I.getModule();
 
   // Create the call llvm.uadd.with.overflow.i128
-  Function *UAddWithOverflow = Intrinsic::getDeclaration(Mod, Intrinsic::uadd_with_overflow, 
-      Type::getInt128Ty(Mod->getContext()));
+  Function *UAddWithOverflow = Intrinsic::getDeclaration(
+      Mod, Intrinsic::uadd_with_overflow, Type::getInt128Ty(Mod->getContext()));
   CallInst *UAddCall = Builder.CreateCall(UAddWithOverflow, {Y, X});
-  UAddCall->setTailCall();  // Mark the call as a tail call
-  
-  // Extract the sum and the 
+  UAddCall->setTailCall(); // Mark the call as a tail call
+
+  // Extract the sum and the
   Value *Sum = Builder.CreateExtractValue(UAddCall, 0);
   Value *Overflow = Builder.CreateExtractValue(UAddCall, 1);
 
   // Create the right shift for the sum element of
   // overflow
-  Value *LShrVal = Builder.CreateLShr(Sum, ConstantInt::get(Type::getInt128Ty(I.getContext()), 1));
+  Value *LShrVal = Builder.CreateLShr(
+      Sum, ConstantInt::get(Type::getInt128Ty(I.getContext()), 1));
 
   // Create the select instruction
-  Value *SelectVal = Builder.CreateSelect(Overflow,
-      ConstantInt::get(Type::getInt128Ty(I.getContext()), APInt(128, 1).shl(127)), 
+  Value *SelectVal = Builder.CreateSelect(
+      Overflow,
+      ConstantInt::get(Type::getInt128Ty(I.getContext()),
+                       APInt(128, 1).shl(127)),
       ConstantInt::get(Type::getInt128Ty(I.getContext()), 0));
 
   Instruction *OrVal = BinaryOperator::CreateOr(LShrVal, SelectVal);

@wizardengineer wizardengineer requested a review from dtcxzyw July 19, 2024 22:03
@wizardengineer
Copy link
Contributor Author

Hey @dtcxzyw, not to be a bother but I was wondering was there anything else i needed to change? :)

@dtcxzyw
Copy link
Member

dtcxzyw commented Jul 20, 2024

I am not sure whether it is appropriate to do this transform in InstCombine. It would be better to do this in SDAG.
cc @RKSimon @topperc

@wizardengineer
Copy link
Contributor Author

I am not sure whether it is appropriate to do this transform in InstCombine. It would be better to do this in SDAG.

Ah, I see. Just curious, why would I put this in SDAG rather than InstCombine?

@RKSimon
Copy link
Collaborator

RKSimon commented Jul 20, 2024

This is just a avgflooru node, which DAG can already recognise and handle.

Optimized lowered selection DAG: %bb.0 'src:start'
SelectionDAG has 22 nodes:
  t0: ch,glue = EntryToken
    t20: i64 = extract_element t27, Constant:i64<0>
  t23: ch,glue = CopyToReg t0, Register:i64 $rax, t20
    t18: i64 = extract_element t27, Constant:i64<1>
  t25: ch,glue = CopyToReg t23, Register:i64 $rdx, t18, t23:1
      t6: i64,ch = CopyFromReg t0, Register:i64 %2
      t8: i64,ch = CopyFromReg t0, Register:i64 %3
    t10: i128 = build_pair t6, t8
      t2: i64,ch = CopyFromReg t0, Register:i64 %0
      t4: i64,ch = CopyFromReg t0, Register:i64 %1
    t9: i128 = build_pair t2, t4
  t27: i128 = avgflooru t10, t9
  t26: ch = X86ISD::RET_GLUE t25, TargetConstant:i32<0>, Register:i64 $rax, Register:i64 $rdx, t25:1

What we're missing is better legalization handling in DAGTypeLegalizer::ExpandIntRes_AVG - your code expansion should work pretty well in there instead of InstCombine, where we shouldn't be making assumptions about type legality.

@wizardengineer
Copy link
Contributor Author

I understand, thank you for taking the time to explain it to me. I'll close this PR

@RKSimon
Copy link
Collaborator

RKSimon commented Jul 22, 2024

@medievalghoul Are you happy to work on a DAG equivalent?

@wizardengineer
Copy link
Contributor Author

@RKSimon I'd be more than happy to work on a DAG equivalent, I just started today.

@wizardengineer wizardengineer deleted the 07-19-_instcombine_reducing_rust_i128_midpoint_instructions branch July 28, 2024 23:37
topperc pushed a commit that referenced this pull request Jul 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants