-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[DAGCombine] Add all users of the instruction recursively into worklist when an instruction is simplified #91772
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…st when an instruction is simplified
@llvm/pr-subscribers-llvm-selectiondag @llvm/pr-subscribers-backend-x86 Author: Shengchen Kan (KanRobert) ChangesFull diff: https://github.com/llvm/llvm-project/pull/91772.diff 2 Files Affected:
diff --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
index 4589d201d6203..796264394c046 100644
--- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -205,8 +205,10 @@ namespace {
/// When an instruction is simplified, add all users of the instruction to
/// the work lists because they might get more simplified now.
void AddUsersToWorklist(SDNode *N) {
- for (SDNode *Node : N->uses())
+ for (SDNode *Node : N->uses()) {
AddToWorklist(Node);
+ AddUsersToWorklist(Node);
+ }
}
/// Convenient shorthand to add a node and all of its user to the worklist.
diff --git a/llvm/test/CodeGen/X86/addcarry.ll b/llvm/test/CodeGen/X86/addcarry.ll
index f8d32fc2d2925..3895d3a51b366 100644
--- a/llvm/test/CodeGen/X86/addcarry.ll
+++ b/llvm/test/CodeGen/X86/addcarry.ll
@@ -317,21 +317,13 @@ define %S @readd(ptr nocapture readonly %this, %S %arg.b) nounwind {
; CHECK: # %bb.0: # %entry
; CHECK-NEXT: movq %rdi, %rax
; CHECK-NEXT: addq (%rsi), %rdx
-; CHECK-NEXT: movq 8(%rsi), %rdi
-; CHECK-NEXT: adcq $0, %rdi
-; CHECK-NEXT: setb %r10b
-; CHECK-NEXT: movzbl %r10b, %r10d
-; CHECK-NEXT: addq %rcx, %rdi
-; CHECK-NEXT: adcq 16(%rsi), %r10
-; CHECK-NEXT: setb %cl
-; CHECK-NEXT: movzbl %cl, %ecx
-; CHECK-NEXT: addq %r8, %r10
-; CHECK-NEXT: adcq 24(%rsi), %rcx
-; CHECK-NEXT: addq %r9, %rcx
-; CHECK-NEXT: movq %rdx, (%rax)
-; CHECK-NEXT: movq %rdi, 8(%rax)
-; CHECK-NEXT: movq %r10, 16(%rax)
-; CHECK-NEXT: movq %rcx, 24(%rax)
+; CHECK-NEXT: adcq 8(%rsi), %rcx
+; CHECK-NEXT: adcq 16(%rsi), %r8
+; CHECK-NEXT: adcq 24(%rsi), %r9
+; CHECK-NEXT: movq %rdx, (%rdi)
+; CHECK-NEXT: movq %rcx, 8(%rdi)
+; CHECK-NEXT: movq %r8, 16(%rdi)
+; CHECK-NEXT: movq %r9, 24(%rdi)
; CHECK-NEXT: retq
entry:
%0 = extractvalue %S %arg.b, 0
@@ -422,14 +414,9 @@ define i128 @addcarry_to_subcarry(i64 %a, i64 %b) nounwind {
; CHECK: # %bb.0:
; CHECK-NEXT: movq %rdi, %rax
; CHECK-NEXT: cmpq %rsi, %rdi
-; CHECK-NEXT: notq %rsi
+; CHECK-NEXT: sbbq %rsi, %rax
; CHECK-NEXT: setae %cl
-; CHECK-NEXT: addb $-1, %cl
-; CHECK-NEXT: adcq $0, %rax
-; CHECK-NEXT: setb %cl
; CHECK-NEXT: movzbl %cl, %edx
-; CHECK-NEXT: addq %rsi, %rax
-; CHECK-NEXT: adcq $0, %rdx
; CHECK-NEXT: retq
%notb = xor i64 %b, -1
%notb128 = zext i64 %notb to i128
|
This affects lots of test cases. I only updated one to prove that it may bring gain. However, the change will increase compile time. |
Maybe we can add a flag |
This is trying to achieve the same thing as the topological dag patches (https://github.com/RKSimon/llvm-project/tree/perf/topological-dag / #77475) Those patches result in a reduction in compile time: (https://llvm-compile-time-tracker.com/?config=Overview&stat=instructions%3Au&remote=RKSimon) There is the same problem with those patches as this PR - massive test churn including a large number of DAG combines that need fixing as they haven't always had to take into account that their operands might have already been combined further. I'm not sure how best to split this work tbh. |
Question: what's the motivation of https://github.com/RKSimon/llvm-project/tree/perf/topological-dag / #77475 ? To reduce the compile time? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is trying to achieve the same thing as the topological dag patches (https://github.com/RKSimon/llvm-project/tree/perf/topological-dag / #77475)
Question: what's the motivation of https://github.com/RKSimon/llvm-project/tree/perf/topological-dag / #77475 ? To reduce the compile time?
Not primarily at least. The motivation for that change is the same motivation as for this pull request, just implemented properly. If we visit nodes in the correct order, then there is no need to requeue them recursively.
(I tried to measure the compile-time impact of this patch, but it's not even possible because this basically makes time to compile stage2 clang infinite.)
No description provided.