[llvm][dfa-jump-threading] Allow DFAJumpThreading with optsize #83049

ilovepi · 2024-02-26T19:16:04Z

Though DFAJumpThreadingPass typically introduces size increases, targets
optimizing for size may want to enable the pass for certain TUs and
workloads. This change adds options to override the default behavior,
and enable the DFAJumpThreadingPass to run, even when compiling for
size.

To experiment with DFAJumpThreading and with minsize,
pass -enable-dfa-jump-optsize to the backend.

Created using spr 1.3.4

optimizing for size Pull Request: llvm#83049

Created using spr 1.3.4

ilovepi · 2024-02-26T22:29:01Z

llvm/lib/Passes/PassBuilderPipelines.cpp

+static cl::opt<bool>
+    DFAJumpThreadingOptSize("dfa-jump-thread-optsize",
+                           cl::desc("Enable DFA jump threading when optimizing for size"),
+                           cl::init(false), cl::Hidden);
+


Its rather unfortunate that we need 2 flags to enable a single behavior ... Is there a common place we can put this flag and use it here and in DFAJumpThreading.cpp?

ilovepi · 2024-02-26T22:31:51Z

cc: @petrhosek @PiJoules since I think they're interested.

aeubanks · 2024-02-26T22:43:35Z

Can we have a more principled approach to determine when to allow this pass to run than a cl::opt? For example, change the CostThreshold in the pass depending on whether or not the function has optsize.

ilovepi · 2024-02-26T23:00:10Z

Can we have a more principled approach to determine when to allow this pass to run than a cl::opt? For example, change the CostThreshold in the pass depending on whether or not the function has optsize.

Sure. I can take a look at doing it that way instead. it will probably be much cleaner.

nikic · 2024-02-27T06:08:58Z

This pass is not part of the default pipeline -- do you enable this pass downstream, or do you want to add options for all uses of minsize, even if there is no evidence of usefulness?

ilovepi · 2024-02-27T06:56:36Z

This pass is not part of the default pipeline -- do you enable this pass downstream, or do you want to add options for all uses of minsize, even if there is no evidence of usefulness?

We’d like to see if this can be useful to some of our size constrained users in the embedded space. Particularly because it has been proposed to turn this on in the default pipeline. https://discourse.llvm.org/t/rfc-enable-dfa-jumpthreading-pass-by-default/77231. Currently we don’t have evidence that this is beneficial, but for these targets we also don’t have a way to vet that with the current implementation.

So to be concrete, I’d like the ability to enable this pass somehow even if compiling for size. If we can control this behavior with a threshold, then mechanically I don’t see much difference between that an the flags I’ve added in this patch. But right now we don’t have a way to evaluate if this may be useful at any threshold, and I think it’s useful if we can turn this knob in some way without changing the default behavior.

aeubanks · 2024-02-27T16:45:25Z

also, we should remove pipeline checks for "optsize" and move those checks into the passes themselves

…ilder Created using spr 1.3.4

ilovepi · 2024-02-27T21:26:31Z

also, we should remove pipeline checks for "optsize" and move those checks into the passes themselves

done in #83188

Created using spr 1.3.4

llvmbot · 2024-02-28T18:53:49Z

@llvm/pr-subscribers-llvm-transforms

Author: Paul Kirth (ilovepi)

Changes

Though DFAJumpThreadingPass typically introduces size increases, targets
optimizing for size may want to enable the pass for certain TUs and
workloads. This change adds options to override the default behavior,
and enable the DFAJumpThreadingPass to run, even when compiling for
size.

Full diff: https://github.com/llvm/llvm-project/pull/83049.diff

3 Files Affected:

(modified) llvm/lib/Passes/PassBuilderPipelines.cpp (+5)
(modified) llvm/lib/Transforms/Scalar/DFAJumpThreading.cpp (+6-1)
(modified) llvm/test/Transforms/DFAJumpThreading/negative.ll (+49)

diff --git a/llvm/lib/Passes/PassBuilderPipelines.cpp b/llvm/lib/Passes/PassBuilderPipelines.cpp
index cbbbec0ccc8c4d..3b2d4523ef7e8e 100644
--- a/llvm/lib/Passes/PassBuilderPipelines.cpp
+++ b/llvm/lib/Passes/PassBuilderPipelines.cpp
@@ -222,6 +222,11 @@ static cl::opt<bool>
     EnableDFAJumpThreading("enable-dfa-jump-thread",
                            cl::desc("Enable DFA jump threading"),
                            cl::init(false), cl::Hidden);
+static cl::opt<bool>
+    DFAJumpThreadingOptSize("dfa-jump-thread-optsize",
+                           cl::desc("Enable DFA jump threading when optimizing for size"),
+                           cl::init(false), cl::Hidden);
+
 
 // TODO: turn on and remove flag
 static cl::opt<bool> EnablePGOForceFunctionAttrs(
diff --git a/llvm/lib/Transforms/Scalar/DFAJumpThreading.cpp b/llvm/lib/Transforms/Scalar/DFAJumpThreading.cpp
index 85d4065286e41f..ce5e823e531e0f 100644
--- a/llvm/lib/Transforms/Scalar/DFAJumpThreading.cpp
+++ b/llvm/lib/Transforms/Scalar/DFAJumpThreading.cpp
@@ -110,6 +110,11 @@ static cl::opt<unsigned>
                   cl::desc("Maximum cost accepted for the transformation"),
                   cl::Hidden, cl::init(50));
 
+static cl::opt<bool> DFAJumpThreadIgnoreOptSize(
+    "dfa-jump-ignore-optsize",
+    cl::desc("Enable dfa jump threading, even when optimizing for size"),
+    cl::Hidden, cl::init(false));
+
 namespace {
 
 class SelectInstToUnfold {
@@ -1244,7 +1249,7 @@ struct TransformDFA {
 bool DFAJumpThreading::run(Function &F) {
   LLVM_DEBUG(dbgs() << "\nDFA Jump threading: " << F.getName() << "\n");
 
-  if (F.hasOptSize()) {
+  if (!DFAJumpThreadIgnoreOptSize && F.hasOptSize()) {
     LLVM_DEBUG(dbgs() << "Skipping due to the 'minsize' attribute\n");
     return false;
   }
diff --git a/llvm/test/Transforms/DFAJumpThreading/negative.ll b/llvm/test/Transforms/DFAJumpThreading/negative.ll
index a9642814276992..c822b6613f7334 100644
--- a/llvm/test/Transforms/DFAJumpThreading/negative.ll
+++ b/llvm/test/Transforms/DFAJumpThreading/negative.ll
@@ -1,6 +1,7 @@
 ; RUN: opt -passes=dfa-jump-threading -dfa-cost-threshold=25 -pass-remarks-missed='dfa-jump-threading' -pass-remarks-output=%t -disable-output %s
 ; RUN: FileCheck --input-file %t --check-prefix=REMARK %s
 ; RUN: opt -S -passes=dfa-jump-threading %s | FileCheck %s
+; RUN: opt -S -passes=dfa-jump-threading -dfa-jump-ignore-optsize %s | FileCheck %s --check-prefix=IGNORESIZE
 
 ; This negative test case checks that the optimization doesn't trigger
 ; when the code size cost is too high.
@@ -186,6 +187,54 @@ define i32 @negative5(i32 %num) minsize {
 ; CHECK:       for.end:
 ; CHECK-NEXT:    ret i32 0
 ;
+; IGNORESIZE-LABEL: define i32 @negative5(
+; IGNORESIZE-SAME: i32 [[NUM:%.*]]) #[[ATTR0:[0-9]+]] {
+; IGNORESIZE-NEXT:  entry:
+; IGNORESIZE-NEXT:    br label [[FOR_BODY:%.*]]
+; IGNORESIZE:       for.body:
+; IGNORESIZE-NEXT:    [[COUNT:%.*]] = phi i32 [ 0, [[ENTRY:%.*]] ], [ [[INC:%.*]], [[FOR_INC:%.*]] ]
+; IGNORESIZE-NEXT:    [[STATE:%.*]] = phi i32 [ 1, [[ENTRY]] ], [ poison, [[FOR_INC]] ]
+; IGNORESIZE-NEXT:    switch i32 [[STATE]], label [[FOR_INC_JT1:%.*]] [
+; IGNORESIZE-NEXT:    i32 1, label [[CASE1:%.*]]
+; IGNORESIZE-NEXT:    i32 2, label [[CASE2:%.*]]
+; IGNORESIZE-NEXT:    ]
+; IGNORESIZE:       for.body.jt2:
+; IGNORESIZE-NEXT:    [[COUNT_JT2:%.*]] = phi i32 [ [[INC_JT2:%.*]], [[FOR_INC_JT2:%.*]] ]
+; IGNORESIZE-NEXT:    [[STATE_JT2:%.*]] = phi i32 [ [[STATE_NEXT_JT2:%.*]], [[FOR_INC_JT2]] ]
+; IGNORESIZE-NEXT:    br label [[CASE2]]
+; IGNORESIZE:       for.body.jt1:
+; IGNORESIZE-NEXT:    [[COUNT_JT1:%.*]] = phi i32 [ [[INC_JT1:%.*]], [[FOR_INC_JT1]] ]
+; IGNORESIZE-NEXT:    [[STATE_JT1:%.*]] = phi i32 [ [[STATE_NEXT_JT1:%.*]], [[FOR_INC_JT1]] ]
+; IGNORESIZE-NEXT:    br label [[CASE1]]
+; IGNORESIZE:       case1:
+; IGNORESIZE-NEXT:    [[COUNT2:%.*]] = phi i32 [ [[COUNT_JT1]], [[FOR_BODY_JT1:%.*]] ], [ [[COUNT]], [[FOR_BODY]] ]
+; IGNORESIZE-NEXT:    br label [[FOR_INC_JT2]]
+; IGNORESIZE:       case2:
+; IGNORESIZE-NEXT:    [[COUNT1:%.*]] = phi i32 [ [[COUNT_JT2]], [[FOR_BODY_JT2:%.*]] ], [ [[COUNT]], [[FOR_BODY]] ]
+; IGNORESIZE-NEXT:    [[CMP:%.*]] = icmp eq i32 [[COUNT1]], 50
+; IGNORESIZE-NEXT:    br i1 [[CMP]], label [[FOR_INC_JT1]], label [[SI_UNFOLD_FALSE:%.*]]
+; IGNORESIZE:       si.unfold.false:
+; IGNORESIZE-NEXT:    br label [[FOR_INC_JT2]]
+; IGNORESIZE:       for.inc:
+; IGNORESIZE-NEXT:    [[INC]] = add nsw i32 undef, 1
+; IGNORESIZE-NEXT:    [[CMP_EXIT:%.*]] = icmp slt i32 [[INC]], [[NUM]]
+; IGNORESIZE-NEXT:    br i1 [[CMP_EXIT]], label [[FOR_BODY]], label [[FOR_END:%.*]]
+; IGNORESIZE:       for.inc.jt2:
+; IGNORESIZE-NEXT:    [[COUNT4:%.*]] = phi i32 [ [[COUNT1]], [[SI_UNFOLD_FALSE]] ], [ [[COUNT2]], [[CASE1]] ]
+; IGNORESIZE-NEXT:    [[STATE_NEXT_JT2]] = phi i32 [ 2, [[CASE1]] ], [ 2, [[SI_UNFOLD_FALSE]] ]
+; IGNORESIZE-NEXT:    [[INC_JT2]] = add nsw i32 [[COUNT4]], 1
+; IGNORESIZE-NEXT:    [[CMP_EXIT_JT2:%.*]] = icmp slt i32 [[INC_JT2]], [[NUM]]
+; IGNORESIZE-NEXT:    br i1 [[CMP_EXIT_JT2]], label [[FOR_BODY_JT2]], label [[FOR_END]]
+; IGNORESIZE:       for.inc.jt1:
+; IGNORESIZE-NEXT:    [[COUNT3:%.*]] = phi i32 [ [[COUNT1]], [[CASE2]] ], [ [[COUNT]], [[FOR_BODY]] ]
+; IGNORESIZE-NEXT:    [[STATE_NEXT_JT1]] = phi i32 [ 1, [[CASE2]] ], [ 1, [[FOR_BODY]] ]
+; IGNORESIZE-NEXT:    [[INC_JT1]] = add nsw i32 [[COUNT3]], 1
+; IGNORESIZE-NEXT:    [[CMP_EXIT_JT1:%.*]] = icmp slt i32 [[INC_JT1]], [[NUM]]
+; IGNORESIZE-NEXT:    br i1 [[CMP_EXIT_JT1]], label [[FOR_BODY_JT1]], label [[FOR_END]]
+; IGNORESIZE:       for.end:
+; IGNORESIZE-NEXT:    ret i32 0
+;
+
 entry:
   br label %for.body

ilovepi · 2024-02-28T21:36:37Z

Can we have a more principled approach to determine when to allow this pass to run than a cl::opt? For example, change the CostThreshold in the pass depending on whether or not the function has optsize.

So, I've experimented with this, and I'm not sure this is the right approach to take right away. I think its premature to suppose that at -Oz we would ever want this pass on by default. I don't think its a good idea to try and guess what the threshold should be. Likely, that is something specific targets would need to tune on their own via -dfa-cost-threshold=. I don't see a better way to allow us to experiment with this, than adding this flag, which is simple(allow the pass w/ optsize), and we can still use -dfa-cost-threshold= to determine what we should choose.

I'd like to avoid running afoul of this pass being on by default in the future for size optimized targets when we're still in the experimental stage of evaluating this passes usefulness when combined with size optimizations. I agree that if we believe this is a good default in the future, we should adjust the logic to modify the threshold for optsize, but right now I don't have confidence that is the correct way forward.

I'm looking to see if there is a way to use -enable-dfa-jump-threading to control this behavior, instead of introducing a new cl::opt.

Created using spr 1.3.4

ilovepi · 2024-02-28T22:34:19Z

So, if DFAJumpThreading is ever on by default, then I think its pretty simple to repurpose -enable-dfa-jump-thread to enable the behavior in the pass, and we can drop -enable-dfa-jump-optsize @aeubanks, does that sound like a better plan?

aeubanks · 2024-03-05T23:10:25Z

I don't think its a good idea to try and guess what the threshold should be. Likely, that is something specific targets would need to tune on their own via -dfa-cost-threshold=.
The inliner has a target-independent threshold and it's much more important than this pass, so I don't see why this pass can't do the same thing.

As for experimenting, I was hoping that you could just experiment locally with local changes to LLVM. Is it much easier to experiment with a flag in LLVM release? If so then this is fine.

This pass is already experimental (given it's not enabled by default) so I'm not too worried about regressing it in terms of binary size.

nikic · 2024-03-06T08:36:58Z

As for experimenting, I was hoping that you could just experiment locally with local changes to LLVM. Is it much easier to experiment with a flag in LLVM release? If so then this is fine.

Agree that experimentation should be done with local changes to LLVM.

Options like these tend to get added and never removed, even if they have outlived their usefulness. I could kind of see the rationale for adding it to LoopRotate (which is well-known to be problematic in this regard, and it's plausible to assume that more than one organization would have interest in testing the option), but I wouldn't want to encourage this as a general pattern for exploration of minsize optimization changes.

ilovepi · 2024-03-19T20:26:48Z

Well, since I'm not going to have time to work on this in the near future, and it will be quite a while before I can roll out an experimental toolchain to partners, I'm going to close this for now. Its easy enough to re-open when we have better feeling for its value, but I do still think its good to have a way to opt into a transform irrespective of the optlevel.

[𝘀𝗽𝗿] initial version

f3009e8

Created using spr 1.3.4

ilovepi added a commit to ilovepi/llvm-project that referenced this pull request Feb 26, 2024

[llvm][dfa-jump-threading] Add option to allow DFAJumpThreading when

2abc273

optimizing for size Pull Request: llvm#83049

Refactor option names, and update test

fc2e672

Created using spr 1.3.4

ilovepi changed the title ~~[llvm][dfa-jump-threading] Add option to allow DFAJumpThreading when~~ [llvm][dfa-jump-threading] Allow DFAJumpThreading with optsize Feb 26, 2024

Make variable static + rename

ffaa39b

Created using spr 1.3.4

ilovepi commented Feb 26, 2024

View reviewed changes

ilovepi requested review from nikic, aeubanks and fhahn February 26, 2024 22:30

Revise flag handling after removing the optsize check from PipelineBu…

f2b0c0d

…ilder Created using spr 1.3.4

Fix missing match in test, due to a deleted line

72ecbcc

Created using spr 1.3.4

ilovepi marked this pull request as ready for review February 28, 2024 18:53

llvmbot added the llvm:transforms label Feb 28, 2024

ilovepi mentioned this pull request Feb 28, 2024

[llvm] Remove pipeline checks for optsize for DFAJumpThreadingPass #83318

Merged

Update option name for clarity

6049b21

Created using spr 1.3.4

ilovepi closed this Mar 19, 2024

ilovepi deleted the users/ilovepi/spr/llvmdfa-jump-threading-add-option-to-allow-dfajumpthreading-when branch March 19, 2024 20:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[llvm][dfa-jump-threading] Allow DFAJumpThreading with optsize #83049

[llvm][dfa-jump-threading] Allow DFAJumpThreading with optsize #83049

Uh oh!

ilovepi commented Feb 26, 2024 •

edited

Loading

Uh oh!

ilovepi Feb 26, 2024

Uh oh!

ilovepi commented Feb 26, 2024

Uh oh!

aeubanks commented Feb 26, 2024

Uh oh!

ilovepi commented Feb 26, 2024

Uh oh!

nikic commented Feb 27, 2024

Uh oh!

ilovepi commented Feb 27, 2024 •

edited

Loading

Uh oh!

aeubanks commented Feb 27, 2024

Uh oh!

ilovepi commented Feb 27, 2024

Uh oh!

llvmbot commented Feb 28, 2024

Uh oh!

ilovepi commented Feb 28, 2024

Uh oh!

ilovepi commented Feb 28, 2024

Uh oh!

aeubanks commented Mar 5, 2024

Uh oh!

nikic commented Mar 6, 2024

Uh oh!

ilovepi commented Mar 19, 2024

Uh oh!

Uh oh!

[llvm][dfa-jump-threading] Allow DFAJumpThreading with optsize #83049

[llvm][dfa-jump-threading] Allow DFAJumpThreading with optsize #83049

Uh oh!

Conversation

ilovepi commented Feb 26, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ilovepi Feb 26, 2024

Choose a reason for hiding this comment

Uh oh!

ilovepi commented Feb 26, 2024

Uh oh!

aeubanks commented Feb 26, 2024

Uh oh!

ilovepi commented Feb 26, 2024

Uh oh!

nikic commented Feb 27, 2024

Uh oh!

ilovepi commented Feb 27, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

aeubanks commented Feb 27, 2024

Uh oh!

ilovepi commented Feb 27, 2024

Uh oh!

llvmbot commented Feb 28, 2024

Uh oh!

ilovepi commented Feb 28, 2024

Uh oh!

ilovepi commented Feb 28, 2024

Uh oh!

aeubanks commented Mar 5, 2024

Uh oh!

nikic commented Mar 6, 2024

Uh oh!

ilovepi commented Mar 19, 2024

Uh oh!

Uh oh!

ilovepi commented Feb 26, 2024 •

edited

Loading

ilovepi commented Feb 27, 2024 •

edited

Loading