[LLVM][AArch64] Correctly lower funnel shifts by constants. #140058

paulwalker-arm · 2025-05-15T13:33:05Z

Prevent LowerFunnelShift from creating an invalid ISD::FSHR when lowering "ISD::FSHL X, Y, 0". Such inputs are rare because it's a NOP that DAGCombiner will optimise away. However, we should not rely on this and so this PR mirrors the same optimisation.

Ensure LowerFunnelShift normalises constant shift amounts because isel rules expect them to be in the range [0, src bit length).

NOTE: To simiplify testing, this PR also adds a command line option to disable the DAG combiner (-combiner-disabled).

llvmbot · 2025-05-15T13:33:37Z

@llvm/pr-subscribers-backend-aarch64

Author: Paul Walker (paulwalker-arm)

Changes

Prevent LowerFunnelShift from creating an invalid ISD::FSHR when lowering "ISD::FSHL X, Y, 0". Such inputs are rare because it's a NOP that DAGCombiner will optimise away. However, we shoudl not rely on this and so this PR mirror the same optimisation.

NOTE: To simiplify testing, this PR also adds a command line option to disable the DAG combiner (-combiner-disabled).

Full diff: https://github.com/llvm/llvm-project/pull/140058.diff

2 Files Affected:

(modified) llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp (+6-1)
(modified) llvm/lib/Target/AArch64/AArch64ISelLowering.cpp (+6)

diff --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
index d6e288a59b2ee..2b752498f64a1 100644
--- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -149,6 +149,10 @@ static cl::opt<bool> EnableShrinkLoadReplaceStoreWithStore(
     cl::desc("DAG combiner enable load/<replace bytes>/store with "
              "a narrower store"));
 
+static cl::opt<bool> DisableCombines("combiner-disabled", cl::Hidden,
+                                     cl::init(false),
+                                     cl::desc("Disable the DAG combiner"));
+
 namespace {
 
   class DAGCombiner {
@@ -248,7 +252,8 @@ namespace {
           STI(D.getSubtarget().getSelectionDAGInfo()), OptLevel(OL),
           BatchAA(BatchAA) {
       ForCodeSize = DAG.shouldOptForSize();
-      DisableGenericCombines = STI && STI->disableGenericCombines(OptLevel);
+      DisableGenericCombines =
+          DisableCombines || (STI && STI->disableGenericCombines(OptLevel));
 
       MaximumLegalStoreInBits = 0;
       // We use the minimum store size here, since that's all we can guarantee
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index fb7f7d6f7537d..7206a619cb767 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -7266,12 +7266,18 @@ static SDValue LowerFunnelShift(SDValue Op, SelectionDAG &DAG) {
     MVT VT = Op.getSimpleValueType();
 
     if (Op.getOpcode() == ISD::FSHL) {
+      if (ShiftNo->isZero())
+        return Op.getOperand(0);
+
       unsigned int NewShiftNo =
           VT.getFixedSizeInBits() - ShiftNo->getZExtValue();
       return DAG.getNode(
           ISD::FSHR, DL, VT, Op.getOperand(0), Op.getOperand(1),
           DAG.getConstant(NewShiftNo, DL, Shifts.getValueType()));
     } else if (Op.getOpcode() == ISD::FSHR) {
+      if (ShiftNo->isZero())
+        return Op.getOperand(1);
+
       return Op;
     }
   }

llvmbot · 2025-05-15T13:33:38Z

@llvm/pr-subscribers-llvm-selectiondag

Author: Paul Walker (paulwalker-arm)

Changes

Prevent LowerFunnelShift from creating an invalid ISD::FSHR when lowering "ISD::FSHL X, Y, 0". Such inputs are rare because it's a NOP that DAGCombiner will optimise away. However, we shoudl not rely on this and so this PR mirror the same optimisation.

NOTE: To simiplify testing, this PR also adds a command line option to disable the DAG combiner (-combiner-disabled).

Full diff: https://github.com/llvm/llvm-project/pull/140058.diff

2 Files Affected:

(modified) llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp (+6-1)
(modified) llvm/lib/Target/AArch64/AArch64ISelLowering.cpp (+6)

diff --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
index d6e288a59b2ee..2b752498f64a1 100644
--- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -149,6 +149,10 @@ static cl::opt<bool> EnableShrinkLoadReplaceStoreWithStore(
     cl::desc("DAG combiner enable load/<replace bytes>/store with "
              "a narrower store"));
 
+static cl::opt<bool> DisableCombines("combiner-disabled", cl::Hidden,
+                                     cl::init(false),
+                                     cl::desc("Disable the DAG combiner"));
+
 namespace {
 
   class DAGCombiner {
@@ -248,7 +252,8 @@ namespace {
           STI(D.getSubtarget().getSelectionDAGInfo()), OptLevel(OL),
           BatchAA(BatchAA) {
       ForCodeSize = DAG.shouldOptForSize();
-      DisableGenericCombines = STI && STI->disableGenericCombines(OptLevel);
+      DisableGenericCombines =
+          DisableCombines || (STI && STI->disableGenericCombines(OptLevel));
 
       MaximumLegalStoreInBits = 0;
       // We use the minimum store size here, since that's all we can guarantee
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index fb7f7d6f7537d..7206a619cb767 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -7266,12 +7266,18 @@ static SDValue LowerFunnelShift(SDValue Op, SelectionDAG &DAG) {
     MVT VT = Op.getSimpleValueType();
 
     if (Op.getOpcode() == ISD::FSHL) {
+      if (ShiftNo->isZero())
+        return Op.getOperand(0);
+
       unsigned int NewShiftNo =
           VT.getFixedSizeInBits() - ShiftNo->getZExtValue();
       return DAG.getNode(
           ISD::FSHR, DL, VT, Op.getOperand(0), Op.getOperand(1),
           DAG.getConstant(NewShiftNo, DL, Shifts.getValueType()));
     } else if (Op.getOpcode() == ISD::FSHR) {
+      if (ShiftNo->isZero())
+        return Op.getOperand(1);
+
       return Op;
     }
   }

jayfoad · 2025-05-15T13:53:49Z

This doesn't look right. fshl and fshr interpret the shift amount as modulo the bitwidth of op0 (or op1), so there is no bug here unless you are worried about non-power-of-two bitwidths.

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp

Prevent LowerFunnelShift from creating an invalid ISD::FSHR when lowering "ISD::FSHL X, Y, 0". Such inputs are rare because it's a NOP that DAGCombiner will optimise away. However, we should not rely on this and so this PR mirrors the same optimisation. Ensure LowerFunnelShift normalises constant shift amounts because isel rules expect them to be in the range [0, src bit length). NOTE: To simiplify testing, this PR also adds a command line option to disable the DAG combiner (-combiner-disabled).

paulwalker-arm · 2025-05-15T15:46:22Z

Updated to normalise the shift amounts for both fshl and fshr, with new tests that trigger isel failures prior to this PR.

I did try marking ISD::FSHL as needing expansion, but that triggered other asserts and worse code generation so I'd rather get the bug fix out of the way and revisit whether the custom lowering code can be removed at a later date.

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp

jayfoad

LGTM

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp

RKSimon

getNode() already handles zero shift amounts for shifts and rotates - why not just add it for fshl/fshr as well?

paulwalker-arm · 2025-05-15T16:41:53Z

getNode() already handles zero shift amounts for shifts and rotates - why not just add it for fshl/fshr as well?

I'll give it a go and see what drops out.

paulwalker-arm · 2025-05-15T17:34:36Z

@RKSimon - If getNode() performs this canonicalisation, can I be sure there's no way for the lowering code to see fshl(x,y,0)? I'm wondering if the lowering could end up generating the "zero" shift amount without there being a new ISD::FHSL construction to canonicalise it.

If that can happen then the custom lowering code must still handle it?

RKSimon · 2025-05-15T17:57:00Z

If it wasn't constant and a later combine manages to fold the node to a constant then it might not be regenerated/combined and technically you could see it at lowering - DAG combines still aren't properly topologically ordered :(

But I'm not entirely clear what you're trying to handle here - does arm64 only support FSHR nodes with constant shift amounts? What happens with zero / modulo amounts with that instruction that is so bad?

davemgreen · 2025-05-15T18:29:32Z

This is #139866. The sequence of events is there is i128 shl by load -> i64 shl_parts by load -> i64 shl_parts by 0 (the load is optimized to a 0, via rauw I would guess) -> (there is no combine for shl_part by 0 but does get visited) -> lowered to i64 fshl by 0 via custom/expandShiftParts -> lower fshl that goes wrong.

paulwalker-arm · 2025-05-15T19:50:09Z

But I'm not entirely clear what you're trying to handle here - does arm64 only support FSHR nodes with constant shift amounts? What happens with zero / modulo amounts with that instruction that is so bad?

Continuing on from what David said above, it's not that FSHR does not like zero, but that the custom lowering code tries to rewrite all FSHLs with constant shift amounts as FSHRs because isel only exists for the latter. The problem being the transformation is invalid when the modulo shift amount is 0. Whilst fixing that I spotted the FSHR AArch64 isel rules only support [0-BitWidth) so I figured I'd fix that as well so that we can be sure all legitimate shift amounts can be lowered to something we can isel.

) Prevent LowerFunnelShift from creating an invalid ISD::FSHR when lowering "ISD::FSHL X, Y, 0". Such inputs are rare because it's a NOP that DAGCombiner will optimise away. However, we should not rely on this and so this PR mirrors the same optimisation. Ensure LowerFunnelShift normalises constant shift amounts because isel rules expect them to be in the range [0, src bit length). NOTE: To simiplify testing, this PR also adds a command line option to disable the DAG combiner (-combiner-disabled).

[LLVM][DAGCombiner] Add command line option to disable the combiner.

8dd0065

llvmbot added backend:AArch64 llvm:SelectionDAG SelectionDAGISel as well labels May 15, 2025

paulwalker-arm requested review from arsenm, davemgreen and efriedma-quic May 15, 2025 13:34

paulwalker-arm force-pushed the fshl-fix branch from 87fad5c to 3906037 Compare May 15, 2025 13:35

paulwalker-arm mentioned this pull request May 15, 2025

AArch64 ISel miscompilation from weird Rust MIR #139866

Closed

jayfoad reviewed May 15, 2025

View reviewed changes

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp Show resolved Hide resolved

paulwalker-arm force-pushed the fshl-fix branch from 3906037 to 2bfdff8 Compare May 15, 2025 15:37

jayfoad reviewed May 15, 2025

View reviewed changes

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp Show resolved Hide resolved

Ensure half element shifts are lowered correctly.

fbad1e5

jayfoad approved these changes May 15, 2025

View reviewed changes

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp Show resolved Hide resolved

Remove else.

eba339d

RKSimon reviewed May 15, 2025

View reviewed changes

paulwalker-arm changed the title ~~[LLVM][AArch64] Correctly lower funnel shifts by zero.~~ [LLVM][AArch64] Correctly lower funnel shifts by constants. May 20, 2025

paulwalker-arm merged commit 5dfaf84 into llvm:main May 20, 2025
11 checks passed

paulwalker-arm deleted the fshl-fix branch May 20, 2025 10:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[LLVM][AArch64] Correctly lower funnel shifts by constants. #140058

[LLVM][AArch64] Correctly lower funnel shifts by constants. #140058

Uh oh!

paulwalker-arm commented May 15, 2025 •

edited

Loading

Uh oh!

llvmbot commented May 15, 2025

Uh oh!

llvmbot commented May 15, 2025

Uh oh!

jayfoad commented May 15, 2025

Uh oh!

Uh oh!

paulwalker-arm commented May 15, 2025

Uh oh!

Uh oh!

jayfoad left a comment

Uh oh!

Uh oh!

RKSimon left a comment

Uh oh!

paulwalker-arm commented May 15, 2025

Uh oh!

paulwalker-arm commented May 15, 2025

Uh oh!

RKSimon commented May 15, 2025

Uh oh!

davemgreen commented May 15, 2025

Uh oh!

paulwalker-arm commented May 15, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

[LLVM][AArch64] Correctly lower funnel shifts by constants. #140058

[LLVM][AArch64] Correctly lower funnel shifts by constants. #140058

Uh oh!

Conversation

paulwalker-arm commented May 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented May 15, 2025

Uh oh!

llvmbot commented May 15, 2025

Uh oh!

jayfoad commented May 15, 2025

Uh oh!

Uh oh!

paulwalker-arm commented May 15, 2025

Uh oh!

Uh oh!

jayfoad left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

RKSimon left a comment

Choose a reason for hiding this comment

Uh oh!

paulwalker-arm commented May 15, 2025

Uh oh!

paulwalker-arm commented May 15, 2025

Uh oh!

RKSimon commented May 15, 2025

Uh oh!

davemgreen commented May 15, 2025

Uh oh!

paulwalker-arm commented May 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

paulwalker-arm commented May 15, 2025 •

edited

Loading

paulwalker-arm commented May 15, 2025 •

edited

Loading