[SCEV] Fix incorrect NUW inference #70521

DaniilSuchkov · 2023-10-27T23:19:16Z

This patch fixes a miscompile in LSR caused by incorrect inference of NUW flag for AddRec: we shouldn't infer no-wrap flags based on a comparison which doesn't fully control the loop exit.

llvmbot · 2023-10-27T23:20:17Z

@llvm/pr-subscribers-llvm-transforms

@llvm/pr-subscribers-llvm-analysis

Author: Daniil Suchkov (DaniilSuchkov)

Changes

This patch fixes a miscompile in LSR caused by incorrect inference of NUW flag for AddRec: we shouldn't infer no-wrap flags based on a comparison which doesn't fully control the loop exit.

Full diff: https://github.com/llvm/llvm-project/pull/70521.diff

2 Files Affected:

(modified) llvm/lib/Analysis/ScalarEvolution.cpp (+5)
(modified) llvm/test/Transforms/LoopStrengthReduce/scev-incorrect-nuw-inference.ll (+13-18)

diff --git a/llvm/lib/Analysis/ScalarEvolution.cpp b/llvm/lib/Analysis/ScalarEvolution.cpp
index 2368003177e741c..26a32e17f7f9bd8 100644
--- a/llvm/lib/Analysis/ScalarEvolution.cpp
+++ b/llvm/lib/Analysis/ScalarEvolution.cpp
@@ -12637,6 +12637,11 @@ ScalarEvolution::howManyLessThans(const SCEV *LHS, const SCEV *RHS,
       const SCEVAddRecExpr *AR = dyn_cast<SCEVAddRecExpr>(ZExt->getOperand());
       if (AR && AR->getLoop() == L && AR->isAffine()) {
         auto canProveNUW = [&]() {
+          // We can use the comparison to infer no-wrap flags only it fully
+          // controls the loop exit.
+          if (!ControlsOnlyExit)
+            return false;
+
           if (!isLoopInvariant(RHS, L))
             return false;
 
diff --git a/llvm/test/Transforms/LoopStrengthReduce/scev-incorrect-nuw-inference.ll b/llvm/test/Transforms/LoopStrengthReduce/scev-incorrect-nuw-inference.ll
index 3a9cb160e88d7bc..1b37b4cd2a2d8b9 100644
--- a/llvm/test/Transforms/LoopStrengthReduce/scev-incorrect-nuw-inference.ll
+++ b/llvm/test/Transforms/LoopStrengthReduce/scev-incorrect-nuw-inference.ll
@@ -5,38 +5,33 @@
 target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128-ni:1-p2:32:8:8:32-ni:2"
 target triple = "x86_64-unknown-linux-gnu"
 
-; FIXME: the returned value should be equal to
-;     zext (trunk (%phi-1) to i16) to i64
-;   or simply
+; The returned value should be equal to
 ;     zext (%phi-1) to i64
-; which means it should be equal to 1209. Currently, due to a bug in SCEV, it's
-; over 65534.
+; or simply 1209.
 define noundef i64 @test() {
 ; CHECK-LABEL: define noundef i64 @test() {
 ; CHECK-NEXT:  bb2:
 ; CHECK-NEXT:    br label [[BB3:%.*]]
 ; CHECK:       bb3:
-; CHECK-NEXT:    [[LSR_IV:%.*]] = phi i64 [ [[LSR_IV_NEXT:%.*]], [[BB10:%.*]] ], [ 0, [[BB2:%.*]] ]
-; CHECK-NEXT:    [[TMP0:%.*]] = add i64 [[LSR_IV]], 65535
-; CHECK-NEXT:    [[TMP1:%.*]] = add i64 [[LSR_IV]], 65407
-; CHECK-NEXT:    [[ICMP5:%.*]] = icmp ult i64 [[TMP1]], -256
-; CHECK-NEXT:    [[TMP2:%.*]] = trunc i64 [[TMP0]] to i32
-; CHECK-NEXT:    [[ICMP6:%.*]] = icmp ult i32 [[TMP2]], 128
+; CHECK-NEXT:    [[LSR_IV:%.*]] = phi i32 [ [[LSR_IV_NEXT:%.*]], [[BB10:%.*]] ], [ -1, [[BB2:%.*]] ]
+; CHECK-NEXT:    [[TMP0:%.*]] = add i32 [[LSR_IV]], 65536
+; CHECK-NEXT:    [[AND:%.*]] = and i32 [[TMP0]], 65535
+; CHECK-NEXT:    [[ZEXT:%.*]] = zext i32 [[AND]] to i64
+; CHECK-NEXT:    [[ADD4:%.*]] = add nsw i64 [[ZEXT]], -128
+; CHECK-NEXT:    [[ICMP5:%.*]] = icmp ult i64 [[ADD4]], -256
+; CHECK-NEXT:    [[ICMP6:%.*]] = icmp ult i32 [[AND]], 128
 ; CHECK-NEXT:    [[OR:%.*]] = or i1 [[ICMP5]], [[ICMP6]]
 ; CHECK-NEXT:    br i1 [[OR]], label [[BB10]], label [[BB7:%.*]]
 ; CHECK:       bb7:
-; CHECK-NEXT:    [[TMP1:%.*]] = trunc i64 [[LSR_IV]] to i32
+; CHECK-NEXT:    [[TMP1:%.*]] = add i32 [[LSR_IV]], 1
 ; CHECK-NEXT:    call void @foo(i32 [[TMP1]])
 ; CHECK-NEXT:    unreachable
 ; CHECK:       bb10:
-; CHECK-NEXT:    [[LSR_IV_NEXT]] = add nuw nsw i64 [[LSR_IV]], 1
-; CHECK-NEXT:    [[TMP2:%.*]] = add i64 [[LSR_IV_NEXT]], -1
-; CHECK-NEXT:    [[TMP:%.*]] = trunc i64 [[TMP2]] to i32
-; CHECK-NEXT:    [[ICMP12:%.*]] = icmp ult i32 [[TMP]], 1210
+; CHECK-NEXT:    [[LSR_IV_NEXT]] = add nsw i32 [[LSR_IV]], 1
+; CHECK-NEXT:    [[ICMP12:%.*]] = icmp ult i32 [[LSR_IV_NEXT]], 1210
 ; CHECK-NEXT:    br i1 [[ICMP12]], label [[BB3]], label [[BB13:%.*]]
 ; CHECK:       bb13:
-; CHECK-NEXT:    [[TMP3:%.*]] = add i64 [[LSR_IV_NEXT]], 65534
-; CHECK-NEXT:    ret i64 [[TMP3]]
+; CHECK-NEXT:    ret i64 [[ZEXT]]
 ;
 bb2:
   br label %bb3

nikic · 2023-10-30T14:13:03Z

Is the problem here that the nuw flag is just entirely incorrect, or that it is correct at that use, but we set it on the addrec in general? If the latter, would it be possible to just not set the addrec flag? That is, make the code if (AR->hasNoUnsignedWrap() || canProveNUW()) and drop the addrec flag manipulation code?

DaniilSuchkov · 2023-10-30T16:38:10Z

Is the problem here that the nuw flag is just entirely incorrect, or that it is correct at that use, but we set it on the addrec in general?

It's entirely incorrect. In the test showing the miscompile have an exit condition that is an OR of two icmps with one being basically an inverse of the other, i.e. it's if (!(cond || !cond)) break;. Currently we infer NUW based on one of the icmps: we assume that if this comparison is false, we'll leave the loop and thus the AddRec won't reach overflow, but that assumption is just wrong.
Also, this exit check dominates the latch, so (in most cases?) whatever is correct for that check, should be as well correct for that AddRec in general.

nikic

LGTM

llvm/lib/Analysis/ScalarEvolution.cpp

Co-authored-by: Nikita Popov <[email protected]>

[SCEV] Fix incorrect NUW inference

ebdff99

This patch fixes a miscompile in LSR caused by incorrect inference of NUW flag for AddRec: we shouldn't infer no-wrap flags based on a comparison which doesn't fully control the loop exit.

DaniilSuchkov requested review from preames and aleks-tmb October 27, 2023 23:19

DaniilSuchkov self-assigned this Oct 27, 2023

DaniilSuchkov requested a review from nikic as a code owner October 27, 2023 23:19

llvmbot added llvm:analysis Includes value tracking, cost tables and constant folding llvm:transforms labels Oct 27, 2023

nikic approved these changes Oct 30, 2023

View reviewed changes

llvm/lib/Analysis/ScalarEvolution.cpp Outdated Show resolved Hide resolved

Update llvm/lib/Analysis/ScalarEvolution.cpp

c95933e

Co-authored-by: Nikita Popov <[email protected]>

DaniilSuchkov merged commit 1344b65 into llvm:main Oct 31, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SCEV] Fix incorrect NUW inference #70521

[SCEV] Fix incorrect NUW inference #70521

Uh oh!

DaniilSuchkov commented Oct 27, 2023

Uh oh!

llvmbot commented Oct 27, 2023 •

edited

Loading

Uh oh!

nikic commented Oct 30, 2023

Uh oh!

DaniilSuchkov commented Oct 30, 2023

Uh oh!

nikic left a comment

Uh oh!

Uh oh!

Uh oh!

[SCEV] Fix incorrect NUW inference #70521

[SCEV] Fix incorrect NUW inference #70521

Uh oh!

Conversation

DaniilSuchkov commented Oct 27, 2023

Uh oh!

llvmbot commented Oct 27, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nikic commented Oct 30, 2023

Uh oh!

DaniilSuchkov commented Oct 30, 2023

Uh oh!

nikic left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

llvmbot commented Oct 27, 2023 •

edited

Loading