Skip to content

[mlir][dataflow] Fix for integer range analysis propagation bug #93199

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
May 28, 2024

Conversation

sabauma
Copy link
Contributor

@sabauma sabauma commented May 23, 2024

Integer range analysis will not update the range of an operation when any of the inferred input lattices are uninitialized. In the current behavior, all lattice values for non integer types are uninitialized.

For operations like arith.cmpf

%3 = arith.cmpf ugt, %arg0, %arg1 : f32

that will result in the range of the output also being uninitialized, and so on for any consumer of the arith.cmpf result. When control-flow ops are involved, the lack of propagation results in incorrect ranges, as the back edges for loop carried values are not properly joined with the definitions from the body region.

For example, an scf.while loop whose body region produces a value that is in a dataflow relationship with some floating-point values through an arith.cmpf operation:

func.func @test_bad_range(%arg0: f32, %arg1: f32) -> (index, index) {
  %c4 = arith.constant 4 : index
  %c1 = arith.constant 1 : index
  %c0 = arith.constant 0 : index

  %3 = arith.cmpf ugt, %arg0, %arg1 : f32

  %1:2 = scf.while (%arg2 = %c0, %arg3 = %c0) : (index, index) -> (index, index) {
    %2 = arith.cmpi ult, %arg2, %c4 : index
    scf.condition(%2) %arg2, %arg3 : index, index
  } do {
  ^bb0(%arg2: index, %arg3: index):
    %4 = arith.select %3, %arg3, %arg3 : index
    %5 = arith.addi %arg2, %c1 : index
    scf.yield %5, %4 : index, index
  }

  return %1#0, %1#1 : index, index
}

The existing behavior results in the control condition %2 being optimized to true, turning the while loop into an infinite loop. The update to %arg2 through the body region is never factored into the range calculation, as the ranges for the body ops all test as uninitialized.

This change causes all values initialized with setToEntryState to be set to some initialized range, even if the values are not integers.

@sabauma sabauma requested review from Mogball and ubfx May 23, 2024 14:38
@sabauma sabauma self-assigned this May 23, 2024
@llvmbot
Copy link
Member

llvmbot commented May 23, 2024

@llvm/pr-subscribers-mlir-index
@llvm/pr-subscribers-mlir-gpu
@llvm/pr-subscribers-mlir-arith

@llvm/pr-subscribers-mlir

Author: Spenser Bauman (sabauma)

Changes

Integer range analysis will not update the range of an operation when any of the inferred input lattices are uninitialized. In the current behavior, all lattice values for non integer types are uninitialized.

For operations like arith.cmpf

%3 = arith.cmpf ugt, %arg0, %arg1 : f32

that will result in the range of the output also being uninitialized, and so on for any consumer of the arith.cmpf result. When control-flow ops are involved, the lack of propagation results in incorrect ranges, as the back edges for loop carried values are not properly joined with the definitions from the body region.

For example, an scf.while loop whose body region produces a value that is in a dataflow relationship with some floating-point values through an arith.cmpf operation:

func.func @<!-- -->test_bad_range(%arg0: f32, %arg1: f32) -&gt; (index, index) {
  %c4 = arith.constant 4 : index
  %c1 = arith.constant 1 : index
  %c0 = arith.constant 0 : index

  %3 = arith.cmpf ugt, %arg0, %arg1 : f32

  %1:2 = scf.while (%arg2 = %c0, %arg3 = %c0) : (index, index) -&gt; (index, index) {
    %2 = arith.cmpi ult, %arg2, %c4 : index
    scf.condition(%2) %arg2, %arg3 : index, index
  } do {
  ^bb0(%arg2: index, %arg3: index):
    %4 = arith.select %3, %arg3, %arg3 : index
    %5 = arith.addi %arg2, %c1 : index
    scf.yield %5, %4 : index, index
  }

  return %1#<!-- -->0, %1#<!-- -->1 : index, index
}

The existing behavior results in the control condition %2 being optimized to true, turning the while loop into an infinite loop. The update to %arg2 through the body region is never factored into the range calculation, as the ranges for the body ops all test as uninitialized.

This change causes all values initialized with setToEntryState to be set to some initialized range, even if the values are not integers.


Full diff: https://github.com/llvm/llvm-project/pull/93199.diff

2 Files Affected:

  • (modified) mlir/lib/Analysis/DataFlow/IntegerRangeAnalysis.cpp (-2)
  • (modified) mlir/test/Dialect/Arith/int-range-interface.mlir (+18)
diff --git a/mlir/lib/Analysis/DataFlow/IntegerRangeAnalysis.cpp b/mlir/lib/Analysis/DataFlow/IntegerRangeAnalysis.cpp
index a82c30717e275..b69b2e0416209 100644
--- a/mlir/lib/Analysis/DataFlow/IntegerRangeAnalysis.cpp
+++ b/mlir/lib/Analysis/DataFlow/IntegerRangeAnalysis.cpp
@@ -38,8 +38,6 @@ using namespace mlir::dataflow;
 
 IntegerValueRange IntegerValueRange::getMaxRange(Value value) {
   unsigned width = ConstantIntRanges::getStorageBitwidth(value.getType());
-  if (width == 0)
-    return {};
   APInt umin = APInt::getMinValue(width);
   APInt umax = APInt::getMaxValue(width);
   APInt smin = width != 0 ? APInt::getSignedMinValue(width) : umin;
diff --git a/mlir/test/Dialect/Arith/int-range-interface.mlir b/mlir/test/Dialect/Arith/int-range-interface.mlir
index 5b538197a0c11..fdeb8a2e6c935 100644
--- a/mlir/test/Dialect/Arith/int-range-interface.mlir
+++ b/mlir/test/Dialect/Arith/int-range-interface.mlir
@@ -899,3 +899,21 @@ func.func @test_shl_i8_nowrap() -> i8 {
   %2 = test.reflect_bounds %1 : i8
   return %2: i8
 }
+
+/// A test case to ensure that the ranges for unsupported ops are initialized
+/// properly to maxRange, rather than left uninitialized.
+/// In this test case, the previous behavior would leave the ranges for %a and
+/// %b uninitialized, resulting in arith.cmpf's range not being updated, even
+/// though it has an integer valued result.
+
+// CHECK-LABEL: func @test_cmpf_propagates
+// CHECK: test.reflect_bounds {smax = 2 : index, smin = 1 : index, umax = 2 : index, umin = 1 : index}
+func.func @test_cmpf_propagates(%a: f32, %b: f32) -> index {
+  %c1 = arith.constant 1 : index
+  %c2 = arith.constant 2 : index
+
+  %0 = arith.cmpf ueq, %a, %b : f32
+  %1 = arith.select %0, %c1, %c2 : index
+  %2 = test.reflect_bounds %1 : index
+  func.return %2 : index
+}

@ubfx
Copy link
Member

ubfx commented May 23, 2024

I agree that the control flow issue you have laid out is a problem. But to me it seems like assigning "fake" ranges to non-Integer values might lead to other problems down the line.

Maybe we should think about this check? Why don't we do setAllToEntryStates() on the Operation's results when any of the operands are unitialized?
https://github.com/llvm/llvm-project/blob/main/mlir/lib/Analysis/DataFlow/IntegerRangeAnalysis.cpp#L76

@krzysz00
Copy link
Contributor

Re that check, I think the point was to prevent having to create fake constant ranges for non-integer inputs.

I think the right solution would be to change the interface from ArrayRef<ConstantIntRanges> to something like ArrayRef<optional<ConstantIntRanges>>

Or we could fake things by, for example, declaring non-integers to be zero-width, for instance.

@sabauma
Copy link
Contributor Author

sabauma commented May 23, 2024

Re that check, I think the point was to prevent having to create fake constant ranges for non-integer inputs.

@krzysz00, yes That check was added explicitly to prevent a crash relating to faking the integer ranges. That problem seems to no longer occur.

Maybe we should think about this check? Why don't we do setAllToEntryStates() on the Operation's results when any of the operands are unitialized?

@ubfx I agree my proposed solution is kind of odd. As @krzysz00 mentioned in his comment, I suspect removing that check is the right approach, but it would require changes to the InferIntRangeInterface. I would be happy to take a stab at that, but it would be a more invasive change.

@ubfx
Copy link
Member

ubfx commented May 23, 2024

Ok, then I'm +1 on changing the interface

Copy link

github-actions bot commented May 23, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@sabauma sabauma force-pushed the integer-range-analysis-fix branch from 2f5a403 to 8713785 Compare May 23, 2024 23:03
Copy link
Contributor

@krzysz00 krzysz00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not going to LGTM this so late at night, but this seems like a reasonable fix.

(Having an "intrusive" None value is the other option we could think of here ,but I can't see a way to do that to the range type, so ... yeah, optional is what it'll probably be)

This might need an announcement somewhere?

@sabauma
Copy link
Contributor Author

sabauma commented May 24, 2024

(Having an "intrusive" None value is the other option we could think of here ,but I can't see a way to do that to the range type, so ... yeah, optional is what it'll probably be)

@krzysz00 Another option I was considering was to rework the interfaces to take IntegerRangeValue. I think that might clarify the API of the interfaces a little.

@sabauma sabauma force-pushed the integer-range-analysis-fix branch 3 times, most recently from f672570 to 5e23a7f Compare May 24, 2024 16:28
Copy link
Contributor

@krzysz00 krzysz00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So to keep the discussion going, I remember keeping ConstantIntRanges and the lattices separate back in the day so as to make the integer range inference usable for cases other than that one dataflow analysis. But that's not a super hard goal and so this might be a reasonable solution.

Re std::function vs function_ref, I don't have a strong opinion there - all I can say is that the intent was to pass in a transient pointer to a callback. Seems reasonable to make that change.

But also, wrt the interface and backwards-compatibily, how about

  1. Provide a new method, possibly with the same name (or if not, something like inferResultRangesFromOptional) that takes ArrayRef<std::optional<ConstantIntRanges>> instead of ArrayRef<ConstantIntRanges>
  2. Implement the optional version in terms of the non-optional version by default: that is, if someone doesn't implement inferResultRanges(ArrayRef<std::optional<ConstantIntRanges>>, SetResultFn), it'll loop through the inputs and, if they're all not none, call inferResultRanges() on the unwrapped values. This gives everyone an upgrade path.

@sabauma
Copy link
Contributor Author

sabauma commented May 25, 2024

@krzysz00 I've applied the suggested changes. Extending the interface resulted in a much cleaner design and I was able to revert most of the surgery done to the dialects using the InferIntRangeInterface.

Provide a new method, possibly with the same name (or if not, something like inferResultRangesFromOptional) that takes ArrayRef<std::optional> instead of ArrayRef

I was not able to use the same name overloaded on different input types. This seems to be a limitation of the interface system.

Spenser Bauman and others added 4 commits May 25, 2024 15:21
Integer range analysis will not update the range of an operation when
any of the inferred input lattices are uninitialized. In the current
behavior, all lattice values for non integer types are uninitialized.

For operations like arith.cmpf

```mlir
%3 = arith.cmpf ugt, %arg0, %arg1 : f32
```

that will result in the range of the output also being uninitialized,
and so on for any consumer of the arith.cmpf result. When control-flow
ops are involved, the lack of propagation results in incorrect ranges,
as the back edges for loop carried values are not properly joined with
the definitions from the body region.

For example, an scf.while loop whose body region produces a value that
is in a dataflow relationship with some floating-point values through
an arith.cmpf operation:

```mlir
func.func @test_bad_range(%arg0: f32, %arg1: f32) -> (index, index) {
  %c4 = arith.constant 4 : index
  %c1 = arith.constant 1 : index
  %c0 = arith.constant 0 : index

  %3 = arith.cmpf ugt, %arg0, %arg1 : f32

  %1:2 = scf.while (%arg2 = %c0, %arg3 = %c0) : (index, index) -> (index, index) {
    %2 = arith.cmpi ult, %arg2, %c4 : index
    scf.condition(%2) %arg2, %arg3 : index, index
  } do {
  ^bb0(%arg2: index, %arg3: index):
    %4 = arith.select %3, %arg3, %arg3 : index
    %5 = arith.addi %arg2, %c1 : index
    scf.yield %5, %4 : index, index
  }

  return %1#0, %1#1 : index, index
}
```

The existing behavior results in the control condition %2 being
optimized to true, turning the while loop into an infinite loop. The
update to %arg2 through the body region is never factored into the range
calculation, as the ranges for the body ops all test as uninitialized.

This change causes all values initialized with setToEntryState to
be set to some initialized range, even if the values are not integers.
Modify the integer range analysis interfaces to handle uninitialized
values by allowing the inferred input ranges to be optional.
IntegerValueRange already exists and encodes the extact information that
we want to represent with OptionalIntRange. This makes the APIs clearer
than passing an std::optional everywhere.
This new method allows downstream implementers to easily opt into the
old behavior while providing an easy way to transition to the more
powerful interface methods.
@sabauma sabauma force-pushed the integer-range-analysis-fix branch from 10ad623 to ef2c812 Compare May 25, 2024 19:58
@sabauma sabauma requested a review from krzysz00 May 28, 2024 12:04
Copy link
Contributor

@krzysz00 krzysz00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm glad we finally managed to find a good approach here! Thanks for putting up with the endless rounds of revisions.

Approved!

@sabauma sabauma merged commit 6aeea70 into llvm:main May 28, 2024
7 checks passed
vg0204 pushed a commit to vg0204/llvm-project that referenced this pull request May 29, 2024
…#93199)

Integer range analysis will not update the range of an operation when
any of the inferred input lattices are uninitialized. In the current
behavior, all lattice values for non integer types are uninitialized.

For operations like arith.cmpf

```mlir
%3 = arith.cmpf ugt, %arg0, %arg1 : f32
```

that will result in the range of the output also being uninitialized,
and so on for any consumer of the arith.cmpf result. When control-flow
ops are involved, the lack of propagation results in incorrect ranges,
as the back edges for loop carried values are not properly joined with
the definitions from the body region.

For example, an scf.while loop whose body region produces a value that
is in a dataflow relationship with some floating-point values through an
arith.cmpf operation:

```mlir
func.func @test_bad_range(%arg0: f32, %arg1: f32) -> (index, index) {
  %c4 = arith.constant 4 : index
  %c1 = arith.constant 1 : index
  %c0 = arith.constant 0 : index

  %3 = arith.cmpf ugt, %arg0, %arg1 : f32

  %1:2 = scf.while (%arg2 = %c0, %arg3 = %c0) : (index, index) -> (index, index) {
    %2 = arith.cmpi ult, %arg2, %c4 : index
    scf.condition(%2) %arg2, %arg3 : index, index
  } do {
  ^bb0(%arg2: index, %arg3: index):
    %4 = arith.select %3, %arg3, %arg3 : index
    %5 = arith.addi %arg2, %c1 : index
    scf.yield %5, %4 : index, index
  }

  return %1#0, %1#1 : index, index
}
```

The existing behavior results in the control condition %2 being
optimized to true, turning the while loop into an infinite loop. The
update to %arg2 through the body region is never factored into the range
calculation, as the ranges for the body ops all test as uninitialized.

This change causes all values initialized with setToEntryState to be set
to some initialized range, even if the values are not integers.

---------

Co-authored-by: Spenser Bauman <sabauma@fastmail>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants