-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[mlir][dataflow] Fix for integer range analysis propagation bug #93199
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-mlir-index @llvm/pr-subscribers-mlir Author: Spenser Bauman (sabauma) ChangesInteger range analysis will not update the range of an operation when any of the inferred input lattices are uninitialized. In the current behavior, all lattice values for non integer types are uninitialized. For operations like arith.cmpf %3 = arith.cmpf ugt, %arg0, %arg1 : f32 that will result in the range of the output also being uninitialized, and so on for any consumer of the arith.cmpf result. When control-flow ops are involved, the lack of propagation results in incorrect ranges, as the back edges for loop carried values are not properly joined with the definitions from the body region. For example, an scf.while loop whose body region produces a value that is in a dataflow relationship with some floating-point values through an arith.cmpf operation: func.func @<!-- -->test_bad_range(%arg0: f32, %arg1: f32) -> (index, index) {
%c4 = arith.constant 4 : index
%c1 = arith.constant 1 : index
%c0 = arith.constant 0 : index
%3 = arith.cmpf ugt, %arg0, %arg1 : f32
%1:2 = scf.while (%arg2 = %c0, %arg3 = %c0) : (index, index) -> (index, index) {
%2 = arith.cmpi ult, %arg2, %c4 : index
scf.condition(%2) %arg2, %arg3 : index, index
} do {
^bb0(%arg2: index, %arg3: index):
%4 = arith.select %3, %arg3, %arg3 : index
%5 = arith.addi %arg2, %c1 : index
scf.yield %5, %4 : index, index
}
return %1#<!-- -->0, %1#<!-- -->1 : index, index
} The existing behavior results in the control condition %2 being optimized to true, turning the while loop into an infinite loop. The update to %arg2 through the body region is never factored into the range calculation, as the ranges for the body ops all test as uninitialized. This change causes all values initialized with setToEntryState to be set to some initialized range, even if the values are not integers. Full diff: https://github.com/llvm/llvm-project/pull/93199.diff 2 Files Affected:
diff --git a/mlir/lib/Analysis/DataFlow/IntegerRangeAnalysis.cpp b/mlir/lib/Analysis/DataFlow/IntegerRangeAnalysis.cpp
index a82c30717e275..b69b2e0416209 100644
--- a/mlir/lib/Analysis/DataFlow/IntegerRangeAnalysis.cpp
+++ b/mlir/lib/Analysis/DataFlow/IntegerRangeAnalysis.cpp
@@ -38,8 +38,6 @@ using namespace mlir::dataflow;
IntegerValueRange IntegerValueRange::getMaxRange(Value value) {
unsigned width = ConstantIntRanges::getStorageBitwidth(value.getType());
- if (width == 0)
- return {};
APInt umin = APInt::getMinValue(width);
APInt umax = APInt::getMaxValue(width);
APInt smin = width != 0 ? APInt::getSignedMinValue(width) : umin;
diff --git a/mlir/test/Dialect/Arith/int-range-interface.mlir b/mlir/test/Dialect/Arith/int-range-interface.mlir
index 5b538197a0c11..fdeb8a2e6c935 100644
--- a/mlir/test/Dialect/Arith/int-range-interface.mlir
+++ b/mlir/test/Dialect/Arith/int-range-interface.mlir
@@ -899,3 +899,21 @@ func.func @test_shl_i8_nowrap() -> i8 {
%2 = test.reflect_bounds %1 : i8
return %2: i8
}
+
+/// A test case to ensure that the ranges for unsupported ops are initialized
+/// properly to maxRange, rather than left uninitialized.
+/// In this test case, the previous behavior would leave the ranges for %a and
+/// %b uninitialized, resulting in arith.cmpf's range not being updated, even
+/// though it has an integer valued result.
+
+// CHECK-LABEL: func @test_cmpf_propagates
+// CHECK: test.reflect_bounds {smax = 2 : index, smin = 1 : index, umax = 2 : index, umin = 1 : index}
+func.func @test_cmpf_propagates(%a: f32, %b: f32) -> index {
+ %c1 = arith.constant 1 : index
+ %c2 = arith.constant 2 : index
+
+ %0 = arith.cmpf ueq, %a, %b : f32
+ %1 = arith.select %0, %c1, %c2 : index
+ %2 = test.reflect_bounds %1 : index
+ func.return %2 : index
+}
|
I agree that the control flow issue you have laid out is a problem. But to me it seems like assigning "fake" ranges to non-Integer values might lead to other problems down the line. Maybe we should think about this check? Why don't we do |
Re that check, I think the point was to prevent having to create fake constant ranges for non-integer inputs. I think the right solution would be to change the interface from Or we could fake things by, for example, declaring non-integers to be zero-width, for instance. |
@krzysz00, yes That check was added explicitly to prevent a crash relating to faking the integer ranges. That problem seems to no longer occur.
@ubfx I agree my proposed solution is kind of odd. As @krzysz00 mentioned in his comment, I suspect removing that check is the right approach, but it would require changes to the |
Ok, then I'm +1 on changing the interface |
✅ With the latest revision this PR passed the C/C++ code formatter. |
2f5a403
to
8713785
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not going to LGTM this so late at night, but this seems like a reasonable fix.
(Having an "intrusive" None value is the other option we could think of here ,but I can't see a way to do that to the range type, so ... yeah, optional is what it'll probably be)
This might need an announcement somewhere?
@krzysz00 Another option I was considering was to rework the interfaces to take |
f672570
to
5e23a7f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So to keep the discussion going, I remember keeping ConstantIntRanges
and the lattices separate back in the day so as to make the integer range inference usable for cases other than that one dataflow analysis. But that's not a super hard goal and so this might be a reasonable solution.
Re std::function vs function_ref, I don't have a strong opinion there - all I can say is that the intent was to pass in a transient pointer to a callback. Seems reasonable to make that change.
But also, wrt the interface and backwards-compatibily, how about
- Provide a new method, possibly with the same name (or if not, something like
inferResultRangesFromOptional
) that takesArrayRef<std::optional<ConstantIntRanges>>
instead ofArrayRef<ConstantIntRanges>
- Implement the optional version in terms of the non-optional version by default: that is, if someone doesn't implement
inferResultRanges(ArrayRef<std::optional<ConstantIntRanges>>, SetResultFn)
, it'll loop through the inputs and, if they're all notnone
, callinferResultRanges()
on the unwrapped values. This gives everyone an upgrade path.
@krzysz00 I've applied the suggested changes. Extending the interface resulted in a much cleaner design and I was able to revert most of the surgery done to the dialects using the
I was not able to use the same name overloaded on different input types. This seems to be a limitation of the interface system. |
Integer range analysis will not update the range of an operation when any of the inferred input lattices are uninitialized. In the current behavior, all lattice values for non integer types are uninitialized. For operations like arith.cmpf ```mlir %3 = arith.cmpf ugt, %arg0, %arg1 : f32 ``` that will result in the range of the output also being uninitialized, and so on for any consumer of the arith.cmpf result. When control-flow ops are involved, the lack of propagation results in incorrect ranges, as the back edges for loop carried values are not properly joined with the definitions from the body region. For example, an scf.while loop whose body region produces a value that is in a dataflow relationship with some floating-point values through an arith.cmpf operation: ```mlir func.func @test_bad_range(%arg0: f32, %arg1: f32) -> (index, index) { %c4 = arith.constant 4 : index %c1 = arith.constant 1 : index %c0 = arith.constant 0 : index %3 = arith.cmpf ugt, %arg0, %arg1 : f32 %1:2 = scf.while (%arg2 = %c0, %arg3 = %c0) : (index, index) -> (index, index) { %2 = arith.cmpi ult, %arg2, %c4 : index scf.condition(%2) %arg2, %arg3 : index, index } do { ^bb0(%arg2: index, %arg3: index): %4 = arith.select %3, %arg3, %arg3 : index %5 = arith.addi %arg2, %c1 : index scf.yield %5, %4 : index, index } return %1#0, %1#1 : index, index } ``` The existing behavior results in the control condition %2 being optimized to true, turning the while loop into an infinite loop. The update to %arg2 through the body region is never factored into the range calculation, as the ranges for the body ops all test as uninitialized. This change causes all values initialized with setToEntryState to be set to some initialized range, even if the values are not integers.
Modify the integer range analysis interfaces to handle uninitialized values by allowing the inferred input ranges to be optional.
IntegerValueRange already exists and encodes the extact information that we want to represent with OptionalIntRange. This makes the APIs clearer than passing an std::optional everywhere.
This new method allows downstream implementers to easily opt into the old behavior while providing an easy way to transition to the more powerful interface methods.
10ad623
to
ef2c812
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm glad we finally managed to find a good approach here! Thanks for putting up with the endless rounds of revisions.
Approved!
…#93199) Integer range analysis will not update the range of an operation when any of the inferred input lattices are uninitialized. In the current behavior, all lattice values for non integer types are uninitialized. For operations like arith.cmpf ```mlir %3 = arith.cmpf ugt, %arg0, %arg1 : f32 ``` that will result in the range of the output also being uninitialized, and so on for any consumer of the arith.cmpf result. When control-flow ops are involved, the lack of propagation results in incorrect ranges, as the back edges for loop carried values are not properly joined with the definitions from the body region. For example, an scf.while loop whose body region produces a value that is in a dataflow relationship with some floating-point values through an arith.cmpf operation: ```mlir func.func @test_bad_range(%arg0: f32, %arg1: f32) -> (index, index) { %c4 = arith.constant 4 : index %c1 = arith.constant 1 : index %c0 = arith.constant 0 : index %3 = arith.cmpf ugt, %arg0, %arg1 : f32 %1:2 = scf.while (%arg2 = %c0, %arg3 = %c0) : (index, index) -> (index, index) { %2 = arith.cmpi ult, %arg2, %c4 : index scf.condition(%2) %arg2, %arg3 : index, index } do { ^bb0(%arg2: index, %arg3: index): %4 = arith.select %3, %arg3, %arg3 : index %5 = arith.addi %arg2, %c1 : index scf.yield %5, %4 : index, index } return %1#0, %1#1 : index, index } ``` The existing behavior results in the control condition %2 being optimized to true, turning the while loop into an infinite loop. The update to %arg2 through the body region is never factored into the range calculation, as the ranges for the body ops all test as uninitialized. This change causes all values initialized with setToEntryState to be set to some initialized range, even if the values are not integers. --------- Co-authored-by: Spenser Bauman <sabauma@fastmail>
Integer range analysis will not update the range of an operation when any of the inferred input lattices are uninitialized. In the current behavior, all lattice values for non integer types are uninitialized.
For operations like arith.cmpf
that will result in the range of the output also being uninitialized, and so on for any consumer of the arith.cmpf result. When control-flow ops are involved, the lack of propagation results in incorrect ranges, as the back edges for loop carried values are not properly joined with the definitions from the body region.
For example, an scf.while loop whose body region produces a value that is in a dataflow relationship with some floating-point values through an arith.cmpf operation:
The existing behavior results in the control condition %2 being optimized to true, turning the while loop into an infinite loop. The update to %arg2 through the body region is never factored into the range calculation, as the ranges for the body ops all test as uninitialized.
This change causes all values initialized with setToEntryState to be set to some initialized range, even if the values are not integers.