Skip to content

[MLIR][Bufferization] Retire enforce-aliasing-invariants #130929

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

chelini
Copy link
Contributor

@chelini chelini commented Mar 12, 2025

Why? This option can lead to incorrect IR if used in isolation, for example, consider the IR below:

func.func @loop_with_aliasing(%arg0: tensor<5xf32>, %arg1: index, %arg2: index) -> tensor<5xf32> {
  %c1 = arith.constant 1 : index
  %cst = arith.constant 1.000000e+00 : f32
  %0 = tensor.empty() : tensor<5xf32>
  %1 = linalg.fill ins(%cst : f32) outs(%0 : tensor<5xf32>) -> tensor<5xf32>
  // The BufferizableOpInterface says that %2 alias with %arg0 or be a newly
  // allocated buffer
  %2 = scf.for %arg3 = %arg1 to %arg2 step %c1 iter_args(%arg4 = %arg0) -> (tensor<5xf32>) {
    scf.yield %1 : tensor<5xf32>
  }
  %cst_0 = arith.constant 1.000000e+00 : f32
  %inserted = tensor.insert %cst_0 into %1[%c1] : tensor<5xf32>
  return %2 : tensor<5xf32>
}

If we bufferize with: enforce-aliasing-invariants=false, we get:

func.func @loop_with_aliasing(%arg0: memref<5xf32, strided<[?], offset: ?>>, %arg1: index, %arg2: index) -> memref<5xf32, strided<[?], offset: ?>> {
  %c1 = arith.constant 1 : index
  %cst = arith.constant 1.000000e+00 : f32
  %alloc = memref.alloc() {alignment = 64 : i64} : memref<5xf32>
  linalg.fill ins(%cst : f32) outs(%alloc : memref<5xf32>)
  %0 = scf.for %arg3 = %arg1 to %arg2 step %c1 iter_args(%arg4 = %arg0) -> (memref<5xf32, strided<[?], offset: ?>>) {
    %cast = memref.cast %alloc : memref<5xf32> to memref<5xf32, strided<[?], offset: ?>>
    scf.yield %cast : memref<5xf32, strided<[?], offset: ?>>
  }
  %cst_0 = arith.constant 1.000000e+00 : f32
  memref.store %cst_0, %alloc[%c1] : memref<5xf32>
  return %0 : memref<5xf32, strided<[?], offset: ?>>
}

Which is not correct IR since the loop yields the allocation.

I am using this option. What do I need to do now?

If you are using this option in isolation, you are possibly generating incorrect IR, so you need to revisit your bufferization strategy. If you are using it together with copyBeforeWrite, you simply need to retire the enforceAliasingInvariants option.

Why? This option can lead to incorrect IR if used in isolation, for example,
consider the IR below:

```mlir
func.func @loop_with_aliasing(%arg0: tensor<5xf32>, %arg1: index, %arg2: index) -> tensor<5xf32> {
  %c1 = arith.constant 1 : index
  %cst = arith.constant 1.000000e+00 : f32
  %0 = tensor.empty() : tensor<5xf32>
  %1 = linalg.fill ins(%cst : f32) outs(%0 : tensor<5xf32>) -> tensor<5xf32>
  // The BufferizableOpInterface says that %2 alias with %arg0 or be a newly
  // allocated buffer
  %2 = scf.for %arg3 = %arg1 to %arg2 step %c1 iter_args(%arg4 = %arg0) -> (tensor<5xf32>) {
    scf.yield %1 : tensor<5xf32>
  }
  %cst_0 = arith.constant 1.000000e+00 : f32
  %inserted = tensor.insert %cst_0 into %1[%c1] : tensor<5xf32>
  return %2 : tensor<5xf32>
}
```

If we bufferize with: enforce-aliasing-invariants=false, we get:

```
func.func @loop_with_aliasing(%arg0: memref<5xf32, strided<[?], offset: ?>>, %arg1: index, %arg2: index) -> memref<5xf32, strided<[?], offset: ?>> {
  %c1 = arith.constant 1 : index
  %cst = arith.constant 1.000000e+00 : f32
  %alloc = memref.alloc() {alignment = 64 : i64} : memref<5xf32>
  linalg.fill ins(%cst : f32) outs(%alloc : memref<5xf32>)
  %0 = scf.for %arg3 = %arg1 to %arg2 step %c1 iter_args(%arg4 = %arg0) -> (memref<5xf32, strided<[?], offset: ?>>) {
    %cast = memref.cast %alloc : memref<5xf32> to memref<5xf32, strided<[?], offset: ?>>
    scf.yield %cast : memref<5xf32, strided<[?], offset: ?>>
  }
  %cst_0 = arith.constant 1.000000e+00 : f32
  memref.store %cst_0, %alloc[%c1] : memref<5xf32>
  return %0 : memref<5xf32, strided<[?], offset: ?>>
}
```
Which is not correct IR since the loop yields the allocation.

I am using this option. What do I need to do now?

If you are using this option in isolation, you are possibly generating incorrect
IR, so you need to revisit your bufferization strategy. If you are using it
together with `copyBeforeWrite,` you simply need to retire the
`enforceAliasingInvariants` option.

Co-authored-by: Matthias Springer <[email protected]>
@llvmbot
Copy link
Member

llvmbot commented Mar 12, 2025

@llvm/pr-subscribers-mlir
@llvm/pr-subscribers-mlir-bufferization

@llvm/pr-subscribers-mlir-scf

Author: lorenzo chelini (chelini)

Changes

Why? This option can lead to incorrect IR if used in isolation, for example, consider the IR below:

func.func @<!-- -->loop_with_aliasing(%arg0: tensor&lt;5xf32&gt;, %arg1: index, %arg2: index) -&gt; tensor&lt;5xf32&gt; {
  %c1 = arith.constant 1 : index
  %cst = arith.constant 1.000000e+00 : f32
  %0 = tensor.empty() : tensor&lt;5xf32&gt;
  %1 = linalg.fill ins(%cst : f32) outs(%0 : tensor&lt;5xf32&gt;) -&gt; tensor&lt;5xf32&gt;
  // The BufferizableOpInterface says that %2 alias with %arg0 or be a newly
  // allocated buffer
  %2 = scf.for %arg3 = %arg1 to %arg2 step %c1 iter_args(%arg4 = %arg0) -&gt; (tensor&lt;5xf32&gt;) {
    scf.yield %1 : tensor&lt;5xf32&gt;
  }
  %cst_0 = arith.constant 1.000000e+00 : f32
  %inserted = tensor.insert %cst_0 into %1[%c1] : tensor&lt;5xf32&gt;
  return %2 : tensor&lt;5xf32&gt;
}

If we bufferize with: enforce-aliasing-invariants=false, we get:

func.func @<!-- -->loop_with_aliasing(%arg0: memref&lt;5xf32, strided&lt;[?], offset: ?&gt;&gt;, %arg1: index, %arg2: index) -&gt; memref&lt;5xf32, strided&lt;[?], offset: ?&gt;&gt; {
  %c1 = arith.constant 1 : index
  %cst = arith.constant 1.000000e+00 : f32
  %alloc = memref.alloc() {alignment = 64 : i64} : memref&lt;5xf32&gt;
  linalg.fill ins(%cst : f32) outs(%alloc : memref&lt;5xf32&gt;)
  %0 = scf.for %arg3 = %arg1 to %arg2 step %c1 iter_args(%arg4 = %arg0) -&gt; (memref&lt;5xf32, strided&lt;[?], offset: ?&gt;&gt;) {
    %cast = memref.cast %alloc : memref&lt;5xf32&gt; to memref&lt;5xf32, strided&lt;[?], offset: ?&gt;&gt;
    scf.yield %cast : memref&lt;5xf32, strided&lt;[?], offset: ?&gt;&gt;
  }
  %cst_0 = arith.constant 1.000000e+00 : f32
  memref.store %cst_0, %alloc[%c1] : memref&lt;5xf32&gt;
  return %0 : memref&lt;5xf32, strided&lt;[?], offset: ?&gt;&gt;
}

Which is not correct IR since the loop yields the allocation.

I am using this option. What do I need to do now?

If you are using this option in isolation, you are possibly generating incorrect IR, so you need to revisit your bufferization strategy. If you are using it together with copyBeforeWrite, you simply need to retire the enforceAliasingInvariants option.


Full diff: https://github.com/llvm/llvm-project/pull/130929.diff

2 Files Affected:

  • (modified) mlir/include/mlir/Dialect/Bufferization/IR/BufferizableOpInterface.h (-10)
  • (modified) mlir/lib/Dialect/SCF/Transforms/BufferizableOpInterfaceImpl.cpp (+2-4)
diff --git a/mlir/include/mlir/Dialect/Bufferization/IR/BufferizableOpInterface.h b/mlir/include/mlir/Dialect/Bufferization/IR/BufferizableOpInterface.h
index ea7af3018bda8..ada9539e87121 100644
--- a/mlir/include/mlir/Dialect/Bufferization/IR/BufferizableOpInterface.h
+++ b/mlir/include/mlir/Dialect/Bufferization/IR/BufferizableOpInterface.h
@@ -315,16 +315,6 @@ struct BufferizationOptions {
   // outside of the parallel region will be given a new buffer.
   bool checkParallelRegions = true;
 
-  /// Certain ops have aliasing OpOperand/OpResult invariants (e.g., scf.for).
-  /// If this flag is set to `false`, those invariants are no longer enforced
-  /// with buffer copies.
-  ///
-  /// Note: Deactivating this flag can lead to incorrect bufferization results
-  /// when used incorrectly. This flag is useful with
-  /// `AlwaysCopyAnalysisState` which bufferizes all writing tensor
-  /// OpOperands out-of-place.
-  bool enforceAliasingInvariants = true;
-
   /// This function controls buffer types on function signatures. Sets
   /// `functionArgTypeConverterFn` and `inferFunctionResultLayout` accordingly.
   ///
diff --git a/mlir/lib/Dialect/SCF/Transforms/BufferizableOpInterfaceImpl.cpp b/mlir/lib/Dialect/SCF/Transforms/BufferizableOpInterfaceImpl.cpp
index e9d7dc1b847c6..74e36796d498e 100644
--- a/mlir/lib/Dialect/SCF/Transforms/BufferizableOpInterfaceImpl.cpp
+++ b/mlir/lib/Dialect/SCF/Transforms/BufferizableOpInterfaceImpl.cpp
@@ -652,8 +652,7 @@ struct ForOpInterface
     if (failed(bufferizableOp.resolveTensorOpOperandConflicts(rewriter, state)))
       return failure();
 
-    if (!state.getOptions().enforceAliasingInvariants ||
-        state.getOptions().copyBeforeWrite)
+    if (state.getOptions().copyBeforeWrite)
       return success();
 
     // According to the `getAliasing...` implementations, a bufferized OpResult
@@ -894,8 +893,7 @@ struct WhileOpInterface
     if (failed(bufferizableOp.resolveTensorOpOperandConflicts(rewriter, state)))
       return failure();
 
-    if (!state.getOptions().enforceAliasingInvariants ||
-        state.getOptions().copyBeforeWrite)
+    if (state.getOptions().copyBeforeWrite)
       return success();
 
     // According to the `getAliasing...` implementations, a bufferized OpResult

@chelini
Copy link
Contributor Author

chelini commented Mar 12, 2025

@MaheshRavishankar and @lialan can you try to see if this change breaks IREE pipeline?

@joker-eph
Copy link
Collaborator

LG, but @matthias-springer is the expert, I'll let him approve.

Copy link
Contributor

@MaheshRavishankar MaheshRavishankar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment here is

Note: Deactivating this flag can lead to incorrect bufferization results
when used incorrectly. This flag is useful with
`AlwaysCopyAnalysisState` which bufferizes all writing tensor
OpOperands out-of-place.

It is not clear what "used" incorrectly is. The existing default is for this to be true. I feel like this might uncover other bugs in bufferization that isnt being caught by upstream tests. There is definitely an correctness issue here, but this might be too big a hammer. Do we want to do a PSA before hand cause this can be a breaking change and I am fraid about the coverage upstream.

Also what will happen if copyBeforeWrite is set to true.

@matthias-springer
Copy link
Member

matthias-springer commented Mar 13, 2025

enforce-aliasing-invariants=true is the default case. enforce-aliasing-invariants=false can cause incorrect bufferization.

enforce-aliasing-invariants was added before copy-before-write. When I added copy-before-write, I should have removed enforce-aliasing-invariants. There's no reason to have both. If you look at the diff, you see that enforce-aliasing-invariants and copy-before-write are checked in the same place. (It is the expectation that copy-before-write <=> !enforce-aliasing-invariants.)

IREE is probably the only project that uses enforce-aliasing-invariants=false (I doubt that anybody else even knows about this flag.) and you use it together with copy-before-write=true, so removing this flag should be a no-op and resolve the issue with one of the reverts that you are carrying on your most recent LLVM integrate.

@matthias-springer
Copy link
Member

I posted a PSA here: https://discourse.llvm.org/t/psa-removing-enforcealiasinginvariants-from-one-shot-bufferize/85178
Let's give it a few days, so people have a chance to respond.

@chelini
Copy link
Contributor Author

chelini commented Mar 13, 2025

The comment here is

Note: Deactivating this flag can lead to incorrect bufferization results
when used incorrectly. This flag is useful with
`AlwaysCopyAnalysisState` which bufferizes all writing tensor
OpOperands out-of-place.

It is not clear what "used" incorrectly is. The existing default is for this to be true. I feel like this might uncover other bugs in bufferization that isnt being caught by upstream tests. There is definitely an correctness issue here, but this might be too big a hammer. Do we want to do a PSA before hand cause this can be a breaking change and I am fraid about the coverage upstream.

Also what will happen if copyBeforeWrite is set to true.

I also ran the IREE unit tests and didn’t encounter any failures. I agree with Matthias—the option is always used in tandem with copy-before-write, so we can safely remove it. Can you please run IREE pipeline on this commit?

MaheshRavishankar added a commit to MaheshRavishankar/iree that referenced this pull request Mar 13, 2025
Signed-off-by: MaheshRavishankar <[email protected]>
@MaheshRavishankar
Copy link
Contributor

Running this change with IREE, I am seeing a huge regression on our benchmarks... I need to triage this more.. can I get some time before landing this, or is there a way to land it while using an option to keep existing behavior?

@MaheshRavishankar
Copy link
Contributor

Actually that was just a CI issue. Seems fine for iree. I'll unblock, but you guys decide when it is safe to land

@MaheshRavishankar MaheshRavishankar dismissed their stale review March 15, 2025 17:06

Issue resolved

@matthias-springer matthias-springer merged commit 57dc713 into llvm:main Mar 18, 2025
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
mlir:bufferization Bufferization infrastructure mlir:scf mlir
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants