[mlir][tensor] Remove assertion in ExpandShapeOp::build #91361

bjacob · 2024-05-07T17:09:39Z

Unblocking downstream integrate where an expected-to-fail test was expecting this to be a runtime verifier error, not a compiler crash: llvm/torch-mlir#3279.

llvmbot · 2024-05-07T17:28:50Z

@llvm/pr-subscribers-mlir

@llvm/pr-subscribers-mlir-tensor

Author: Benoit Jacob (bjacob)

Changes

Unblocking downstream integrate where an expected-to-fail test was expecting this to be a runtime verifier error, not a compiler crash: llvm/torch-mlir#3279.

Full diff: https://github.com/llvm/llvm-project/pull/91361.diff

1 Files Affected:

(modified) mlir/lib/Dialect/Tensor/IR/TensorOps.cpp (+6-4)

diff --git a/mlir/lib/Dialect/Tensor/IR/TensorOps.cpp b/mlir/lib/Dialect/Tensor/IR/TensorOps.cpp
index 4c65045084dc..7a13f7a7d135 100644
--- a/mlir/lib/Dialect/Tensor/IR/TensorOps.cpp
+++ b/mlir/lib/Dialect/Tensor/IR/TensorOps.cpp
@@ -1676,10 +1676,12 @@ void ExpandShapeOp::build(OpBuilder &builder, OperationState &result,
   auto tensorResultTy = cast<RankedTensorType>(resultType);
   FailureOr<SmallVector<OpFoldResult>> outputShape = inferOutputShape(
       builder, result.location, tensorResultTy, reassociation, inputShape);
-  // Failure of this assertion usually indicates presence of multiple
-  // dynamic dimensions in the same reassociation group.
-  assert(succeeded(outputShape) && "unable to infer output shape");
-  build(builder, result, tensorResultTy, src, reassociation, *outputShape);
+  SmallVector<OpFoldResult> outputShapeOrEmpty;
+  if (succeeded(outputShape)) {
+    outputShapeOrEmpty = *outputShape;
+  }
+  build(builder, result, tensorResultTy, src, reassociation,
+        outputShapeOrEmpty);
 }
 
 SmallVector<AffineMap, 4> CollapseShapeOp::getReassociationMaps() {

jpienaar

Would it be possible to add test?

bjacob · 2024-05-07T17:48:01Z

as a follow-up please :-) need to unblock the downstreams.

also, the test would ideally test for a helpful error message, which runs into a problem i had writing this PR: I dont know how if it's safe to call emitOpError before the end of build(). I didn't want to take any chances with unblocking the integrates.

So +1 to a follow-up that finds how to generate a helpful error , and tests that.

hanhanW · 2024-05-07T17:51:34Z

I think we can capture the invalid op in op verifier. I'd suggest that we update the below code as well. It should look at expand_shape shape inference too. Then we can add the invalid op test to https://github.com/llvm/llvm-project/blob/main/mlir/test/Dialect/Tensor/invalid.mlir . What do you think?

llvm-project/mlir/lib/Dialect/Tensor/IR/TensorOps.cpp

Lines 1751 to 1786 in 026a29e

    
           static LogicalResult verifyTensorReshapeOp(TensorReshapeOp op, 
        
                                                      RankedTensorType expandedType, 
        
                                                      RankedTensorType collapsedType) { 
        
             if (failed( 
        
                     verifyReshapeLikeTypes(op, expandedType, collapsedType, isExpansion))) 
        
               return failure(); 
        
             auto maps = op.getReassociationMaps(); 
        
             RankedTensorType expectedType = 
        
                 CollapseShapeOp::inferCollapsedType(expandedType, maps); 
        
             if (!isSameTypeWithoutEncoding(collapsedType, expectedType)) 
        
               return op.emitOpError("expected collapsed type to be ") 
        
                      << expectedType << ", but got " << collapsedType; 
        
             return success(); 
        
           } 
        
           LogicalResult ExpandShapeOp::verify() { 
        
             auto srcType = getSrcType(); 
        
             auto resultType = getResultType(); 
        
             if ((int64_t)getStaticOutputShape().size() != resultType.getRank()) 
        
               return emitOpError("expected number of static shape dims to be equal to " 
        
                                  "the output rank (") 
        
                      << resultType.getRank() << ") but found " 
        
                      << getStaticOutputShape().size() << " inputs instead"; 
        
             if ((int64_t)getOutputShape().size() != 
        
                 llvm::count(getStaticOutputShape(), ShapedType::kDynamic)) 
        
               return emitOpError("mismatch in dynamic dims in output_shape and " 
        
                                  "static_output_shape: static_output_shape has ") 
        
                      << llvm::count(getStaticOutputShape(), ShapedType::kDynamic) 
        
                      << " dynamic dims while output_shape has " << getOutputShape().size() 
        
                      << " values"; 
        
             return verifyTensorReshapeOp(*this, resultType, srcType); 
        
           }

bjacob · 2024-05-07T17:55:28Z

oh... ok, i can give that a try.

bjacob · 2024-05-07T18:00:33Z

Actually there is another problem: this condition in C++ builder code is impossible to exercise from mlir source in tensor dialect, as omitting output_shape is a parsing error. It can only be exercised from C++. The bad caller here is a C++ caller, in TosaToTensor.cpp. So if we added a MLIR test, it would have to be in TOSA dialect.

hanhanW · 2024-05-07T18:12:46Z

Sorry that my comment was not clear.. It is true that it is hard to test c++ builder code. My point is that the build methods and verify methods should be aligned in some sense. It is okay to generate invalid op, but the verifier should signal it. In the case we've seen in torch-mlir repo, it is generating something like:

tensor.expand_shape %arg0 [[0, 1]] output_shape [%sz0, %sz1] : tensor<?xf32> into tensor<?x?xf32>

So I'd suggest add the invalid op to invalid.mlir, and signal some errors from the verifier. I'm concerned that an invalid op is generated silently but not captured. With the test, at least we can inform users that such invalid op is generated.

Shukla-Gaurav · 2024-05-07T18:40:52Z

Sorry that my comment was not clear.. It is true that it is hard to test c++ builder code. My point is that the build methods and verify methods should be aligned in some sense. It is okay to generate invalid op, but the verifier should signal it. In the case we've seen in torch-mlir repo, it is generating something like:
tensor.expand_shape %arg0 [[0, 1]] output_shape [%sz0, %sz1] : tensor<?xf32> into tensor<?x?xf32>
So I'd suggest add the invalid op to invalid.mlir, and signal some errors from the verifier. I'm concerned that an invalid op is generated silently but not captured. With the test, at least we can inform users that such invalid op is generated.

Actually, this IR is not invalid anymore(#90040). This indicates that we are calling the wrong build method(that's why the assertion). We need to raise some sort of signal in place of assertion(maybe in follow up PR).
Also tosa.reshape lowering needs to be fixed(call the right build method by passing output_shape) for such cases.

bjacob · 2024-05-07T18:42:53Z

Yeah IR like this would go to the ExpandShapeOp::build overload that takes a outputShape.

To exercise the error condition we are talking about here, which is in the ExpandShapeOp::build overload NOT taking a outputShape, I think the only way to hit that is from C++. I don't see it being hit from any kind of tensor-dialect MLIR source, since there the output_shape is mandatory in the syntax, and then of course it's going to call the C++ overload with outputShape parameter.

hanhanW

I see... thanks for filling me the context

no assert in ExpandShapeOp::build

8990e76

bjacob mentioned this pull request May 7, 2024

Bump llvm version to 97069a86193a617a9e4cf742a29db6116b2bf449 llvm/torch-mlir#3279

Closed

bjacob marked this pull request as ready for review May 7, 2024 17:28

llvmbot added mlir mlir:tensor labels May 7, 2024

bjacob requested review from hanhanW and MaheshRavishankar May 7, 2024 17:29

hanhanW changed the title ~~Remove assertion in ExpandShapeOp::build~~ [mlir][tensor] Remove assertion in ExpandShapeOp::build May 7, 2024

jpienaar reviewed May 7, 2024

View reviewed changes

hanhanW approved these changes May 7, 2024

View reviewed changes

bjacob merged commit 62bed56 into llvm:main May 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[mlir][tensor] Remove assertion in ExpandShapeOp::build #91361

[mlir][tensor] Remove assertion in ExpandShapeOp::build #91361

Uh oh!

bjacob commented May 7, 2024

Uh oh!

llvmbot commented May 7, 2024 •

edited

Loading

Uh oh!

jpienaar left a comment

Uh oh!

bjacob commented May 7, 2024

Uh oh!

hanhanW commented May 7, 2024 •

edited

Loading

Uh oh!

bjacob commented May 7, 2024

Uh oh!

bjacob commented May 7, 2024 •

edited

Loading

Uh oh!

hanhanW commented May 7, 2024

Uh oh!

Shukla-Gaurav commented May 7, 2024 •

edited

Loading

Uh oh!

bjacob commented May 7, 2024

Uh oh!

hanhanW left a comment

Uh oh!

Uh oh!

[mlir][tensor] Remove assertion in ExpandShapeOp::build #91361

[mlir][tensor] Remove assertion in ExpandShapeOp::build #91361

Uh oh!

Conversation

bjacob commented May 7, 2024

Uh oh!

llvmbot commented May 7, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jpienaar left a comment

Choose a reason for hiding this comment

Uh oh!

bjacob commented May 7, 2024

Uh oh!

hanhanW commented May 7, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bjacob commented May 7, 2024

Uh oh!

bjacob commented May 7, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hanhanW commented May 7, 2024

Uh oh!

Shukla-Gaurav commented May 7, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bjacob commented May 7, 2024

Uh oh!

hanhanW left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

llvmbot commented May 7, 2024 •

edited

Loading

hanhanW commented May 7, 2024 •

edited

Loading

bjacob commented May 7, 2024 •

edited

Loading

Shukla-Gaurav commented May 7, 2024 •

edited

Loading