Skip to content

[mlir][vector] Relax the requirements on broadcast dims #99341

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Oct 4, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 12 additions & 12 deletions mlir/include/mlir/Dialect/Vector/IR/VectorOps.td
Original file line number Diff line number Diff line change
Expand Up @@ -1290,12 +1290,12 @@ def Vector_TransferReadOp :
specifies if the transfer is guaranteed to be within the source bounds. If
set to "false", accesses (including the starting point) may run
out-of-bounds along the respective vector dimension as the index increases.
Non-vector and broadcast dimensions *must* always be in-bounds. The
`in_bounds` array length has to be equal to the vector rank. This attribute
has a default value: `false` (i.e. "out-of-bounds"). When skipped in the
textual IR, the default value is assumed. Similarly, the OP printer will
omit this attribute when all dimensions are out-of-bounds (i.e. the default
value is used).
Non-vector dimensions *must* always be in-bounds. The `in_bounds` array
length has to be equal to the vector rank. This attribute has a default
value: `false` (i.e. "out-of-bounds"). When skipped in the textual IR, the
default value is assumed. Similarly, the OP printer will omit this
attribute when all dimensions are out-of-bounds (i.e. the default value is
used).

A `vector.transfer_read` can be lowered to a simple load if all dimensions
are specified to be within bounds and no `mask` was specified.
Expand Down Expand Up @@ -1535,12 +1535,12 @@ def Vector_TransferWriteOp :
specifies if the transfer is guaranteed to be within the source bounds. If
set to "false", accesses (including the starting point) may run
out-of-bounds along the respective vector dimension as the index increases.
Non-vector and broadcast dimensions *must* always be in-bounds. The
`in_bounds` array length has to be equal to the vector rank. This attribute
has a default value: `false` (i.e. "out-of-bounds"). When skipped in the
textual IR, the default value is assumed. Similarly, the OP printer will
omit this attribute when all dimensions are out-of-bounds (i.e. the default
value is used).
Non-vector dimensions *must* always be in-bounds. The `in_bounds` array
length has to be equal to the vector rank. This attribute has a default
value: `false` (i.e. "out-of-bounds"). When skipped in the textual IR, the
default value is assumed. Similarly, the OP printer will omit this
attribute when all dimensions are out-of-bounds (i.e. the default value is
used).

A `vector.transfer_write` can be lowered to a simple store if all
dimensions are specified to be within bounds and no `mask` was specified.
Expand Down
7 changes: 2 additions & 5 deletions mlir/include/mlir/Interfaces/VectorInterfaces.td
Original file line number Diff line number Diff line change
Expand Up @@ -234,12 +234,9 @@ def VectorTransferOpInterface : OpInterface<"VectorTransferOpInterface"> {
return constExpr && constExpr.getValue() == 0;
}

/// Return "true" if the vector transfer dimension `dim` is in-bounds. Also
/// return "true" if the dimension is a broadcast dimension. Return "false"
/// otherwise.
/// Return "true" if the vector transfer dimension `dim` is in-bounds.
/// Return "false" otherwise.
bool isDimInBounds(unsigned dim) {
if ($_op.isBroadcastDim(dim))
return true;
auto inBounds = $_op.getInBounds();
return ::llvm::cast<::mlir::BoolAttr>(inBounds[dim]).getValue();
}
Expand Down
13 changes: 1 addition & 12 deletions mlir/lib/Dialect/Affine/Transforms/SuperVectorize.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1223,19 +1223,8 @@ static Operation *vectorizeAffineLoad(AffineLoadOp loadOp,
LLVM_DEBUG(dbgs() << "\n[early-vect]+++++ permutationMap: ");
LLVM_DEBUG(permutationMap.print(dbgs()));

// Make sure that the in_bounds attribute corresponding to a broadcast dim
// is set to `true` - that's required by the xfer Op.
// FIXME: We're not veryfying whether the corresponding access is in bounds.
// TODO: Use masking instead.
SmallVector<unsigned> broadcastedDims = permutationMap.getBroadcastDims();
SmallVector<bool> inBounds(vectorType.getRank(), false);

for (auto idx : broadcastedDims)
inBounds[idx] = true;

auto transfer = state.builder.create<vector::TransferReadOp>(
loadOp.getLoc(), vectorType, loadOp.getMemRef(), indices, permutationMap,
inBounds);
loadOp.getLoc(), vectorType, loadOp.getMemRef(), indices, permutationMap);

// Register replacement for future uses in the scope.
state.registerOpVectorReplacement(loadOp, transfer);
Expand Down
11 changes: 1 addition & 10 deletions mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1380,17 +1380,8 @@ vectorizeAsLinalgGeneric(RewriterBase &rewriter, VectorizationState &state,

SmallVector<Value> indices(linalgOp.getShape(opOperand).size(), zero);

// Make sure that the in_bounds attribute corresponding to a broadcast dim
// is `true`
SmallVector<unsigned> broadcastedDims = readMap.getBroadcastDims();
SmallVector<bool> inBounds(readType.getRank(), false);

for (auto idx : broadcastedDims)
inBounds[idx] = true;

Operation *read = rewriter.create<vector::TransferReadOp>(
loc, readType, opOperand->get(), indices, readMap,
ArrayRef<bool>(inBounds));
loc, readType, opOperand->get(), indices, readMap);
read = state.maskOperation(rewriter, read, linalgOp, indexingMap);
Value readValue = read->getResult(0);

Expand Down
37 changes: 27 additions & 10 deletions mlir/lib/Dialect/Vector/IR/VectorOps.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -3947,10 +3947,6 @@ verifyTransferOp(VectorTransferOpInterface op, ShapedType shapedType,
"as permutation_map results: ")
<< AffineMapAttr::get(permutationMap)
<< " vs inBounds of size: " << inBounds.size();
for (unsigned int i = 0, e = permutationMap.getNumResults(); i < e; ++i)
if (isa<AffineConstantExpr>(permutationMap.getResult(i)) &&
!llvm::cast<BoolAttr>(inBounds.getValue()[i]).getValue())
return op->emitOpError("requires broadcast dimensions to be in-bounds");

return success();
}
Expand Down Expand Up @@ -4138,22 +4134,43 @@ static LogicalResult foldTransferInBoundsAttribute(TransferOp op) {
bool changed = false;
SmallVector<bool, 4> newInBounds;
newInBounds.reserve(op.getTransferRank());
// Idxs of non-bcast dims - used when analysing bcast dims.
SmallVector<unsigned> nonBcastDims;

// 1. Process non-broadcast dims
for (unsigned i = 0; i < op.getTransferRank(); ++i) {
// Already marked as in-bounds, nothing to see here.
// 1.1. Already marked as in-bounds, nothing to see here.
if (op.isDimInBounds(i)) {
newInBounds.push_back(true);
continue;
}
// Currently out-of-bounds, check whether we can statically determine it is
// inBounds.
// 1.2. Currently out-of-bounds, check whether we can statically determine
// it is inBounds.
bool inBounds = false;
auto dimExpr = dyn_cast<AffineDimExpr>(permutationMap.getResult(i));
assert(dimExpr && "Broadcast dims must be in-bounds");
auto inBounds =
isInBounds(op, /*resultIdx=*/i, /*indicesIdx=*/dimExpr.getPosition());
if (dimExpr) {
inBounds = isInBounds(op, /*resultIdx=*/i,
/*indicesIdx=*/dimExpr.getPosition());
nonBcastDims.push_back(i);
}

newInBounds.push_back(inBounds);
// We commit the pattern if it is "more inbounds".
changed |= inBounds;
}

// 2. Handle broadcast dims
// If all non-broadcast dims are "in bounds", then all bcast dims should be
// "in bounds" as well.
bool allNonBcastDimsInBounds = llvm::all_of(
nonBcastDims, [&newInBounds](unsigned idx) { return newInBounds[idx]; });
if (allNonBcastDimsInBounds) {
for (size_t idx : permutationMap.getBroadcastDims()) {
changed |= !newInBounds[idx];
newInBounds[idx] = true;
}
}

if (!changed)
return failure();
// OpBuilder is only used as a helper to build an I64ArrayAttr.
Expand Down
6 changes: 3 additions & 3 deletions mlir/test/Conversion/VectorToSCF/vector-to-scf.mlir
Original file line number Diff line number Diff line change
Expand Up @@ -133,7 +133,7 @@ func.func @materialize_read(%M: index, %N: index, %O: index, %P: index) {
affine.for %i1 = 0 to %N {
affine.for %i2 = 0 to %O {
affine.for %i3 = 0 to %P step 5 {
%f = vector.transfer_read %A[%i0, %i1, %i2, %i3], %f0 {in_bounds = [false, true, false], permutation_map = affine_map<(d0, d1, d2, d3) -> (d3, 0, d0)>} : memref<?x?x?x?xf32>, vector<5x4x3xf32>
%f = vector.transfer_read %A[%i0, %i1, %i2, %i3], %f0 {permutation_map = affine_map<(d0, d1, d2, d3) -> (d3, 0, d0)>} : memref<?x?x?x?xf32>, vector<5x4x3xf32>
// Add a dummy use to prevent dead code elimination from removing
// transfer read ops.
"dummy_use"(%f) : (vector<5x4x3xf32>) -> ()
Expand Down Expand Up @@ -507,7 +507,7 @@ func.func @transfer_read_with_tensor(%arg: tensor<f32>) -> vector<1xf32> {
// CHECK-NEXT: %[[RESULT:.*]] = vector.broadcast %[[EXTRACTED]] : f32 to vector<1xf32>
// CHECK-NEXT: return %[[RESULT]] : vector<1xf32>
%f0 = arith.constant 0.0 : f32
%0 = vector.transfer_read %arg[], %f0 {in_bounds = [true], permutation_map = affine_map<()->(0)>} :
%0 = vector.transfer_read %arg[], %f0 {permutation_map = affine_map<()->(0)>} :
tensor<f32>, vector<1xf32>
return %0: vector<1xf32>
}
Expand Down Expand Up @@ -746,7 +746,7 @@ func.func @cannot_lower_transfer_read_with_leading_scalable(%arg0: memref<?x4xf3
func.func @does_not_crash_on_unpack_one_dim(%subview: memref<1x1x1x1xi32>, %mask: vector<1x1xi1>) -> vector<1x1x1x1xi32> {
%c0 = arith.constant 0 : index
%c0_i32 = arith.constant 0 : i32
%3 = vector.transfer_read %subview[%c0, %c0, %c0, %c0], %c0_i32, %mask {in_bounds = [false, true, true, false], permutation_map = #map1}
%3 = vector.transfer_read %subview[%c0, %c0, %c0, %c0], %c0_i32, %mask {permutation_map = #map1}
: memref<1x1x1x1xi32>, vector<1x1x1x1xi32>
return %3 : vector<1x1x1x1xi32>
}
Expand Down
6 changes: 3 additions & 3 deletions mlir/test/Dialect/Affine/SuperVectorize/vectorize_1d.mlir
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ func.func @vec1d_1(%A : memref<?x?xf32>, %B : memref<?x?x?xf32>) {
// CHECK-NEXT: %{{.*}} = affine.apply #[[$map_id1]](%[[C0]])
// CHECK-NEXT: %{{.*}} = affine.apply #[[$map_id1]](%[[C0]])
// CHECK-NEXT: %{{.*}} = arith.constant 0.0{{.*}}: f32
// CHECK-NEXT: {{.*}} = vector.transfer_read %{{.*}}[%{{.*}}, %{{.*}}], %{{.*}} {in_bounds = [true], permutation_map = #[[$map_proj_d0d1_0]]} : memref<?x?xf32>, vector<128xf32>
// CHECK-NEXT: {{.*}} = vector.transfer_read %{{.*}}[%{{.*}}, %{{.*}}], %{{.*}} {permutation_map = #[[$map_proj_d0d1_0]]} : memref<?x?xf32>, vector<128xf32>
affine.for %i0 = 0 to %M { // vectorized due to scalar -> vector
%a0 = affine.load %A[%c0, %c0] : memref<?x?xf32>
}
Expand Down Expand Up @@ -425,7 +425,7 @@ func.func @vec_rejected_8(%A : memref<?x?xf32>, %B : memref<?x?x?xf32>) {
// CHECK: %{{.*}} = affine.apply #[[$map_id1]](%{{.*}})
// CHECK: %{{.*}} = affine.apply #[[$map_id1]](%{{.*}})
// CHECK: %{{.*}} = arith.constant 0.0{{.*}}: f32
// CHECK: {{.*}} = vector.transfer_read %{{.*}}[%{{.*}}, %{{.*}}], %{{.*}} {in_bounds = [true], permutation_map = #[[$map_proj_d0d1_0]]} : memref<?x?xf32>, vector<128xf32>
// CHECK: {{.*}} = vector.transfer_read %{{.*}}[%{{.*}}, %{{.*}}], %{{.*}} {permutation_map = #[[$map_proj_d0d1_0]]} : memref<?x?xf32>, vector<128xf32>
affine.for %i17 = 0 to %M { // not vectorized, the 1-D pattern that matched %{{.*}} in DFS post-order prevents vectorizing %{{.*}}
affine.for %i18 = 0 to %M { // vectorized due to scalar -> vector
%a18 = affine.load %A[%c0, %c0] : memref<?x?xf32>
Expand Down Expand Up @@ -459,7 +459,7 @@ func.func @vec_rejected_9(%A : memref<?x?xf32>, %B : memref<?x?x?xf32>) {
// CHECK: %{{.*}} = affine.apply #[[$map_id1]](%{{.*}})
// CHECK-NEXT: %{{.*}} = affine.apply #[[$map_id1]](%{{.*}})
// CHECK-NEXT: %{{.*}} = arith.constant 0.0{{.*}}: f32
// CHECK-NEXT: {{.*}} = vector.transfer_read %{{.*}}[%{{.*}}, %{{.*}}], %{{.*}} {in_bounds = [true], permutation_map = #[[$map_proj_d0d1_0]]} : memref<?x?xf32>, vector<128xf32>
// CHECK-NEXT: {{.*}} = vector.transfer_read %{{.*}}[%{{.*}}, %{{.*}}], %{{.*}} {permutation_map = #[[$map_proj_d0d1_0]]} : memref<?x?xf32>, vector<128xf32>
affine.for %i17 = 0 to %M { // not vectorized, the 1-D pattern that matched %i18 in DFS post-order prevents vectorizing %{{.*}}
affine.for %i18 = 0 to %M { // vectorized due to scalar -> vector
%a18 = affine.load %A[%c0, %c0] : memref<?x?xf32>
Expand Down
4 changes: 2 additions & 2 deletions mlir/test/Dialect/Affine/SuperVectorize/vectorize_2d.mlir
Original file line number Diff line number Diff line change
Expand Up @@ -123,8 +123,8 @@ func.func @vectorize_matmul(%arg0: memref<?x?xf32>, %arg1: memref<?x?xf32>, %arg
// VECT: affine.for %[[I2:.*]] = #[[$map_id1]](%[[C0]]) to #[[$map_id1]](%[[M]]) step 4 {
// VECT-NEXT: affine.for %[[I3:.*]] = #[[$map_id1]](%[[C0]]) to #[[$map_id1]](%[[N]]) step 8 {
// VECT-NEXT: affine.for %[[I4:.*]] = #[[$map_id1]](%[[C0]]) to #[[$map_id1]](%[[K]]) {
// VECT: %[[A:.*]] = vector.transfer_read %{{.*}}[%[[I4]], %[[I3]]], %{{.*}} {in_bounds = [true, false], permutation_map = #[[$map_proj_d0d1_zerod1]]} : memref<?x?xf32>, vector<4x8xf32>
// VECT: %[[B:.*]] = vector.transfer_read %{{.*}}[%[[I2]], %[[I4]]], %{{.*}} {in_bounds = [false, true], permutation_map = #[[$map_proj_d0d1_d0zero]]} : memref<?x?xf32>, vector<4x8xf32>
// VECT: %[[A:.*]] = vector.transfer_read %{{.*}}[%[[I4]], %[[I3]]], %{{.*}} {permutation_map = #[[$map_proj_d0d1_zerod1]]} : memref<?x?xf32>, vector<4x8xf32>
// VECT: %[[B:.*]] = vector.transfer_read %{{.*}}[%[[I2]], %[[I4]]], %{{.*}} {permutation_map = #[[$map_proj_d0d1_d0zero]]} : memref<?x?xf32>, vector<4x8xf32>
// VECT-NEXT: %[[C:.*]] = arith.mulf %[[B]], %[[A]] : vector<4x8xf32>
// VECT: %[[D:.*]] = vector.transfer_read %{{.*}}[%[[I2]], %[[I3]]], %{{.*}} : memref<?x?xf32>, vector<4x8xf32>
// VECT-NEXT: %[[E:.*]] = arith.addf %[[D]], %[[C]] : vector<4x8xf32>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -141,7 +141,7 @@ func.func @affine_map_with_expr_2(%arg0: memref<8x12x16xf32>, %arg1: memref<8x24
// CHECK-NEXT: %[[S1:.*]] = affine.apply #[[$MAP_ID4]](%[[ARG3]], %[[ARG4]], %[[I0]])
// CHECK-NEXT: %[[S2:.*]] = affine.apply #[[$MAP_ID5]](%[[ARG3]], %[[ARG4]], %[[I0]])
// CHECK-NEXT: %[[CST:.*]] = arith.constant 0.000000e+00 : f32
// CHECK-NEXT: %[[S3:.*]] = vector.transfer_read %[[ARG0]][%[[S0]], %[[S1]], %[[S2]]], %[[CST]] {in_bounds = [true], permutation_map = #[[$MAP_ID6]]} : memref<8x12x16xf32>, vector<8xf32>
// CHECK-NEXT: %[[S3:.*]] = vector.transfer_read %[[ARG0]][%[[S0]], %[[S1]], %[[S2]]], %[[CST]] {permutation_map = #[[$MAP_ID6]]} : memref<8x12x16xf32>, vector<8xf32>
// CHECK-NEXT: vector.transfer_write %[[S3]], %[[ARG1]][%[[ARG3]], %[[ARG4]], %[[ARG5]]] : vector<8xf32>, memref<8x24x48xf32>
// CHECK-NEXT: }
// CHECK-NEXT: }
Expand Down
2 changes: 1 addition & 1 deletion mlir/test/Dialect/Linalg/hoisting.mlir
Original file line number Diff line number Diff line change
Expand Up @@ -200,7 +200,7 @@ func.func @hoist_vector_transfer_pairs_in_affine_loops(%memref0: memref<64x64xi3
affine.for %arg3 = 0 to 64 {
affine.for %arg4 = 0 to 64 step 16 {
affine.for %arg5 = 0 to 64 {
%0 = vector.transfer_read %memref0[%arg3, %arg5], %c0_i32 {in_bounds = [true], permutation_map = affine_map<(d0, d1) -> (0)>} : memref<64x64xi32>, vector<16xi32>
%0 = vector.transfer_read %memref0[%arg3, %arg5], %c0_i32 {permutation_map = affine_map<(d0, d1) -> (0)>} : memref<64x64xi32>, vector<16xi32>
%1 = vector.transfer_read %memref1[%arg5, %arg4], %c0_i32 : memref<64x64xi32>, vector<16xi32>
%2 = vector.transfer_read %memref2[%arg3, %arg4], %c0_i32 : memref<64x64xi32>, vector<16xi32>
%3 = arith.muli %0, %1 : vector<16xi32>
Expand Down
2 changes: 1 addition & 1 deletion mlir/test/Dialect/Linalg/vectorization.mlir
Original file line number Diff line number Diff line change
Expand Up @@ -130,7 +130,7 @@ func.func @vectorize_dynamic_1d_broadcast(%arg0: tensor<?xf32>,
// CHECK-LABEL: @vectorize_dynamic_1d_broadcast
// CHECK: %[[VAL_3:.*]] = arith.constant 0 : index
// CHECK: %[[VAL_4:.*]] = tensor.dim %{{.*}}, %[[VAL_3]] : tensor<?xf32>
// CHECK: %[[VAL_7:.*]] = vector.transfer_read %{{.*}} {in_bounds = {{.*}}, permutation_map = #{{.*}}} : tensor<?xf32>, vector<4xf32>
// CHECK: %[[VAL_7:.*]] = vector.transfer_read %{{.*}} {permutation_map = #{{.*}}} : tensor<?xf32>, vector<4xf32>
// CHECK: %[[VAL_9:.*]] = vector.create_mask %[[VAL_4]] : vector<4xi1>
// CHECK: %[[VAL_10:.*]] = vector.mask %[[VAL_9]] { vector.transfer_read %{{.*}} {in_bounds = [true]} : tensor<?xf32>, vector<4xf32> } : vector<4xi1> -> vector<4xf32>
// CHECK: %[[VAL_12:.*]] = vector.mask %[[VAL_9]] { vector.transfer_read %{{.*}} {in_bounds = [true]} : tensor<?xf32>, vector<4xf32> } : vector<4xi1> -> vector<4xf32>
Expand Down
28 changes: 18 additions & 10 deletions mlir/test/Dialect/Vector/invalid.mlir
Original file line number Diff line number Diff line change
Expand Up @@ -454,6 +454,15 @@ func.func @test_vector.transfer_read(%arg0: memref<?x?xf32>) {

// -----

func.func @test_vector.transfer_read(%arg0: memref<?x?xf32>) {
%c3 = arith.constant 3 : index
%cst = arith.constant 3.0 : f32
// expected-error@+1 {{requires a projected permutation_map (at most one dim or the zero constant can appear in each result)}}
%0 = vector.transfer_read %arg0[%c3, %c3], %cst {permutation_map = affine_map<(d0, d1)->(1)>} : memref<?x?xf32>, vector<128xf32>
}

// -----

func.func @test_vector.transfer_read(%arg0: memref<?x?x?xf32>) {
%c3 = arith.constant 3 : index
%cst = arith.constant 3.0 : f32
Expand Down Expand Up @@ -505,16 +514,6 @@ func.func @test_vector.transfer_read(%arg0: memref<?x?xvector<2x3xf32>>) {

// -----

func.func @test_vector.transfer_read(%arg0: memref<?x?xvector<2x3xf32>>) {
%c3 = arith.constant 3 : index
%f0 = arith.constant 0.0 : f32
%vf0 = vector.splat %f0 : vector<2x3xf32>
// expected-error@+1 {{requires broadcast dimensions to be in-bounds}}
%0 = vector.transfer_read %arg0[%c3, %c3], %vf0 {in_bounds = [false, true], permutation_map = affine_map<(d0, d1)->(0, d1)>} : memref<?x?xvector<2x3xf32>>, vector<1x1x2x3xf32>
}

// -----

func.func @test_vector.transfer_read(%arg0: memref<?x?xvector<2x3xf32>>) {
%c3 = arith.constant 3 : index
%f0 = arith.constant 0.0 : f32
Expand Down Expand Up @@ -618,6 +617,15 @@ func.func @test_vector.transfer_write(%arg0: memref<?x?xf32>) {

// -----

func.func @test_vector.transfer_write(%arg0: memref<?x?xf32>) {
%c3 = arith.constant 3 : index
%cst = arith.constant dense<3.0> : vector<128 x f32>
// expected-error@+1 {{requires a projected permutation_map (at most one dim or the zero constant can appear in each result)}}
vector.transfer_write %cst, %arg0[%c3, %c3] {permutation_map = affine_map<(d0, d1)->(1)>} : vector<128xf32>, memref<?x?xf32>
}

// -----

func.func @test_vector.transfer_write(%arg0: memref<?x?x?xf32>) {
%c3 = arith.constant 3 : index
%cst = arith.constant dense<3.0> : vector<3 x 7 x f32>
Expand Down
2 changes: 1 addition & 1 deletion mlir/test/Dialect/Vector/ops.mlir
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,7 @@ func.func @vector_transfer_ops(%arg0: memref<?x?xf32>,
// CHECK: vector.transfer_read %{{.*}}[%[[C3]], %[[C3]]], %{{.*}}, %{{.*}} : memref<?x?xf32>, vector<5xf32>
%8 = vector.transfer_read %arg0[%c3, %c3], %f0, %m : memref<?x?xf32>, vector<5xf32>
// CHECK: vector.transfer_read %{{.*}}[%[[C3]], %[[C3]], %[[C3]]], %{{.*}}, %{{.*}} : memref<?x?x?xf32>, vector<5x4x8xf32>
%9 = vector.transfer_read %arg4[%c3, %c3, %c3], %f0, %m2 {in_bounds = [false, false, true], permutation_map = affine_map<(d0, d1, d2)->(d1, d0, 0)>} : memref<?x?x?xf32>, vector<5x4x8xf32>
%9 = vector.transfer_read %arg4[%c3, %c3, %c3], %f0, %m2 {permutation_map = affine_map<(d0, d1, d2)->(d1, d0, 0)>} : memref<?x?x?xf32>, vector<5x4x8xf32>

// CHECK: vector.transfer_write
vector.transfer_write %0, %arg0[%c3, %c3] {permutation_map = affine_map<(d0, d1)->(d0)>} : vector<128xf32>, memref<?x?xf32>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -327,13 +327,11 @@ func.func @masked_permutation_xfer_read_fixed_width(
%c0 = arith.constant 0 : index
%3 = vector.mask %mask {
vector.transfer_read %dest[%c0, %c0], %cst {
in_bounds = [false, true, false],
permutation_map = affine_map<(d0, d1) -> (d1, 0, d0)>
} : tensor<?x1xf32>, vector<1x4x4xf32>
} : vector<4x1xi1> -> vector<1x4x4xf32>

"test.some_use"(%3) : (vector<1x4x4xf32>) -> ()

return
}

Expand Down
Loading
Loading