Skip to content

Commit 96ef623

Browse files
[AArch64] Cast predicate operand of SVE gather loads/scater stores to the parameter type of the intrinsic (NFC) (llvm#71289)
When emitting LLVM IR for gather loads/scatter stores, the predicate parameter is cast to a type that depends on the loaded, resp. stored type. That's correct for operation where we have a predicate per lane, however it is not correct for quadword loads and stores (`LD1Q`, `ST1Q`) where the predicate is per 128-bit chunk, independent from the ACLE intrinsic type. This can be universally handled by cast to the corresponding parameter type of the intrinsic. The intrinsic itself should be defined in a way that enforces relations between parameter types.
1 parent dc5bdcb commit 96ef623

File tree

1 file changed

+15
-9
lines changed

1 file changed

+15
-9
lines changed

clang/lib/CodeGen/CGBuiltin.cpp

Lines changed: 15 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -9482,13 +9482,6 @@ Value *CodeGenFunction::EmitSVEGatherLoad(const SVETypeFlags &TypeFlags,
94829482
auto *OverloadedTy =
94839483
llvm::ScalableVectorType::get(SVEBuiltinMemEltTy(TypeFlags), ResultTy);
94849484

9485-
// At the ACLE level there's only one predicate type, svbool_t, which is
9486-
// mapped to <n x 16 x i1>. However, this might be incompatible with the
9487-
// actual type being loaded. For example, when loading doubles (i64) the
9488-
// predicated should be <n x 2 x i1> instead. At the IR level the type of
9489-
// the predicate and the data being loaded must match. Cast accordingly.
9490-
Ops[0] = EmitSVEPredicateCast(Ops[0], OverloadedTy);
9491-
94929485
Function *F = nullptr;
94939486
if (Ops[1]->getType()->isVectorTy())
94949487
// This is the "vector base, scalar offset" case. In order to uniquely
@@ -9502,6 +9495,16 @@ Value *CodeGenFunction::EmitSVEGatherLoad(const SVETypeFlags &TypeFlags,
95029495
// intrinsic.
95039496
F = CGM.getIntrinsic(IntID, OverloadedTy);
95049497

9498+
// At the ACLE level there's only one predicate type, svbool_t, which is
9499+
// mapped to <n x 16 x i1>. However, this might be incompatible with the
9500+
// actual type being loaded. For example, when loading doubles (i64) the
9501+
// predicate should be <n x 2 x i1> instead. At the IR level the type of
9502+
// the predicate and the data being loaded must match. Cast to the type
9503+
// expected by the intrinsic. The intrinsic itself should be defined in
9504+
// a way than enforces relations between parameter types.
9505+
Ops[0] = EmitSVEPredicateCast(
9506+
Ops[0], cast<llvm::ScalableVectorType>(F->getArg(0)->getType()));
9507+
95059508
// Pass 0 when the offset is missing. This can only be applied when using
95069509
// the "vector base" addressing mode for which ACLE allows no offset. The
95079510
// corresponding LLVM IR always requires an offset.
@@ -9566,8 +9569,11 @@ Value *CodeGenFunction::EmitSVEScatterStore(const SVETypeFlags &TypeFlags,
95669569
// mapped to <n x 16 x i1>. However, this might be incompatible with the
95679570
// actual type being stored. For example, when storing doubles (i64) the
95689571
// predicated should be <n x 2 x i1> instead. At the IR level the type of
9569-
// the predicate and the data being stored must match. Cast accordingly.
9570-
Ops[1] = EmitSVEPredicateCast(Ops[1], OverloadedTy);
9572+
// the predicate and the data being stored must match. Cast to the type
9573+
// expected by the intrinsic. The intrinsic itself should be defined in
9574+
// a way that enforces relations between parameter types.
9575+
Ops[1] = EmitSVEPredicateCast(
9576+
Ops[1], cast<llvm::ScalableVectorType>(F->getArg(1)->getType()));
95719577

95729578
// For "vector base, scalar index" scale the index so that it becomes a
95739579
// scalar offset.

0 commit comments

Comments
 (0)