Skip to content

[CIR] Upstream insert op for VectorType #139146

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
May 12, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 36 additions & 0 deletions clang/include/clang/CIR/Dialect/IR/CIROps.td
Original file line number Diff line number Diff line change
Expand Up @@ -1969,6 +1969,42 @@ def VecCreateOp : CIR_Op<"vec.create", [Pure]> {
let hasVerifier = 1;
}

//===----------------------------------------------------------------------===//
// VecInsertOp
//===----------------------------------------------------------------------===//

def VecInsertOp : CIR_Op<"vec.insert", [Pure,
TypesMatchWith<"argument type matches vector element type", "vec", "value",
"cast<VectorType>($_self).getElementType()">,
AllTypesMatch<["result", "vec"]>]> {

let summary = "Insert one element into a vector object";
let description = [{
The `cir.vec.insert` operation produces a new vector by replacing
the element of the input vector at `index` with `value`.

```mlir
%value = cir.const #cir.int<5> : !s32i
%index = cir.const #cir.int<2> : !s32i
%vec_tmp = cir.load %0 : !cir.ptr<!cir.vector<4 x !s32i>>, !cir.vector<4 x !s32i>
%new_vec = cir.vec.insert %value, %vec_tmp[%index : !s32i] : !cir.vector<4 x !s32i>
```
}];

let arguments = (ins
CIR_VectorType:$vec,
AnyType:$value,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AnyType seems wrong here, in the absence of a verifier. The value has to match the element type of the vector, right? I'm not sure if there's a way to declare that here that will enforce it. If not, we should have a verifier.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is enforced through TypesMatchWith<"argument type matches vector element type", "vec", "value", "cast<VectorType>($_self).getElementType()"> trait. So no need for verifier.

CIR_AnyFundamentalIntType:$index
);

let results = (outs CIR_VectorType:$result);

let assemblyFormat = [{
$value `,` $vec `[` $index `:` type($index) `]` attr-dict `:`
qualified(type($vec))
}];
}

//===----------------------------------------------------------------------===//
// VecExtractOp
//===----------------------------------------------------------------------===//
Expand Down
40 changes: 31 additions & 9 deletions clang/lib/CIR/CodeGen/CIRGenExpr.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -205,6 +205,17 @@ Address CIRGenFunction::emitPointerWithAlignment(const Expr *expr,
void CIRGenFunction::emitStoreThroughLValue(RValue src, LValue dst,
bool isInit) {
if (!dst.isSimple()) {
if (dst.isVectorElt()) {
// Read/modify/write the vector, inserting the new element
const mlir::Location loc = dst.getVectorPointer().getLoc();
const mlir::Value vector =
builder.createLoad(loc, dst.getVectorAddress().getPointer());
const mlir::Value newVector = builder.create<cir::VecInsertOp>(
loc, vector, src.getScalarVal(), dst.getVectorIdx());
builder.createStore(loc, newVector, dst.getVectorAddress().getPointer());
return;
}

cgm.errorNYI(dst.getPointer().getLoc(),
"emitStoreThroughLValue: non-simple lvalue");
return;
Expand Down Expand Up @@ -418,6 +429,13 @@ RValue CIRGenFunction::emitLoadOfLValue(LValue lv, SourceLocation loc) {
if (lv.isSimple())
return RValue::get(emitLoadOfScalar(lv, loc));

if (lv.isVectorElt()) {
const mlir::Value load =
builder.createLoad(getLoc(loc), lv.getVectorAddress().getPointer());
return RValue::get(builder.create<cir::VecExtractOp>(getLoc(loc), load,
lv.getVectorIdx()));
}

cgm.errorNYI(loc, "emitLoadOfLValue");
return RValue::get(nullptr);
}
Expand Down Expand Up @@ -638,12 +656,6 @@ static Address emitArraySubscriptPtr(CIRGenFunction &cgf,

LValue
CIRGenFunction::emitArraySubscriptExpr(const clang::ArraySubscriptExpr *e) {
if (e->getBase()->getType()->isVectorType() &&
!isa<ExtVectorElementExpr>(e->getBase())) {
cgm.errorNYI(e->getSourceRange(), "emitArraySubscriptExpr: VectorType");
return LValue::makeAddr(Address::invalid(), e->getType(), LValueBaseInfo());
}

if (isa<ExtVectorElementExpr>(e->getBase())) {
cgm.errorNYI(e->getSourceRange(),
"emitArraySubscriptExpr: ExtVectorElementExpr");
Expand All @@ -666,18 +678,28 @@ CIRGenFunction::emitArraySubscriptExpr(const clang::ArraySubscriptExpr *e) {
assert((e->getIdx() == e->getLHS() || e->getIdx() == e->getRHS()) &&
"index was neither LHS nor RHS");

auto emitIdxAfterBase = [&]() -> mlir::Value {
auto emitIdxAfterBase = [&](bool promote) -> mlir::Value {
const mlir::Value idx = emitScalarExpr(e->getIdx());

// Extend or truncate the index type to 32 or 64-bits.
auto ptrTy = mlir::dyn_cast<cir::PointerType>(idx.getType());
if (ptrTy && mlir::isa<cir::IntType>(ptrTy.getPointee()))
if (promote && ptrTy && ptrTy.isPtrTo<cir::IntType>())
cgm.errorNYI(e->getSourceRange(),
"emitArraySubscriptExpr: index type cast");
return idx;
};

const mlir::Value idx = emitIdxAfterBase();
// If the base is a vector type, then we are forming a vector element
// with this subscript.
if (e->getBase()->getType()->isVectorType() &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You dropped a comment from the incubator here. Can you bring it in?

!isa<ExtVectorElementExpr>(e->getBase())) {
const mlir::Value idx = emitIdxAfterBase(/*promote=*/false);
const LValue lhs = emitLValue(e->getBase());
return LValue::makeVectorElt(lhs.getAddress(), idx, e->getBase()->getType(),
lhs.getBaseInfo());
}

const mlir::Value idx = emitIdxAfterBase(/*promote=*/true);
if (const Expr *array = getSimpleArrayDecayOperand(e->getBase())) {
LValue arrayLV;
if (const auto *ase = dyn_cast<ArraySubscriptExpr>(array))
Expand Down
27 changes: 27 additions & 0 deletions clang/lib/CIR/CodeGen/CIRGenValue.h
Original file line number Diff line number Diff line change
Expand Up @@ -115,6 +115,7 @@ class LValue {
// this is the alignment of the whole vector)
unsigned alignment;
mlir::Value v;
mlir::Value vectorIdx; // Index for vector subscript
mlir::Type elementType;
LValueBaseInfo baseInfo;

Expand All @@ -135,6 +136,7 @@ class LValue {

public:
bool isSimple() const { return lvType == Simple; }
bool isVectorElt() const { return lvType == VectorElt; }
bool isBitField() const { return lvType == BitField; }

// TODO: Add support for volatile
Expand Down Expand Up @@ -175,6 +177,31 @@ class LValue {
r.initialize(t, t.getQualifiers(), address.getAlignment(), baseInfo);
return r;
}

Address getVectorAddress() const {
return Address(getVectorPointer(), elementType, getAlignment());
}

mlir::Value getVectorPointer() const {
assert(isVectorElt());
return v;
}

mlir::Value getVectorIdx() const {
assert(isVectorElt());
return vectorIdx;
}

static LValue makeVectorElt(Address vecAddress, mlir::Value index,
clang::QualType t, LValueBaseInfo baseInfo) {
LValue r;
r.lvType = VectorElt;
r.v = vecAddress.getPointer();
r.elementType = vecAddress.getElementType();
r.vectorIdx = index;
r.initialize(t, t.getQualifiers(), vecAddress.getAlignment(), baseInfo);
return r;
}
};

/// An aggregate value slot.
Expand Down
11 changes: 10 additions & 1 deletion clang/lib/CIR/Lowering/DirectToLLVM/LowerToLLVM.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1601,7 +1601,8 @@ void ConvertCIRToLLVMPass::runOnOperation() {
CIRToLLVMTrapOpLowering,
CIRToLLVMUnaryOpLowering,
CIRToLLVMVecCreateOpLowering,
CIRToLLVMVecExtractOpLowering
CIRToLLVMVecExtractOpLowering,
CIRToLLVMVecInsertOpLowering
// clang-format on
>(converter, patterns.getContext());

Expand Down Expand Up @@ -1718,6 +1719,14 @@ mlir::LogicalResult CIRToLLVMVecExtractOpLowering::matchAndRewrite(
return mlir::success();
}

mlir::LogicalResult CIRToLLVMVecInsertOpLowering::matchAndRewrite(
cir::VecInsertOp op, OpAdaptor adaptor,
mlir::ConversionPatternRewriter &rewriter) const {
rewriter.replaceOpWithNewOp<mlir::LLVM::InsertElementOp>(
op, adaptor.getVec(), adaptor.getValue(), adaptor.getIndex());
return mlir::success();
}

std::unique_ptr<mlir::Pass> createConvertCIRToLLVMPass() {
return std::make_unique<ConvertCIRToLLVMPass>();
}
Expand Down
10 changes: 10 additions & 0 deletions clang/lib/CIR/Lowering/DirectToLLVM/LowerToLLVM.h
Original file line number Diff line number Diff line change
Expand Up @@ -313,6 +313,16 @@ class CIRToLLVMVecExtractOpLowering
mlir::ConversionPatternRewriter &) const override;
};

class CIRToLLVMVecInsertOpLowering
: public mlir::OpConversionPattern<cir::VecInsertOp> {
public:
using mlir::OpConversionPattern<cir::VecInsertOp>::OpConversionPattern;

mlir::LogicalResult
matchAndRewrite(cir::VecInsertOp op, OpAdaptor,
mlir::ConversionPatternRewriter &) const override;
};

} // namespace direct
} // namespace cir

Expand Down
123 changes: 123 additions & 0 deletions clang/test/CIR/CodeGen/vector-ext.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -213,3 +213,126 @@ void foo4() {
// OGCG: %[[TMP2:.*]] = load i32, ptr %[[IDX]], align 4
// OGCG: %[[ELE:.*]] = extractelement <4 x i32> %[[TMP1]], i32 %[[TMP2]]
// OGCG: store i32 %[[ELE]], ptr %[[INIT]], align 4

void foo5() {
vi4 a = { 1, 2, 3, 4 };

a[2] = 5;
}

// CIR: %[[VEC:.*]] = cir.alloca !cir.vector<4 x !s32i>, !cir.ptr<!cir.vector<4 x !s32i>>, ["a", init]
// CIR: %[[CONST_1:.*]] = cir.const #cir.int<1> : !s32i
// CIR: %[[CONST_2:.*]] = cir.const #cir.int<2> : !s32i
// CIR: %[[CONST_3:.*]] = cir.const #cir.int<3> : !s32i
// CIR: %[[CONST_4:.*]] = cir.const #cir.int<4> : !s32i
// CIR: %[[VEC_VAL:.*]] = cir.vec.create(%[[CONST_1]], %[[CONST_2]], %[[CONST_3]], %[[CONST_4]] :
// CIR-SAME: !s32i, !s32i, !s32i, !s32i) : !cir.vector<4 x !s32i>
// CIR: cir.store %[[VEC_VAL]], %[[VEC]] : !cir.vector<4 x !s32i>, !cir.ptr<!cir.vector<4 x !s32i>>
// CIR: %[[CONST_VAL:.*]] = cir.const #cir.int<5> : !s32i
// CIR: %[[CONST_IDX:.*]] = cir.const #cir.int<2> : !s32i
// CIR: %[[TMP:.*]] = cir.load %[[VEC]] : !cir.ptr<!cir.vector<4 x !s32i>>, !cir.vector<4 x !s32i>
// CIR: %[[NEW_VEC:.*]] = cir.vec.insert %[[CONST_VAL]], %[[TMP]][%[[CONST_IDX]] : !s32i] : !cir.vector<4 x !s32i>
// CIR: cir.store %[[NEW_VEC]], %[[VEC]] : !cir.vector<4 x !s32i>, !cir.ptr<!cir.vector<4 x !s32i>>

// LLVM: %[[VEC:.*]] = alloca <4 x i32>, i64 1, align 16
// LLVM: store <4 x i32> <i32 1, i32 2, i32 3, i32 4>, ptr %[[VEC]], align 16
// LLVM: %[[TMP:.*]] = load <4 x i32>, ptr %[[VEC]], align 16
// LLVM: %[[NEW_VEC:.*]] = insertelement <4 x i32> %[[TMP]], i32 5, i32 2
// LLVM: store <4 x i32> %[[NEW_VEC]], ptr %[[VEC]], align 16

// OGCG: %[[VEC:.*]] = alloca <4 x i32>, align 16
// OGCG: store <4 x i32> <i32 1, i32 2, i32 3, i32 4>, ptr %[[VEC]], align 16
// OGCG: %[[TMP:.*]] = load <4 x i32>, ptr %[[VEC]], align 16
// OGCG: %[[NEW_VEC:.*]] = insertelement <4 x i32> %[[TMP]], i32 5, i32 2
// OGCG: store <4 x i32> %[[NEW_VEC]], ptr %[[VEC]], align 16

void foo6() {
vi4 a = { 1, 2, 3, 4 };
int idx = 2;
int value = 5;
a[idx] = value;
}

// CIR: %[[VEC:.*]] = cir.alloca !cir.vector<4 x !s32i>, !cir.ptr<!cir.vector<4 x !s32i>>, ["a", init]
// CIR: %[[IDX:.*]] = cir.alloca !s32i, !cir.ptr<!s32i>, ["idx", init]
// CIR: %[[VAL:.*]] = cir.alloca !s32i, !cir.ptr<!s32i>, ["value", init]
// CIR: %[[CONST_1:.*]] = cir.const #cir.int<1> : !s32i
// CIR: %[[CONST_2:.*]] = cir.const #cir.int<2> : !s32i
// CIR: %[[CONST_3:.*]] = cir.const #cir.int<3> : !s32i
// CIR: %[[CONST_4:.*]] = cir.const #cir.int<4> : !s32i
// CIR: %[[VEC_VAL:.*]] = cir.vec.create(%[[CONST_1]], %[[CONST_2]], %[[CONST_3]], %[[CONST_4]] :
// CIR-SAME: !s32i, !s32i, !s32i, !s32i) : !cir.vector<4 x !s32i>
// CIR: cir.store %[[VEC_VAL]], %[[VEC]] : !cir.vector<4 x !s32i>, !cir.ptr<!cir.vector<4 x !s32i>>
// CIR: %[[CONST_IDX:.*]] = cir.const #cir.int<2> : !s32i
// CIR: cir.store %[[CONST_IDX]], %[[IDX]] : !s32i, !cir.ptr<!s32i>
// CIR: %[[CONST_VAL:.*]] = cir.const #cir.int<5> : !s32i
// CIR: cir.store %[[CONST_VAL]], %[[VAL]] : !s32i, !cir.ptr<!s32i>
// CIR: %[[TMP1:.*]] = cir.load %[[VAL]] : !cir.ptr<!s32i>, !s32i
// CIR: %[[TMP2:.*]] = cir.load %[[IDX]] : !cir.ptr<!s32i>, !s32i
// CIR: %[[TMP3:.*]] = cir.load %0 : !cir.ptr<!cir.vector<4 x !s32i>>, !cir.vector<4 x !s32i>
// CIR: %[[NEW_VEC:.*]] = cir.vec.insert %[[TMP1]], %[[TMP3]][%[[TMP2]] : !s32i] : !cir.vector<4 x !s32i>
// CIR: cir.store %[[NEW_VEC]], %[[VEC]] : !cir.vector<4 x !s32i>, !cir.ptr<!cir.vector<4 x !s32i>>

// LLVM: %[[VEC:.*]] = alloca <4 x i32>, i64 1, align 16
// LLVM: %[[IDX:.*]] = alloca i32, i64 1, align 4
// LLVM: %[[VAL:.*]] = alloca i32, i64 1, align 4
// LLVM: store <4 x i32> <i32 1, i32 2, i32 3, i32 4>, ptr %1, align 16
// LLVM: store i32 2, ptr %[[IDX]], align 4
// LLVM: store i32 5, ptr %[[VAL]], align 4
// LLVM: %[[TMP1:.*]] = load i32, ptr %[[VAL]], align 4
// LLVM: %[[TMP2:.*]] = load i32, ptr %[[IDX]], align 4
// LLVM: %[[TMP3:.*]] = load <4 x i32>, ptr %[[VEC]], align 16
// LLVM: %[[NEW_VEC:.*]] = insertelement <4 x i32> %[[TMP3]], i32 %[[TMP1]], i32 %[[TMP2]]
// LLVM: store <4 x i32> %[[NEW_VEC]], ptr %[[VEC]], align 16

// OGCG: %[[VEC:.*]] = alloca <4 x i32>, align 16
// OGCG: %[[IDX:.*]] = alloca i32, align 4
// OGCG: %[[VAL:.*]] = alloca i32, align 4
// OGCG: store <4 x i32> <i32 1, i32 2, i32 3, i32 4>, ptr %[[VEC]], align 16
// OGCG: store i32 2, ptr %[[IDX]], align 4
// OGCG: store i32 5, ptr %[[VAL]], align 4
// OGCG: %[[TMP1:.*]] = load i32, ptr %[[VAL]], align 4
// OGCG: %[[TMP2:.*]] = load i32, ptr %[[IDX]], align 4
// OGCG: %[[TMP3:.*]] = load <4 x i32>, ptr %[[VEC]], align 16
// OGCG: %[[NEW_VEC:.*]] = insertelement <4 x i32> %[[TMP3]], i32 %[[TMP1]], i32 %[[TMP2]]
// OGCG: store <4 x i32> %[[NEW_VEC]], ptr %[[VEC]], align 16

void foo7() {
vi4 a = {1, 2, 3, 4};
a[2] += 5;
}

// CIR: %[[VEC:.*]] = cir.alloca !cir.vector<4 x !s32i>, !cir.ptr<!cir.vector<4 x !s32i>>, ["a", init]
// CIR: %[[CONST_1:.*]] = cir.const #cir.int<1> : !s32i
// CIR: %[[CONST_2:.*]] = cir.const #cir.int<2> : !s32i
// CIR: %[[CONST_3:.*]] = cir.const #cir.int<3> : !s32i
// CIR: %[[CONST_4:.*]] = cir.const #cir.int<4> : !s32i
// CIR: %[[VEC_VAL:.*]] = cir.vec.create(%[[CONST_1]], %[[CONST_2]], %[[CONST_3]], %[[CONST_4]] :
// CIR-SAME: !s32i, !s32i, !s32i, !s32i) : !cir.vector<4 x !s32i>
// CIR: cir.store %[[VEC_VAL]], %[[VEC]] : !cir.vector<4 x !s32i>, !cir.ptr<!cir.vector<4 x !s32i>>
// CIR: %[[CONST_VAL:.*]] = cir.const #cir.int<5> : !s32i
// CIR: %[[CONST_IDX:.*]] = cir.const #cir.int<2> : !s32i
// CIR: %[[TMP:.*]] = cir.load %[[VEC]] : !cir.ptr<!cir.vector<4 x !s32i>>, !cir.vector<4 x !s32i>
// CIR: %[[ELE:.*]] = cir.vec.extract %[[TMP]][%[[CONST_IDX]] : !s32i] : !cir.vector<4 x !s32i>
// CIR: %[[RES:.*]] = cir.binop(add, %[[ELE]], %[[CONST_VAL]]) nsw : !s32i
// CIR: %[[TMP2:.*]] = cir.load %[[VEC]] : !cir.ptr<!cir.vector<4 x !s32i>>, !cir.vector<4 x !s32i>
// CIR: %[[NEW_VEC:.*]] = cir.vec.insert %[[RES]], %[[TMP2]][%[[CONST_IDX]] : !s32i] : !cir.vector<4 x !s32i>
// CIR: cir.store %[[NEW_VEC]], %[[VEC]] : !cir.vector<4 x !s32i>, !cir.ptr<!cir.vector<4 x !s32i>>

// LLVM: %[[VEC:.*]] = alloca <4 x i32>, i64 1, align 16
// LLVM: store <4 x i32> <i32 1, i32 2, i32 3, i32 4>, ptr %[[VEC]], align 16
// LLVM: %[[TMP:.*]] = load <4 x i32>, ptr %[[VEC]], align 16
// LLVM: %[[ELE:.*]] = extractelement <4 x i32> %[[TMP]], i32 2
// LLVM: %[[RES:.*]] = add nsw i32 %[[ELE]], 5
// LLVM: %[[TMP2:.*]] = load <4 x i32>, ptr %[[VEC]], align 16
// LLVM: %[[NEW_VEC:.*]] = insertelement <4 x i32> %[[TMP2]], i32 %[[RES]], i32 2
// LLVM: store <4 x i32> %[[NEW_VEC]], ptr %[[VEC]], align 16

// OGCG: %[[VEC:.*]] = alloca <4 x i32>, align 16
// OGCG: store <4 x i32> <i32 1, i32 2, i32 3, i32 4>, ptr %[[VEC]], align 16
// OGCG: %[[TMP:.*]] = load <4 x i32>, ptr %[[VEC]], align 16
// OGCG: %[[ELE:.*]] = extractelement <4 x i32> %[[TMP]], i32 2
// OGCG: %[[RES:.*]] = add nsw i32 %[[ELE]], 5
// OGCG: %[[TMP2:.*]] = load <4 x i32>, ptr %[[VEC]], align 16
// OGCG: %[[NEW_VEC:.*]] = insertelement <4 x i32> %[[TMP2]], i32 %[[RES]], i32 2
// OGCG: store <4 x i32> %[[NEW_VEC]], ptr %[[VEC]], align 16
Loading