Skip to content

Commit 4321c6a

Browse files
committed
[ARM,MVE] Support immediate vbicq,vorrq,vmvnq intrinsics.
Summary: Immediate vmvnq is code-generated as a simple vector constant in IR, and left to the backend to recognize that it can be created with an MVE VMVN instruction. The predicated version is represented as a select between the input and the same constant, and I've added a Tablegen isel rule to turn that into a predicated VMVN. (That should be better than the previous VMVN + VPSEL: it's the same number of instructions but now it can fold into an adjacent VPT block.) The unpredicated forms of VBIC and VORR are done by enabling the same isel lowering as for NEON, recognizing appropriate immediates and rewriting them as ARMISD::VBICIMM / ARMISD::VORRIMM SDNodes, which I then instruction-select into the right MVE instructions (now that I've also reworked those instructions to use the same MC operand encoding). In order to do that, I had to promote the Tablegen SDNode instance `NEONvorrImm` to a general `ARMvorrImm` available in MVE as well, and similarly for `NEONvbicImm`. The predicated forms of VBIC and VORR are represented as a vector select between the original input vector and the output of the unpredicated operation. The main convenience of this is that it still lets me use the existing isel lowering for VBICIMM/VORRIMM, and not have to write another copy of the operand encoding translation code. This intrinsic family is the first to use the `imm_simd` system I put into the MveEmitter tablegen backend. So, naturally, it showed up a bug or two (emitting bogus range checks and the like). Fixed those, and added a full set of tests for the permissible immediates in the existing Sema test. Also adjusted the isel pattern for `vmovlb.u8`, which stopped matching because lowering started turning its input into a VBICIMM. Now it recognizes the VBICIMM instead. Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard Reviewed By: dmgreen Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D72934
1 parent 772e493 commit 4321c6a

File tree

12 files changed

+966
-62
lines changed

12 files changed

+966
-62
lines changed

clang/include/clang/Basic/arm_mve.td

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -116,6 +116,28 @@ def vmulqf: Intrinsic<Vector, (args Vector:$a, Vector:$b), (fmul $a, $b)>,
116116
NameOverride<"vmulq">;
117117
}
118118

119+
let params = !listconcat(T.Int16, T.Int32) in {
120+
let pnt = PNT_None in {
121+
def vmvnq_n: Intrinsic<Vector, (args imm_simd_vmvn:$imm),
122+
(not (splat (Scalar $imm)))>;
123+
}
124+
defm vmvnq: IntrinsicMX<Vector, (args imm_simd_vmvn:$imm, Predicate:$pred),
125+
(select $pred, (not (splat (Scalar $imm))), $inactive),
126+
1, "_n", PNT_NType, PNT_None>;
127+
let pnt = PNT_NType in {
128+
def vbicq_n: Intrinsic<Vector, (args Vector:$v, imm_simd_restrictive:$imm),
129+
(and $v, (not (splat (Scalar $imm))))>;
130+
def vorrq_n: Intrinsic<Vector, (args Vector:$v, imm_simd_restrictive:$imm),
131+
(or $v, (splat (Scalar $imm)))>;
132+
}
133+
def vbicq_m_n: Intrinsic<
134+
Vector, (args Vector:$v, imm_simd_restrictive:$imm, Predicate:$pred),
135+
(select $pred, (and $v, (not (splat (Scalar $imm)))), $v)>;
136+
def vorrq_m_n: Intrinsic<
137+
Vector, (args Vector:$v, imm_simd_restrictive:$imm, Predicate:$pred),
138+
(select $pred, (or $v, (splat (Scalar $imm))), $v)>;
139+
}
140+
119141
// The bitcasting below is not overcomplicating the IR because while
120142
// Vector and UVector may be different vector types at the C level i.e.
121143
// vectors of same size signed/unsigned ints. Once they're lowered

clang/include/clang/Basic/arm_mve_defs.td

Lines changed: 16 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -319,6 +319,7 @@ class IB_EltBit<int base_, Type type_ = Scalar> : ImmediateBounds {
319319
int base = base_;
320320
Type type = type_;
321321
}
322+
def IB_ExtraArg_LaneSize;
322323

323324
// -----------------------------------------------------------------------------
324325
// End-user definitions for immediate arguments.
@@ -327,11 +328,13 @@ class IB_EltBit<int base_, Type type_ = Scalar> : ImmediateBounds {
327328
// intrinsics like vmvnq or vorrq. imm_simd_restrictive has to be an 8-bit
328329
// value shifted left by a whole number of bytes; imm_simd_vmvn can also be of
329330
// the form 0xXXFF for some byte value XX.
330-
def imm_simd_restrictive : Immediate<u32, IB_UEltValue> {
331+
def imm_simd_restrictive : Immediate<Scalar, IB_UEltValue> {
331332
let extra = "ShiftedByte";
333+
let extraarg = "!lanesize";
332334
}
333-
def imm_simd_vmvn : Immediate<u32, IB_UEltValue> {
335+
def imm_simd_vmvn : Immediate<Scalar, IB_UEltValue> {
334336
let extra = "ShiftedByteOrXXFF";
337+
let extraarg = "!lanesize";
335338
}
336339

337340
// imm_1toN can take any value from 1 to N inclusive, where N is the number of
@@ -457,26 +460,31 @@ class NameOverride<string basename_> {
457460

458461
// A wrapper to define both _m and _x versions of a predicated
459462
// intrinsic.
463+
//
464+
// We provide optional parameters to override the polymorphic name
465+
// types separately for the _m and _x variants, because sometimes they
466+
// polymorph differently (typically because the type of the inactive
467+
// parameter can be used as a disambiguator if it's present).
460468
multiclass IntrinsicMX<Type rettype, dag arguments, dag cg,
461469
int wantXVariant = 1,
462470
string nameSuffix = "",
471+
PolymorphicNameType pnt_m = PNT_Type,
463472
PolymorphicNameType pnt_x = PNT_Type> {
464473
// The _m variant takes an initial parameter called $inactive, which
465474
// provides the input value of the output register, i.e. all the
466475
// inactive lanes in the predicated operation take their values from
467476
// this.
468477
def "_m" # nameSuffix:
469-
Intrinsic<rettype, !con((args rettype:$inactive), arguments), cg>;
478+
Intrinsic<rettype, !con((args rettype:$inactive), arguments), cg> {
479+
let pnt = pnt_m;
480+
}
470481

471482
foreach unusedVar = !if(!eq(wantXVariant, 1), [1], []<int>) in {
472483
// The _x variant leaves off that parameter, and simply uses an
473484
// undef value of the same type.
485+
474486
def "_x" # nameSuffix:
475-
Intrinsic<rettype, arguments, (seq (undef rettype):$inactive, cg)> {
476-
// Allow overriding of the polymorphic name type, because
477-
// sometimes the _m and _x variants polymorph differently
478-
// (typically because the type of the inactive parameter can be
479-
// used as a disambiguator if it's present).
487+
Intrinsic<rettype, arguments, (seq (undef rettype):$inactive, cg)> {
480488
let pnt = pnt_x;
481489
}
482490
}

clang/include/clang/Sema/Sema.h

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -11670,8 +11670,10 @@ class Sema final {
1167011670
bool SemaBuiltinConstantArgMultiple(CallExpr *TheCall, int ArgNum,
1167111671
unsigned Multiple);
1167211672
bool SemaBuiltinConstantArgPower2(CallExpr *TheCall, int ArgNum);
11673-
bool SemaBuiltinConstantArgShiftedByte(CallExpr *TheCall, int ArgNum);
11674-
bool SemaBuiltinConstantArgShiftedByteOrXXFF(CallExpr *TheCall, int ArgNum);
11673+
bool SemaBuiltinConstantArgShiftedByte(CallExpr *TheCall, int ArgNum,
11674+
unsigned ArgBits);
11675+
bool SemaBuiltinConstantArgShiftedByteOrXXFF(CallExpr *TheCall, int ArgNum,
11676+
unsigned ArgBits);
1167511677
bool SemaBuiltinARMSpecialReg(unsigned BuiltinID, CallExpr *TheCall,
1167611678
int ArgNum, unsigned ExpectedFieldNum,
1167711679
bool AllowName);

clang/lib/Sema/SemaChecking.cpp

Lines changed: 12 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5592,7 +5592,8 @@ static bool IsShiftedByte(llvm::APSInt Value) {
55925592
/// SemaBuiltinConstantArgShiftedByte - Check if argument ArgNum of TheCall is
55935593
/// a constant expression representing an arbitrary byte value shifted left by
55945594
/// a multiple of 8 bits.
5595-
bool Sema::SemaBuiltinConstantArgShiftedByte(CallExpr *TheCall, int ArgNum) {
5595+
bool Sema::SemaBuiltinConstantArgShiftedByte(CallExpr *TheCall, int ArgNum,
5596+
unsigned ArgBits) {
55965597
llvm::APSInt Result;
55975598

55985599
// We can't check the value of a dependent argument.
@@ -5604,6 +5605,10 @@ bool Sema::SemaBuiltinConstantArgShiftedByte(CallExpr *TheCall, int ArgNum) {
56045605
if (SemaBuiltinConstantArg(TheCall, ArgNum, Result))
56055606
return true;
56065607

5608+
// Truncate to the given size.
5609+
Result = Result.getLoBits(ArgBits);
5610+
Result.setIsUnsigned(true);
5611+
56075612
if (IsShiftedByte(Result))
56085613
return false;
56095614

@@ -5617,7 +5622,8 @@ bool Sema::SemaBuiltinConstantArgShiftedByte(CallExpr *TheCall, int ArgNum) {
56175622
/// 0x00FF, 0x01FF, ..., 0xFFFF). This strange range check is needed for some
56185623
/// Arm MVE intrinsics.
56195624
bool Sema::SemaBuiltinConstantArgShiftedByteOrXXFF(CallExpr *TheCall,
5620-
int ArgNum) {
5625+
int ArgNum,
5626+
unsigned ArgBits) {
56215627
llvm::APSInt Result;
56225628

56235629
// We can't check the value of a dependent argument.
@@ -5629,6 +5635,10 @@ bool Sema::SemaBuiltinConstantArgShiftedByteOrXXFF(CallExpr *TheCall,
56295635
if (SemaBuiltinConstantArg(TheCall, ArgNum, Result))
56305636
return true;
56315637

5638+
// Truncate to the given size.
5639+
Result = Result.getLoBits(ArgBits);
5640+
Result.setIsUnsigned(true);
5641+
56325642
// Check to see if it's in either of the required forms.
56335643
if (IsShiftedByte(Result) ||
56345644
(Result > 0 && Result < 0x10000 && (Result & 0xFF) == 0xFF))

0 commit comments

Comments
 (0)