Skip to content

Commit b74cc98

Browse files
committed
Add support for getelementptr nusw and nuw
1 parent ac7c482 commit b74cc98

File tree

22 files changed

+333
-38
lines changed

22 files changed

+333
-38
lines changed

llvm/docs/LangRef.rst

Lines changed: 39 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -11184,6 +11184,8 @@ Syntax:
1118411184

1118511185
<result> = getelementptr <ty>, ptr <ptrval>{, <ty> <idx>}*
1118611186
<result> = getelementptr inbounds <ty>, ptr <ptrval>{, <ty> <idx>}*
11187+
<result> = getelementptr nusw <ty>, ptr <ptrval>{, <ty> <idx>}*
11188+
<result> = getelementptr nuw <ty>, ptr <ptrval>{, <ty> <idx>}*
1118711189
<result> = getelementptr inrange(S,E) <ty>, ptr <ptrval>{, <ty> <idx>}*
1118811190
<result> = getelementptr <ty>, <N x ptr> <ptrval>, <vector index type> <idx>
1118911191

@@ -11299,27 +11301,46 @@ memory though, even if it happens to point into allocated storage. See the
1129911301
:ref:`Pointer Aliasing Rules <pointeraliasing>` section for more
1130011302
information.
1130111303

11302-
If the ``inbounds`` keyword is present, the result value of a
11303-
``getelementptr`` with any non-zero indices is a
11304-
:ref:`poison value <poisonvalues>` if one of the following rules is violated:
11305-
11306-
* The base pointer has an *in bounds* address of an allocated object, which
11304+
The ``getelementptr`` instruction may have a number of attributes that impose
11305+
additional rules. If any of the rules are violated, the result value is a
11306+
:ref:`poison value <poisonvalues>`. In cases where the base is a vector of
11307+
pointers, the attributes apply to each computation element-wise.
11308+
11309+
For ``nusw`` (no unsigned signed wrap):
11310+
11311+
* If the type of an index is larger than the pointer index type, the
11312+
truncation to the pointer index type preserves the signed value
11313+
(``trunc nsw``).
11314+
* The multiplication of an index by the type size does not wrap the pointer
11315+
index type in a signed sense (``mul nsw``).
11316+
* The successive addition of each offset (without adding the base address)
11317+
does not wrap the pointer index type in a signed sense (``add nsw``).
11318+
* The successive addition of the current address, truncated to the index type
11319+
and interpreted as an unsigned number, and each offset, interpreted as
11320+
a signed number, does not wrap the index type.
11321+
11322+
For ``nuw`` (no unsigned wrap):
11323+
11324+
* If the type of an index is larger than the pointer index type, the
11325+
truncation to the pointer index type preserves the unsigned value
11326+
(``trunc nuw``).
11327+
* The multiplication of an index by the type size does not wrap the pointer
11328+
index type in an unsigned sense (``mul nuw``).
11329+
* The successive addition of each offset (without adding the base address)
11330+
does not wrap the pointer index type in an unsigned sense (``add nuw``).
11331+
* The successive addition of the current address, truncated to the index type
11332+
and interpreted as an unsigned number, and each offset, also interpreted as
11333+
an unsigned number, does not wrap the index type (``add nuw``).
11334+
11335+
For ``inbounds`` all rules of the ``nusw`` attribute apply. Additionally,
11336+
if the ``getelementptr`` has any non-zero indices, the following rules apply:
11337+
11338+
* The base pointer has an *in bounds* address of an allocated object, which
1130711339
means that it points into an allocated object, or to its end. Note that the
1130811340
object does not have to be live anymore; being in-bounds of a deallocated
1130911341
object is sufficient.
11310-
* If the type of an index is larger than the pointer index type, the
11311-
truncation to the pointer index type preserves the signed value.
11312-
* The multiplication of an index by the type size does not wrap the pointer
11313-
index type in a signed sense (``nsw``).
11314-
* The successive addition of each offset (without adding the base address) does
11315-
not wrap the pointer index type in a signed sense (``nsw``).
11316-
* The successive addition of the current address, interpreted as an unsigned
11317-
number, and each offset, interpreted as a signed number, does not wrap the
11318-
unsigned address space and remains *in bounds* of the allocated object.
11319-
As a corollary, if the added offset is non-negative, the addition does not
11320-
wrap in an unsigned sense (``nuw``).
11321-
* In cases where the base is a vector of pointers, the ``inbounds`` keyword
11322-
applies to each of the computations element-wise.
11342+
* During the successive addition of offsets to the address, the resulting
11343+
pointer must remain *in bounds* of the allocated object at each step.
1132311344

1132411345
Note that ``getelementptr`` with all-zero indices is always considered to be
1132511346
``inbounds``, even if the base pointer does not point to an allocated object.

llvm/docs/ReleaseNotes.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,7 @@ Changes to the LLVM IR
5151
----------------------
5252

5353
* Added Memory Model Relaxation Annotations (MMRAs).
54+
* Added ``nusw`` and ``nuw`` flags to ``getelementptr`` instruction.
5455
* Renamed ``llvm.experimental.vector.reverse`` intrinsic to ``llvm.vector.reverse``.
5556
* Renamed ``llvm.experimental.vector.splice`` intrinsic to ``llvm.vector.splice``.
5657
* Renamed ``llvm.experimental.vector.interleave2`` intrinsic to ``llvm.vector.interleave2``.

llvm/include/llvm/AsmParser/LLToken.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -109,6 +109,7 @@ enum Kind {
109109
kw_fast,
110110
kw_nuw,
111111
kw_nsw,
112+
kw_nusw,
112113
kw_exact,
113114
kw_disjoint,
114115
kw_inbounds,

llvm/include/llvm/Bitcode/LLVMBitCodes.h

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -524,6 +524,14 @@ enum PossiblyExactOperatorOptionalFlags { PEO_EXACT = 0 };
524524
/// PossiblyDisjointInst's SubclassOptionalData contents.
525525
enum PossiblyDisjointInstOptionalFlags { PDI_DISJOINT = 0 };
526526

527+
/// GetElementPtrOptionalFlags - Flags for serializing
528+
/// GEPOperator's SubclassOptionalData contents.
529+
enum GetElementPtrOptionalFlags {
530+
GEP_INBOUNDS = 0,
531+
GEP_NUSW = 1,
532+
GEP_NUW = 2,
533+
};
534+
527535
/// Encoded AtomicOrdering values.
528536
enum AtomicOrderingCodes {
529537
ORDERING_NOTATOMIC = 0,

llvm/include/llvm/IR/Instructions.h

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1171,9 +1171,23 @@ class GetElementPtrInst : public Instruction {
11711171
/// See LangRef.html for the meaning of inbounds on a getelementptr.
11721172
void setIsInBounds(bool b = true);
11731173

1174+
/// Set or clear the nusw flag on this GEP instruction.
1175+
/// See LangRef.html for the meaning of nusw on a getelementptr.
1176+
void setHasNoUnsignedSignedWrap(bool B = true);
1177+
1178+
/// Set or clear the nuw flag on this GEP instruction.
1179+
/// See LangRef.html for the meaning of nuw on a getelementptr.
1180+
void setHasNoUnsignedWrap(bool B = true);
1181+
11741182
/// Determine whether the GEP has the inbounds flag.
11751183
bool isInBounds() const;
11761184

1185+
/// Determine whether the GEP has the nusw flag.
1186+
bool hasNoUnsignedSignedWrap() const;
1187+
1188+
/// Determine whether the GEP has the nuw flag.
1189+
bool hasNoUnsignedWrap() const;
1190+
11771191
/// Accumulate the constant address offset of this GEP if possible.
11781192
///
11791193
/// This routine accepts an APInt into which it will accumulate the constant

llvm/include/llvm/IR/Operator.h

Lines changed: 25 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -405,11 +405,27 @@ class GEPOperator
405405

406406
enum {
407407
IsInBounds = (1 << 0),
408+
HasNoUnsignedSignedWrap = (1 << 1),
409+
HasNoUnsignedWrap = (1 << 2),
408410
};
409411

410412
void setIsInBounds(bool B) {
413+
// Also set nusw when inbounds is set.
414+
SubclassOptionalData = (SubclassOptionalData & ~IsInBounds) |
415+
(B * (IsInBounds | HasNoUnsignedSignedWrap));
416+
}
417+
418+
void setHasNoUnsignedSignedWrap(bool B) {
419+
// Also unset inbounds when nusw is unset.
420+
if (B)
421+
SubclassOptionalData |= HasNoUnsignedSignedWrap;
422+
else
423+
SubclassOptionalData &= ~(IsInBounds | HasNoUnsignedSignedWrap);
424+
}
425+
426+
void setHasNoUnsignedWrap(bool B) {
411427
SubclassOptionalData =
412-
(SubclassOptionalData & ~IsInBounds) | (B * IsInBounds);
428+
(SubclassOptionalData & ~HasNoUnsignedWrap) | (B * HasNoUnsignedWrap);
413429
}
414430

415431
public:
@@ -421,6 +437,14 @@ class GEPOperator
421437
return SubclassOptionalData & IsInBounds;
422438
}
423439

440+
bool hasNoUnsignedSignedWrap() const {
441+
return SubclassOptionalData & HasNoUnsignedSignedWrap;
442+
}
443+
444+
bool hasNoUnsignedWrap() const {
445+
return SubclassOptionalData & HasNoUnsignedWrap;
446+
}
447+
424448
/// Returns the offset of the index with an inrange attachment, or
425449
/// std::nullopt if none.
426450
std::optional<ConstantRange> getInRange() const;

llvm/lib/AsmParser/LLLexer.cpp

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -566,6 +566,7 @@ lltok::Kind LLLexer::LexIdentifier() {
566566
KEYWORD(fast);
567567
KEYWORD(nuw);
568568
KEYWORD(nsw);
569+
KEYWORD(nusw);
569570
KEYWORD(exact);
570571
KEYWORD(disjoint);
571572
KEYWORD(inbounds);

llvm/lib/AsmParser/LLParser.cpp

Lines changed: 18 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -8340,7 +8340,17 @@ int LLParser::parseGetElementPtr(Instruction *&Inst, PerFunctionState &PFS) {
83408340
Value *Val = nullptr;
83418341
LocTy Loc, EltLoc;
83428342

8343-
bool InBounds = EatIfPresent(lltok::kw_inbounds);
8343+
bool InBounds = false, NUSW = false, NUW = false;
8344+
while (true) {
8345+
if (EatIfPresent(lltok::kw_inbounds))
8346+
InBounds = true;
8347+
else if (EatIfPresent(lltok::kw_nusw))
8348+
NUSW = true;
8349+
else if (EatIfPresent(lltok::kw_nuw))
8350+
NUW = true;
8351+
else
8352+
break;
8353+
}
83448354

83458355
Type *Ty = nullptr;
83468356
if (parseType(Ty) ||
@@ -8393,9 +8403,14 @@ int LLParser::parseGetElementPtr(Instruction *&Inst, PerFunctionState &PFS) {
83938403

83948404
if (!GetElementPtrInst::getIndexedType(Ty, Indices))
83958405
return error(Loc, "invalid getelementptr indices");
8396-
Inst = GetElementPtrInst::Create(Ty, Ptr, Indices);
8406+
GetElementPtrInst *GEP = GetElementPtrInst::Create(Ty, Ptr, Indices);
8407+
Inst = GEP;
83978408
if (InBounds)
8398-
cast<GetElementPtrInst>(Inst)->setIsInBounds(true);
8409+
GEP->setIsInBounds(true);
8410+
if (NUSW)
8411+
GEP->setHasNoUnsignedSignedWrap(true);
8412+
if (NUW)
8413+
GEP->setHasNoUnsignedWrap(true);
83998414
return AteExtraComma ? InstExtraComma : InstNormal;
84008415
}
84018416

llvm/lib/Bitcode/Reader/BitcodeReader.cpp

Lines changed: 16 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -5062,10 +5062,17 @@ Error BitcodeReader::parseFunctionBody(Function *F) {
50625062

50635063
unsigned TyID;
50645064
Type *Ty;
5065-
bool InBounds;
5065+
bool InBounds = false, NUSW = false, NUW = false;
50665066

50675067
if (BitCode == bitc::FUNC_CODE_INST_GEP) {
5068-
InBounds = Record[OpNum++];
5068+
uint64_t Flags = Record[OpNum++];
5069+
if (Flags & (1 << bitc::GEP_INBOUNDS))
5070+
InBounds = true;
5071+
if (Flags & (1 << bitc::GEP_NUSW))
5072+
NUSW = true;
5073+
if (Flags & (1 << bitc::GEP_NUW))
5074+
NUW = true;
5075+
50695076
TyID = Record[OpNum++];
50705077
Ty = getTypeByID(TyID);
50715078
} else {
@@ -5096,7 +5103,8 @@ Error BitcodeReader::parseFunctionBody(Function *F) {
50965103
GEPIdx.push_back(Op);
50975104
}
50985105

5099-
I = GetElementPtrInst::Create(Ty, BasePtr, GEPIdx);
5106+
auto *GEP = GetElementPtrInst::Create(Ty, BasePtr, GEPIdx);
5107+
I = GEP;
51005108

51015109
ResTypeID = TyID;
51025110
if (cast<GEPOperator>(I)->getNumIndices() != 0) {
@@ -5123,7 +5131,11 @@ Error BitcodeReader::parseFunctionBody(Function *F) {
51235131

51245132
InstructionList.push_back(I);
51255133
if (InBounds)
5126-
cast<GetElementPtrInst>(I)->setIsInBounds(true);
5134+
GEP->setIsInBounds(true);
5135+
if (NUSW)
5136+
GEP->setHasNoUnsignedSignedWrap(true);
5137+
if (NUW)
5138+
GEP->setHasNoUnsignedWrap(true);
51275139
break;
51285140
}
51295141

llvm/lib/Bitcode/Writer/BitcodeWriter.cpp

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2961,7 +2961,14 @@ void ModuleBitcodeWriter::writeInstruction(const Instruction &I,
29612961
Code = bitc::FUNC_CODE_INST_GEP;
29622962
AbbrevToUse = FUNCTION_INST_GEP_ABBREV;
29632963
auto &GEPInst = cast<GetElementPtrInst>(I);
2964-
Vals.push_back(GEPInst.isInBounds());
2964+
uint64_t Flags = 0;
2965+
if (GEPInst.isInBounds())
2966+
Flags |= 1 << bitc::GEP_INBOUNDS;
2967+
if (GEPInst.hasNoUnsignedSignedWrap())
2968+
Flags |= 1 << bitc::GEP_NUSW;
2969+
if (GEPInst.hasNoUnsignedWrap())
2970+
Flags |= 1 << bitc::GEP_NUW;
2971+
Vals.push_back(Flags);
29652972
Vals.push_back(VE.getTypeID(GEPInst.getSourceElementType()));
29662973
for (unsigned i = 0, e = I.getNumOperands(); i != e; ++i)
29672974
pushValueAndType(I.getOperand(i), InstID, Vals);
@@ -3859,7 +3866,7 @@ void ModuleBitcodeWriter::writeBlockInfo() {
38593866
{
38603867
auto Abbv = std::make_shared<BitCodeAbbrev>();
38613868
Abbv->Add(BitCodeAbbrevOp(bitc::FUNC_CODE_INST_GEP));
3862-
Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 1));
3869+
Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 3));
38633870
Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, // dest ty
38643871
Log2_32_Ceil(VE.getTypes().size() + 1)));
38653872
Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Array));

llvm/lib/IR/AsmWriter.cpp

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1417,6 +1417,10 @@ static void WriteOptimizationInfo(raw_ostream &Out, const User *U) {
14171417
} else if (const GEPOperator *GEP = dyn_cast<GEPOperator>(U)) {
14181418
if (GEP->isInBounds())
14191419
Out << " inbounds";
1420+
else if (GEP->hasNoUnsignedSignedWrap())
1421+
Out << " nusw";
1422+
if (GEP->hasNoUnsignedWrap())
1423+
Out << " nuw";
14201424
if (auto InRange = GEP->getInRange()) {
14211425
Out << " inrange(" << InRange->getLower() << ", " << InRange->getUpper()
14221426
<< ")";

llvm/lib/IR/Instruction.cpp

Lines changed: 18 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -442,6 +442,8 @@ void Instruction::dropPoisonGeneratingFlags() {
442442

443443
case Instruction::GetElementPtr:
444444
cast<GetElementPtrInst>(this)->setIsInBounds(false);
445+
cast<GetElementPtrInst>(this)->setHasNoUnsignedSignedWrap(false);
446+
cast<GetElementPtrInst>(this)->setHasNoUnsignedWrap(false);
445447
break;
446448

447449
case Instruction::UIToFP:
@@ -658,9 +660,15 @@ void Instruction::copyIRFlags(const Value *V, bool IncludeWrapFlags) {
658660
if (isa<FPMathOperator>(this))
659661
copyFastMathFlags(FP->getFastMathFlags());
660662

661-
if (auto *SrcGEP = dyn_cast<GetElementPtrInst>(V))
662-
if (auto *DestGEP = dyn_cast<GetElementPtrInst>(this))
663+
if (auto *SrcGEP = dyn_cast<GetElementPtrInst>(V)) {
664+
if (auto *DestGEP = dyn_cast<GetElementPtrInst>(this)) {
663665
DestGEP->setIsInBounds(SrcGEP->isInBounds() || DestGEP->isInBounds());
666+
DestGEP->setHasNoUnsignedSignedWrap(SrcGEP->hasNoUnsignedSignedWrap() ||
667+
DestGEP->hasNoUnsignedSignedWrap());
668+
DestGEP->setHasNoUnsignedWrap(SrcGEP->hasNoUnsignedWrap() ||
669+
DestGEP->hasNoUnsignedWrap());
670+
}
671+
}
664672

665673
if (auto *NNI = dyn_cast<PossiblyNonNegInst>(V))
666674
if (isa<PossiblyNonNegInst>(this))
@@ -698,9 +706,15 @@ void Instruction::andIRFlags(const Value *V) {
698706
}
699707
}
700708

701-
if (auto *SrcGEP = dyn_cast<GetElementPtrInst>(V))
702-
if (auto *DestGEP = dyn_cast<GetElementPtrInst>(this))
709+
if (auto *SrcGEP = dyn_cast<GetElementPtrInst>(V)) {
710+
if (auto *DestGEP = dyn_cast<GetElementPtrInst>(this)) {
703711
DestGEP->setIsInBounds(SrcGEP->isInBounds() && DestGEP->isInBounds());
712+
DestGEP->setHasNoUnsignedSignedWrap(SrcGEP->hasNoUnsignedSignedWrap() &&
713+
DestGEP->hasNoUnsignedSignedWrap());
714+
DestGEP->setHasNoUnsignedWrap(SrcGEP->hasNoUnsignedWrap() &&
715+
DestGEP->hasNoUnsignedWrap());
716+
}
717+
}
704718

705719
if (auto *NNI = dyn_cast<PossiblyNonNegInst>(V))
706720
if (isa<PossiblyNonNegInst>(this))

llvm/lib/IR/Instructions.cpp

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2047,10 +2047,26 @@ void GetElementPtrInst::setIsInBounds(bool B) {
20472047
cast<GEPOperator>(this)->setIsInBounds(B);
20482048
}
20492049

2050+
void GetElementPtrInst::setHasNoUnsignedSignedWrap(bool B) {
2051+
cast<GEPOperator>(this)->setHasNoUnsignedSignedWrap(B);
2052+
}
2053+
2054+
void GetElementPtrInst::setHasNoUnsignedWrap(bool B) {
2055+
cast<GEPOperator>(this)->setHasNoUnsignedWrap(B);
2056+
}
2057+
20502058
bool GetElementPtrInst::isInBounds() const {
20512059
return cast<GEPOperator>(this)->isInBounds();
20522060
}
20532061

2062+
bool GetElementPtrInst::hasNoUnsignedSignedWrap() const {
2063+
return cast<GEPOperator>(this)->hasNoUnsignedSignedWrap();
2064+
}
2065+
2066+
bool GetElementPtrInst::hasNoUnsignedWrap() const {
2067+
return cast<GEPOperator>(this)->hasNoUnsignedWrap();
2068+
}
2069+
20542070
bool GetElementPtrInst::accumulateConstantOffset(const DataLayout &DL,
20552071
APInt &Offset) const {
20562072
// Delegate to the generic GEPOperator implementation.

llvm/lib/IR/Operator.cpp

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,8 @@ bool Operator::hasPoisonGeneratingFlags() const {
4242
case Instruction::GetElementPtr: {
4343
auto *GEP = cast<GEPOperator>(this);
4444
// Note: inrange exists on constexpr only
45-
return GEP->isInBounds() || GEP->getInRange() != std::nullopt;
45+
return GEP->isInBounds() || GEP->hasNoUnsignedSignedWrap() ||
46+
GEP->hasNoUnsignedWrap() || GEP->getInRange() != std::nullopt;
4647
}
4748
case Instruction::UIToFP:
4849
case Instruction::ZExt:

llvm/lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1111,6 +1111,9 @@ bool SeparateConstOffsetFromGEP::splitGEP(GetElementPtrInst *GEP) {
11111111
// possible. GEPs with inbounds are more friendly to alias analysis.
11121112
bool GEPWasInBounds = GEP->isInBounds();
11131113
GEP->setIsInBounds(false);
1114+
// TODO(gep_nowrap): Try to preserve these.
1115+
GEP->setHasNoUnsignedSignedWrap(false);
1116+
GEP->setHasNoUnsignedWrap(false);
11141117

11151118
// Lowers a GEP to either GEPs with a single index or arithmetic operations.
11161119
if (LowerGEP) {
@@ -1386,6 +1389,11 @@ void SeparateConstOffsetFromGEP::swapGEPOperand(GetElementPtrInst *First,
13861389
Offset.ugt(ObjectSize)) {
13871390
First->setIsInBounds(false);
13881391
Second->setIsInBounds(false);
1392+
// TODO(gep_nowrap): Make flag preservation more precise.
1393+
First->setHasNoUnsignedSignedWrap(false);
1394+
Second->setHasNoUnsignedSignedWrap(false);
1395+
First->setHasNoUnsignedWrap(false);
1396+
Second->setHasNoUnsignedWrap(false);
13891397
} else
13901398
First->setIsInBounds(true);
13911399
}

0 commit comments

Comments
 (0)