-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[RFC] IR: Define noalias.addrspace metadata #102461
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
63ad716
a2e9181
d1384fe
4e9ec21
763f178
4a6df18
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -8047,6 +8047,43 @@ it will contain a list of ids, including the ids of the callsites in the | |
full inline sequence, in order from the leaf-most call's id to the outermost | ||
inlined call. | ||
|
||
|
||
'``noalias.addrspace``' Metadata | ||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
|
||
The ``noalias.addrspace`` metadata is used to identify memory | ||
operations which cannot access objects allocated in a range of address | ||
spaces. It is attached to memory instructions, including | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It is not clear to me from this what the intended behaviour is when this metadata is used to specify an object is not in the generic address space. Does that effectively mean this code must be unreachable, or does that say nothing because the concrete address space any object is in will not be the generic address space? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not exactly sure how to parse this question.
Do you mean, in the amdgpu case, !{i32 0, i32 1}?
We avoid creating objects in the generic address space (except functions), but it shouldn't really matter? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suppose you just have !noalias.addrspace !0 with !0 = !{i32 1}.
If you have an object in the global address space 0, which is also addressable using the generic address space 1 (please correct me if I am getting the AMDGPU address spaces wrong), would the above !noalias.addrspace specify that it can or that it cannot alias that object? I cannot see a good reason why a frontend would generate that directly, but I can imagine future additions where a frontend would generate that based on user-specified address spaces in source code, so I think it would be good to have that answered upfront. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
These are backwards, 0 is generic and 1 is global. To simplify your question with an example, I would interpret
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
This example is confusing for me. Since addr space 0 is a union of all concrete addr spaces including 1, 2, 3, 5, etc, an actual global pointer with MD promising not alias to 0 is self-contradictory. If in this example the MD is promising not alias to 5 then it has no confusion. Then it comes the question: does it really meaningful to have a MD promising not alias to generic addr space 0? Should that be disallowed? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The IR does not have a concept of generic address spaces, and this description is not written in terms of a generic address space. It merely states the address space of the underlying object allocation There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ok. got it |
||
:ref:`atomicrmw <i_atomicrmw>`, :ref:`cmpxchg <i_cmpxchg>`, and | ||
:ref:`call <i_call>` instructions. | ||
|
||
This follows the same form as :ref:`range metadata <range-metadata>`, | ||
except the field entries must be of type `i32`. The interpretation is | ||
the same numeric address spaces as applied to IR values. | ||
|
||
Example: | ||
|
||
.. code-block:: llvm | ||
|
||
; %ptr cannot point to an object allocated in addrspace(5) | ||
%rmw.valid = atomicrmw and ptr %ptr, i64 %value seq_cst, !noalias.addrspace !0 | ||
|
||
; Undefined behavior. The underlying object is allocated in one of the listed | ||
; address spaces. | ||
%alloca = alloca i64, addrspace(5) | ||
%alloca.cast = addrspacecast ptr addrspace(5) %alloca to ptr | ||
%rmw.ub = atomicrmw and ptr %alloca.cast, i64 %value seq_cst, !noalias.addrspace !0 | ||
|
||
!0 = !{i32 5, i32 6} ; Exclude addrspace(5) only | ||
|
||
|
||
This is intended for use on targets with a notion of generic address | ||
spaces, which at runtime resolve to different physical memory | ||
spaces. The interpretation of the address space values is target | ||
specific. The behavior is undefined if the runtime memory address does | ||
resolve to an object defined in one of the indicated address spaces. | ||
|
||
|
||
Module Flags Metadata | ||
===================== | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -492,6 +492,14 @@ class Verifier : public InstVisitor<Verifier>, VerifierSupport { | |
/// Whether a metadata node is allowed to be, or contain, a DILocation. | ||
enum class AreDebugLocsAllowed { No, Yes }; | ||
|
||
/// Metadata that should be treated as a range, with slightly different | ||
/// requirements. | ||
enum class RangeLikeMetadataKind { | ||
Range, // MD_range | ||
AbsoluteSymbol, // MD_absolute_symbol | ||
NoaliasAddrspace // MD_noalias_addrspace | ||
}; | ||
|
||
// Verification methods... | ||
void visitGlobalValue(const GlobalValue &GV); | ||
void visitGlobalVariable(const GlobalVariable &GV); | ||
|
@@ -515,9 +523,10 @@ class Verifier : public InstVisitor<Verifier>, VerifierSupport { | |
void visitModuleFlagCGProfileEntry(const MDOperand &MDO); | ||
void visitFunction(const Function &F); | ||
void visitBasicBlock(BasicBlock &BB); | ||
void verifyRangeMetadata(const Value &V, const MDNode *Range, Type *Ty, | ||
bool IsAbsoluteSymbol); | ||
void verifyRangeLikeMetadata(const Value &V, const MDNode *Range, Type *Ty, | ||
RangeLikeMetadataKind Kind); | ||
void visitRangeMetadata(Instruction &I, MDNode *Range, Type *Ty); | ||
void visitNoaliasAddrspaceMetadata(Instruction &I, MDNode *Range, Type *Ty); | ||
void visitDereferenceableMetadata(Instruction &I, MDNode *MD); | ||
void visitProfMetadata(Instruction &I, MDNode *MD); | ||
void visitCallStackMetadata(MDNode *MD); | ||
|
@@ -760,8 +769,9 @@ void Verifier::visitGlobalValue(const GlobalValue &GV) { | |
// FIXME: Why is getMetadata on GlobalValue protected? | ||
if (const MDNode *AbsoluteSymbol = | ||
GO->getMetadata(LLVMContext::MD_absolute_symbol)) { | ||
verifyRangeMetadata(*GO, AbsoluteSymbol, DL.getIntPtrType(GO->getType()), | ||
true); | ||
verifyRangeLikeMetadata(*GO, AbsoluteSymbol, | ||
DL.getIntPtrType(GO->getType()), | ||
RangeLikeMetadataKind::AbsoluteSymbol); | ||
} | ||
} | ||
|
||
|
@@ -4136,8 +4146,8 @@ static bool isContiguous(const ConstantRange &A, const ConstantRange &B) { | |
|
||
/// Verify !range and !absolute_symbol metadata. These have the same | ||
/// restrictions, except !absolute_symbol allows the full set. | ||
void Verifier::verifyRangeMetadata(const Value &I, const MDNode *Range, | ||
Type *Ty, bool IsAbsoluteSymbol) { | ||
void Verifier::verifyRangeLikeMetadata(const Value &I, const MDNode *Range, | ||
Type *Ty, RangeLikeMetadataKind Kind) { | ||
unsigned NumOperands = Range->getNumOperands(); | ||
Check(NumOperands % 2 == 0, "Unfinished range!", Range); | ||
unsigned NumRanges = NumOperands / 2; | ||
|
@@ -4154,8 +4164,14 @@ void Verifier::verifyRangeMetadata(const Value &I, const MDNode *Range, | |
|
||
Check(High->getType() == Low->getType(), "Range pair types must match!", | ||
&I); | ||
Check(High->getType() == Ty->getScalarType(), | ||
"Range types must match instruction type!", &I); | ||
|
||
if (Kind == RangeLikeMetadataKind::NoaliasAddrspace) { | ||
Check(High->getType()->isIntegerTy(32), | ||
"noalias.addrspace type must be i32!", &I); | ||
} else { | ||
Check(High->getType() == Ty->getScalarType(), | ||
"Range types must match instruction type!", &I); | ||
} | ||
|
||
APInt HighV = High->getValue(); | ||
APInt LowV = Low->getValue(); | ||
|
@@ -4166,7 +4182,9 @@ void Verifier::verifyRangeMetadata(const Value &I, const MDNode *Range, | |
"The upper and lower limits cannot be the same value", &I); | ||
|
||
ConstantRange CurRange(LowV, HighV); | ||
Check(!CurRange.isEmptySet() && (IsAbsoluteSymbol || !CurRange.isFullSet()), | ||
Check(!CurRange.isEmptySet() && | ||
(Kind == RangeLikeMetadataKind::AbsoluteSymbol || | ||
!CurRange.isFullSet()), | ||
"Range must not be empty!", Range); | ||
if (i != 0) { | ||
Check(CurRange.intersectWith(LastRange).isEmptySet(), | ||
|
@@ -4194,7 +4212,15 @@ void Verifier::verifyRangeMetadata(const Value &I, const MDNode *Range, | |
void Verifier::visitRangeMetadata(Instruction &I, MDNode *Range, Type *Ty) { | ||
assert(Range && Range == I.getMetadata(LLVMContext::MD_range) && | ||
"precondition violation"); | ||
verifyRangeMetadata(I, Range, Ty, false); | ||
verifyRangeLikeMetadata(I, Range, Ty, RangeLikeMetadataKind::Range); | ||
} | ||
|
||
void Verifier::visitNoaliasAddrspaceMetadata(Instruction &I, MDNode *Range, | ||
Type *Ty) { | ||
assert(Range && Range == I.getMetadata(LLVMContext::MD_noalias_addrspace) && | ||
"precondition violation"); | ||
verifyRangeLikeMetadata(I, Range, Ty, | ||
RangeLikeMetadataKind::NoaliasAddrspace); | ||
} | ||
|
||
void Verifier::checkAtomicMemAccessSize(Type *Ty, const Instruction *I) { | ||
|
@@ -5187,6 +5213,13 @@ void Verifier::visitInstruction(Instruction &I) { | |
visitRangeMetadata(I, Range, I.getType()); | ||
} | ||
|
||
if (MDNode *Range = I.getMetadata(LLVMContext::MD_noalias_addrspace)) { | ||
Check(isa<LoadInst>(I) || isa<StoreInst>(I) || isa<AtomicRMWInst>(I) || | ||
isa<AtomicCmpXchgInst>(I) || isa<CallInst>(I), | ||
"noalias.addrspace are only for memory operations!", &I); | ||
visitNoaliasAddrspaceMetadata(I, Range, I.getType()); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should we also add a check to the effect that if there is an operand in address space 'n', 'n' cannot be listed in the noalias.addrspace metadata for that instruction? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No, because the rule only is for the address space of the underlying object. You may be accessing it through an addrspacecasted pointer. If we wanted something like that, it would be more appropriate to put in the Lint analysis. Undefined behavior that transforms can happen to introduce generally should not be IR verifier errors There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok. I can see that for say things like function or intrinsics, where they accept a pointer in address space 'n' but also assert that its underlying address space is not 'n' and access it internally using another address space (so the pointer is pure bit container to pass some data into the function or intrinsic). But for native LLVM insts, like There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Address spaces may overlap, a pointer to one address space may point to an object defined in another address space. A There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just some thought: The spec language is: The In general, given a pointer in AS 'n', we can have set of address spaces that potentially alias with 'n', and then any memory access using AS Assuming address spaces are organized as a tree (or a DAG), where each node is an address space, and its children/directed neighbors are all address spaces that are contained within the parent, Aliases(P) is all address spaces k such that k is on a path from some root to some leaf that contains node (p). So, Aliases(p) capture all sub-address spaces within (p) as well as all address spaces that p is a sub-address space of. Note that we can have another definition where Aliases(P) is only its sub-address spaces looking down in this tree but not up (Descendents). So, whenever you access a pointer in AS p, you can potentially access any of the address spaces in Aliases(P). And then noalias.addrspace drops some address spaces from that. If we have native LoadInst whose pointer AS is P, and then noalias.addrespace, the set of legal address spaces its allowed to access is: Aliases(P) - noalias.addrspace If this set happens to be empty, then I'd think the IR is invalid (we have not left any address space for the load to do its memory access. Or maybe undefined)? And if Alias(P) includes parents as well as descendent address spaces, then load with pointer p and noalias.addresspace = p can be well defined, since Alias(P) - {p) will include say a parent address space, and the load is "required" to upcast the pointer from AS p to that parent address space (say generic) to do the access. If OTOH we define the set of legal address spaces to be: Descendents(P) - (union(Descendents(noalias.addrspace)) which I think is the more logical meaning, then load with pointer p and noalias.addresspace = p will always be undefined as the set of legal address spaces left will always be null. Essentially, I am defining legality by first defining the set of legal address spaces that a load can access, and then making a load/store legal only if that set is non-empty. Whether we use AscendentsAndDescendepts(p) or just Descendents(p) I guess depends on the intended semantics. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The intent was to only restrict the underlying object's address space, which is most obviously definable in terms of generic IR rules we already have. The IR has no generic rules for address space aliasing relationships. Any address space is assumed to potentially alias any other address space, and this is only relaxed by target AliasAnalysis extensions. An accessing LoadInst with an explicit noalias.addrspace forbidding the exact address space of the load has to be assumed valid, as we know nothing about what address spaces can alias others.
We can't make these situations invalid IR without requiring more target information in the IR, which I want to avoid. The assumption is the IR producer knows what it's doing for the target, and any address space value may have been produced through a valid chain of addrspacecasts. I've tried to reword the LangRef to be more explicit this only says anything about the original object's address space, not the access |
||
} | ||
|
||
if (I.hasMetadata(LLVMContext::MD_invariant_group)) { | ||
Check(isa<LoadInst>(I) || isa<StoreInst>(I), | ||
"invariant.group metadata is only for loads and stores", &I); | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,110 @@ | ||
; RUN: llvm-as < %s | llvm-dis | FileCheck %s | ||
|
||
define i64 @atomicrmw_noalias_addrspace__0_1(ptr %ptr, i64 %val) { | ||
; CHECK-LABEL: define i64 @atomicrmw_noalias_addrspace__0_1( | ||
; CHECK-SAME: ptr [[PTR:%.*]], i64 [[VAL:%.*]]) { | ||
; CHECK-NEXT: [[RET:%.*]] = atomicrmw add ptr [[PTR]], i64 [[VAL]] seq_cst, align 8, !noalias.addrspace [[META0:![0-9]+]] | ||
; CHECK-NEXT: ret i64 [[RET]] | ||
; | ||
%ret = atomicrmw add ptr %ptr, i64 %val seq_cst, align 8, !noalias.addrspace !0 | ||
ret i64 %ret | ||
} | ||
|
||
define i64 @atomicrmw_noalias_addrspace__0_2(ptr %ptr, i64 %val) { | ||
; CHECK-LABEL: define i64 @atomicrmw_noalias_addrspace__0_2( | ||
; CHECK-SAME: ptr [[PTR:%.*]], i64 [[VAL:%.*]]) { | ||
; CHECK-NEXT: [[RET:%.*]] = atomicrmw add ptr [[PTR]], i64 [[VAL]] seq_cst, align 8, !noalias.addrspace [[META1:![0-9]+]] | ||
; CHECK-NEXT: ret i64 [[RET]] | ||
; | ||
%ret = atomicrmw add ptr %ptr, i64 %val seq_cst, align 8, !noalias.addrspace !1 | ||
ret i64 %ret | ||
} | ||
|
||
define i64 @atomicrmw_noalias_addrspace__1_3(ptr %ptr, i64 %val) { | ||
; CHECK-LABEL: define i64 @atomicrmw_noalias_addrspace__1_3( | ||
; CHECK-SAME: ptr [[PTR:%.*]], i64 [[VAL:%.*]]) { | ||
; CHECK-NEXT: [[RET:%.*]] = atomicrmw add ptr [[PTR]], i64 [[VAL]] seq_cst, align 8, !noalias.addrspace [[META2:![0-9]+]] | ||
; CHECK-NEXT: ret i64 [[RET]] | ||
; | ||
%ret = atomicrmw add ptr %ptr, i64 %val seq_cst, align 8, !noalias.addrspace !2 | ||
ret i64 %ret | ||
} | ||
|
||
define i64 @atomicrmw_noalias_addrspace__multiple_ranges(ptr %ptr, i64 %val) { | ||
; CHECK-LABEL: define i64 @atomicrmw_noalias_addrspace__multiple_ranges( | ||
; CHECK-SAME: ptr [[PTR:%.*]], i64 [[VAL:%.*]]) { | ||
; CHECK-NEXT: [[RET:%.*]] = atomicrmw add ptr [[PTR]], i64 [[VAL]] seq_cst, align 8, !noalias.addrspace [[META3:![0-9]+]] | ||
; CHECK-NEXT: ret i64 [[RET]] | ||
; | ||
%ret = atomicrmw add ptr %ptr, i64 %val seq_cst, align 8, !noalias.addrspace !3 | ||
ret i64 %ret | ||
} | ||
|
||
define i64 @load_noalias_addrspace__5_6(ptr %ptr) { | ||
; CHECK-LABEL: define i64 @load_noalias_addrspace__5_6( | ||
; CHECK-SAME: ptr [[PTR:%.*]]) { | ||
; CHECK-NEXT: [[RET:%.*]] = load i64, ptr [[PTR]], align 4, !noalias.addrspace [[META4:![0-9]+]] | ||
; CHECK-NEXT: ret i64 [[RET]] | ||
; | ||
%ret = load i64, ptr %ptr, align 4, !noalias.addrspace !4 | ||
ret i64 %ret | ||
} | ||
|
||
define void @store_noalias_addrspace__5_6(ptr %ptr, i64 %val) { | ||
; CHECK-LABEL: define void @store_noalias_addrspace__5_6( | ||
; CHECK-SAME: ptr [[PTR:%.*]], i64 [[VAL:%.*]]) { | ||
; CHECK-NEXT: store i64 [[VAL]], ptr [[PTR]], align 4, !noalias.addrspace [[META4]] | ||
; CHECK-NEXT: ret void | ||
; | ||
store i64 %val, ptr %ptr, align 4, !noalias.addrspace !4 | ||
ret void | ||
} | ||
|
||
define { i64, i1 } @cmpxchg_noalias_addrspace__5_6(ptr %ptr, i64 %val0, i64 %val1) { | ||
; CHECK-LABEL: define { i64, i1 } @cmpxchg_noalias_addrspace__5_6( | ||
; CHECK-SAME: ptr [[PTR:%.*]], i64 [[VAL0:%.*]], i64 [[VAL1:%.*]]) { | ||
; CHECK-NEXT: [[RET:%.*]] = cmpxchg ptr [[PTR]], i64 [[VAL0]], i64 [[VAL1]] monotonic monotonic, align 8, !noalias.addrspace [[META4]] | ||
; CHECK-NEXT: ret { i64, i1 } [[RET]] | ||
; | ||
%ret = cmpxchg ptr %ptr, i64 %val0, i64 %val1 monotonic monotonic, align 8, !noalias.addrspace !4 | ||
ret { i64, i1 } %ret | ||
} | ||
|
||
declare void @foo() | ||
|
||
define void @call_noalias_addrspace__5_6(ptr %ptr) { | ||
; CHECK-LABEL: define void @call_noalias_addrspace__5_6( | ||
; CHECK-SAME: ptr [[PTR:%.*]]) { | ||
; CHECK-NEXT: call void @foo(), !noalias.addrspace [[META4]] | ||
; CHECK-NEXT: ret void | ||
; | ||
call void @foo(), !noalias.addrspace !4 | ||
ret void | ||
} | ||
|
||
define void @call_memcpy_intrinsic_addrspace__5_6(ptr %dst, ptr %src, i64 %size) { | ||
; CHECK-LABEL: define void @call_memcpy_intrinsic_addrspace__5_6( | ||
; CHECK-SAME: ptr [[DST:%.*]], ptr [[SRC:%.*]], i64 [[SIZE:%.*]]) { | ||
; CHECK-NEXT: call void @llvm.memcpy.p0.p0.i64(ptr [[DST]], ptr [[SRC]], i64 [[SIZE]], i1 false), !noalias.addrspace [[META4]] | ||
; CHECK-NEXT: ret void | ||
; | ||
call void @llvm.memcpy.p0.p0.i64(ptr %dst, ptr %src, i64 %size, i1 false), !noalias.addrspace !4 | ||
ret void | ||
} | ||
|
||
declare void @llvm.memcpy.p0.p0.i64(ptr noalias nocapture writeonly, ptr noalias nocapture readonly, i64, i1 immarg) #0 | ||
|
||
attributes #0 = { nocallback nofree nounwind willreturn memory(argmem: readwrite) } | ||
|
||
!0 = !{i32 0, i32 1} | ||
!1 = !{i32 0, i32 2} | ||
!2 = !{i32 1, i32 3} | ||
!3 = !{i32 4, i32 6, i32 10, i32 55} | ||
!4 = !{i32 5, i32 6} | ||
;. | ||
; CHECK: [[META0]] = !{i32 0, i32 1} | ||
; CHECK: [[META1]] = !{i32 0, i32 2} | ||
; CHECK: [[META2]] = !{i32 1, i32 3} | ||
; CHECK: [[META3]] = !{i32 4, i32 6, i32 10, i32 55} | ||
; CHECK: [[META4]] = !{i32 5, i32 6} | ||
;. |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -316,10 +316,80 @@ out: | |
ret void | ||
} | ||
|
||
define void @hoist_noalias_addrspace_both(i1 %c, ptr %p, i64 %val) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I assume the various tests you added here are just to show conservatively correct behavior, and you plan to implement proper merging support (via combineMetadataForCSE and friends) in a followup PR? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes |
||
; CHECK-LABEL: @hoist_noalias_addrspace_both( | ||
; CHECK-NEXT: if: | ||
; CHECK-NEXT: [[T:%.*]] = atomicrmw add ptr [[P:%.*]], i64 [[VAL:%.*]] seq_cst, align 8 | ||
; CHECK-NEXT: ret void | ||
; | ||
if: | ||
br i1 %c, label %then, label %else | ||
|
||
then: | ||
%t = atomicrmw add ptr %p, i64 %val seq_cst, !noalias.addrspace !4 | ||
br label %out | ||
|
||
else: | ||
%e = atomicrmw add ptr %p, i64 %val seq_cst, !noalias.addrspace !4 | ||
br label %out | ||
|
||
out: | ||
ret void | ||
} | ||
|
||
define void @hoist_noalias_addrspace_one(i1 %c, ptr %p, i64 %val) { | ||
; CHECK-LABEL: @hoist_noalias_addrspace_one( | ||
; CHECK-NEXT: if: | ||
; CHECK-NEXT: [[T:%.*]] = atomicrmw add ptr [[P:%.*]], i64 [[VAL:%.*]] seq_cst, align 8 | ||
; CHECK-NEXT: ret void | ||
; | ||
if: | ||
br i1 %c, label %then, label %else | ||
|
||
then: | ||
%t = atomicrmw add ptr %p, i64 %val seq_cst, !noalias.addrspace !4 | ||
br label %out | ||
|
||
else: | ||
%e = atomicrmw add ptr %p, i64 %val seq_cst | ||
br label %out | ||
|
||
out: | ||
ret void | ||
} | ||
|
||
define void @hoist_noalias_addrspace_switch(i64 %i, ptr %p, i64 %val) { | ||
; CHECK-LABEL: @hoist_noalias_addrspace_switch( | ||
; CHECK-NEXT: out: | ||
; CHECK-NEXT: [[T:%.*]] = atomicrmw add ptr [[P:%.*]], i64 [[VAL:%.*]] seq_cst, align 8 | ||
; CHECK-NEXT: ret void | ||
; | ||
switch i64 %i, label %bb0 [ | ||
i64 1, label %bb1 | ||
i64 2, label %bb2 | ||
] | ||
bb0: | ||
%t = atomicrmw add ptr %p, i64 %val seq_cst, !noalias.addrspace !4 | ||
br label %out | ||
bb1: | ||
%e = atomicrmw add ptr %p, i64 %val seq_cst, !noalias.addrspace !5 | ||
br label %out | ||
bb2: | ||
%f = atomicrmw add ptr %p, i64 %val seq_cst, !noalias.addrspace !6 | ||
br label %out | ||
out: | ||
ret void | ||
} | ||
|
||
|
||
!0 = !{ i8 0, i8 1 } | ||
!1 = !{ i8 3, i8 5 } | ||
!2 = !{} | ||
!3 = !{ i8 7, i8 9 } | ||
!4 = !{i32 5, i32 6} | ||
!5 = !{i32 5, i32 7} | ||
!6 = !{i32 4, i32 8} | ||
|
||
;. | ||
; CHECK: [[RNG0]] = !{i8 0, i8 1, i8 3, i8 5} | ||
; CHECK: [[RNG1]] = !{i8 0, i8 1, i8 3, i8 5, i8 7, i8 9} | ||
|
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this say both 5 and 6 ?
The comment says:
Should maybe be
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, ranges are specified as an open range on the end. 5, 6 only excludes 5. { 5, 7 } would exclude 5 and 6
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would probably make sense to add a comment here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, this is not obvious, I am only now realising I messed that up when I asked my question to clarify another aspect of this, and it went unnoticed.
Are ranges of address spaces, in general, a meaningful thing? I would not have assumed that current address space numbering is intended to be meaningful, there is no presumption that two address spaces being consecutive means they are any more or less likely to make sense to combine than two non-consecutive address spaces. And would excluding address spaces 5 and 7, but not 6, then require using two
!noalias.addrspace
annotations on the same instruction, or is there a better way? Whatever the answer is, could you add an example of that?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is only a way to add flexibility in the encoding for future use. There's no inherent meaning of consecutive address spaces.
In the current AMDGPU/PTX use cases, there's only one stack address space to consider. If in the future we wanted to reserve some range of address space values for software uses that may be allocated as stack, you could represent this without listing every possible value. Plus reusing all of the range parse and check code is nicer than needing to brute force search through a list of entries
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the clarification.
FWIW I think that encoding this as a range instead of a list is fine.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can think of use cases where the set of address spaces is non-contiguous and the range base representation can be clunky as each range will be of size 1 and any coalescing is coincidental. Would it be possible to support both list and ranges, for example:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Though this is syntactic sugar that can some later if desired.