Skip to content

[RFC] IR: Define noalias.addrspace metadata #102461

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Oct 7, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
37 changes: 37 additions & 0 deletions llvm/docs/LangRef.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8047,6 +8047,43 @@ it will contain a list of ids, including the ids of the callsites in the
full inline sequence, in order from the leaf-most call's id to the outermost
inlined call.


'``noalias.addrspace``' Metadata
Copy link
Contributor

@gonzalobg gonzalobg Sep 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this say both 5 and 6 ?

The comment says:

%ptr cannot point to an object allocated in addrspace(5)

Should maybe be

%ptr cannot point to an object allocated in addrspace(5) or addrspace(6)

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, ranges are specified as an open range on the end. 5, 6 only excludes 5. { 5, 7 } would exclude 5 and 6

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would probably make sense to add a comment here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, this is not obvious, I am only now realising I messed that up when I asked my question to clarify another aspect of this, and it went unnoticed.

Are ranges of address spaces, in general, a meaningful thing? I would not have assumed that current address space numbering is intended to be meaningful, there is no presumption that two address spaces being consecutive means they are any more or less likely to make sense to combine than two non-consecutive address spaces. And would excluding address spaces 5 and 7, but not 6, then require using two !noalias.addrspace annotations on the same instruction, or is there a better way? Whatever the answer is, could you add an example of that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only a way to add flexibility in the encoding for future use. There's no inherent meaning of consecutive address spaces.

In the current AMDGPU/PTX use cases, there's only one stack address space to consider. If in the future we wanted to reserve some range of address space values for software uses that may be allocated as stack, you could represent this without listing every possible value. Plus reusing all of the range parse and check code is nicer than needing to brute force search through a list of entries

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clarification.
FWIW I think that encoding this as a range instead of a list is fine.

Copy link
Contributor

@jurahul jurahul Sep 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can think of use cases where the set of address spaces is non-contiguous and the range base representation can be clunky as each range will be of size 1 and any coalescing is coincidental. Would it be possible to support both list and ranges, for example:

!0 = !{!"range" i32 5, i32 6} // a list of ranges follows
!0 = !{!"list" i32 5} // a list of address spaces follows

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

!0 = !{!"range" i32 5, i32 6} // a list of ranges follows

Though this is syntactic sugar that can some later if desired.

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The ``noalias.addrspace`` metadata is used to identify memory
operations which cannot access objects allocated in a range of address
spaces. It is attached to memory instructions, including
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not clear to me from this what the intended behaviour is when this metadata is used to specify an object is not in the generic address space. Does that effectively mean this code must be unreachable, or does that say nothing because the concrete address space any object is in will not be the generic address space?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not exactly sure how to parse this question.

when this metadata is used to specify an object is not in the generic address space

Do you mean, in the amdgpu case, !{i32 0, i32 1}?

because the concrete address space any object is in will not be the generic address space?

We avoid creating objects in the generic address space (except functions), but it shouldn't really matter?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean, in the amdgpu case, !{i32 0, i32 1}?

Suppose you just have !noalias.addrspace !0 with !0 = !{i32 1}.

We avoid creating objects in the generic address space (except functions), but it shouldn't really matter?

If you have an object in the global address space 0, which is also addressable using the generic address space 1 (please correct me if I am getting the AMDGPU address spaces wrong), would the above !noalias.addrspace specify that it can or that it cannot alias that object? I cannot see a good reason why a frontend would generate that directly, but I can imagine future additions where a frontend would generate that based on user-specified address spaces in source code, so I think it would be good to have that answered upfront.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you have an object in the global address space 0, which is also addressable using the generic address space 1

These are backwards, 0 is generic and 1 is global.

To simplify your question with an example, I would interpret

@globalvar.in.0 = addrspace(0) constant i64

@globalvar.in.1 = addrspace(1) constant i64

; instant undefined behavior, can fold to trap
%rmw.ub = atomicrmw and ptr @globalvar.in.0, i64 %value seq_cst, !noalias.addrspace !0

; OK, underlying object is in addrspace(1)
%rmw.ok0 = atomicrmw and ptr addrspace(1) @globalvar.in.1, i64 %value seq_cst, !noalias.addrspace !0

; OK, underlying object is in addrspace(1) even though access is through forbidden addrspace(0) the object ; definition address space is what matters 
%cast.gv.in.1 = addrspacecast ptr addrspace(1) @globalvar.in.1 to ptr
%rmw.ok0 = atomicrmw and ptr %cast.gv.in.1, i64 %value seq_cst, !noalias.addrspace !0

!0 = !{i32 0, i32 1} ; cannot alias 0

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To simplify your question with an example, I would interpret

@globalvar.in.0 = addrspace(0) constant i64

@globalvar.in.1 = addrspace(1) constant i64

; instant undefined behavior, can fold to trap
%rmw.ub = atomicrmw and ptr @globalvar.in.0, i64 %value seq_cst, !noalias.addrspace !0

; OK, underlying object is in addrspace(1)
%rmw.ok0 = atomicrmw and ptr addrspace(1) @globalvar.in.1, i64 %value seq_cst, !noalias.addrspace !0

; OK, underlying object is in addrspace(1) even though access is through forbidden addrspace(0) the object ; definition address space is what matters 
%cast.gv.in.1 = addrspacecast ptr addrspace(1) @globalvar.in.1 to ptr
%rmw.ok0 = atomicrmw and ptr %cast.gv.in.1, i64 %value seq_cst, !noalias.addrspace !0

!0 = !{i32 0, i32 1} ; cannot alias 0

This example is confusing for me. Since addr space 0 is a union of all concrete addr spaces including 1, 2, 3, 5, etc, an actual global pointer with MD promising not alias to 0 is self-contradictory.

If in this example the MD is promising not alias to 5 then it has no confusion.

Then it comes the question: does it really meaningful to have a MD promising not alias to generic addr space 0? Should that be disallowed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The IR does not have a concept of generic address spaces, and this description is not written in terms of a generic address space. It merely states the address space of the underlying object allocation

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. got it

:ref:`atomicrmw <i_atomicrmw>`, :ref:`cmpxchg <i_cmpxchg>`, and
:ref:`call <i_call>` instructions.

This follows the same form as :ref:`range metadata <range-metadata>`,
except the field entries must be of type `i32`. The interpretation is
the same numeric address spaces as applied to IR values.

Example:

.. code-block:: llvm

; %ptr cannot point to an object allocated in addrspace(5)
%rmw.valid = atomicrmw and ptr %ptr, i64 %value seq_cst, !noalias.addrspace !0

; Undefined behavior. The underlying object is allocated in one of the listed
; address spaces.
%alloca = alloca i64, addrspace(5)
%alloca.cast = addrspacecast ptr addrspace(5) %alloca to ptr
%rmw.ub = atomicrmw and ptr %alloca.cast, i64 %value seq_cst, !noalias.addrspace !0

!0 = !{i32 5, i32 6} ; Exclude addrspace(5) only


This is intended for use on targets with a notion of generic address
spaces, which at runtime resolve to different physical memory
spaces. The interpretation of the address space values is target
specific. The behavior is undefined if the runtime memory address does
resolve to an object defined in one of the indicated address spaces.


Module Flags Metadata
=====================

Expand Down
2 changes: 2 additions & 0 deletions llvm/docs/ReleaseNotes.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,8 @@ Changes to the LLVM IR

* Added `usub_cond` and `usub_sat` operations to `atomicrmw`.

* Introduced `noalias.addrspace` metadata.

* Remove the following intrinsics which can be replaced with a `bitcast`:

* `llvm.nvvm.bitcast.f2i`
Expand Down
1 change: 1 addition & 0 deletions llvm/include/llvm/IR/FixedMetadataKinds.def
Original file line number Diff line number Diff line change
Expand Up @@ -52,3 +52,4 @@ LLVM_FIXED_MD_KIND(MD_pcsections, "pcsections", 37)
LLVM_FIXED_MD_KIND(MD_DIAssignID, "DIAssignID", 38)
LLVM_FIXED_MD_KIND(MD_coro_outside_frame, "coro.outside.frame", 39)
LLVM_FIXED_MD_KIND(MD_mmra, "mmra", 40)
LLVM_FIXED_MD_KIND(MD_noalias_addrspace, "noalias.addrspace", 41)
53 changes: 43 additions & 10 deletions llvm/lib/IR/Verifier.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -492,6 +492,14 @@ class Verifier : public InstVisitor<Verifier>, VerifierSupport {
/// Whether a metadata node is allowed to be, or contain, a DILocation.
enum class AreDebugLocsAllowed { No, Yes };

/// Metadata that should be treated as a range, with slightly different
/// requirements.
enum class RangeLikeMetadataKind {
Range, // MD_range
AbsoluteSymbol, // MD_absolute_symbol
NoaliasAddrspace // MD_noalias_addrspace
};

// Verification methods...
void visitGlobalValue(const GlobalValue &GV);
void visitGlobalVariable(const GlobalVariable &GV);
Expand All @@ -515,9 +523,10 @@ class Verifier : public InstVisitor<Verifier>, VerifierSupport {
void visitModuleFlagCGProfileEntry(const MDOperand &MDO);
void visitFunction(const Function &F);
void visitBasicBlock(BasicBlock &BB);
void verifyRangeMetadata(const Value &V, const MDNode *Range, Type *Ty,
bool IsAbsoluteSymbol);
void verifyRangeLikeMetadata(const Value &V, const MDNode *Range, Type *Ty,
RangeLikeMetadataKind Kind);
void visitRangeMetadata(Instruction &I, MDNode *Range, Type *Ty);
void visitNoaliasAddrspaceMetadata(Instruction &I, MDNode *Range, Type *Ty);
void visitDereferenceableMetadata(Instruction &I, MDNode *MD);
void visitProfMetadata(Instruction &I, MDNode *MD);
void visitCallStackMetadata(MDNode *MD);
Expand Down Expand Up @@ -760,8 +769,9 @@ void Verifier::visitGlobalValue(const GlobalValue &GV) {
// FIXME: Why is getMetadata on GlobalValue protected?
if (const MDNode *AbsoluteSymbol =
GO->getMetadata(LLVMContext::MD_absolute_symbol)) {
verifyRangeMetadata(*GO, AbsoluteSymbol, DL.getIntPtrType(GO->getType()),
true);
verifyRangeLikeMetadata(*GO, AbsoluteSymbol,
DL.getIntPtrType(GO->getType()),
RangeLikeMetadataKind::AbsoluteSymbol);
}
}

Expand Down Expand Up @@ -4136,8 +4146,8 @@ static bool isContiguous(const ConstantRange &A, const ConstantRange &B) {

/// Verify !range and !absolute_symbol metadata. These have the same
/// restrictions, except !absolute_symbol allows the full set.
void Verifier::verifyRangeMetadata(const Value &I, const MDNode *Range,
Type *Ty, bool IsAbsoluteSymbol) {
void Verifier::verifyRangeLikeMetadata(const Value &I, const MDNode *Range,
Type *Ty, RangeLikeMetadataKind Kind) {
unsigned NumOperands = Range->getNumOperands();
Check(NumOperands % 2 == 0, "Unfinished range!", Range);
unsigned NumRanges = NumOperands / 2;
Expand All @@ -4154,8 +4164,14 @@ void Verifier::verifyRangeMetadata(const Value &I, const MDNode *Range,

Check(High->getType() == Low->getType(), "Range pair types must match!",
&I);
Check(High->getType() == Ty->getScalarType(),
"Range types must match instruction type!", &I);

if (Kind == RangeLikeMetadataKind::NoaliasAddrspace) {
Check(High->getType()->isIntegerTy(32),
"noalias.addrspace type must be i32!", &I);
} else {
Check(High->getType() == Ty->getScalarType(),
"Range types must match instruction type!", &I);
}

APInt HighV = High->getValue();
APInt LowV = Low->getValue();
Expand All @@ -4166,7 +4182,9 @@ void Verifier::verifyRangeMetadata(const Value &I, const MDNode *Range,
"The upper and lower limits cannot be the same value", &I);

ConstantRange CurRange(LowV, HighV);
Check(!CurRange.isEmptySet() && (IsAbsoluteSymbol || !CurRange.isFullSet()),
Check(!CurRange.isEmptySet() &&
(Kind == RangeLikeMetadataKind::AbsoluteSymbol ||
!CurRange.isFullSet()),
"Range must not be empty!", Range);
if (i != 0) {
Check(CurRange.intersectWith(LastRange).isEmptySet(),
Expand Down Expand Up @@ -4194,7 +4212,15 @@ void Verifier::verifyRangeMetadata(const Value &I, const MDNode *Range,
void Verifier::visitRangeMetadata(Instruction &I, MDNode *Range, Type *Ty) {
assert(Range && Range == I.getMetadata(LLVMContext::MD_range) &&
"precondition violation");
verifyRangeMetadata(I, Range, Ty, false);
verifyRangeLikeMetadata(I, Range, Ty, RangeLikeMetadataKind::Range);
}

void Verifier::visitNoaliasAddrspaceMetadata(Instruction &I, MDNode *Range,
Type *Ty) {
assert(Range && Range == I.getMetadata(LLVMContext::MD_noalias_addrspace) &&
"precondition violation");
verifyRangeLikeMetadata(I, Range, Ty,
RangeLikeMetadataKind::NoaliasAddrspace);
}

void Verifier::checkAtomicMemAccessSize(Type *Ty, const Instruction *I) {
Expand Down Expand Up @@ -5187,6 +5213,13 @@ void Verifier::visitInstruction(Instruction &I) {
visitRangeMetadata(I, Range, I.getType());
}

if (MDNode *Range = I.getMetadata(LLVMContext::MD_noalias_addrspace)) {
Check(isa<LoadInst>(I) || isa<StoreInst>(I) || isa<AtomicRMWInst>(I) ||
isa<AtomicCmpXchgInst>(I) || isa<CallInst>(I),
"noalias.addrspace are only for memory operations!", &I);
visitNoaliasAddrspaceMetadata(I, Range, I.getType());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also add a check to the effect that if there is an operand in address space 'n', 'n' cannot be listed in the noalias.addrspace metadata for that instruction?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, because the rule only is for the address space of the underlying object. You may be accessing it through an addrspacecasted pointer. If we wanted something like that, it would be more appropriate to put in the Lint analysis. Undefined behavior that transforms can happen to introduce generally should not be IR verifier errors

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. I can see that for say things like function or intrinsics, where they accept a pointer in address space 'n' but also assert that its underlying address space is not 'n' and access it internally using another address space (so the pointer is pure bit container to pass some data into the function or intrinsic). But for native LLVM insts, like LoadInst when would it make sense to have its pointer operand in addspace(n) and also have the noalias.addrspace metadata listing n? I guess we are calling it undefined behavior but still valid IR?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Address spaces may overlap, a pointer to one address space may point to an object defined in another address space. A load instruction that acts on a pointer to one address space, and asserts that the pointer does not point to an object defined in that address space, could presumably still have well-defined behaviour if the pointed-to object is defined in another address space but is accessible through multiple address spaces.

Copy link
Contributor

@jurahul jurahul Sep 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some thought:

The spec language is: The noalias.addrspace metadata is used to identify memory
operations which cannot access a range of address spaces
.

In general, given a pointer in AS 'n', we can have set of address spaces that potentially alias with 'n', and then any memory access using AS n pointer is a potential access to any of the aliasing address spaces. And then this metadata is a guarantee that some of those aliasing address spaces can be dropped from the set, right? If Aliases(P) is all address spaces that can potentially alias with P (including P itself), then a LoadInst for example would be invalid if Aliases(P) - noalias.addrspace is an empty set, i.e., no address space is left to define the access?

Assuming address spaces are organized as a tree (or a DAG), where each node is an address space, and its children/directed neighbors are all address spaces that are contained within the parent, Aliases(P) is all address spaces k such that k is on a path from some root to some leaf that contains node (p). So, Aliases(p) capture all sub-address spaces within (p) as well as all address spaces that p is a sub-address space of. Note that we can have another definition where Aliases(P) is only its sub-address spaces looking down in this tree but not up (Descendents).

So, whenever you access a pointer in AS p, you can potentially access any of the address spaces in Aliases(P). And then noalias.addrspace drops some address spaces from that. If we have native LoadInst whose pointer AS is P, and then noalias.addrespace, the set of legal address spaces its allowed to access is:

Aliases(P) - noalias.addrspace

If this set happens to be empty, then I'd think the IR is invalid (we have not left any address space for the load to do its memory access. Or maybe undefined)?

And if Alias(P) includes parents as well as descendent address spaces, then

load with pointer p and noalias.addresspace = p can be well defined, since Alias(P) - {p) will include say a parent address space, and the load is "required" to upcast the pointer from AS p to that parent address space (say generic) to do the access.

If OTOH we define the set of legal address spaces to be:

Descendents(P) - (union(Descendents(noalias.addrspace))

which I think is the more logical meaning, then load with pointer p and noalias.addresspace = p will always be undefined as the set of legal address spaces left will always be null.

Essentially, I am defining legality by first defining the set of legal address spaces that a load can access, and then making a load/store legal only if that set is non-empty. Whether we use AscendentsAndDescendepts(p) or just Descendents(p) I guess depends on the intended semantics.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intent was to only restrict the underlying object's address space, which is most obviously definable in terms of generic IR rules we already have.

The IR has no generic rules for address space aliasing relationships. Any address space is assumed to potentially alias any other address space, and this is only relaxed by target AliasAnalysis extensions. An accessing LoadInst with an explicit noalias.addrspace forbidding the exact address space of the load has to be assumed valid, as we know nothing about what address spaces can alias others.

If this set happens to be empty, then I'd think the IR is invalid (we have not left any address space for the load to do its memory access. Or maybe undefined)?

We can't make these situations invalid IR without requiring more target information in the IR, which I want to avoid. The assumption is the IR producer knows what it's doing for the target, and any address space value may have been produced through a valid chain of addrspacecasts.

I've tried to reword the LangRef to be more explicit this only says anything about the original object's address space, not the access

}

if (I.hasMetadata(LLVMContext::MD_invariant_group)) {
Check(isa<LoadInst>(I) || isa<StoreInst>(I),
"invariant.group metadata is only for loads and stores", &I);
Expand Down
110 changes: 110 additions & 0 deletions llvm/test/Assembler/noalias-addrspace-md.ll
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
; RUN: llvm-as < %s | llvm-dis | FileCheck %s

define i64 @atomicrmw_noalias_addrspace__0_1(ptr %ptr, i64 %val) {
; CHECK-LABEL: define i64 @atomicrmw_noalias_addrspace__0_1(
; CHECK-SAME: ptr [[PTR:%.*]], i64 [[VAL:%.*]]) {
; CHECK-NEXT: [[RET:%.*]] = atomicrmw add ptr [[PTR]], i64 [[VAL]] seq_cst, align 8, !noalias.addrspace [[META0:![0-9]+]]
; CHECK-NEXT: ret i64 [[RET]]
;
%ret = atomicrmw add ptr %ptr, i64 %val seq_cst, align 8, !noalias.addrspace !0
ret i64 %ret
}

define i64 @atomicrmw_noalias_addrspace__0_2(ptr %ptr, i64 %val) {
; CHECK-LABEL: define i64 @atomicrmw_noalias_addrspace__0_2(
; CHECK-SAME: ptr [[PTR:%.*]], i64 [[VAL:%.*]]) {
; CHECK-NEXT: [[RET:%.*]] = atomicrmw add ptr [[PTR]], i64 [[VAL]] seq_cst, align 8, !noalias.addrspace [[META1:![0-9]+]]
; CHECK-NEXT: ret i64 [[RET]]
;
%ret = atomicrmw add ptr %ptr, i64 %val seq_cst, align 8, !noalias.addrspace !1
ret i64 %ret
}

define i64 @atomicrmw_noalias_addrspace__1_3(ptr %ptr, i64 %val) {
; CHECK-LABEL: define i64 @atomicrmw_noalias_addrspace__1_3(
; CHECK-SAME: ptr [[PTR:%.*]], i64 [[VAL:%.*]]) {
; CHECK-NEXT: [[RET:%.*]] = atomicrmw add ptr [[PTR]], i64 [[VAL]] seq_cst, align 8, !noalias.addrspace [[META2:![0-9]+]]
; CHECK-NEXT: ret i64 [[RET]]
;
%ret = atomicrmw add ptr %ptr, i64 %val seq_cst, align 8, !noalias.addrspace !2
ret i64 %ret
}

define i64 @atomicrmw_noalias_addrspace__multiple_ranges(ptr %ptr, i64 %val) {
; CHECK-LABEL: define i64 @atomicrmw_noalias_addrspace__multiple_ranges(
; CHECK-SAME: ptr [[PTR:%.*]], i64 [[VAL:%.*]]) {
; CHECK-NEXT: [[RET:%.*]] = atomicrmw add ptr [[PTR]], i64 [[VAL]] seq_cst, align 8, !noalias.addrspace [[META3:![0-9]+]]
; CHECK-NEXT: ret i64 [[RET]]
;
%ret = atomicrmw add ptr %ptr, i64 %val seq_cst, align 8, !noalias.addrspace !3
ret i64 %ret
}

define i64 @load_noalias_addrspace__5_6(ptr %ptr) {
; CHECK-LABEL: define i64 @load_noalias_addrspace__5_6(
; CHECK-SAME: ptr [[PTR:%.*]]) {
; CHECK-NEXT: [[RET:%.*]] = load i64, ptr [[PTR]], align 4, !noalias.addrspace [[META4:![0-9]+]]
; CHECK-NEXT: ret i64 [[RET]]
;
%ret = load i64, ptr %ptr, align 4, !noalias.addrspace !4
ret i64 %ret
}

define void @store_noalias_addrspace__5_6(ptr %ptr, i64 %val) {
; CHECK-LABEL: define void @store_noalias_addrspace__5_6(
; CHECK-SAME: ptr [[PTR:%.*]], i64 [[VAL:%.*]]) {
; CHECK-NEXT: store i64 [[VAL]], ptr [[PTR]], align 4, !noalias.addrspace [[META4]]
; CHECK-NEXT: ret void
;
store i64 %val, ptr %ptr, align 4, !noalias.addrspace !4
ret void
}

define { i64, i1 } @cmpxchg_noalias_addrspace__5_6(ptr %ptr, i64 %val0, i64 %val1) {
; CHECK-LABEL: define { i64, i1 } @cmpxchg_noalias_addrspace__5_6(
; CHECK-SAME: ptr [[PTR:%.*]], i64 [[VAL0:%.*]], i64 [[VAL1:%.*]]) {
; CHECK-NEXT: [[RET:%.*]] = cmpxchg ptr [[PTR]], i64 [[VAL0]], i64 [[VAL1]] monotonic monotonic, align 8, !noalias.addrspace [[META4]]
; CHECK-NEXT: ret { i64, i1 } [[RET]]
;
%ret = cmpxchg ptr %ptr, i64 %val0, i64 %val1 monotonic monotonic, align 8, !noalias.addrspace !4
ret { i64, i1 } %ret
}

declare void @foo()

define void @call_noalias_addrspace__5_6(ptr %ptr) {
; CHECK-LABEL: define void @call_noalias_addrspace__5_6(
; CHECK-SAME: ptr [[PTR:%.*]]) {
; CHECK-NEXT: call void @foo(), !noalias.addrspace [[META4]]
; CHECK-NEXT: ret void
;
call void @foo(), !noalias.addrspace !4
ret void
}

define void @call_memcpy_intrinsic_addrspace__5_6(ptr %dst, ptr %src, i64 %size) {
; CHECK-LABEL: define void @call_memcpy_intrinsic_addrspace__5_6(
; CHECK-SAME: ptr [[DST:%.*]], ptr [[SRC:%.*]], i64 [[SIZE:%.*]]) {
; CHECK-NEXT: call void @llvm.memcpy.p0.p0.i64(ptr [[DST]], ptr [[SRC]], i64 [[SIZE]], i1 false), !noalias.addrspace [[META4]]
; CHECK-NEXT: ret void
;
call void @llvm.memcpy.p0.p0.i64(ptr %dst, ptr %src, i64 %size, i1 false), !noalias.addrspace !4
ret void
}

declare void @llvm.memcpy.p0.p0.i64(ptr noalias nocapture writeonly, ptr noalias nocapture readonly, i64, i1 immarg) #0

attributes #0 = { nocallback nofree nounwind willreturn memory(argmem: readwrite) }

!0 = !{i32 0, i32 1}
!1 = !{i32 0, i32 2}
!2 = !{i32 1, i32 3}
!3 = !{i32 4, i32 6, i32 10, i32 55}
!4 = !{i32 5, i32 6}
;.
; CHECK: [[META0]] = !{i32 0, i32 1}
; CHECK: [[META1]] = !{i32 0, i32 2}
; CHECK: [[META2]] = !{i32 1, i32 3}
; CHECK: [[META3]] = !{i32 4, i32 6, i32 10, i32 55}
; CHECK: [[META4]] = !{i32 5, i32 6}
;.
14 changes: 14 additions & 0 deletions llvm/test/Transforms/InstCombine/loadstore-metadata.ll
Original file line number Diff line number Diff line change
Expand Up @@ -173,6 +173,19 @@ define i32 @test_load_cast_combine_noundef(ptr %ptr) {
ret i32 %c
}

define i32 @test_load_cast_combine_noalias_addrspace(ptr %ptr) {
; Ensure (cast (load (...))) -> (load (cast (...))) preserves TBAA.
; CHECK-LABEL: @test_load_cast_combine_noalias_addrspace(
; CHECK-NEXT: entry:
; CHECK-NEXT: [[L1:%.*]] = load i32, ptr [[PTR:%.*]], align 4
; CHECK-NEXT: ret i32 [[L1]]
;
entry:
%l = load float, ptr %ptr, align 4, !noalias.addrspace !11
%c = bitcast float %l to i32
ret i32 %c
}

!0 = !{!1, !1, i64 0}
!1 = !{!"scalar type", !2}
!2 = !{!"root"}
Expand All @@ -184,3 +197,4 @@ define i32 @test_load_cast_combine_noundef(ptr %ptr) {
!8 = !{i32 1}
!9 = !{i64 8}
!10 = distinct !{}
!11 = !{i32 5, i32 6}
70 changes: 70 additions & 0 deletions llvm/test/Transforms/SimplifyCFG/hoist-with-metadata.ll
Original file line number Diff line number Diff line change
Expand Up @@ -316,10 +316,80 @@ out:
ret void
}

define void @hoist_noalias_addrspace_both(i1 %c, ptr %p, i64 %val) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume the various tests you added here are just to show conservatively correct behavior, and you plan to implement proper merging support (via combineMetadataForCSE and friends) in a followup PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

; CHECK-LABEL: @hoist_noalias_addrspace_both(
; CHECK-NEXT: if:
; CHECK-NEXT: [[T:%.*]] = atomicrmw add ptr [[P:%.*]], i64 [[VAL:%.*]] seq_cst, align 8
; CHECK-NEXT: ret void
;
if:
br i1 %c, label %then, label %else

then:
%t = atomicrmw add ptr %p, i64 %val seq_cst, !noalias.addrspace !4
br label %out

else:
%e = atomicrmw add ptr %p, i64 %val seq_cst, !noalias.addrspace !4
br label %out

out:
ret void
}

define void @hoist_noalias_addrspace_one(i1 %c, ptr %p, i64 %val) {
; CHECK-LABEL: @hoist_noalias_addrspace_one(
; CHECK-NEXT: if:
; CHECK-NEXT: [[T:%.*]] = atomicrmw add ptr [[P:%.*]], i64 [[VAL:%.*]] seq_cst, align 8
; CHECK-NEXT: ret void
;
if:
br i1 %c, label %then, label %else

then:
%t = atomicrmw add ptr %p, i64 %val seq_cst, !noalias.addrspace !4
br label %out

else:
%e = atomicrmw add ptr %p, i64 %val seq_cst
br label %out

out:
ret void
}

define void @hoist_noalias_addrspace_switch(i64 %i, ptr %p, i64 %val) {
; CHECK-LABEL: @hoist_noalias_addrspace_switch(
; CHECK-NEXT: out:
; CHECK-NEXT: [[T:%.*]] = atomicrmw add ptr [[P:%.*]], i64 [[VAL:%.*]] seq_cst, align 8
; CHECK-NEXT: ret void
;
switch i64 %i, label %bb0 [
i64 1, label %bb1
i64 2, label %bb2
]
bb0:
%t = atomicrmw add ptr %p, i64 %val seq_cst, !noalias.addrspace !4
br label %out
bb1:
%e = atomicrmw add ptr %p, i64 %val seq_cst, !noalias.addrspace !5
br label %out
bb2:
%f = atomicrmw add ptr %p, i64 %val seq_cst, !noalias.addrspace !6
br label %out
out:
ret void
}


!0 = !{ i8 0, i8 1 }
!1 = !{ i8 3, i8 5 }
!2 = !{}
!3 = !{ i8 7, i8 9 }
!4 = !{i32 5, i32 6}
!5 = !{i32 5, i32 7}
!6 = !{i32 4, i32 8}

;.
; CHECK: [[RNG0]] = !{i8 0, i8 1, i8 3, i8 5}
; CHECK: [[RNG1]] = !{i8 0, i8 1, i8 3, i8 5, i8 7, i8 9}
Expand Down
Loading
Loading