Skip to content

[clang][CodeGen] Add query for a target's flat address space #95728

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

AlexVlx
Copy link
Contributor

@AlexVlx AlexVlx commented Jun 17, 2024

Often, targets which are not address space agnostic expose a flat/generic address space, which acts as a shared, legal target for address space casts. Whilst today we accidentally (e.g. by using PointerType::getUnqual) treat 0 as corresponding to this flat address space, there is no binding requirement placed on targets in this regard, which leads to issues such as those reflected in #93601 and #93914. This patch adds a getFlatPtrAddressSpace() interface in TargetInfo, allowing targets to inform the front-end. A possible alternative name would be getGenericPtrAddressSpace(), but that was not chosen since generic has a fairly specific meaning in C++ and it seemed somewhat confusing in this context.

The interface is not used anywhere at the moment, but the intention is to employ it, for example, to specify the pointer type for the llvm.used array's elements. Overrides are provided for the SPIR-V and AMDGPU targets, which do specify a flat/generic address space.

@llvmbot llvmbot added clang Clang issues not falling into any other category backend:AMDGPU clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Jun 17, 2024
@llvmbot
Copy link
Member

llvmbot commented Jun 17, 2024

@llvm/pr-subscribers-backend-spir-v
@llvm/pr-subscribers-clang-codegen
@llvm/pr-subscribers-backend-amdgpu

@llvm/pr-subscribers-clang

Author: Alex Voicu (AlexVlx)

Changes

Often, targets which are not address space agnostic expose a flat/generic address space, which acts as a shared, legal target for address space casts. Whilst today we accidentally (e.g. by using PointerType::getUnqual) treat 0 as corresponding to this flat address space, there is no binding requirement placed on targets in this regard, which leads to issues such as those reflected in #93601 and #93914. This patch adds a getFlatPtrAddressSpace() interface in TargetInfo, allowing targets to inform the front-end. A possible alternative name would be getGenericPtrAddressSpace(), but that was not chosen since generic has a fairly specific meaning in C++ and it seemed somewhat confusing in this context.

The interface is not used anywhere at the moment, but the intention is to employ it, for example, to specify the pointer type for the llvm.used array's elements.


Full diff: https://github.com/llvm/llvm-project/pull/95728.diff

3 Files Affected:

  • (modified) clang/include/clang/Basic/TargetInfo.h (+7)
  • (modified) clang/lib/Basic/Targets/AMDGPU.h (+6)
  • (modified) clang/lib/Basic/Targets/SPIR.h (+4)
diff --git a/clang/include/clang/Basic/TargetInfo.h b/clang/include/clang/Basic/TargetInfo.h
index 8a6511b9ced83..8841ec5f910d9 100644
--- a/clang/include/clang/Basic/TargetInfo.h
+++ b/clang/include/clang/Basic/TargetInfo.h
@@ -1764,6 +1764,13 @@ class TargetInfo : public TransferrableTargetInfo,
     return 0;
   }
 
+  /// \returns Target specific flat ptr address space; a flat ptr is a ptr that
+  /// can be casted to / from all other target address spaces. If the target
+  /// exposes no such address space / does not care, we return 0.
+  virtual unsigned getFlatPtrAddressSpace() const {
+    return 0;
+  }
+
   /// \returns If a target requires an address within a target specific address
   /// space \p AddressSpace to be converted in order to be used, then return the
   /// corresponding target specific DWARF address space.
diff --git a/clang/lib/Basic/Targets/AMDGPU.h b/clang/lib/Basic/Targets/AMDGPU.h
index 94d9ba93ed226..d06c7d58fe94c 100644
--- a/clang/lib/Basic/Targets/AMDGPU.h
+++ b/clang/lib/Basic/Targets/AMDGPU.h
@@ -379,6 +379,12 @@ class LLVM_LIBRARY_VISIBILITY AMDGPUTargetInfo final : public TargetInfo {
     return static_cast<unsigned>(llvm::AMDGPUAS::CONSTANT_ADDRESS);
   }
 
+  /// \returns Target specific flat ptr address space; a flat ptr is a ptr that
+  /// can be casted to / from all other target address spaces.
+  unsigned getFlatPtrAddressSpace() const override {
+    return static_cast<unsigned>(llvm::AMDGPUAS::FLAT_ADDRESS);
+  }
+
   /// \returns If a target requires an address within a target specific address
   /// space \p AddressSpace to be converted in order to be used, then return the
   /// corresponding target specific DWARF address space.
diff --git a/clang/lib/Basic/Targets/SPIR.h b/clang/lib/Basic/Targets/SPIR.h
index 37cf9d7921bac..14d235bace960 100644
--- a/clang/lib/Basic/Targets/SPIR.h
+++ b/clang/lib/Basic/Targets/SPIR.h
@@ -182,6 +182,10 @@ class LLVM_LIBRARY_VISIBILITY BaseSPIRTargetInfo : public TargetInfo {
     return TargetInfo::VoidPtrBuiltinVaList;
   }
 
+  unsigned getFlatPtrAddressSpace() const override {
+    return 4u; // 4 is generic i.e. flat for SPIR & SPIR-V.
+  }
+
   std::optional<unsigned>
   getDWARFAddressSpace(unsigned AddressSpace) const override {
     return AddressSpace;

@AlexVlx AlexVlx added clang:codegen IR generation bugs: mangling, exceptions, etc. backend:SPIR-V and removed clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Jun 17, 2024
Copy link

github-actions bot commented Jun 17, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Jun 17, 2024
@@ -1764,6 +1764,13 @@ class TargetInfo : public TransferrableTargetInfo,
return 0;
}

/// \returns Target specific flat ptr address space; a flat ptr is a ptr that
/// can be casted to / from all other target address spaces. If the target
/// exposes no such address space / does not care, we return 0.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This further spreads the notion that 0 is a lack of address space, which it is not. You shouldn't need a new thing for this; it should be equivalent to querying the target address space map for the LangAS Generic thing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no such thing as LangAS::Generic, there's LangAS::Default. Default doesn't have to map to something that's actually useful, and it actually does not. There's no actual mechanism to get a flat pointer. And I have no idea what notion spreading you are talking about - this merely maintains the status quo, wherein everybody grabs an AS 0 pointer and hopes for the best. So I have no idea what you are talking about, if I'm honest.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here's no such thing as LangAS::Generic, there's LangAS::Default.

Right, so query the target address space map for that.

The IR does not have the concept of a generic, flat, or no address space. 0 is the "default address space", which has its own specific properties.

this merely maintains the status quo, wherein everybody grabs an AS 0 pointer and hopes for the best.

For the most part this isn't true, at least in llvm proper.

From the comment here, I'm not even sure what this is returning (either an IR address space or a clang address space)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some targets won't even have a flat address space. But that's fine, because in these specific uses the address space's underlying range/representation/meaning/whatever is completely irrelevant, all that matters is the GlobalObject (GlobalValue?) you get once you strip away an addrspacecasts. So I object to this on several grounds:

  1. A flat address space may not exist
  2. These uses don't care about the address space beyond being consistent across files
  3. The existence of this API invites people to start using it for other things, and relying on properties of it that you've documented as being true which I don't think we want to be guaranteeing

If you really object to the arbitrary address space in use, I would rather see the special arrays be made overloaded/polymorphic instead so that the addrspacecasts can be avoided entirely .

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some targets won't even have a flat address space. But that's fine, because in these specific uses the address space's underlying range/representation/meaning/whatever is completely irrelevant, all that matters is the GlobalObject (GlobalValue?) you get once you strip away an addrspacecasts. So I object to this on several grounds:

  1. A flat address space may not exist
  2. These uses don't care about the address space beyond being consistent across files
  3. The existence of this API invites people to start using it for other things, and relying on properties of it that you've documented as being true which I don't think we want to be guaranteeing

If you really object to the arbitrary address space in use, I would rather see the special arrays be made overloaded/polymorphic instead so that the addrspacecasts can be avoided entirely .

@jrtc27 this is not purely bespoke for the special arrays (sadly). The core issue is the prevalent, unguarded/unassuming use of AS 0. Since targets are not really bound to do anything special with 0 (there are no words anywhere saying 0 has any specific properties), problematic / invalid situations arise. This leads to a perpetual game of whack-a-mole as e.g. one brings up a more general purpose language, which is not quite ideal.

Strictly regarding the special purpose arrays, I object to the arbitrary address space because it actually breaks targets as things stand today, for example: SPIR-V uses 0 for private, if we just use 0 we end up trying to generate invalid SPIR-V that casts from global (CrossWorkgroup) to private (Function) when populating them. Yes we can then go and add special handling just for the special arrays etc. in the SPIR-V BE/the downstream Translator, but that seems less than ideal to me.

I'd like to address (1) and (3) (I did worry about those, but we have an interface that returns a vtbl ptr address space, and it was not clear why 0, which on some targets points to a limited visibility/capacity/private/unambiguously non-global private AS, would be valid for a vtable) - instead of unconditionally returning an integer, we could return an optional, with the default being to return std::nullopt. This'd constrain to uses to folks who know what they want, and why they want it, as opposed to trivial drop-in replacement.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here's no such thing as LangAS::Generic, there's LangAS::Default.

Right, so query the target address space map for that.

Ok, I queried LangAS::Default and I got this because of this. What's next?

The IR does not have the concept of a generic, flat, or no address space. 0 is the "default address space", which has its own specific properties.

What are these specific properties and where are they documented? Also, 0 doesn't quite look like the default address space, at least not de jure, see above; it's not even the default address space for all uses of AMDGPU.

this merely maintains the status quo, wherein everybody grabs an AS 0 pointer and hopes for the best.

For the most part this isn't true, at least in llvm proper.

Sure, but this is in Clang, which pretty liberally uses things like getUnqualPtrTy and hopes it will work. Elsewhere you are essentially arguing for exactly the same thing, namely the FE should just use 0 unconditionally. I am saying that is wrong in general, and the next best thing is to query the target if it happens to have any opinions regarding what the AS one should use everywhere is. Just like we're already asking if it has any opinions about something far more niche, namely vtable ptrs), and then revert to the status quo which is use 0. In Clang. Because, otherwise, your request for the FE to "just do the right thing" ends up unsolvable.

Copy link
Contributor Author

@AlexVlx AlexVlx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jrtc27 @arsenm any additional comments? Are things more palatable in this form? Should this be turned into an RFC? Thanks.

@efriedma-quic
Copy link
Collaborator

I mean, this is conceptually fine as far as it goes, but I'm not sure where we actually want to use it. As noted on the other PRs, llvm.used/llvm.global_ctors/llvm.global_dtors don't really want a flat address-space. The "address-space" for llvm.used, and for the "associated global" for global_ctors/global_dtors, is not a real address-space; it's just there because the IR representation forces the use of some address-space.

For most places using address-space "0", there's some appropriate address-space, it's just nobody has spent any time thinking about what that address-space is, and everything defaulted to "0" when address-spaces were introduced into LLVM.

@arsenm
Copy link
Contributor

arsenm commented Jun 28, 2024

I still think we should not need this. DefaultIsPrivate is junk that needs to be deleted. Querying for LangAS::Default should always give the answer 0 for AMDGPU, which is what this is working around.

This clang notion of address space has nothing to do with your troubles with llvm.used in the IR or SPIRV

@yxsamliu
Copy link
Collaborator

yxsamliu commented Jul 2, 2024

I still think we should not need this. DefaultIsPrivate is junk that needs to be deleted. Querying for LangAS::Default should always give the answer 0 for AMDGPU, which is what this is working around.

This clang notion of address space has nothing to do with your troubles with llvm.used in the IR or SPIRV

LangAS::Default is not just determined by target. It also depends on language. For OpenCL 1.2 it is private. For example, the argument of void foo(int*) by language spec points to private addr space, and it translates to addr space 5 instead of 0 in IR. (https://godbolt.org/z/E71E3Wb5e). Due to this, it is hard to let LangAS::Default maps to 0 for amdgpu target for all languages.

@arsenm
Copy link
Contributor

arsenm commented Jul 2, 2024

LangAS::Default is not just determined by target. It also depends on language. For OpenCL 1.2 it is private.

I would have hoped this would be implemented by default assuming a hidden private addrspace qualifier

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:AMDGPU backend:SPIR-V clang:codegen IR generation bugs: mangling, exceptions, etc. clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants