Skip to content

[DataLayout] Refactor storage of non-integral address spaces #105734

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

arichardson
Copy link
Member

Instead of storing this as a separate array of non-integral pointers,
add it to the PointerSpec class instead. This will allow for future
simplifications such as splitting the non-integral property into
multiple distinct ones: relocatable (i.e. non-stable representation) and
non-integral representation (i.e. pointers with metadata).

Created using spr 1.3.6-beta.1
@llvmbot
Copy link
Member

llvmbot commented Aug 22, 2024

@llvm/pr-subscribers-llvm-ir

Author: Alexander Richardson (arichardson)

Changes

Instead of storing this as a separate array of non-integral pointers,
add it to the PointerSpec class instead. This will allow for future
simplifications such as splitting the non-integral property into
multiple distinct ones: relocatable (i.e. non-stable representation) and
non-integral representation (i.e. pointers with metadata).


Full diff: https://github.com/llvm/llvm-project/pull/105734.diff

2 Files Affected:

  • (modified) llvm/include/llvm/IR/DataLayout.h (+17-11)
  • (modified) llvm/lib/IR/DataLayout.cpp (+21-9)
diff --git a/llvm/include/llvm/IR/DataLayout.h b/llvm/include/llvm/IR/DataLayout.h
index 2f06bda6c30a51..953deb6653cc9a 100644
--- a/llvm/include/llvm/IR/DataLayout.h
+++ b/llvm/include/llvm/IR/DataLayout.h
@@ -78,7 +78,11 @@ class DataLayout {
     Align ABIAlign;
     Align PrefAlign;
     uint32_t IndexBitWidth;
-
+    /// Pointers in this address space don't have a well-defined bitwise
+    /// representation (e.g. may be relocated by a copying garbage collector).
+    /// Additionally, they may also be non-integral (i.e. containing additional
+    /// metadata such as bounds information/permissions).
+    bool IsNonIntegral = false;
     bool operator==(const PointerSpec &Other) const;
   };
 
@@ -133,10 +137,6 @@ class DataLayout {
   // The StructType -> StructLayout map.
   mutable void *LayoutMap = nullptr;
 
-  /// Pointers in these address spaces are non-integral, and don't have a
-  /// well-defined bitwise representation.
-  SmallVector<unsigned, 8> NonIntegralAddressSpaces;
-
   /// Sets or updates the specification for the given primitive type.
   void setPrimitiveSpec(char Specifier, uint32_t BitWidth, Align ABIAlign,
                         Align PrefAlign);
@@ -147,7 +147,8 @@ class DataLayout {
 
   /// Sets or updates the specification for pointer in the given address space.
   void setPointerSpec(uint32_t AddrSpace, uint32_t BitWidth, Align ABIAlign,
-                      Align PrefAlign, uint32_t IndexBitWidth);
+                      Align PrefAlign, uint32_t IndexBitWidth,
+                      bool IsNonIntegral);
 
   /// Internal helper to get alignment for integer of given bitwidth.
   Align getIntegerAlignment(uint32_t BitWidth, bool abi_or_pref) const;
@@ -165,7 +166,8 @@ class DataLayout {
   Error parsePointerSpec(StringRef Spec);
 
   /// Attempts to parse a single specification.
-  Error parseSpecification(StringRef Spec);
+  Error parseSpecification(StringRef Spec,
+                           SmallVectorImpl<unsigned> &NonIntegralAddressSpaces);
 
   /// Attempts to parse a data layout string.
   Error parseLayoutString(StringRef LayoutString);
@@ -343,13 +345,17 @@ class DataLayout {
 
   /// Return the address spaces containing non-integral pointers.  Pointers in
   /// this address space don't have a well-defined bitwise representation.
-  ArrayRef<unsigned> getNonIntegralAddressSpaces() const {
-    return NonIntegralAddressSpaces;
+  SmallVector<unsigned, 8> getNonIntegralAddressSpaces() const {
+    SmallVector<unsigned, 8> AddrSpaces;
+    for (const PointerSpec &PS : PointerSpecs) {
+      if (PS.IsNonIntegral)
+        AddrSpaces.push_back(PS.AddrSpace);
+    }
+    return AddrSpaces;
   }
 
   bool isNonIntegralAddressSpace(unsigned AddrSpace) const {
-    ArrayRef<unsigned> NonIntegralSpaces = getNonIntegralAddressSpaces();
-    return is_contained(NonIntegralSpaces, AddrSpace);
+    return getPointerSpec(AddrSpace).IsNonIntegral;
   }
 
   bool isNonIntegralPointerType(PointerType *PT) const {
diff --git a/llvm/lib/IR/DataLayout.cpp b/llvm/lib/IR/DataLayout.cpp
index d295d1f5785eb9..a7a1fa8ef03ede 100644
--- a/llvm/lib/IR/DataLayout.cpp
+++ b/llvm/lib/IR/DataLayout.cpp
@@ -151,7 +151,8 @@ bool DataLayout::PrimitiveSpec::operator==(const PrimitiveSpec &Other) const {
 bool DataLayout::PointerSpec::operator==(const PointerSpec &Other) const {
   return AddrSpace == Other.AddrSpace && BitWidth == Other.BitWidth &&
          ABIAlign == Other.ABIAlign && PrefAlign == Other.PrefAlign &&
-         IndexBitWidth == Other.IndexBitWidth;
+         IndexBitWidth == Other.IndexBitWidth &&
+         IsNonIntegral == Other.IsNonIntegral;
 }
 
 namespace {
@@ -206,7 +207,8 @@ constexpr DataLayout::PrimitiveSpec DefaultVectorSpecs[] = {
 
 // Default pointer type specifications.
 constexpr DataLayout::PointerSpec DefaultPointerSpecs[] = {
-    {0, 64, Align::Constant<8>(), Align::Constant<8>(), 64} // p0:64:64:64:64
+    {0, 64, Align::Constant<8>(), Align::Constant<8>(), 64,
+     false} // p0:64:64:64:64
 };
 
 DataLayout::DataLayout()
@@ -239,13 +241,11 @@ DataLayout &DataLayout::operator=(const DataLayout &Other) {
   PointerSpecs = Other.PointerSpecs;
   StructABIAlignment = Other.StructABIAlignment;
   StructPrefAlignment = Other.StructPrefAlignment;
-  NonIntegralAddressSpaces = Other.NonIntegralAddressSpaces;
   return *this;
 }
 
 bool DataLayout::operator==(const DataLayout &Other) const {
   // NOTE: StringRepresentation might differ, it is not canonicalized.
-  // FIXME: NonIntegralAddressSpaces isn't compared.
   return BigEndian == Other.BigEndian &&
          AllocaAddrSpace == Other.AllocaAddrSpace &&
          ProgramAddrSpace == Other.ProgramAddrSpace &&
@@ -454,11 +454,13 @@ Error DataLayout::parsePointerSpec(StringRef Spec) {
     return createStringError(
         "index size cannot be larger than the pointer size");
 
-  setPointerSpec(AddrSpace, BitWidth, ABIAlign, PrefAlign, IndexBitWidth);
+  setPointerSpec(AddrSpace, BitWidth, ABIAlign, PrefAlign, IndexBitWidth,
+                 false);
   return Error::success();
 }
 
-Error DataLayout::parseSpecification(StringRef Spec) {
+Error DataLayout::parseSpecification(
+    StringRef Spec, SmallVectorImpl<unsigned> &NonIntegralAddressSpaces) {
   // The "ni" specifier is the only two-character specifier. Handle it first.
   if (Spec.starts_with("ni")) {
     // ni:<address space>[:<address space>]...
@@ -614,12 +616,21 @@ Error DataLayout::parseLayoutString(StringRef LayoutString) {
 
   // Split the data layout string into specifications separated by '-' and
   // parse each specification individually, updating internal data structures.
+  SmallVector<unsigned, 8> NonIntegralAddressSpaces;
   for (StringRef Spec : split(LayoutString, '-')) {
     if (Spec.empty())
       return createStringError("empty specification is not allowed");
-    if (Error Err = parseSpecification(Spec))
+    if (Error Err = parseSpecification(Spec, NonIntegralAddressSpaces))
       return Err;
   }
+  // Mark all address spaces that were qualified as non-integral now. This has
+  // to be done later since the non-integral property is not part of the data
+  // layout pointer specification.
+  for (unsigned AS : NonIntegralAddressSpaces) {
+    const PointerSpec &PS = getPointerSpec(AS);
+    setPointerSpec(AS, PS.BitWidth, PS.ABIAlign, PS.PrefAlign, PS.IndexBitWidth,
+                   true);
+  }
 
   return Error::success();
 }
@@ -666,16 +677,17 @@ DataLayout::getPointerSpec(uint32_t AddrSpace) const {
 
 void DataLayout::setPointerSpec(uint32_t AddrSpace, uint32_t BitWidth,
                                 Align ABIAlign, Align PrefAlign,
-                                uint32_t IndexBitWidth) {
+                                uint32_t IndexBitWidth, bool IsNonIntegral) {
   auto I = lower_bound(PointerSpecs, AddrSpace, LessPointerAddrSpace());
   if (I == PointerSpecs.end() || I->AddrSpace != AddrSpace) {
     PointerSpecs.insert(I, PointerSpec{AddrSpace, BitWidth, ABIAlign, PrefAlign,
-                                       IndexBitWidth});
+                                       IndexBitWidth, IsNonIntegral});
   } else {
     I->BitWidth = BitWidth;
     I->ABIAlign = ABIAlign;
     I->PrefAlign = PrefAlign;
     I->IndexBitWidth = IndexBitWidth;
+    I->IsNonIntegral = IsNonIntegral;
   }
 }
 

@arichardson arichardson requested review from nikic and arsenm August 22, 2024 21:32
Copy link
Contributor

@s-barannikov s-barannikov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea makes sense to me, the "nonintegrality" is a pointer property.
I wish they were combined into one "pn" spec.

return Error::success();
}

Error DataLayout::parseSpecification(StringRef Spec) {
Error DataLayout::parseSpecification(
StringRef Spec, SmallVectorImpl<unsigned> &NonIntegralAddressSpaces) {
// The "ni" specifier is the only two-character specifier. Handle it first.
if (Spec.starts_with("ni")) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(optional) Since "ni" is special, may it make sense to extract this into a function and call it from parseLayoutString loop?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried this and it added more extra code so keeping as is.

@s-barannikov
Copy link
Contributor

lld/wasm/lto tests are failing for some reason:

wasm-ld: /var/lib/buildkite-agent/builds/linux-56-59b8f5d88-q7g2n-1/llvm-project/github-pull-requests/llvm/lib/CodeGen/MachineFunction.cpp:248: void llvm::MachineFunction::init(): Assertion `Target.isCompatibleDataLayout(getDataLayout()) && "Can't create a MachineFunction using a Module with a " "Target-incompatible DataLayout attached\n"' failed.

@arichardson
Copy link
Member Author

The idea makes sense to me, the "nonintegrality" is a pointer property. I wish they were combined into one "pn" spec.

Agreed, it makes more sense to me to have this be part of the pointer spec: #105735 (with the additional change of splitting the non-integral properties).

Copy link
Contributor

@arsenm arsenm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes more sense

@@ -343,13 +345,17 @@ class DataLayout {

/// Return the address spaces containing non-integral pointers. Pointers in
/// this address space don't have a well-defined bitwise representation.
ArrayRef<unsigned> getNonIntegralAddressSpaces() const {
return NonIntegralAddressSpaces;
SmallVector<unsigned, 8> getNonIntegralAddressSpaces() const {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this actually used for anything now?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is only used in tests - I could drop those if you prefer?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it might be useful somewhere I suppose

Created using spr 1.3.6-beta.1
arichardson added a commit that referenced this pull request Oct 25, 2024
These mismatched data layouts are exposed by the refactoring in
#105734 that now also
compares the non-integral pointers list of the data layout.

I dropped the explicit definition from all tests that use opt/llc since
those tools will insert the correct value, but for the llvm-as based
tests and explicit layout is required.

Reviewed By: MaskRay

Pull Request: #107276
Created using spr 1.3.6-beta.1
@arichardson arichardson merged commit 305a1ce into main Oct 25, 2024
8 checks passed
@arichardson arichardson deleted the users/arichardson/spr/datalayout-refactor-storage-of-non-integral-address-spaces branch October 25, 2024 17:02
NoumanAmir657 pushed a commit to NoumanAmir657/llvm-project that referenced this pull request Nov 4, 2024
These mismatched data layouts are exposed by the refactoring in
llvm#105734 that now also
compares the non-integral pointers list of the data layout.

I dropped the explicit definition from all tests that use opt/llc since
those tools will insert the correct value, but for the llvm-as based
tests and explicit layout is required.

Reviewed By: MaskRay

Pull Request: llvm#107276
NoumanAmir657 pushed a commit to NoumanAmir657/llvm-project that referenced this pull request Nov 4, 2024
Instead of storing this as a separate array of non-integral pointers,
add it to the PointerSpec class instead. This will allow for future
simplifications such as splitting the non-integral property into
multiple distinct ones: relocatable (i.e. non-stable representation) and
non-integral representation (i.e. pointers with metadata).

Reviewed By: arsenm

Pull Request: llvm#105734
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants