Skip to content

Metadata: Optimize metadata queries #70700

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Oct 31, 2023
Merged

Metadata: Optimize metadata queries #70700

merged 3 commits into from
Oct 31, 2023

Conversation

MatzeB
Copy link
Contributor

@MatzeB MatzeB commented Oct 30, 2023

Optimize metadata query code:

  • Avoid DenseMap::operator[] in situations where it is known that the key exists in the map. Instead use DenseMap::at()/ DenseMap::find()->second. This avoids code-bloat and bad inlining decisions for the unused insertion/growing code in operator[].
  • Directly query the map in Instruction::getMetadataImpl instead of using Value::getMetadata to avoid a redundant hasMetadata() check.
  • Move the KindID == LLVMContext::MD_dbg case to Instruction::getMetadata and check it first assuming that it can be constant folded after inlining in many situations.

The motivation for this change is a regression triggered by e3cf80c which could attributed to the compiler inlining the insertion part of DenseMap::operator[] in more cases while unbeknownst to a compiler (without PGO) that code is never used in this context. This change improves performance and eliminates difference before and after that change in my measurements.

Optimize metadata query code:
- Avoid `DenseMap::operator[]` in situations where it is known that the
  key exists in the map. Instead use `DenseMap::at()`/
  `DenseMap::find()->second`. This avoids code-bloat leading to missed
  inlining opportunities just because the unnecessary map growing code
  was around.
- Directly query the map in `Instruction::getMetadataImpl` instead of
  using `Value::getMetadata` to avoid an unnecessary `hasMetadata()`
  check.
- Move `KindID == LLVMContext::MD_dbg` to `Instruction::getMetadata`
  and check this case first assuming that it can be inlined and constant
  folded in many situations.
@llvmbot
Copy link
Member

llvmbot commented Oct 30, 2023

@llvm/pr-subscribers-llvm-ir

Author: Matthias Braun (MatzeB)

Changes

Optimize metadata query code:

  • Avoid DenseMap::operator[] in situations where it is known that the key exists in the map. Instead use DenseMap::at()/ DenseMap::find()->second. This avoids code-bloat and bad inlining decisions for the unused insertion/growing code in operator[].
  • Directly query the map in Instruction::getMetadataImpl instead of using Value::getMetadata to avoid an unnecessary hasMetadata() check.
  • Move KindID == LLVMContext::MD_dbg to Instruction::getMetadata and check this case first assuming that it can be inlined and constant folded in many situations.

The motivation for this change is a regression triggered by e3cf80c which could attributed to the compiler inlining the insertion part of DenseMap::operator[] in more cases while unbeknownst to a compiler (without PGO) that code is never used in this context. This change improves performance and eliminates difference before and after that change in my measurements.


Full diff: https://github.com/llvm/llvm-project/pull/70700.diff

2 Files Affected:

  • (modified) llvm/include/llvm/IR/Instruction.h (+8-2)
  • (modified) llvm/lib/IR/Metadata.cpp (+27-19)
diff --git a/llvm/include/llvm/IR/Instruction.h b/llvm/include/llvm/IR/Instruction.h
index 5142847fa75fff2..fbeafd645a5a038 100644
--- a/llvm/include/llvm/IR/Instruction.h
+++ b/llvm/include/llvm/IR/Instruction.h
@@ -303,8 +303,14 @@ class Instruction : public User,
   /// Get the metadata of given kind attached to this Instruction.
   /// If the metadata is not found then return null.
   MDNode *getMetadata(unsigned KindID) const {
-    if (!hasMetadata()) return nullptr;
-    return getMetadataImpl(KindID);
+    // Handle 'dbg' as a special case since it is not stored in the hash table.
+    if (KindID == LLVMContext::MD_dbg) {
+      return DbgLoc.getAsMDNode();
+    }
+    if (hasMetadataOtherThanDebugLoc()) {
+      return getMetadataImpl(KindID);
+    }
+    return nullptr;
   }
 
   /// Get the metadata of given kind attached to this Instruction.
diff --git a/llvm/lib/IR/Metadata.cpp b/llvm/lib/IR/Metadata.cpp
index c153ffb71a73bba..a3f171fe88e983e 100644
--- a/llvm/lib/IR/Metadata.cpp
+++ b/llvm/lib/IR/Metadata.cpp
@@ -56,6 +56,12 @@
 
 using namespace llvm;
 
+static inline MDNode *getValueMetadata(const Value &Value, unsigned KindID,
+                                       const LLVMContext &Ctx) {
+  const MDAttachments &Attachements = Ctx.pImpl->ValueMetadata.at(&Value);
+  return Attachements.lookup(KindID);
+}
+
 MetadataAsValue::MetadataAsValue(Type *Ty, Metadata *MD)
     : Value(Ty, MetadataAsValueVal), MD(MD) {
   track();
@@ -1354,22 +1360,20 @@ bool MDAttachments::erase(unsigned ID) {
 MDNode *Value::getMetadata(unsigned KindID) const {
   if (!hasMetadata())
     return nullptr;
-  const auto &Info = getContext().pImpl->ValueMetadata[this];
-  assert(!Info.empty() && "bit out of sync with hash table");
-  return Info.lookup(KindID);
+  return getValueMetadata(*this, KindID, getContext());
 }
 
 MDNode *Value::getMetadata(StringRef Kind) const {
   if (!hasMetadata())
     return nullptr;
-  const auto &Info = getContext().pImpl->ValueMetadata[this];
-  assert(!Info.empty() && "bit out of sync with hash table");
-  return Info.lookup(getContext().getMDKindID(Kind));
+  const LLVMContext &Ctx = getContext();
+  unsigned KindID = Ctx.getMDKindID(Kind);
+  return getValueMetadata(*this, KindID, Ctx);
 }
 
 void Value::getMetadata(unsigned KindID, SmallVectorImpl<MDNode *> &MDs) const {
   if (hasMetadata())
-    getContext().pImpl->ValueMetadata[this].get(KindID, MDs);
+    getContext().pImpl->ValueMetadata.at(this).get(KindID, MDs);
 }
 
 void Value::getMetadata(StringRef Kind, SmallVectorImpl<MDNode *> &MDs) const {
@@ -1382,8 +1386,7 @@ void Value::getAllMetadata(
   if (hasMetadata()) {
     assert(getContext().pImpl->ValueMetadata.count(this) &&
            "bit out of sync with hash table");
-    const auto &Info = getContext().pImpl->ValueMetadata.find(this)->second;
-    assert(!Info.empty() && "Shouldn't have called this");
+    const MDAttachments &Info = getContext().pImpl->ValueMetadata.at(this);
     Info.getAll(MDs);
   }
 }
@@ -1393,7 +1396,7 @@ void Value::setMetadata(unsigned KindID, MDNode *Node) {
 
   // Handle the case when we're adding/updating metadata on a value.
   if (Node) {
-    auto &Info = getContext().pImpl->ValueMetadata[this];
+    MDAttachments &Info = getContext().pImpl->ValueMetadata[this];
     assert(!Info.empty() == HasMetadata && "bit out of sync with hash table");
     if (Info.empty())
       HasMetadata = true;
@@ -1406,7 +1409,7 @@ void Value::setMetadata(unsigned KindID, MDNode *Node) {
          "bit out of sync with hash table");
   if (!HasMetadata)
     return; // Nothing to remove!
-  auto &Info = getContext().pImpl->ValueMetadata[this];
+  MDAttachments &Info = getContext().pImpl->ValueMetadata.find(this)->second;
 
   // Handle removal of an existing value.
   Info.erase(KindID);
@@ -1438,7 +1441,7 @@ bool Value::eraseMetadata(unsigned KindID) {
   if (!HasMetadata)
     return false;
 
-  auto &Store = getContext().pImpl->ValueMetadata[this];
+  MDAttachments &Store = getContext().pImpl->ValueMetadata.find(this)->second;
   bool Changed = Store.erase(KindID);
   if (Store.empty())
     clearMetadata();
@@ -1461,7 +1464,15 @@ void Instruction::setMetadata(StringRef Kind, MDNode *Node) {
 }
 
 MDNode *Instruction::getMetadataImpl(StringRef Kind) const {
-  return getMetadataImpl(getContext().getMDKindID(Kind));
+  const LLVMContext &Ctx = getContext();
+  unsigned KindID = Ctx.getMDKindID(Kind);
+  if (KindID == LLVMContext::MD_dbg) {
+    return DbgLoc.getAsMDNode();
+  }
+  if (hasMetadataOtherThanDebugLoc()) {
+    return getValueMetadata(*this, KindID, Ctx);
+  }
+  return nullptr;
 }
 
 void Instruction::dropUnknownNonDebugMetadata(ArrayRef<unsigned> KnownIDs) {
@@ -1475,7 +1486,7 @@ void Instruction::dropUnknownNonDebugMetadata(ArrayRef<unsigned> KnownIDs) {
   KnownSet.insert(LLVMContext::MD_DIAssignID);
 
   auto &MetadataStore = getContext().pImpl->ValueMetadata;
-  auto &Info = MetadataStore[this];
+  MDAttachments &Info = MetadataStore.find(this)->second;
   assert(!Info.empty() && "bit out of sync with hash table");
   Info.remove_if([&KnownSet](const MDAttachments::Attachment &I) {
     return !KnownSet.count(I.MDKind);
@@ -1598,7 +1609,7 @@ AAMDNodes Instruction::getAAMetadata() const {
   // Not using Instruction::hasMetadata() because we're not interested in
   // DebugInfoMetadata.
   if (Value::hasMetadata()) {
-    const auto &Info = getContext().pImpl->ValueMetadata[this];
+    const MDAttachments &Info = getContext().pImpl->ValueMetadata.at(this);
     Result.TBAA = Info.lookup(LLVMContext::MD_tbaa);
     Result.TBAAStruct = Info.lookup(LLVMContext::MD_tbaa_struct);
     Result.Scope = Info.lookup(LLVMContext::MD_alias_scope);
@@ -1620,10 +1631,7 @@ void Instruction::setNoSanitizeMetadata() {
 }
 
 MDNode *Instruction::getMetadataImpl(unsigned KindID) const {
-  // Handle 'dbg' as a special case since it is not stored in the hash table.
-  if (KindID == LLVMContext::MD_dbg)
-    return DbgLoc.getAsMDNode();
-  return Value::getMetadata(KindID);
+  return getValueMetadata(*this, KindID, getContext());
 }
 
 void Instruction::getAllMetadataImpl(

@MatzeB
Copy link
Contributor Author

MatzeB commented Oct 30, 2023

In my ad-hoc benchmarking I am measuring clang -O3 -c llvm-test-suite/CTMark/sqlite3/sqlite3.c compile-time as instructions counted by valgrind/callgrind:

Before e3cf80c              37,821,687,947
After e3cf80c               38,133,312,923
this change before e3cf80c  37,268,867,198
this change after e3cf80c   37,268,867,064

@MatzeB MatzeB mentioned this pull request Oct 30, 2023
Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@nikic
Copy link
Contributor

nikic commented Oct 30, 2023

In my ad-hoc benchmarking I am measuring clang -O3 -c llvm-test-suite/CTMark/sqlite3/sqlite3.c compile-time as instructions counted by valgrind/callgrind:

Before e3cf80c              37,821,687,947
After 3cf80c                38,133,312,923
this change before e3cf80c  37,268,867,198
this change after 3cf80c    37,268,867,064

I just tested this as well, and the numbers I'm seeing aren't quite this good. On sqlite3 in O3 configuration, the original change was a +0.66% regression, and this one is a -0.33% win. (The geomean was +0.68% and -0.20%).

(But in any case, this is a win. And for GCC-built clang this is actually a slightly larger win.)

return DbgLoc.getAsMDNode();
}
if (hasMetadataOtherThanDebugLoc()) {
return getMetadataImpl(KindID);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this call check the debug kind and later hasMetadata again?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From Instruction.h:

  /// Return true if this instruction has any metadata attached to it.
  bool hasMetadata() const { return DbgLoc || Value::hasMetadata(); }

  /// Return true if this instruction has metadata attached to it other than a
  /// debug location.
  bool hasMetadataOtherThanDebugLoc() const { return Value::hasMetadata(); }

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that Value::hasMetadata() is a protected member, accessing a bit stored on Value that caches whether anything is in the DenseMap stored in the LLVMContext.

Copy link
Contributor Author

@MatzeB MatzeB Oct 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With my changes applied getMetadataImpl() will not perform any additional hasMetadata() checks anymore (there used to be a redundant Value::hasMetadata() check before my changes).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right. I missed the change in Value::getMetadata().

Copy link
Collaborator

@dexonsmith dexonsmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM too.

return DbgLoc.getAsMDNode();
}
if (hasMetadataOtherThanDebugLoc()) {
return getMetadataImpl(KindID);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right. I missed the change in Value::getMetadata().

Copy link
Member

@jmorse jmorse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with nit

Copy link
Collaborator

@dexonsmith dexonsmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM again; thanks for the extra cleanup.

@MatzeB MatzeB merged commit af010e7 into llvm:main Oct 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants