Skip to content

[IR] Don't verify module flags on every access #102153

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 6, 2024

Conversation

aengelke
Copy link
Contributor

@aengelke aengelke commented Aug 6, 2024

8b4306c introduced validity checks for every module flag access, because the auto-upgrader uses named metadata before verifying the module.

This causes overhead for all other accesses, and the check is, in fact, only need at that single place. Change the upgrader to be careful when accessing module flags before the module is verified and remove the checks on all other occasions.

There are two tangential optimizations included: first, when querying a specific flag, don't enumerate all other flags into a vector as well. Second, don't use a Twine for getNamedMetadata(), which has materialization overhead -- all call sites use simple strings that can be implicitly converted to a StringRef.

c-t-t -0.19% in stage2-O3, -0.14% in stage2-O0-g

8b4306c introduced validity checks for every module flag access,
because the auto-upgrader uses named metadata before verifying the
module.

This causes overhead for all other accesses, and the check is, in fact,
only need at that single place. Change the upgrader to be careful when
accessing module flags before the module is verified and remove the
checks on all other occasions.

There are two tangential optimizations included: first, when querying a
specific flag, don't enumerate all other flags into a vector as well.
Second, don't use a Twine for getNamedMetadata(), which has
materialization overhead -- all call sites use simple strings that can
be implicitly converted to a StringRef.
@aengelke aengelke requested a review from nikic August 6, 2024 14:24
@llvmbot llvmbot added the llvm:ir label Aug 6, 2024
@llvmbot
Copy link
Member

llvmbot commented Aug 6, 2024

@llvm/pr-subscribers-llvm-ir

Author: Alexis Engelke (aengelke)

Changes

8b4306c introduced validity checks for every module flag access, because the auto-upgrader uses named metadata before verifying the module.

This causes overhead for all other accesses, and the check is, in fact, only need at that single place. Change the upgrader to be careful when accessing module flags before the module is verified and remove the checks on all other occasions.

There are two tangential optimizations included: first, when querying a specific flag, don't enumerate all other flags into a vector as well. Second, don't use a Twine for getNamedMetadata(), which has materialization overhead -- all call sites use simple strings that can be implicitly converted to a StringRef.

c-t-t -0.19% in stage2-O3, -0.14% in stage2-O0-g


Full diff: https://github.com/llvm/llvm-project/pull/102153.diff

3 Files Affected:

  • (modified) llvm/include/llvm/IR/Module.h (+1-6)
  • (modified) llvm/lib/IR/AutoUpgrade.cpp (+19-1)
  • (modified) llvm/lib/IR/Module.cpp (+15-35)
diff --git a/llvm/include/llvm/IR/Module.h b/llvm/include/llvm/IR/Module.h
index d2b2fe40b1ada..7042a54a59e7d 100644
--- a/llvm/include/llvm/IR/Module.h
+++ b/llvm/include/llvm/IR/Module.h
@@ -158,11 +158,6 @@ class LLVM_EXTERNAL_VISIBILITY Module {
   /// converted result in MFB.
   static bool isValidModFlagBehavior(Metadata *MD, ModFlagBehavior &MFB);
 
-  /// Check if the given module flag metadata represents a valid module flag,
-  /// and store the flag behavior, the key string and the value metadata.
-  static bool isValidModuleFlag(const MDNode &ModFlag, ModFlagBehavior &MFB,
-                                MDString *&Key, Metadata *&Val);
-
   struct ModuleFlagEntry {
     ModFlagBehavior Behavior;
     MDString *Key;
@@ -502,7 +497,7 @@ class LLVM_EXTERNAL_VISIBILITY Module {
 
   /// Return the first NamedMDNode in the module with the specified name. This
   /// method returns null if a NamedMDNode with the specified name is not found.
-  NamedMDNode *getNamedMetadata(const Twine &Name) const;
+  NamedMDNode *getNamedMetadata(StringRef Name) const;
 
   /// Return the named MDNode in the module with the specified name. This method
   /// returns a new NamedMDNode if a NamedMDNode with the specified name is not
diff --git a/llvm/lib/IR/AutoUpgrade.cpp b/llvm/lib/IR/AutoUpgrade.cpp
index 53de9eef516b3..ec719754183d5 100644
--- a/llvm/lib/IR/AutoUpgrade.cpp
+++ b/llvm/lib/IR/AutoUpgrade.cpp
@@ -4886,7 +4886,25 @@ bool llvm::UpgradeDebugInfo(Module &M) {
   if (DisableAutoUpgradeDebugInfo)
     return false;
 
-  unsigned Version = getDebugMetadataVersionFromModule(M);
+  // We need to get metadata before the module is verified (i.e., getModuleFlag
+  // makes assumptions that we haven't verified yet). Carefully extract the flag
+  // from the metadata.
+  unsigned Version = 0;
+  if (NamedMDNode *ModFlags = M.getModuleFlagsMetadata()) {
+    auto OpIt = find_if(ModFlags->operands(), [](const MDNode *Flag) {
+      if (Flag->getNumOperands() < 3)
+        return false;
+      if (MDString *K = dyn_cast_or_null<MDString>(Flag->getOperand(1)))
+        return K->getString() == "Debug Info Version";
+      return false;
+    });
+    if (OpIt != ModFlags->op_end()) {
+      const MDOperand &ValOp = (*OpIt)->getOperand(2);
+      if (auto *CI = mdconst::dyn_extract_or_null<ConstantInt>(ValOp))
+        Version = CI->getZExtValue();
+    }
+  }
+
   if (Version == DEBUG_METADATA_VERSION) {
     bool BrokenDebugInfo = false;
     if (verifyModule(M, &llvm::errs(), &BrokenDebugInfo))
diff --git a/llvm/lib/IR/Module.cpp b/llvm/lib/IR/Module.cpp
index c966c53d09baf..80b5408b61eda 100644
--- a/llvm/lib/IR/Module.cpp
+++ b/llvm/lib/IR/Module.cpp
@@ -259,10 +259,8 @@ GlobalIFunc *Module::getNamedIFunc(StringRef Name) const {
 /// getNamedMetadata - Return the first NamedMDNode in the module with the
 /// specified name. This method returns null if a NamedMDNode with the
 /// specified name is not found.
-NamedMDNode *Module::getNamedMetadata(const Twine &Name) const {
-  SmallString<256> NameData;
-  StringRef NameRef = Name.toStringRef(NameData);
-  return NamedMDSymTab.lookup(NameRef);
+NamedMDNode *Module::getNamedMetadata(StringRef Name) const {
+  return NamedMDSymTab.lookup(Name);
 }
 
 /// getOrInsertNamedMetadata - Return the first named MDNode in the module
@@ -296,20 +294,6 @@ bool Module::isValidModFlagBehavior(Metadata *MD, ModFlagBehavior &MFB) {
   return false;
 }
 
-bool Module::isValidModuleFlag(const MDNode &ModFlag, ModFlagBehavior &MFB,
-                               MDString *&Key, Metadata *&Val) {
-  if (ModFlag.getNumOperands() < 3)
-    return false;
-  if (!isValidModFlagBehavior(ModFlag.getOperand(0), MFB))
-    return false;
-  MDString *K = dyn_cast_or_null<MDString>(ModFlag.getOperand(1));
-  if (!K)
-    return false;
-  Key = K;
-  Val = ModFlag.getOperand(2);
-  return true;
-}
-
 /// getModuleFlagsMetadata - Returns the module flags in the provided vector.
 void Module::
 getModuleFlagsMetadata(SmallVectorImpl<ModuleFlagEntry> &Flags) const {
@@ -317,25 +301,24 @@ getModuleFlagsMetadata(SmallVectorImpl<ModuleFlagEntry> &Flags) const {
   if (!ModFlags) return;
 
   for (const MDNode *Flag : ModFlags->operands()) {
-    ModFlagBehavior MFB;
-    MDString *Key = nullptr;
-    Metadata *Val = nullptr;
-    if (isValidModuleFlag(*Flag, MFB, Key, Val)) {
-      // Check the operands of the MDNode before accessing the operands.
-      // The verifier will actually catch these failures.
-      Flags.push_back(ModuleFlagEntry(MFB, Key, Val));
-    }
+    // The verifier will catch errors, so no need to check them here.
+    auto *MFBConstant = mdconst::extract<ConstantInt>(Flag->getOperand(0));
+    auto MFB = static_cast<ModFlagBehavior>(MFBConstant->getLimitedValue());
+    MDString *Key = cast<MDString>(Flag->getOperand(1));
+    Metadata *Val = Flag->getOperand(2);
+    Flags.push_back(ModuleFlagEntry(MFB, Key, Val));
   }
 }
 
 /// Return the corresponding value if Key appears in module flags, otherwise
 /// return null.
 Metadata *Module::getModuleFlag(StringRef Key) const {
-  SmallVector<Module::ModuleFlagEntry, 8> ModuleFlags;
-  getModuleFlagsMetadata(ModuleFlags);
-  for (const ModuleFlagEntry &MFE : ModuleFlags) {
-    if (Key == MFE.Key->getString())
-      return MFE.Val;
+  const NamedMDNode *ModFlags = getModuleFlagsMetadata();
+  if (!ModFlags)
+    return nullptr;
+  for (const MDNode *Flag : ModFlags->operands()) {
+    if (Key == cast<MDString>(Flag->getOperand(1))->getString())
+      return Flag->getOperand(2);
   }
   return nullptr;
 }
@@ -388,10 +371,7 @@ void Module::setModuleFlag(ModFlagBehavior Behavior, StringRef Key,
   NamedMDNode *ModFlags = getOrInsertModuleFlagsMetadata();
   // Replace the flag if it already exists.
   for (MDNode *Flag : ModFlags->operands()) {
-    ModFlagBehavior MFB;
-    MDString *K = nullptr;
-    Metadata *V = nullptr;
-    if (isValidModuleFlag(*Flag, MFB, K, V) && K->getString() == Key) {
+    if (cast<MDString>(Flag->getOperand(1))->getString() == Key) {
       Flag->replaceOperandWith(2, Val);
       return;
     }

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM if CI is happy.

@aengelke aengelke merged commit b7cd564 into llvm:main Aug 6, 2024
9 checks passed
@aengelke aengelke deleted the fastermoduleflags branch August 6, 2024 16:33
@llvm-ci
Copy link
Collaborator

llvm-ci commented Aug 6, 2024

LLVM Buildbot has detected a new failure on builder openmp-offload-amdgpu-runtime running on omp-vega20-0 while building llvm at step 7 "Add check check-offload".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/30/builds/3447

Here is the relevant piece of the build log for the reference:

Step 7 (Add check check-offload) failure: test (failure)
...
PASS: libomptarget :: x86_64-pc-linux-gnu-LTO :: offloading/test_libc.cpp (828 of 837)
PASS: libomptarget :: x86_64-pc-linux-gnu-LTO :: offloading/bug53727.cpp (829 of 837)
PASS: libomptarget :: x86_64-pc-linux-gnu-LTO :: offloading/bug50022.cpp (830 of 837)
PASS: libomptarget :: x86_64-pc-linux-gnu-LTO :: offloading/wtime.c (831 of 837)
PASS: libomptarget :: x86_64-pc-linux-gnu :: offloading/bug49021.cpp (832 of 837)
PASS: libomptarget :: x86_64-pc-linux-gnu :: offloading/std_complex_arithmetic.cpp (833 of 837)
PASS: libomptarget :: x86_64-pc-linux-gnu-LTO :: offloading/complex_reduction.cpp (834 of 837)
PASS: libomptarget :: x86_64-pc-linux-gnu-LTO :: offloading/bug49021.cpp (835 of 837)
PASS: libomptarget :: x86_64-pc-linux-gnu-LTO :: offloading/std_complex_arithmetic.cpp (836 of 837)
TIMEOUT: libomptarget :: amdgcn-amd-amdhsa :: offloading/ctor_dtor.cpp (837 of 837)
******************** TEST 'libomptarget :: amdgcn-amd-amdhsa :: offloading/ctor_dtor.cpp' FAILED ********************
Exit Code: -9
Timeout: Reached timeout of 100 seconds

Command Output (stdout):
--
# RUN: at line 1
/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/clang++ -fopenmp    -I /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test -I /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src  -nogpulib -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib  -fopenmp-targets=amdgcn-amd-amdhsa /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test/offloading/ctor_dtor.cpp -o /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/offloading/Output/ctor_dtor.cpp.tmp /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib/libomptarget.devicertl.a && /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/offloading/Output/ctor_dtor.cpp.tmp | /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/FileCheck /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test/offloading/ctor_dtor.cpp
# executed command: /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/clang++ -fopenmp -I /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test -I /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -nogpulib -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib -fopenmp-targets=amdgcn-amd-amdhsa /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test/offloading/ctor_dtor.cpp -o /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/offloading/Output/ctor_dtor.cpp.tmp /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib/libomptarget.devicertl.a
# note: command had no output on stdout or stderr
# executed command: /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/offloading/Output/ctor_dtor.cpp.tmp
# note: command had no output on stdout or stderr
# executed command: /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/FileCheck /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test/offloading/ctor_dtor.cpp
# note: command had no output on stdout or stderr
# RUN: at line 2
/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/clang++ -fopenmp    -I /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test -I /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src  -nogpulib -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib  -fopenmp-targets=amdgcn-amd-amdhsa -O3 /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test/offloading/ctor_dtor.cpp -o /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/offloading/Output/ctor_dtor.cpp.tmp /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib/libomptarget.devicertl.a && /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/offloading/Output/ctor_dtor.cpp.tmp | /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/FileCheck /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test/offloading/ctor_dtor.cpp
# executed command: /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/clang++ -fopenmp -I /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test -I /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -nogpulib -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib -fopenmp-targets=amdgcn-amd-amdhsa -O3 /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test/offloading/ctor_dtor.cpp -o /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/offloading/Output/ctor_dtor.cpp.tmp /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib/libomptarget.devicertl.a
# note: command had no output on stdout or stderr
# executed command: /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/offloading/Output/ctor_dtor.cpp.tmp
# note: command had no output on stdout or stderr
# error: command failed with exit status: -9
# error: command reached timeout: True
# executed command: /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/FileCheck /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test/offloading/ctor_dtor.cpp
# note: command had no output on stdout or stderr
# error: command failed with exit status: -9
# error: command reached timeout: True

--

********************
Slowest Tests:
--------------------------------------------------------------------------
100.05s: libomptarget :: amdgcn-amd-amdhsa :: offloading/ctor_dtor.cpp
16.59s: libomptarget :: amdgcn-amd-amdhsa :: offloading/bug49021.cpp
12.99s: libomptarget :: amdgcn-amd-amdhsa :: offloading/parallel_target_teams_reduction_min.cpp
12.95s: libomptarget :: amdgcn-amd-amdhsa :: offloading/parallel_target_teams_reduction_max.cpp
11.25s: libomptarget :: amdgcn-amd-amdhsa :: offloading/complex_reduction.cpp
10.07s: libomptarget :: amdgcn-amd-amdhsa :: jit/empty_kernel_lvl2.c
9.23s: libomptarget :: x86_64-pc-linux-gnu :: offloading/bug49021.cpp

searlmc1 pushed a commit to ROCm/llvm-project that referenced this pull request Aug 22, 2024
reverts: [IR] Don't verify module flags on every access (llvm#102153)
Change-Id: I0c1ccc88f8617501d70cf402882145e8c5aa4d3c
searlmc1 pushed a commit to ROCm/llvm-project that referenced this pull request Sep 3, 2024
8b4306c introduced validity checks for every module flag access,
because the auto-upgrader uses named metadata before verifying the
module.

This causes overhead for all other accesses, and the check is, in fact,
only need at that single place. Change the upgrader to be careful when
accessing module flags before the module is verified and remove the
checks on all other occasions.

There are two tangential optimizations included: first, when querying a
specific flag, don't enumerate all other flags into a vector as well.
Second, don't use a Twine for getNamedMetadata(), which has
materialization overhead -- all call sites use simple strings that can
be implicitly converted to a StringRef.

(cherry picked from commit b7cd564)

To minimize the diff I just copied the "checked" version of
getDebugMetadataVersionFromModule that was inlined into UpgradeDebugInfo
in the upstream patch. One could also delay the checks that rely on the
version in the Verifier until after we've already verified the module
flags metadata, but that seemed needlessly complicated.

Change-Id: Id98ec158770d5b3e839e3de1c558f507c28cc82c
searlmc1 pushed a commit to ROCm/llvm-project that referenced this pull request Sep 7, 2024
8b4306c introduced validity checks for every module flag access,
because the auto-upgrader uses named metadata before verifying the
module.

This causes overhead for all other accesses, and the check is, in fact,
only need at that single place. Change the upgrader to be careful when
accessing module flags before the module is verified and remove the
checks on all other occasions.

There are two tangential optimizations included: first, when querying a
specific flag, don't enumerate all other flags into a vector as well.
Second, don't use a Twine for getNamedMetadata(), which has
materialization overhead -- all call sites use simple strings that can
be implicitly converted to a StringRef.

Change-Id: I724c67e52f080caba6bc16ec983f6118429408a1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants