Skip to content

[ADT] Restore handwritten vector find in SmallSet #110254

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Sep 30, 2024

Conversation

vhscampos
Copy link
Member

This patch restores handwritten linear searches instead of the use of std::find.

After PR #109412, a performance regression was observed that's caused by the use of std::find for linear searches.

The exact cause wasn't pinpointed, but, at the time of writing, the most likely culprit is the forced loop unrolling in the definition of libstdc++'s std::find. Presumably this is done to optimise for larger containers.

However for the case of small containers such as SmallVector, this actually hurts performance.

This patch restores handwritten linear searches instead of the use of
std::find.

After PR llvm#109412, a performance regression was observed that's caused by
the use of std::find for linear searches.

The exact cause wasn't pinpointed, but, at the time of writing, the most
likely culprit is the forced loop unrolling in the definition of
libstdc++'s std::find. Presumably this is done to optimise for larger
containers.

However for the case of small containers such as SmallVector, this
actually hurts performance.
@vhscampos vhscampos marked this pull request as ready for review September 27, 2024 13:29
@llvmbot
Copy link
Member

llvmbot commented Sep 27, 2024

@llvm/pr-subscribers-llvm-adt

Author: Victor Campos (vhscampos)

Changes

This patch restores handwritten linear searches instead of the use of std::find.

After PR #109412, a performance regression was observed that's caused by the use of std::find for linear searches.

The exact cause wasn't pinpointed, but, at the time of writing, the most likely culprit is the forced loop unrolling in the definition of libstdc++'s std::find. Presumably this is done to optimise for larger containers.

However for the case of small containers such as SmallVector, this actually hurts performance.


Full diff: https://github.com/llvm/llvm-project/pull/110254.diff

1 Files Affected:

  • (modified) llvm/include/llvm/ADT/SmallSet.h (+10-4)
diff --git a/llvm/include/llvm/ADT/SmallSet.h b/llvm/include/llvm/ADT/SmallSet.h
index 97ea4c642bf556..3bc101882dab79 100644
--- a/llvm/include/llvm/ADT/SmallSet.h
+++ b/llvm/include/llvm/ADT/SmallSet.h
@@ -193,7 +193,7 @@ class SmallSet {
   bool erase(const T &V) {
     if (!isSmall())
       return Set.erase(V);
-    auto I = std::find(Vector.begin(), Vector.end(), V);
+    auto I = vfind(V);
     if (I != Vector.end()) {
       Vector.erase(I);
       return true;
@@ -221,7 +221,7 @@ class SmallSet {
   /// Check if the SmallSet contains the given element.
   bool contains(const T &V) const {
     if (isSmall())
-      return std::find(Vector.begin(), Vector.end(), V) != Vector.end();
+      return vfind(V) != Vector.end();
     return Set.find(V) != Set.end();
   }
 
@@ -237,20 +237,26 @@ class SmallSet {
       return {const_iterator(I), Inserted};
     }
 
-    auto I = std::find(Vector.begin(), Vector.end(), V);
+    auto I = vfind(V);
     if (I != Vector.end()) // Don't reinsert if it already exists.
       return {const_iterator(I), false};
     if (Vector.size() < N) {
       Vector.push_back(std::forward<ArgType>(V));
       return {const_iterator(std::prev(Vector.end())), true};
     }
-
     // Otherwise, grow from vector to set.
     Set.insert(std::make_move_iterator(Vector.begin()),
                std::make_move_iterator(Vector.end()));
     Vector.clear();
     return {const_iterator(Set.insert(std::forward<ArgType>(V)).first), true};
   }
+
+  auto vfind(const T &V) const {
+    for (auto I = Vector.begin(), E = Vector.end(); I != E; ++I)
+      if (*I == V)
+        return I;
+    return Vector.end();
+  }
 };
 
 /// If this set is of pointer values, transparently switch over to using

@vhscampos vhscampos requested a review from nikic September 27, 2024 13:45
- Add inline comment.
- Spellout `vfind` return type for clarity.
@vhscampos vhscampos requested a review from nikic September 27, 2024 16:07
@vhscampos vhscampos merged commit fb6feb8 into llvm:main Sep 30, 2024
6 of 8 checks passed
@vhscampos vhscampos deleted the smallset-restore-handwritten-find branch September 30, 2024 15:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants