Skip to content

[SmallPtrSet] Add remove_if() method #96468

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 25, 2024
Merged

Conversation

nikic
Copy link
Contributor

@nikic nikic commented Jun 24, 2024

Add remove_if() method, similar to the one already present on SetVector. It is intended to replace the following pattern:

for (Foo *Ptr : Set)
  if (Pred(Ptr))
    Set.erase(Ptr);

With:

Set.remove_if(Pred);

This pattern is commonly used for set intersection, where Pred is something like !OtherSet.contains(Ptr).

The implementation provided here is a bit more efficient than the naive loop, because it does not require looking up the bucket during the erase() operation again.

However, my actual motivation for this is to have a way to perform this operation without relying on the current std::set-style guarantee that erase() does not invalidate iterators. I'd like to stop making use of tombstones in the small regime, which will make insertion operations a good bit more efficient. However, this will invalidate iterators during erase().

Add remove_if() method, similar to the one already present on
SetVector. It is intended to replace the following pattern:

    for (Foo *Ptr : Set)
      if (Pred(Ptr))
        Set.erase(Ptr);

This pattern is particularly commonly used for set intersections.

The implementation provided here is a bit more efficient than the
naive loop, because it does not require looking up the bucket
during the erase() operation again.

However, my actual motivation for this is to have a way to perform
this operation without relying on the current `std::set`-style
guarantee that erase() does not invalidate iterators. I'd like to
stop making use of tombstones in the small regime, which will make
insertion operations a good bit more efficient. However, this will
invalidate iterators during erase().
@nikic nikic requested review from MaskRay, kuhar and dwblaikie June 24, 2024 09:16
@llvmbot
Copy link
Member

llvmbot commented Jun 24, 2024

@llvm/pr-subscribers-llvm-adt

Author: Nikita Popov (nikic)

Changes

Add remove_if() method, similar to the one already present on SetVector. It is intended to replace the following pattern:

for (Foo *Ptr : Set)
  if (Pred(Ptr))
    Set.erase(Ptr);

With:

Set.remove_if(Pred);

This pattern is commonly used for set intersection, where Pred is something like OtherSet.contains(Ptr).

The implementation provided here is a bit more efficient than the naive loop, because it does not require looking up the bucket during the erase() operation again.

However, my actual motivation for this is to have a way to perform this operation without relying on the current std::set-style guarantee that erase() does not invalidate iterators. I'd like to stop making use of tombstones in the small regime, which will make insertion operations a good bit more efficient. However, this will invalidate iterators during erase().


Full diff: https://github.com/llvm/llvm-project/pull/96468.diff

2 Files Affected:

  • (modified) llvm/include/llvm/ADT/SmallPtrSet.h (+16)
  • (modified) llvm/unittests/ADT/SmallPtrSetTest.cpp (+33)
diff --git a/llvm/include/llvm/ADT/SmallPtrSet.h b/llvm/include/llvm/ADT/SmallPtrSet.h
index 69076ed326927..ed78c3da82fd5 100644
--- a/llvm/include/llvm/ADT/SmallPtrSet.h
+++ b/llvm/include/llvm/ADT/SmallPtrSet.h
@@ -356,6 +356,22 @@ class SmallPtrSetImpl : public SmallPtrSetImplBase {
   bool erase(PtrType Ptr) {
     return erase_imp(PtrTraits::getAsVoidPointer(Ptr));
   }
+
+  /// Remove elements that match the given predicate.
+  template <typename UnaryPredicate>
+  void remove_if(UnaryPredicate P) {
+    for (const void **APtr = CurArray, **E = EndPointer(); APtr != E; ++APtr) {
+      const void *Value = *APtr;
+      if (Value == getTombstoneMarker() || Value == getEmptyMarker())
+        continue;
+      PtrType Ptr = PtrTraits::getFromVoidPointer(const_cast<void *>(Value));
+      if (P(Ptr)) {
+        *APtr = getTombstoneMarker();
+        ++NumTombstones;
+      }
+    }
+  }
+
   /// count - Return 1 if the specified pointer is in the set, 0 otherwise.
   size_type count(ConstPtrType Ptr) const {
     return find_imp(ConstPtrTraits::getAsVoidPointer(Ptr)) != EndPointer();
diff --git a/llvm/unittests/ADT/SmallPtrSetTest.cpp b/llvm/unittests/ADT/SmallPtrSetTest.cpp
index a97f2617cbf70..6aced96c217ca 100644
--- a/llvm/unittests/ADT/SmallPtrSetTest.cpp
+++ b/llvm/unittests/ADT/SmallPtrSetTest.cpp
@@ -409,3 +409,36 @@ TEST(SmallPtrSetTest, InsertIterator) {
   for (const auto *Ptr : Buf)
     EXPECT_TRUE(Set.contains(Ptr));
 }
+
+TEST(SmallPtrSetTest, RemoveIf) {
+  SmallPtrSet<int *, 5> Set;
+  int Vals[6] = {0, 1, 2, 3, 4, 5};
+
+  // Stay in small regime.
+  Set.insert(&Vals[0]);
+  Set.insert(&Vals[1]);
+  Set.insert(&Vals[2]);
+  Set.insert(&Vals[3]);
+  Set.erase(&Vals[0]); // Leave a tombstone.
+
+  // Remove odd elements.
+  Set.remove_if([](int *Ptr) { return *Ptr % 2 != 0; });
+  // We should only have element 2 left now.
+  EXPECT_EQ(Set.size(), 1u);
+  EXPECT_TRUE(Set.contains(&Vals[2]));
+
+  // Switch to big regime.
+  Set.insert(&Vals[0]);
+  Set.insert(&Vals[1]);
+  Set.insert(&Vals[3]);
+  Set.insert(&Vals[4]);
+  Set.insert(&Vals[5]);
+  Set.erase(&Vals[0]); // Leave a tombstone.
+
+  // Remove odd elements.
+  Set.remove_if([](int *Ptr) { return *Ptr % 2 != 0; });
+  // We should only have elements 2 and 4 left now.
+  EXPECT_EQ(Set.size(), 2u);
+  EXPECT_TRUE(Set.contains(&Vals[2]));
+  EXPECT_TRUE(Set.contains(&Vals[4]));
+}

@nikic nikic merged commit f019581 into llvm:main Jun 25, 2024
9 checks passed
@nikic nikic deleted the smallptrset-remove-if branch June 25, 2024 07:01
AlexisPerry pushed a commit to llvm-project-tlp/llvm-project that referenced this pull request Jul 9, 2024
Add remove_if() method, similar to the one already present on SetVector.
It is intended to replace the following pattern:

    for (Foo *Ptr : Set)
      if (Pred(Ptr))
        Set.erase(Ptr);

With:

    Set.remove_if(Pred);

This pattern is commonly used for set intersection, where `Pred` is
something like `!OtherSet.contains(Ptr)`.

The implementation provided here is a bit more efficient than the naive
loop, because it does not require looking up the bucket during the
erase() operation again.

However, my actual motivation for this is to have a way to perform this
operation without relying on the current `std::set`-style guarantee that
erase() does not invalidate iterators. I'd like to stop making use of
tombstones in the small regime, which will make insertion operations a
good bit more efficient. However, this will invalidate iterators during
erase().
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants