Skip to content

[SmallPtrSet] Don't leave tombstones in small mode #96762

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jun 27, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 4 additions & 2 deletions llvm/docs/ProgrammersManual.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2066,8 +2066,10 @@ insertion/deleting/queries with low constant factors) and is very stingy with
malloc traffic.

Note that, unlike :ref:`std::set <dss_set>`, the iterators of ``SmallPtrSet``
are invalidated whenever an insertion occurs. Also, the values visited by the
iterators are not visited in sorted order.
are invalidated whenever an insertion or erasure occurs. The ``remove_if``
method can be used to remove elements while iterating over the set.

Also, the values visited by the iterators are not visited in sorted order.

.. _dss_stringset:

Expand Down
74 changes: 53 additions & 21 deletions llvm/include/llvm/ADT/SmallPtrSet.h
Original file line number Diff line number Diff line change
Expand Up @@ -127,22 +127,11 @@ class SmallPtrSetImplBase : public DebugEpochBase {
std::pair<const void *const *, bool> insert_imp(const void *Ptr) {
if (isSmall()) {
// Check to see if it is already in the set.
const void **LastTombstone = nullptr;
for (const void **APtr = SmallArray, **E = SmallArray + NumNonEmpty;
APtr != E; ++APtr) {
const void *Value = *APtr;
if (Value == Ptr)
return std::make_pair(APtr, false);
if (Value == getTombstoneMarker())
LastTombstone = APtr;
}

// Did we find any tombstone marker?
if (LastTombstone != nullptr) {
*LastTombstone = Ptr;
--NumTombstones;
incrementEpoch();
return std::make_pair(LastTombstone, true);
}

// Nope, there isn't. If we stay small, just 'pushback' now.
Expand All @@ -161,14 +150,27 @@ class SmallPtrSetImplBase : public DebugEpochBase {
/// that the derived class can check that the right type of pointer is passed
/// in.
bool erase_imp(const void * Ptr) {
const void *const *P = find_imp(Ptr);
if (P == EndPointer())
if (isSmall()) {
for (const void **APtr = SmallArray, **E = SmallArray + NumNonEmpty;
APtr != E; ++APtr) {
if (*APtr == Ptr) {
*APtr = SmallArray[--NumNonEmpty];
incrementEpoch();
return true;
}
}
return false;
}

const void **Loc = const_cast<const void **>(P);
assert(*Loc == Ptr && "broken find!");
*Loc = getTombstoneMarker();
auto *Bucket = FindBucketFor(Ptr);
if (*Bucket != Ptr)
return false;

*const_cast<const void **>(Bucket) = getTombstoneMarker();
NumTombstones++;
// Treat this consistently from an API perspective, even if we don't
// actually invalidate iterators here.
incrementEpoch();
return true;
}

Expand All @@ -192,9 +194,9 @@ class SmallPtrSetImplBase : public DebugEpochBase {
return EndPointer();
}

private:
bool isSmall() const { return CurArray == SmallArray; }

private:
std::pair<const void *const *, bool> insert_imp_big(const void *Ptr);

const void * const *FindBucketFor(const void *Ptr) const;
Expand Down Expand Up @@ -351,17 +353,46 @@ class SmallPtrSetImpl : public SmallPtrSetImplBase {
return insert(Ptr).first;
}

/// erase - If the set contains the specified pointer, remove it and return
/// true, otherwise return false.
/// Remove pointer from the set.
///
/// Returns whether the pointer was in the set. Invalidates iterators if
/// true is returned. To remove elements while iterating over the set, use
/// remove_if() instead.
bool erase(PtrType Ptr) {
return erase_imp(PtrTraits::getAsVoidPointer(Ptr));
}

/// Remove elements that match the given predicate. Returns whether anything
/// was removed.
/// Remove elements that match the given predicate.
///
/// This method is a safe replacement for the following pattern, which is not
/// valid, because the erase() calls would invalidate the iterator:
///
/// for (PtrType *Ptr : Set)
/// if (Pred(P))
/// Set.erase(P);
///
/// Returns whether anything was removed. It is safe to read the set inside
/// the predicate function. However, the predicate must not modify the set
/// itself, only indicate a removal by returning true.
template <typename UnaryPredicate>
bool remove_if(UnaryPredicate P) {
Comment on lines 377 to 378
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably document what happens to iterators to explain what the difference is compared to (a sequence of) erase.

bool Removed = false;
if (isSmall()) {
const void **APtr = SmallArray, **E = SmallArray + NumNonEmpty;
while (APtr != E) {
PtrType Ptr = PtrTraits::getFromVoidPointer(const_cast<void *>(*APtr));
if (P(Ptr)) {
*APtr = *--E;
--NumNonEmpty;
incrementEpoch();
Removed = true;
} else {
++APtr;
}
}
return Removed;
}

for (const void **APtr = CurArray, **E = EndPointer(); APtr != E; ++APtr) {
const void *Value = *APtr;
if (Value == getTombstoneMarker() || Value == getEmptyMarker())
Expand All @@ -370,6 +401,7 @@ class SmallPtrSetImpl : public SmallPtrSetImplBase {
if (P(Ptr)) {
*APtr = getTombstoneMarker();
++NumTombstones;
incrementEpoch();
Removed = true;
}
}
Expand Down
39 changes: 0 additions & 39 deletions llvm/unittests/ADT/SmallPtrSetTest.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -243,45 +243,6 @@ TEST(SmallPtrSetTest, SwapTest) {
EXPECT_TRUE(a.count(&buf[3]));
}

void checkEraseAndIterators(SmallPtrSetImpl<int*> &S) {
int buf[3];

S.insert(&buf[0]);
S.insert(&buf[1]);
S.insert(&buf[2]);

// Iterators must still be valid after erase() calls;
auto B = S.begin();
auto M = std::next(B);
auto E = S.end();
EXPECT_TRUE(*B == &buf[0] || *B == &buf[1] || *B == &buf[2]);
EXPECT_TRUE(*M == &buf[0] || *M == &buf[1] || *M == &buf[2]);
EXPECT_TRUE(*B != *M);
int *Removable = *std::next(M);
// No iterator points to Removable now.
EXPECT_TRUE(Removable == &buf[0] || Removable == &buf[1] ||
Removable == &buf[2]);
EXPECT_TRUE(Removable != *B && Removable != *M);

S.erase(Removable);

// B,M,E iterators should still be valid
EXPECT_EQ(B, S.begin());
EXPECT_EQ(M, std::next(B));
EXPECT_EQ(E, S.end());
EXPECT_EQ(std::next(M), E);
}

TEST(SmallPtrSetTest, EraseTest) {
// Test when set stays small.
SmallPtrSet<int *, 8> B;
checkEraseAndIterators(B);

// Test when set grows big.
SmallPtrSet<int *, 2> A;
checkEraseAndIterators(A);
}

// Verify that dereferencing and iteration work.
TEST(SmallPtrSetTest, dereferenceAndIterate) {
int Ints[] = {0, 1, 2, 3, 4, 5, 6, 7};
Expand Down
Loading