Skip to content

[ADT] Use range-based helper functions in SmallSet #108585

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
77 changes: 30 additions & 47 deletions llvm/include/llvm/ADT/SmallSet.h
Original file line number Diff line number Diff line change
Expand Up @@ -46,33 +46,33 @@ class SmallSetIterator
VecIterTy VecIter;
};

bool isSmall;
bool IsSmall;

public:
SmallSetIterator(SetIterTy SetIter) : SetIter(SetIter), isSmall(false) {}
SmallSetIterator(SetIterTy SetIter) : SetIter(SetIter), IsSmall(false) {}

SmallSetIterator(VecIterTy VecIter) : VecIter(VecIter), isSmall(true) {}
SmallSetIterator(VecIterTy VecIter) : VecIter(VecIter), IsSmall(true) {}

// Spell out destructor, copy/move constructor and assignment operators for
// MSVC STL, where set<T>::const_iterator is not trivially copy constructible.
~SmallSetIterator() {
if (isSmall)
if (IsSmall)
VecIter.~VecIterTy();
else
SetIter.~SetIterTy();
}

SmallSetIterator(const SmallSetIterator &Other) : isSmall(Other.isSmall) {
if (isSmall)
SmallSetIterator(const SmallSetIterator &Other) : IsSmall(Other.IsSmall) {
if (IsSmall)
VecIter = Other.VecIter;
else
// Use placement new, to make sure SetIter is properly constructed, even
// if it is not trivially copy-able (e.g. in MSVC).
new (&SetIter) SetIterTy(Other.SetIter);
}

SmallSetIterator(SmallSetIterator &&Other) : isSmall(Other.isSmall) {
if (isSmall)
SmallSetIterator(SmallSetIterator &&Other) : IsSmall(Other.IsSmall) {
if (IsSmall)
VecIter = std::move(Other.VecIter);
else
// Use placement new, to make sure SetIter is properly constructed, even
Expand All @@ -83,11 +83,11 @@ class SmallSetIterator
SmallSetIterator& operator=(const SmallSetIterator& Other) {
// Call destructor for SetIter, so it gets properly destroyed if it is
// not trivially destructible in case we are setting VecIter.
if (!isSmall)
if (!IsSmall)
SetIter.~SetIterTy();

isSmall = Other.isSmall;
if (isSmall)
IsSmall = Other.IsSmall;
if (IsSmall)
VecIter = Other.VecIter;
else
new (&SetIter) SetIterTy(Other.SetIter);
Expand All @@ -97,34 +97,34 @@ class SmallSetIterator
SmallSetIterator& operator=(SmallSetIterator&& Other) {
// Call destructor for SetIter, so it gets properly destroyed if it is
// not trivially destructible in case we are setting VecIter.
if (!isSmall)
if (!IsSmall)
SetIter.~SetIterTy();

isSmall = Other.isSmall;
if (isSmall)
IsSmall = Other.IsSmall;
if (IsSmall)
VecIter = std::move(Other.VecIter);
else
new (&SetIter) SetIterTy(std::move(Other.SetIter));
return *this;
}

bool operator==(const SmallSetIterator &RHS) const {
if (isSmall != RHS.isSmall)
if (IsSmall != RHS.IsSmall)
return false;
if (isSmall)
if (IsSmall)
return VecIter == RHS.VecIter;
return SetIter == RHS.SetIter;
}

SmallSetIterator &operator++() { // Preincrement
if (isSmall)
VecIter++;
if (IsSmall)
++VecIter;
else
SetIter++;
++SetIter;
return *this;
}

const T &operator*() const { return isSmall ? *VecIter : *SetIter; }
const T &operator*() const { return IsSmall ? *VecIter : *SetIter; }
};

/// SmallSet - This maintains a set of unique values, optimizing for the case
Expand All @@ -139,10 +139,6 @@ class SmallSet {
SmallVector<T, N> Vector;
std::set<T, C> Set;

using VIterator = typename SmallVector<T, N>::const_iterator;
using SIterator = typename std::set<T, C>::const_iterator;
using mutable_iterator = typename SmallVector<T, N>::iterator;

// In small mode SmallPtrSet uses linear search for the elements, so it is
// not a good idea to choose this value too high. You may consider using a
// DenseSet<> instead if you expect many elements in the set.
Expand All @@ -163,14 +159,7 @@ class SmallSet {
}

/// count - Return 1 if the element is in the set, 0 otherwise.
size_type count(const T &V) const {
if (isSmall()) {
// Since the collection is small, just do a linear search.
return vfind(V) == Vector.end() ? 0 : 1;
} else {
return Set.count(V);
}
}
size_type count(const T &V) const { return contains(V) ? 1 : 0; }

/// insert - Insert an element into the set if it isn't already there.
/// Returns a pair. The first value of it is an iterator to the inserted
Expand All @@ -182,7 +171,7 @@ class SmallSet {
return std::make_pair(const_iterator(I), Inserted);
}

VIterator I = vfind(V);
auto I = llvm::find(Vector, V);
if (I != Vector.end()) // Don't reinsert if it already exists.
return std::make_pair(const_iterator(I), false);
if (Vector.size() < N) {
Expand All @@ -207,11 +196,12 @@ class SmallSet {
bool erase(const T &V) {
if (!isSmall())
return Set.erase(V);
for (mutable_iterator I = Vector.begin(), E = Vector.end(); I != E; ++I)
if (*I == V) {
Vector.erase(I);
return true;
}

auto It = llvm::find(Vector, V);
if (It != Vector.end()) {
Comment on lines +200 to +201
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this is not used outside of the if

Suggested change
auto It = llvm::find(Vector, V);
if (It != Vector.end()) {
if (auto It = llvm::find(Vector, V); It != Vector.end()) {

Vector.erase(It);
return true;
}
return false;
}

Expand All @@ -235,19 +225,12 @@ class SmallSet {
/// Check if the SmallSet contains the given element.
bool contains(const T &V) const {
if (isSmall())
return vfind(V) != Vector.end();
return Set.find(V) != Set.end();
return llvm::is_contained(Vector, V);
return llvm::is_contained(Set, V);
Copy link
Contributor

@nikic nikic Sep 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do not use is_contained with types that do not require linear scan. This should be return Set.contains(V).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing that out. I've checked and llvm::is_contained indeed does a linear scan on a std::set because the latter does not have a contains method before C++20.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd expect it to call set::find:

else if constexpr (detail::HasMemberFind<R, E>)
return Range.find(Element) != Range.end();

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My bad. @kuhar is right. It calls std::set::find which is as good as it gets with std::set.

Copy link
Contributor

@nikic nikic Sep 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't matter what it actually calls, we should never use is_contained() for set structures, unless there is some very strong reason (e.g. generic context that really does not care whether it gets an O(1) set or O(n) set).

Copy link
Collaborator

@dwblaikie dwblaikie Sep 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This advice seems at least inconsistent with the intent of the code per the comment:

This is intended as the canonical way to check if an
element exists in a range in generic code or range type that does not
expose a `.contains(Element)` member.

In this case, the set meets the second requirement - that it does not expose a contains function? (in C++17, at least - it is in C++20, but LLVM isn't using that yet)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want to have an abstraction over contains() and find() != end(), I'd suggest adding another helper, which does not also include the linear scan case. Though I don't think this is really worthwhile, as the existing find() != end() code works fine.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I think is_contained is more legible than find!=end and seems fine to use here & I agree with the comment on is_contained suggesting it for uses like this.

}

private:
bool isSmall() const { return Set.empty(); }

VIterator vfind(const T &V) const {
for (VIterator I = Vector.begin(), E = Vector.end(); I != E; ++I)
if (*I == V)
return I;
return Vector.end();
}
};

/// If this set is of pointer values, transparently switch over to using
Expand Down
Loading