-
Notifications
You must be signed in to change notification settings - Fork 10.5k
[Runtime] Change protocol conformance scanning to use a concurrent array rather than a locked vector. #16254
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Runtime] Change protocol conformance scanning to use a concurrent array rather than a locked vector. #16254
Conversation
…ray rather than a locked vector. rdar://problem/37173156
@swift-ci please benchmark |
Build comment file:Optimized (O)Regression (2)
Improvement (8)
No Changes (414)
Unoptimized (Onone)Regression (7)
Improvement (56)
No Changes (361)
Hardware Overview
|
include/swift/Runtime/Concurrent.h
Outdated
|
||
Storage *allocate(size_t capacity) { | ||
auto size = sizeof(Storage) + (capacity - 1) * sizeof(Storage().Elem); | ||
auto *ptr = reinterpret_cast<Storage *>(malloc(size)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it worth checking if malloc
failed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely!
include/swift/Runtime/Concurrent.h
Outdated
auto *storage = Elements.load(std::memory_order_relaxed); | ||
if (storage == nullptr) { | ||
storage = allocate(16); | ||
Capacity = 16; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this a seqcst
write? (And below too, and the various reads of Capacity
and storage->Count
) In fact, I suspect Capacity
doesn't need to be atomic at all, since it's protected by WriterLock
like FreeList
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. I really have no idea why I made Capacity
atomic. I blame my cat.
include/swift/Runtime/Concurrent.h
Outdated
ScopedLock guard(WriterLock); | ||
|
||
auto *storage = Elements.load(std::memory_order_relaxed); | ||
if (storage == nullptr) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These two branches could possibly be combined:
auto count = storage ? storage->Count : 0;
if (count >= Capacity) {
auto newCapacity = min(16, count * 2);
auto *newStorage = allocate(newCapacity);
if (storage) {
std::copy(storage->data(), storage->data() + count, newStorage->data());
FreeList.push_back(storage);
}
storage = newStorage;
Capacity = newCapacity;
Elements.store(storage, std::memory_order_release);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's good, better than my way.
Should this also be benchmarked on Linux? |
…k for malloc failure. Use placement new when appending values. rdar://problem/37173156
Benchmarking on Linux would be good, but it seems we don't have automation set up for that. |
@swift-ci please test |
Build failed |
Build failed |
include/swift/Runtime/Concurrent.h
Outdated
if (count >= Capacity) { | ||
auto newCapacity = std::max((size_t)16, count * 2); | ||
auto *newStorage = allocate(newCapacity); | ||
if (storage) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This branch needs to set newStorage->Count
.
include/swift/Runtime/Concurrent.h
Outdated
Elements.store(storage, std::memory_order_release); | ||
} | ||
|
||
auto Count = storage->Count.load(std::memory_order_relaxed); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this need to be reloaded? I think the other count
will have the same value purpose, because there can't be concurrent modifications to storage->Count
.
include/swift/Runtime/Concurrent.h
Outdated
|
||
decltype(f(nullptr, 0)) result = f(ptr, count); | ||
|
||
ReaderCount.fetch_sub(1, std::memory_order_relaxed); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My impression is that this should be release
(and the load
in push_back
above an acquire
), so that there's a barrier for f
s reads and thus ensure that they have finished before calling deallocate
? Or am I missing something?
Hm, now that I think about it, the fetch_add
on ReaderCount
above should probably also be acquire
, because I don't think there's anything stopping that being moved all the way past the Count
load and past the f
call. (I may be misremembering something special about atomic accesses before acquire
ones, though.)
…rentReadableArray. Remove a redundant load of Count. Fix memory ordering on ReaderCount. rdar://problem/37173156
@huonw Thanks, I'm finally coming back to this and I fixed the stuff you pointed out. I think your evaluation of the memory ordering is correct. I would greatly appreciate if you could double-check the fixes in case I messed anything else up. This stuff is sorta hard! |
@swift-ci please test |
Build failed |
include/swift/Runtime/Concurrent.h
Outdated
Mutex WriterLock; | ||
std::vector<Storage *> FreeList; | ||
|
||
Storage *allocate(size_t capacity) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
allocate
should be a static function of struct Storage. We don't want it to touch any existing fields.
include/swift/Runtime/Concurrent.h
Outdated
return ptr; | ||
} | ||
|
||
void deallocate(Storage *storage) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
deallocate
should be a member of struct Storage
. We don't want it to access the fields of any ConcurrentReadableArray
.
/// An append-only array that can be read without taking locks. Writes | ||
/// are still locked and serialized, but only with respect to other | ||
/// writes. | ||
template <class ElemTy> struct ConcurrentReadableArray { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This type should be no-assign no-copy no-delete. (I don't remember how those are expressed in Swift's C++ code.)
include/swift/Runtime/Concurrent.h
Outdated
/// `read` was called. The pointer becomes invalid after `f` returns. | ||
template <class F> auto read(F f) -> decltype(f(nullptr, 0)) { | ||
ReaderCount.fetch_add(1, std::memory_order_acquire); | ||
auto *storage = Elements.load(std::memory_order_consume); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that current compilers implement memory_order_consume
as memory_order_acquire
, which is unnecessarily expensive on some CPUs. If this code is sufficiently performance-sensitive then it might need to use SWIFT_MEMORY_ORDER_CONSUME
instead. (That macro expands to memory_order_relaxed
and we (1) abandon DEC Alpha compatibility and (2) cross our fingers that the compiler doesn't break anything.)
include/swift/Runtime/Concurrent.h
Outdated
ReaderCount.fetch_add(1, std::memory_order_acquire); | ||
auto *storage = Elements.load(std::memory_order_consume); | ||
auto count = storage->Count.load(std::memory_order_acquire); | ||
auto *ptr = storage->data(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think making this const auto *ptr
would force the reader function to use a pointer to const
elements, which would help prevent thread-unsafe writes inside the reader function.
…s into Storage. Mark ConcurrentReadableArray as un-copyable, un-assignable, and un-movable. Use SWIFT_MEMORY_ORDER_CONSUME instead of std::memory_order_consume. Make read() pass a const pointer. rdar://problem/37173156
@swift-ci please test |
Build failed |
Build failed |
This is a different approach to #16219. We'll race them and see who wins.
rdar://problem/37173156