-
Notifications
You must be signed in to change notification settings - Fork 10.5k
[Runtime] Fix incorrect memory ordering in ConcurrentReadableArray/HashMap #35348
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Runtime] Fix incorrect memory ordering in ConcurrentReadableArray/HashMap #35348
Conversation
@swift-ci please test |
@swift-ci please test windows platform |
Build failed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For whatever it may be worth, I think this bug would have been avoided if push_back
were made fully concurrent instead of using a mutex. That is easy to say of course because that kind of redesign would be a lot of work.
@@ -527,7 +527,7 @@ template <class ElemTy> struct ConcurrentReadableArray { | |||
|
|||
storage = newStorage; | |||
Capacity = newCapacity; | |||
Elements.store(storage, std::memory_order_release); | |||
Elements.store(storage, std::memory_order_seq_cst); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the fix is correct but the reasoning doesn't feel right to me at all. The problem isn't that the readers are picking up the old value but that the writer isn't reasoning correctly about whether there are readers or not. After all, this proposed fix is on the writer side of the design.
On x86, the only C++11 memory model hint that matters "sequential consistency" and even then, it only matters for stores. There are two reasons for this: 1) atomic read-modify-write x86 instructions (i.e. "LOCK prefixed") don't support anything other than sequential consistency and 2) the only reordering otherwise allowed by x86's "total store order" memory model is newer loads being moved ahead of older stores.
By switching Elements.store(...)
to sequential consistency, we're forcing ReaderCount.load()
to happen after the store.
I think if one is going to use std::memory_order_seq_cst
, then one really ought to document at least one example in a comment of an actual subsequent load in the same thread that must come after the given store.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is the ASCII art of what I think is happening:
Effective program order due to out-of-order loads.
Writer | Reader
-----------------------+--------------------
............ ReaderCount == 0 ..............
|
-> ReaderCount.load() |
/ |
| | incrementReaders()
| | storage = Elements.load()
| |
| Elements.store() |
\ |
*ReaderCount.load() |
Now the writer wrongly thinks there are zero readers of the old data.
0767c63
to
88903b7
Compare
…shMap. When reallocating storage, the storing the pointer to the new storage had insufficiently strong ordering. This could cause writers to check the reader count before storing the new storage pointer. If a reader then came in between that load and store, it would end up using the old storage pointer, while the writer could end up freeing it. Also adjust the Concurrent.cpp tests to test with a variety of reader and writer counts. Counterintuitively, when freeing garbage is gated on there being zero readers, having a single reader will shake out problems that having lots of readers will not. We were testing with lots of readers in order to stress the code as much as possible, but this resulted in it being extremely rare for writers to ever see zero active readers. rdar://69798617
88903b7
to
ef8d20a
Compare
@swift-ci please test |
When reallocating storage, the storing the pointer to the new storage had insufficiently strong ordering. This could cause writers to check the reader count before storing the new storage pointer. If a reader then came in between that load and store, it would end up using the old storage pointer, while the writer could end up freeing it.
Also adjust the Concurrent.cpp tests to test with a variety of reader and writer counts. Counterintuitively, when freeing garbage is gated on there being zero readers, having a single reader will shake out problems that having lots of readers will not. We were testing with lots of readers in order to stress the code as much as possible, but this resulted in it being extremely rare for writers to ever see zero active readers.
rdar://69798617