-
Notifications
You must be signed in to change notification settings - Fork 10.5k
[stdlib] Implement _copyContents on internal Array types #37914
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Testing all these new implementations is a challenge, as some of them may not be actually reachable from public APIs. The tests I added only cover One easy way to exercise these would be to go and replace existing whole-buffer copies calling The ideal thing would be to validate the Sequence (and RandomAccessCollection / MutableCollection) conformances of these internal types by running them through the protocol validation suite. Unfortunately, that is probably a bigger task that can fit into a simple bugfix. (For one thing, these internal types seem replete with intentional protocol violations; for another, I don't think the validation suite covers "recent" API additions like wCSIA very well -- as evidenced by this bug slipping through.) One practical thing we could do is to add special test methods to debug builds of the stdlib, and write dedicated tests that exercise the new methods through them. @glessard, @atrick Do you have better suggestions? (If not, I may just go with this idea.) |
@swift-ci test |
Many thanks @lorentey for fixing this and working around the backward deployment mess. I don't see any issues with your fixes. A "catastrophic" performance bug is one in which the user experiences hang rather than a crash. Generally considered worse than a crash. You've avoided that just by avoiding the generic Sequence iterator. So I'd say problem solved. Although, to be fair, repeated lazily bridging is far less likely than the situations where this currently crashes. If we're going to have robust APIs on internal types--instead of fatal errors for untested code paths--then I think we need internal test methods at least. I don't have any suggestions on using the stdlib collections unit test drivers, because frankly I never understood the intention behind how they're organized. |
I agree with specifically testing debug builds; I can't think of anything more practical. The changes look good. |
OK, I'll add a test method on Array/ArraySlice (protected by INTERNAL_CHECKS_ENABLED) that patches through to the buffer's _copyContents (say, by copying its contents to a freshly allocated array).
Interesting -- I wonder how calling through let array = bridgedFromObjC()
for i in 0 ..< array.count {
array.withUnsafeBufferPointer { buffer in print(buffer[i]) }
} Unfortunately switching to a more efficient method of copying out all elements won't get rid of this problem, just push it towards slightly higher element counts. 🤔 For contiguous NSArray subclasses, we are already using a fast enumeration trick to retrieve the underlying storage address in the range subscript -- hooking the same trick into the A trivial step towards mitigation would be to document that |
Ah, that wouldn't work in cases where we need to do deferred type checking. Not sure if |
@lorentey IIRC the original "hang" had nothing to do with verbatim-bridged arrays or non-linear scaling. It was simply that using the generic Sequence iterator was 10k times slower than advancing a pointer. So a program that was supposed to take milliseconds took minutes. I think you would only see the issue with lots of repeated array copying. I'm not sure if we care about performance when exposing a contiguous buffer for a verbatim-bridged array. Will that happen repeatedly without forcing it into a native representation? It seems the workaround for both that performance problem and for the crash that this PR fixes is forcing the array into a native representation. |
`_copyContents(initializing:)` is a core method of Sequence, and it is used surprisingly often to copy stuff out of sequences. Array’s internal types currently have explicit implementations of it that trap (to prevent a performance bug due to the default iterator-based implementation. This has proved a bad idea, as not all code paths that end up calling `_copyContents` have actually been expunged — so we replaced a performance bug with a catastrophic correctness bug. 😥 Rather than trying to play whack-a-mole with code paths that end up in `_copyContents`, replace the traps with (relatively) efficient implementations, based on the ancient `_copyContents(subRange:initializing)` methods that have already been there all this time. This resolves https://bugs.swift.org/browse/SR-14663. I expect specialization will make this fix deploy back to earlier OSes in most (but unfortunately not all) cases.
4027deb
to
466e26a
Compare
@swift-ci test |
(AFAIK the PR tests won't run the new ArrayBuffer_CopyContents test, as it requires a stdlib with assertions enabled. It did pass my local testing and I verified it exercises all new |
_copyContents(initializing:)
is a core method of Sequence, and it is used surprisingly often to copy stuff out of sequences. Array’s internal types currently have explicit implementations of it that trap (to prevent a “catastrophic” performance bug due to the default iterator-based implementation. This has proved a bad idea, as not all code paths that end up calling_copyContents
have actually been expunged — so we replaced a performance bug with a catastrophic correctness bug. 😥Rather than trying to play whack-a-mole with code paths that end up in
_copyContents
, replace the traps with (relatively) efficient implementations, based on the ancient_copyContents(subRange:initializing)
methods that have already been there all this time.This resolves https://bugs.swift.org/browse/SR-14663 a.k.a. rdar://78640103.
I expect specialization will make this fix deploy back to earlier OSes in most (but unfortunately not all) cases.