Skip to content

[WIP] [stdlib] Improve performance of Array initialization with known capacity #17311

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

ahoppen
Copy link
Member

@ahoppen ahoppen commented Jun 18, 2018

Currently the initialisation of an array with a known number of elements has two performance problems:

  • An _EmptyArrayStorage is allocated and directly released once the first element is added (rdar://41214707)
  • For each append, _ContiguousArrayStorage is retained and released when append is exited. Since the Array already holds a reference to its storage this shouldn't be necessary.

The fix

I've optimised the Collection.map and Sequence.map to directly allocate the required storage and write to its memory directly by calling to _allocateUninitialized instead of ContiguousArray.init().

@ahoppen
Copy link
Member Author

ahoppen commented Jun 18, 2018

@swift-ci Please smoke test

@airspeedswift
Copy link
Member

This doesn't seem like the right solution. I was under the impression that there was a single _EmptyArrayStorage instance singleton (here: https://github.com/apple/swift/blob/master/stdlib/public/core/ContiguousArrayBuffer.swift#L68) that would be used, so it shouldn't be causing allocations every time. If that's not working (or if I'm missing something and it doesn't work that way) maybe there's a fundamental fix, because this problem will also affect user code and outside the std lib you don't have this option.

@ahoppen
Copy link
Member Author

ahoppen commented Jun 20, 2018 via email

result.reserveCapacity(n)
let result: ContiguousArray<T>
var p: UnsafeMutablePointer<T>
(result, p) = ContiguousArray<T>._allocateUninitialized(count)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you switch count to n here, in case a collection is super broken and returns a different count here compared to up above? Otherwise we might write past the buffer in the loop below.

@ahoppen ahoppen force-pushed the array-init-performance branch from e764b0f to 3727020 Compare June 24, 2018 05:49
@ahoppen
Copy link
Member Author

ahoppen commented Jun 24, 2018

@swift-ci Please smoke test

ahoppen added 2 commits June 24, 2018 07:12
AnySequence just forwards the call to map to its underlying storage
which used to be an Array (and thus a collection).
stride(from:through:by:) creates a proper Sequence.
@ahoppen ahoppen force-pushed the array-init-performance branch from 3727020 to fc36bed Compare June 24, 2018 14:14
@ahoppen
Copy link
Member Author

ahoppen commented Jun 24, 2018

@swift-ci Please smoke test

@airspeedswift
Copy link
Member

Why don't you think this is the right solution (I haven't looked into the test failure yet)? And would you know of a different way to mitigate the issue?

Like I said, this is a narrow fix that will only work within the standard library for a general problem that affects all users. We need to figure out a way of solving it for everyone not just paper over it. Either in code inside the library (find some way that starting with an empty array is free) or via changing the API (i.e. allowing one-shot create-and-reserve).

Once we have that solution, we can consider a localized temporary fix like this but we need to do it in that order.

@ahoppen
Copy link
Member Author

ahoppen commented Jun 25, 2018

@swift-ci Please build toolchain

@airspeedswift
Copy link
Member

So #17389 should provide what's needed – the ability to init with a capacity. This needs evolution approval, but maybe we can implement it with underscores and use it in the mean-time. In fact, it would be interesting to see if it yields a performance benefit to do use the full proposal to implement map and similar functions.

@ahoppen
Copy link
Member Author

ahoppen commented Jun 25, 2018

This seems to be exactly what I need. Closing this and I'll wait for @natecook1000's implementation.

@ahoppen ahoppen closed this Jun 25, 2018
@airspeedswift
Copy link
Member

cc @natecook1000 – looks like this would be a great candidate for proving out the changes you're proposing for one shot initialization. It might even make sense to land underscored implementations in master and then use them to implement map, wdyt?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants