Skip to content

[stdlib] ManagedBuffer independent of malloc_size. #31686

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Sep 10, 2020

Conversation

3405691582
Copy link
Member

ManagedBuffer and ManagedBufferPointer no longer requires having a
functional malloc_size implementation to work. When malloc_size is
unavailable, we store the capacity in the tail allocation used by either
class.

Adding extra stored properties is not feasible, since there are
preconditions and invariants that check for these. This would also mean
that platforms which do have malloc_size would be wasting an unnecessary
stored property per class.

Wrapping the header object is not feasible, since there are certain
assumptions that the header object (and thus not the wrapper) and the
tail-allocated memory are adjacent.

Creating a static dictionary to track allocations and their sizes would
be possible, but this seems too brittle; any allocation tracking would
have to be static since ManagedBuffer and ManagedBufferPointer are
implemented with extensions, restricting us to only computed properties.

Thus, we are left with having to use the dynamic allocation in
ManagedBuffer and ManagedBufferPointer itself. This leaves us with one
gap: the single-argument initializer in ManagedBufferPointer,
init(unsafeBufferObject:). This is fine when called from
init(bufferClass:minimumCapacity:makingHeaderWith:), but if only
initialized with the single-argument initializer, the base capacity
implementation will not be correct. This may be okay, since generally
clients subclassing ManagedBufferPointer should override capacity for
their own tracking; indeed, this is what the unit test actually does.
Furthermore, it is the caller's responsibility to supply enough memory
to store the allocated elements in any case; the way the initializer is
written currently, we have no other way of knowing the size of the
allocation from the caller without malloc_size.

@3405691582
Copy link
Member Author

3405691582 commented May 10, 2020

As a side note, this version of the commit converts to UnsafeMutableRawPointers and does the store/load for the capacity that way rather than using UnsafeMutablePointer<Int> and initialize and pointee, because the latter trips an assertion getValue() && "called on object with zero alignment". I don't quite understand why that's happening, but converting to UnsafeMutableRawPointers works, so I've left it like that.

@3405691582 3405691582 force-pushed the ManagedBuffer_WithoutMallocSize branch from 5011e31 to 3da070a Compare July 18, 2020 20:25
@theblixguy theblixguy requested a review from lorentey July 18, 2020 20:27
@3405691582
Copy link
Member Author

3405691582 commented Jul 18, 2020

While this was awaiting review, the strange issue with load/store doesn't seem to be recurring any more, so those have been reverted back to more direct pointee accesses. Some more idiomatic tweaks were also made.

@3405691582
Copy link
Member Author

Ping; please let me know if there's anything that can be improved here, I know this is rather complex but it is necessary for stdlib to build on platforms without malloc_size.

Copy link
Member

@lorentey lorentey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding extra stored properties is not feasible, since there are
preconditions and invariants that check for these. This would also mean
that platforms which do have malloc_size would be wasting an unnecessary
stored property per class.

I think we should really consider making capacity an actual stored property on these platforms -- trying to emulate one based on a (more or less) runtime condition seems highly error-prone to me. The easiest way to go about it would be to add #if os(openBSD) conditionals, augmented with _debugPrecondition([!]_swift_stdlib_has_malloc_size()) where appropriate to detect bit rot.

(The preconditions/tests/etc will need to be updated to follow the same logic.)

This would be a great usecase for a compile time #if _has_feature(mallocSize) feature. (Along with the queries for pointer size etc.) Alas, we don't have it...

On a higher level, I wonder if ManagedBuffer's API is the right one for these platforms -- it pretty much forces people to store the capacity in the header, which leads to duplication on these platforms. What if we deprecated capacity instead, and added an alternative factory method that passes in the actual capacity?

}

@inlinable
internal final var capacityAddress: UnsafeMutablePointer<Int>? {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small styling nit: Could you please add a leading underscore to this name? (Per the Leading Underscore Rule.)

(The other entry points here don't follow that convention. We can't change those, but we can make sure new additions are correct.)

let offset: Int = _headerOffset + MemoryLayout<Header>.size + increment
return _roundUp(
offset,
toAlignment: MemoryLayout<Element>.alignment)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the no-mallocsize case, this calculation does not take the capacity's alignment into account, so I believe it may produce incorrect results. Consider the case where MemoryLayout<Header>.size == 1 and MemoryLayout<Element>.alignment == 1:

let offset: Int = 16 + 1 + 8
_roundUp(25, toAlignment: 1) == 25

But the first element would be stored at offset 32.

@lorentey
Copy link
Member

lorentey commented Aug 4, 2020

What if we deprecated capacity instead, and added an alternative factory method that passes in the actual capacity?

Expanding on this: we should not store the capacity anywhere on OpenBSD. Rather, ManagedBuffer.capacity should be unavailable there (and deprecated on existing platforms). To emulate its primary usecase, we should add a new public create overload that gets rid of the inherent safety issue of the existing one (as well as fixing its return type):

extension ManagedBuffer {
  public class func create(minimumCapacity: Int, makingHeaderWith factory: (UnsafeBufferPointer<Element>) throws -> Header) rethrows -> Self
}

This isn't a one-to-one replacement to the existing create, but it's very close -- it requires a malloc_size call (where available) while in the original this was controlled by the closure. Adding another overload would fix that:

extension ManagedBuffer {
  public class func create(exactCapacity: Int, makingHeaderWith factory: (UnsafeBufferPointer<Element>) throws -> Header) rethrows -> Self
}

At the same time, we should deprecate the existing create method.

ManagedBufferPointer would need the same sort of updates.

This would require a pitch to the evolution forums and a subsequent proposal. It seems like a relatively straightforward change that improves managed buffer usability everywhere -- but the deprecations could be controversial.

@3405691582
Copy link
Member Author

In the case where such a pitch would be rejected, would we then adopt this strategy (pending the #if os(OpenBSD) changes you mentioned)?

@lorentey
Copy link
Member

lorentey commented Aug 5, 2020

Well, the current APIs implicitly assume that malloc_size (or equivalent) exist, so I think the proposal can make a good case for their replacement. The options as I see them are:

  1. Make ManagedBuffer and ManagedBufferPointer unavailable on OpenBSD. (Feasible but unsatisfying)
  2. On os(OpenBSD), make capacity a stored property. (Unnecessarily wastes memory on OpenBSD, isn't really an option for ManagedBufferPointer)
  3. Tail-allocate the capacity based on a runtime condition, as implemented in this PR. (This wastes memory on OpenBSD and has a potential to affect codegen on all platforms, especially in debug builds.)
  4. Tail-allocate the capacity based on a compile-time conditional. (Wastes memory, but doesn't affect other platforms.)
  5. Fix the managed buffer APIs to be implementable on all platforms. (Requires an evolution proposal.)

In the (unlikely, but technically possible) case that #5 gets completely rejected, #4 would probably be the least disruptive option.

@lorentey
Copy link
Member

lorentey commented Aug 5, 2020

I started a quick pitch topic at https://forums.swift.org/t/ironing-out-managedbuffer-api-wrinkles/39072. Let's see what people think!

@lorentey
Copy link
Member

lorentey commented Aug 5, 2020

A pragmatic stop-gap solution would be to simply mark ManagedBuffer.capacity and ManagedBufferPointer.capacity as unavailable on OpenBSD. Not sure why I didn't think of this above -- this may cause some source compatibility headaches, but it seems like the least bad option overall.

@3405691582
Copy link
Member Author

3405691582 commented Aug 5, 2020

Marking ManagedBufferPointer.capacity unavailable unfortunately doesn't compile, since .capacity is used in ManagedBufferPointer.init (quoting, ManagedBufferPointer(unsafeBufferObject: $0).capacity).

(We also can't make the entirety of ManagedBufferPointer unavailable since BridgingBuffer utilizes it.)

@3405691582 3405691582 force-pushed the ManagedBuffer_WithoutMallocSize branch from 3da070a to c82103c Compare August 21, 2020 22:49
@3405691582
Copy link
Member Author

3405691582 commented Aug 21, 2020

Actually, I wasn't quite testing the @available API right. We do need to introduce a new availability platform symbol to flag the capacity methods however, then we can mark the relevant parts of this as unavailable as you suggest. Here are two commits instead to do just that.

@3405691582 3405691582 force-pushed the ManagedBuffer_WithoutMallocSize branch 2 times, most recently from 9a07af2 to 41739f6 Compare August 22, 2020 14:17
@3405691582
Copy link
Member Author

Ping; let me know if we should proceed with the @available stopgap as the PR stands (this would get a successful build on OpenBSD).

@benrimmington
Copy link
Contributor

@swift-ci Please test

@swift-ci

This comment has been minimized.

@benrimmington

This comment has been minimized.

@swift-ci

This comment has been minimized.

@3405691582 3405691582 force-pushed the ManagedBuffer_WithoutMallocSize branch from 41739f6 to f248233 Compare September 9, 2020 22:48
@3405691582
Copy link
Member Author

Fixed, sorry about that. Please try again.

(Dispatch isn't properly ported for BSD, so it's not built.)

@benrimmington
Copy link
Contributor

@swift-ci Please test

@swift-ci

This comment has been minimized.

@swift-ci

This comment has been minimized.

In swiftlang#31686 changes were introduced to ensure that capacity was stored in
the ManagedBuffer allocation, and @lorentey sugested that as a stopgap
measure for addressing the lack of platform malloc introspection on
OpenBSD, we use Swift availability attributes instead on the relevant
parts of ManagedBuffer and friends.

Since platform availability symbols must be specifically set up to be
used, this commit does so in advance of the above change.
On OpenBSD, malloc introspection (e.g., malloc_usable_size or
malloc_size) is not provided by the platform allocator. Since allocator
introspection is currently a load-bearing piece of functionality for
ManagedBuffer and ManagedBufferPointer, pending any API changes, as a
stopgap measure, this commit marks methods in ManagedBuffer and
ManagedBufferPointer calling _swift_stdlib_malloc_size and methods
dependent thereon unavailable on OpenBSD.

This may induce some compatibility issues for some files, but at least
this change ensures that we can get stdlib to build on this platform
until the evolution process addresses this problem more thoroughly.
@3405691582 3405691582 force-pushed the ManagedBuffer_WithoutMallocSize branch from f248233 to cd7570f Compare September 9, 2020 22:58
@3405691582
Copy link
Member Author

3405691582 commented Sep 10, 2020

The CI runs look good after checking manually (I had to push again because I messed up the amend, but there were no actual changes).

@benrimmington

This comment has been minimized.

@benrimmington
Copy link
Contributor

@swift-ci Please smoke test

Copy link
Member

@lorentey lorentey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! The ManagedBufferPointer initializer could be left available (with an #if os(OpenBSD) check to pass the original count to the closure), but I think we can leave that for another day!

@lorentey lorentey merged commit a084d04 into swiftlang:master Sep 10, 2020
3405691582 added a commit to 3405691582/swift that referenced this pull request Dec 11, 2020
In swiftlang#31686 changes were introduced to ensure that capacity was stored in
the ManagedBuffer allocation, and @lorentey sugested that as a stopgap
measure for addressing the lack of platform malloc introspection on
OpenBSD, we use Swift availability attributes instead on the relevant
parts of ManagedBuffer and friends.

Since platform availability symbols must be specifically set up to be
used, this commit does so in advance of the above change.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants