-
Notifications
You must be signed in to change notification settings - Fork 1.2k
[string] Wean off of no-longer-supported String non-API #1484
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@swift-ci please test |
1 similar comment
@swift-ci please test |
@itaiferber, this code makes incorrect assumptions about Do you have any idea why these SPI are even here? Do they matter? If so, would you prefer a real API and/or SPI instead? Could you also take a look at the "SQUASH ME" fix, and let me know how best to stop using |
@swift-ci please test |
Do we have any replacement available? This is going to be a big performance loss. |
I just want to understand what you mean here. There are two interpretations that I see:
|
_fastCStringContents is specifically for CF bridging; CFStringGetCStringPtr is the fast path for accessing CFString contents, which is quite a hot path. There’s a years-old trail of requests from our team for a better way to do this with Swift string, e.g. 26236614. |
@airspeedswift The second — no one externally should be using But again, I didn't write this, so I don't know the specifics. @phausler would know best. |
OK great that's a relief! So here's how I'd suggest we proceed: we know that we want to give extension String { // or probably rather StringProtocol
/// If the string is held in a form that can provide a null-terminated buffer of UTF8 code
/// units in O(1), pass that into the supplied closure, and return the result. If not, return `nil`
func withUnsafeUTF8<Result>(_ body: (UnsafeBufferPointer<UInt8>) throws -> Result) rethrows -> Result?
/// If the string is held in a form that can provide a null-terminated buffer of UTF16 code
/// units in O(1), pass that into the supplied closure, and return the result. If not, return `nil`
func withUnsafeUTF16<Result>(_ body: (UnsafeBufferPointer<UInt16>) throws -> Result) rethrows -> Result?
} With the current string internals this should be straightforward to implement, but we need to get it through evolution (bike shed if it should be |
@Catfish-Man Sure, but why is this important on Linux? Or does corelibs-foundation need to bridge to CFString to use CoreFoundation to implement NSString? Assuming we underscore them, @parkera @itaiferber @Catfish-Man, do these two closure-based SPIs work for you? |
So the implementation of NSString is primarily used to be the bridge to CFString even in swift-corelibs-foundation on linux. This is the conduit in which constant strings and property list encoding and the such are exposed outwardly to Foundation APIs. So in short: yes it is used on Linux, and I don't think we can avoid _fastCStringContents being called since that is part of the CF source (which is shared with Darwin and would require massive changes to avoid that). This will be a pretty harsh perf hit to the linux serialization. |
@phausler how do these 2 SPI look to you? Note that escaping such pointers violates Swift's programming model. |
How would a non-escaping SPI work for CFStringGetCStringPtr? |
That was the point of @airspeedswift's question about whether this is important for the implementation of Foundation, or for exposing to users of Foundation API. In this example, perhaps he should of asked whether the result of Foundation-internal uses of CFStringGetCStringPtr could ideally be migrated onto something scoped, or worst-case we allow Foundation to violate Swift's model because "it just works and we'll change the implementation when it doesn't". This approach doesn't work when an escaped pointer is important API. Why would Linux user-code be heavily reliant on CFStringGetCStringPtr? If String provided this kind of access, but correctly scoped to not violate Swift's model, then users would just use that. |
Here is one of many examples: When writing out a property list, if we have an ASCII string, then there is no need to copy the string into a temporary buffer so we can then copy it again into the data buffer for the property list we're writing. If |
I understand the scoped access API you're going towards, but it's a poor fit for the existing C code, which is shared across all platforms. |
I think in order to make use of it we'd have to add a new C API which took a closure itself and replace all |
I guess it would be nice if we had a mechanism to allow a pointer to be valid until end of scope, like the return inner pointer feature on Darwin -- although that of course relies on autorelease pools. |
We do have lifetimes and there's some interest (CC @atrick) in better dependent lifetime support. For now, we do this with However, correct usage of that also requires reasoning about any surrounding effects. For example, such a pointer would not be considered a use for COW, so a String append could deallocate the memory the pointer is pointing at, even if the string value is kept alive. A short term solution might be that we trust Foundation to properly reason through effects and extend lifetimes, and we ... erm... hope that user code does as well? |
CF is at least expected to be fairly sharp-edged 🤷♂️ |
Wait a second, isn't this code just wrong (and has always been)? The code doesn't check for nul-termination, but |
How likely is it that we would get the lifetime-encapsulated-closure version of that string API along with these other changes? |
If you mean the SPI that @airspeedswift mentioned, I merged it this morning at swiftlang/swift#15442. Guaranteeing lifetime is up to the caller, there is no way for the callee to sensibly do so. Making this API would require clarifying behavior surrounding small strings. One approach is, if small, spill to the stack and execute the closure on that. However, if you're planning on escaping the pointer, then that'd be a dangling pointer into an old stack frame. |
Hmm, how does one productively develop corelibs-foundation on Mac while using recent changes? It seems like my options are:
Is there a fourth option? |
(2) and a little patience are what you want, generally. I've found the release build to be a lot faster than the debug build, if the changes on the stdlib side look solid. |
(This may be more appropriate for forums than in a patch discussion.) |
Switch off of old _StringCore (which no longer exists)'s isASCII SPI to a check whether the String explicitly stores contiguous ASCII, as opposed to small or non-contiguous contents.
713b4bb
to
4de88de
Compare
@jrose-apple argues that escaping the pointer is more bad rather than less bad. Debating this point further is not useful. I've changed this to just use less bad queries, namely |
@swift-ci please test |
1 similar comment
@swift-ci please test |
@parkera @itaiferber @millenomi @phausler @Catfish-Man what do you think of this far more minimal change? This should be sufficient (for now) to ensure correct behavior with small strings. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a good minimal change to me
Switch off of old _StringCore (which no longer exists)'s isASCII SPI to a check whether the String explicitly stores contiguous ASCII, as opposed to small or non-contiguous contents.
String's _core disappeared, but we kept around a mock-up of it as a
faux-SPI to avoid breaking Foundation. However, it does not fully
model String, and the illusion is broken from small string support.