-
Notifications
You must be signed in to change notification settings - Fork 10.5k
[string] Fix string implementation for big endian platforms #21077
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[string] Fix string implementation for big endian platforms #21077
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we want to store SmallString as little endian in memory, and swap on read? Wouldn't we want SmallString to be the in-memory representation on both platforms?
stdlib/public/core/SmallString.swift
Outdated
// Give raw, nul-terminated code units. This is only for limited internal | ||
// usage: it always clears the discriminator and count (in case it's full) | ||
@inlinable | ||
internal var zeroTerminatedRawCodeUnits: RawBitPattern { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are you removing this? Wouldn't this simplify your code as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, this is an ABI break
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't sure about this one. I think the new code is easier to understand without this property but I'm happy to reinstate it.
Why does this break the ABI? (EDIT: Ah, I see, because it's inlinable) Is there any way to know when such breakages occur? This code isn't in a released version of Swift yet right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you run the test suite, you'll get a failure inside the api-digester.
edit: This code is in the 5.0 release branch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ran the validation test suite (specifically utils/build-script --release-debuginfo -T
) on this PR on a Linux x86-64 box running Ubuntu 18.04 and it passed the tests. Should it have failed or will it only fail in the release branch?
Is the ABI still modifiable at this stage or is it already locked down?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is modifiable in case of emergency, and worse case here we just keep this method around.
stdlib/public/core/SmallString.swift
Outdated
internal init(_ object: _SmallString) { | ||
self.init(raw: object.rawBits) | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this init do over self = object?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing, will fix it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, it does mean the code does an invariant check, whereas an assignment wouldn't I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, if you want to do a final invariant check at the end. Alternatively, we could have the discriminator setter do the check. The discriminator property was added by @lorentey after I wrote much of this code, so there's some potential cleanups like the one you found.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding an invariant check to the discriminator setter could misfire in the ASCII check if the string contents haven't yet been fully updated at the time the discriminator is set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. We do generally want to do an invariant check at the very end of creation, and at the very end of a mutating (e.g. verify isASCII after any RRC operation). Dealing with this was why I started to shy away from settable properties in _StringObject
and _SmallString
over always calling inits, as it's easy to forget to make the check.
Outside the scope of this PR obviously, we'll want to come up with a general programming pattern and codify it in https://github.com/apple/swift/blob/master/docs/StandardLibraryProgrammersManual.md
stdlib/public/core/SmallString.swift
Outdated
@@ -210,7 +204,10 @@ extension _SmallString { | |||
internal func withUTF8<Result>( | |||
_ f: (UnsafeBufferPointer<UInt8>) throws -> Result | |||
) rethrows -> Result { | |||
var raw = self.zeroTerminatedRawCodeUnits | |||
// Code points need to be in little endian byte order and zero terminated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code units
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops, thanks!
stdlib/public/core/SmallString.swift
Outdated
} | ||
|
||
@usableFromInline // @testable | ||
internal init?(_ base: _SmallString, appending other: _SmallString) { | ||
let totalCount = base.count + other.count | ||
guard totalCount <= _SmallString.capacity else { return nil } | ||
|
||
// TODO(SIMD): The below can be replaced with just be a couple vector ops |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this incorrect? Why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concatenating two 64 bit words to form one 128bit word and then treating it as a vector of bytes is likely to cause more endianess issues so I'd rather we avoided adding SIMD to this file. I also think this code could be made considerably faster without resorting to SIMD (using rotates and logical ops). I think that would be a better approach as all platforms could benefit regardless of whether they have the necessary SIMD instructions in the base ISA.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, but should Linux, macOS, and iOS suffer a several X performance regression in hot-paths to facilitate big-endian systems? At the very least, we would #if
if necessary, assuming SIMD doesn't abstract it away sufficiently.
stdlib/public/core/SmallString.swift
Outdated
// Code points need to be in little endian byte order and zero terminated. | ||
let l = self.leadingRawBits | ||
let t = self.trailingRawBits & _StringObject.Nibbles.largeAddressMask | ||
var raw = (l.littleEndian, t.littleEndian) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we convert the bytes to little endian on read? If the plan is to store the contents in big-endian on big-endian systems, then we don't want to convert on read, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And if not, then would it make more sense to just have a littleEndianBytes
computed property?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The trouble is that the discriminator is stored in the most significant byte of the _object
pointer. On little endian systems this means the discriminator is stored in the 'last' byte of the string objects memory:
[ x x x x x x x x x x x x x x x d ]
However on big endian systems this means that the discriminator is stored in the middle byte of the string object:
[ x x x x x x x x d x x x x x x x ]
Since the discriminator is in the middle we can only access the first 8 code units with a withUnsafeBytes
call if we use big endian byte order. However if we byte swap to little endian byte order before any withUnsafeBytes
calls then we can access all 15 code units and so big endian machines can do all the same operations a little endian machine can do. It also means we can add code units to the small string in the same order on both big and little endian platforms which is much cleaner and easier to manage.
We could add a littleEndianBytes
computed property but I don't think the code saving is worth the additional obfuscation. We should only ever need to do the little endian conversion before a withUnsafeBytes
call and ideally we won't add any more direct calls to that and instead always go through withUTF8
and withMutableCapacity
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could also go back to the approach I took in 2866b4a and always store _SmallString._storage
in little endian byte order. That avoids byte swapping before a read but means that _StringObject
and _SmallString
have different raw values for the same string. Though this is also true for 32-bit architectures, so maybe we should go down that path. It would have the nice side effect of making withUnsafeBytes
always safe on _SmallString
values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think that's the path we'll want to go down. For 64-bit LE, _StringObject
and _SmallString
have equivalent layout, but otherwise we need to do extra unpacking / byte-swapping as needed.
That will probably make further maintenance easier as well, as we're more likely to add more readers than creators.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... and might let any SIMD, which we really want to do (on LE platforms at the very least), at least have a chance of being portable.
ba3c7b9
to
8289e16
Compare
I've reworked this PR to minimize the code changes and to remove the ABI changes. The code units in small strings are now always stored in little-endian byte order rather than being re-ordered when accessed as an array. Please take a look when you get a chance. Thanks! |
I’m a little confused. We had said:
My reading of that was that we would byte swap when converting between In that sense, A read from a |
In the new code The little-endian byte order of the raw bits in storage means that on big-endian systems the bytes in each 64-bit chunk of the stored tuple are reversed. This is great for operations such as Ideally I'd change the data type of the |
I feel like we’re talking past each other, and it might be due to conflicting terminology. Rather than say “it’s in little endian”, let’s use the term “memory order” to mean the order in memory that it would be laid out as if it were a 16-byte Array of UInt8 starting at the first code unit and finishing with the discriminator and count. To me, “memory order” seems like it would be equivalent to platform-native endianness, and I think we’re trying to say that We both agree that we do a byte swap on BE platforms when creating a Back to the code. I find it pretty confusing and brittle to change |
8289e16
to
9587582
Compare
Thanks for your patience. I've updated the PR with your suggestions. Is it a bit closer to what you had in mind now? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking great! Thanks for your patience as well.
Review notes are just details about comments, which I don't mind doing in a followup PR. I'll start testing and benchmarking now.
@@ -74,7 +89,8 @@ extension _SmallString { | |||
|
|||
@inlinable @inline(__always) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is worth a comment stating it's back to _StringObject
's notion of a raw discriminated object. I can add that in a followup PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a brief comment. Happy to drop this additional commit if you'd rather do a comment follow up yourself.
// Fetches the `i`th byte, from least-significant to most-significant | ||
// | ||
// TODO: endianess awareness day | ||
// Fetches the `i`th byte, from left to right. | ||
@inlinable @inline(__always) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does "left to right" mean, memory-order/host-endianess?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clarified the comment (memory-order).
@swift-ci please test |
@swift-ci please benchmark |
Build failed |
Build failed |
Build comment file:Performance: -O
Performance: -Osize
Performance: -Onone
How to read the dataThe tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.If you see any unexpected regressions, you should consider fixing the Noise: Sometimes the performance results (not code size!) contain false Hardware Overview
|
Clarify the expected behavior of some small string helper functions.
@swift-ci please test |
Build failed |
Build failed |
@milseman Is there a way to restart the CI tests? I'm not sure what went wrong with them but I don't think it was anything to do with this change. Thanks! |
@swift-ci please test os x platform |
@swift-ci please test linux platform |
Build failed |
@swift-ci please test os x platform |
This commit supersedes 2866b4a which was overwritten by 4ab45df.
When code points are stored in multi-byte types (e.g. UInt64) we
need to convert the order of the bytes in the value to little
endian before we can access the value as raw bytes.