-
Notifications
You must be signed in to change notification settings - Fork 10.5k
Fix undefined behavior in SmallString.withUTF8 #33698
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Actually there is still one regression that I'll just file a separate bug for: |
Incidentally, some of the BufferPointer code could be cleaner by relying on convenience methods that calculate capacity, but I think this code should stick to the most primitive operations to guarantee performance. |
@swift-ci benchmark |
@swift-ci test |
The current stdlib build does not appear to miscompile, but the miscompile is easily exposed by tweaking the optimizer in unrelated ways, and it could happen in user code since withUTF8 is inlinable. |
Performance: -O
Code size: -O
Performance: -Osize
Code size: -Osize
Performance: -Onone
Code size: -swiftlibsHow to read the dataThe tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.If you see any unexpected regressions, you should consider fixing the Noise: Sometimes the performance results (not code size!) contain false Hardware Overview
|
Build failed |
#33693 fixes macOS test failure. |
@swift-ci Please test macOS platform |
@milseman do you have any concerns with this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nits :)
withUTF8 currently vends a typed UInt8 pointer to the underlying SmallString. That pointer type differs from SmallString's representation. It should simply vend a raw pointer, which would be both type safe and convenient for UTF8 data. However, since this method is already @inlinable, I added calls to bindMemory to prevent the optimizer from reasoning about access to the typed pointer that we vend. rdar://67983613 (Undefinied behavior in SmallString.withUTF8 is miscompiled) Additional commentary: SmallString creates a situation where there are two types, the in-memory type, (UInt64, UInt64), vs. the element type, UInt8. `UnsafePointer<T>` specifies the in-memory type of the pointee, because that's how C works. If you want to specify an element type, not the in-memory type, then you need to use something other than UnsafePointer to view the memory. A trivial `BufferView<UInt8>` would be fine, although, frankly, I think UnsafeRawPointer is a perfectly good type on its own for UTF8 bytes. Unfortunately, a lot of the UTF8 helper code is ABI-exposed, so to work around this, we need to insert calls to bindMemory at strategic points to avoid undefined behavior. This is high-risk and can negatively affect performance. So far, I was able to resolve the regressions in our microbenchmarks just by tweaking the inliner.
bind_memory has no actual code size cost, and this is the only way to allow rebinding memory within critical standard library code like SmallString without regressing performance.
303f183
to
42bf92a
Compare
Corrections have all been committed. Much appreciated. |
@swift-ci test |
@swift-ci benchmark |
Performance: -O
Code size: -O
Performance: -Osize
Code size: -Osize
Performance: -Onone
Code size: -swiftlibsHow to read the dataThe tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.If you see any unexpected regressions, you should consider fixing the Noise: Sometimes the performance results (not code size!) contain false Hardware Overview
|
I haven't changed these commits, but now there are large regressions. |
I just hit another miscompile because of this after making an a small change to inlining, so I really need to merge it without investigating regressions. The performance regressions that popped up recently weren't originally a problem, so I'm sure they are fixable but will need a little time to investigate. |
@swift-ci benchmark |
@swift-ci test |
Performance: -O
Code size: -OPerformance: -Osize
Code size: -OsizePerformance: -Onone
Code size: -swiftlibsHow to read the dataThe tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.If you see any unexpected regressions, you should consider fixing the Noise: Sometimes the performance results (not code size!) contain false Hardware Overview
|
Fix undefined behavior in SmallString.withUTF8