-
Notifications
You must be signed in to change notification settings - Fork 10.5k
[stdlib] Implement native Unicode.Scalar binary properties #39597
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@swift-ci please test |
@swift-ci please benchmark |
Performance (x86_64): -O
Code size: -OPerformance (x86_64): -Osize
Code size: -OsizePerformance (x86_64): -Onone
Code size: -swiftlibsHow to read the dataThe tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.If you see any unexpected regressions, you should consider fixing the Noise: Sometimes the performance results (not code size!) contain false Hardware Overview
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might also need to beef up our testing, as before we only had to make sure we passed the right constant to ICU.
Build failed |
@@ -27,7 +27,7 @@ public func formatCollection<C: Collection>( | |||
result += " " | |||
} | |||
|
|||
if rowLength + string.count + 1 > 80 { | |||
if rowLength + string.count + 1 > 100 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Why 100?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had figured that this would stuff more data on a single line making it somewhat easier to get access to the actual lookup implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could just decide to bump the stdlib's line length limit to 100... It seems to be the line length that GitHub uses for its web interface. (Although side-by-side diffs could become less useful.)
Increasing the limit would reduce the number of source files that can be comfortably displayed/edited side by side, but this isn't necessarily a big deal.
(OTOH, the 80 char limit also applies to the compiler, and if we do change it, it should probably apply to the entire repository. FWIW it comes from LLVM.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tbf this is only increasing the line limit of the Unicode data arrays and nothing else. The implementation and all the stdlib changes still abide by the 80 rule.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking good!
Have you looked into testing at all?
@milseman It seems we have some tests regarding these binary properties in https://github.com/apple/swift/blob/main/test/stdlib/UnicodeScalarPropertiesTests.swift#L10 , so I don't know if that alleviates your concerns, or if you think we should have beefy tests. |
Those are nice to have, but they're sanity check to make sure we're not passing the wrong constant to ICU. Bugs in the generator script, binary search, etc., would have a decent likelihood of hiding out. We probably want to have a long-running test that walks each scalar and checks the property vs ICU, but I'm not sure where this would live. |
3cdbfe3
to
fe568dd
Compare
fe568dd
to
0630208
Compare
@swift-ci please test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My main point of feedback is that I want to update our testing, even if just a little bit. Have you looked at our testing coverage? Can we expand it at least a little so that we feel more confident our data and way of accessing that data is correct?
@swift-ci please test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This adds data tables for all binary properties for every scalar removing yet another function dependency on ICU. These binary property data tables are barely under 21KB.