Skip to content

[String] Fix corner case in comparison fast-path. #20937

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Dec 3, 2018

Conversation

milseman
Copy link
Member

@milseman milseman commented Dec 1, 2018

When in a post-binary-prefix-scan fast-path, we need to make sure we
are comparing a full-segment scalar, otherwise we miss situations
where a combining end-of-segment scalar would be reordered with a
prior combining scalar in the same segment under normalization in one
string but not the other.

This was hidden by the fact that many combining scalars are
NFC_QC=maybe, but those which are not present in any precomposed form
have NFC_QC=yes. Added tests.

When in a post-binary-prefix-scan fast-path, we need to make sure we
are comparing a full-segment scalar, otherwise we miss situations
where a combining end-of-segment scalar would be reordered with a
prior combining scalar in the same segment under normalization in one
string but not the other.

This was hidden by the fact that many combining scalars are not
NFC_QC=maybe, but those which are not present in any precomposed form
have NFC_QC=yes. Added tests.
@milseman milseman requested a review from lancep December 1, 2018 01:42
@milseman
Copy link
Member Author

milseman commented Dec 1, 2018

This regresses performance of some code that hits this fast-path, but I have improvements coming soon that compensates for this fact.

@milseman
Copy link
Member Author

milseman commented Dec 1, 2018

@swift-ci please benchmark

@milseman
Copy link
Member Author

milseman commented Dec 1, 2018

@swift-ci please test

@swift-ci
Copy link
Contributor

swift-ci commented Dec 1, 2018

Build comment file:

Performance: -O

TEST OLD NEW DELTA RATIO
Regression
SubstringEquatable 928 1319 +42.1% 0.70x
SubstringEqualString 523 686 +31.2% 0.76x
StringComparison_slowerPrenormal 1550 2031 +31.0% 0.76x
StringComparison_emoji 733 891 +21.6% 0.82x
StringComparison_nonBMPSlowestPrenormal 1423 1690 +18.8% 0.84x
SortStringsUnicode 8589 9877 +15.0% 0.87x
StringComparison_latin1 1787 1987 +11.2% 0.90x
StringComparison_fastPrenormal 2406 2673 +11.1% 0.90x

Performance: -Osize

TEST OLD NEW DELTA RATIO
Regression
StringComparison_slowerPrenormal 1561 2128 +36.3% 0.73x
SubstringEqualString 508 669 +31.7% 0.76x
SubstringEquatable 922 1171 +27.0% 0.79x
StringComparison_emoji 803 963 +19.9% 0.83x
StringComparison_nonBMPSlowestPrenormal 1578 1846 +17.0% 0.85x
SortStringsUnicode 8602 9897 +15.1% 0.87x
StringComparison_latin1 1752 1983 +13.2% 0.88x
StringComparison_fastPrenormal 2422 2688 +11.0% 0.90x
StringComparison_longSharedPrefix 603 664 +10.1% 0.91x
Improvement
NopDeinit 52957 44444 -16.1% 1.19x (?)

Performance: -Onone

TEST OLD NEW DELTA RATIO
Regression
SortStringsUnicode 10441 11760 +12.6% 0.89x
Improvement
DictionaryKeysContainsNative 67 58 -13.4% 1.16x
DictionaryKeysContainsCocoa 68 60 -11.8% 1.13x
How to read the data The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the
regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false
alarms. Unexpected regressions which are marked with '(?)' are probably noise.
If you see regressions which you cannot explain you can try to run the
benchmarks again. If regressions still show up, please consult with the
performance team (@eeckstein).

Hardware Overview
  Model Name: Mac Pro
  Model Identifier: MacPro6,1
  Processor Name: 12-Core Intel Xeon E5
  Processor Speed: 2.7 GHz
  Number of Processors: 1
  Total Number of Cores: 12
  L2 Cache (per Core): 256 KB
  L3 Cache: 30 MB
  Memory: 64 GB
--------------

@milseman milseman merged commit 94942c5 into swiftlang:master Dec 3, 2018
@milseman milseman deleted the start_combining branch December 3, 2018 18:42
milseman added a commit to milseman/swift that referenced this pull request Dec 4, 2018
When in a post-binary-prefix-scan fast-path, we need to make sure we
are comparing a full-segment scalar, otherwise we miss situations
where a combining end-of-segment scalar would be reordered with a
prior combining scalar in the same segment under normalization in one
string but not the other.

This was hidden by the fact that many combining scalars are not
NFC_QC=maybe, but those which are not present in any precomposed form
have NFC_QC=yes. Added tests.
airspeedswift pushed a commit that referenced this pull request Dec 5, 2018
* [String] Fix corner case in comparison fast-path. (#20937)

When in a post-binary-prefix-scan fast-path, we need to make sure we
are comparing a full-segment scalar, otherwise we miss situations
where a combining end-of-segment scalar would be reordered with a
prior combining scalar in the same segment under normalization in one
string but not the other.

This was hidden by the fact that many combining scalars are not
NFC_QC=maybe, but those which are not present in any precomposed form
have NFC_QC=yes. Added tests.

* [String] Normalization-boundary-before UTF-8 fast path.

All latiny (<0x300) scalars have normalization boundaries before them,
so when asking if memory containing a scalar encoded in valid UTF-8
has a boundary before it, check if the leading byte definitely encodes
a scalar less than 0x300.

* [String] Refactor and fast-path normalization

Refactor some normalization queries into StringNormalization.swift,
and add more latiny (<0x300) fast-paths.

* [String] Speed up constant factors on comparison.

Include some tuning and tweaking to reduce the constant factors
involved in string comparison. This yields considerable improvement on
our micro-benchmarks, and allows us to make less inlinable code and
have a smaller ABI surface area.

Adds more extensive testing of corner cases in our existing
fast-paths.

* [String] Hand-increment loop variable for perf.

Hand-incrementing the loop variable allows us to skip overflow
detection, and will permit more vectorization improvements in the
future. For now, it gives us perf improvements in nano-benchmarks.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants