Skip to content

[stdlib] Speed up short UTF-16 distance calculations #62717

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Jan 5, 2023

Conversation

lorentey
Copy link
Member

@lorentey lorentey commented Dec 21, 2022

Previously we insisted on using UTF-16 breadcrumbs even if we only needed to travel a very short way. This could be as much as ten forty times slower than the naive algorithm of simply visiting all the Unicode scalars in between the start and the end.

(Using breadcrumbs generally means that we need to walk to both endpoints from their nearest breadcrumb, which on average requires walking half the distance between breadcrumbs, twice — and this can mean visiting vastly more Unicode scalars than if we simply walked through the ones that are lying in between the endpoints themselves.)

To put it another way, when we want to measure how long it takes to walk between two trees within a nearby park, it probably isn't a great idea to start by separately measuring each of their distances from the nearest airport. 😛

rdar://103575481

@lorentey lorentey requested a review from Catfish-Man December 21, 2022 03:11
@lorentey
Copy link
Member Author

@swift-ci test

@lorentey
Copy link
Member Author

@swift-ci benchmark

@lorentey
Copy link
Member Author

Benchmark results are... peculiar.

------- Performance (x86_64): -O -------

REGRESSION                                OLD        NEW        DELTA    RATIO    
Breadcrumbs.IdxToUTF16.longASCII          128.556    198.545    +54.4%   **0.65x**
FlattenListFlatMap                        3041.0     4311.0     +41.8%   **0.71x (?)**
Breadcrumbs.MutatedIdxToUTF16.ASCII       3.588      4.316      +20.3%   **0.83x**
MapReduceAnyCollection                    123.667    143.7      +16.2%   **0.86x (?)**
SortAdjacentIntPyramids                   763.333    865.714    +13.4%   **0.88x (?)**
SortIntPyramid                            484.333    545.333    +12.6%   **0.89x**
Set.isDisjoint.Seq.Int.Empty              39.757     44.286     +11.4%   **0.90x (?)**
Set.isSuperset.Seq.Empty.Int              43.0       47.538     +10.6%   **0.90x (?)**
Set.isDisjoint.Box.Empty                  49.021     54.19      +10.5%   **0.90x (?)**
Set.isSubset.Seq.Int25                    47.281     51.08      +8.0%    **0.93x (?)**

IMPROVEMENT                               OLD        NEW        DELTA    RATIO    
ArrayAppendGenericStructs                 1522.0     610.0      -59.9%   **2.50x (?)**
UTF8Decode_InitDecoding                   163.556    131.0      -19.9%   **1.25x (?)**
UTF8Decode_InitFromCustom_contiguous      160.2      131.231    -18.1%   **1.22x (?)**
MapReduceClass2                           12.479     10.634     -14.8%   **1.17x (?)**
Breadcrumbs.IdxToUTF16Range.longMixed     397.0      341.333    -14.0%   **1.16x**
UTF8Decode_InitFromCustom_noncontiguous   283.286    249.375    -12.0%   **1.14x (?)**
DataAppendDataSmallToSmall                2723.333   2518.462   -7.5%    **1.08x (?)**
ObjectiveCBridgeStubToNSStringRef         92.808     86.667     -6.6%    **1.07x (?)**

------- Code size: -O -------

REGRESSION        OLD     NEW     DELTA    RATIO  
Diffing.o         7549    7667    +1.6%    **0.98x**

IMPROVEMENT       OLD     NEW     DELTA    RATIO  
ReduceInto.o      14328   10896   -24.0%   **1.31x**
IndexPathTest.o   8386    7946    -5.2%    **1.06x**
MapReduce.o       27754   26723   -3.7%    **1.04x**
RemoveWhere.o     15171   14811   -2.4%    **1.02x**
------- Performance (x86_64): -Osize -------

REGRESSION                                               OLD        NEW        DELTA    RATIO    
PrefixWhileSequenceLazy                                  26.846     53.619     +99.7%   **0.50x**
PrefixWhileArrayLazy                                     20.229     40.25      +99.0%   **0.50x**
PrefixArrayLazy                                          13.958     26.852     +92.4%   **0.52x**
DropFirstCountableRangeLazy                              14.657     26.852     +83.2%   **0.55x**
Breadcrumbs.IdxToUTF16Range.longASCII                    34.435     62.649     +81.9%   **0.55x**
DropLastArrayLazy                                        5.115      9.0        +75.9%   **0.57x**
DropWhileCountableRangeLazy                              40.273     67.053     +66.5%   **0.60x**
Breadcrumbs.IdxToUTF16.longASCII                         128.556    201.545    +56.8%   **0.64x**
DropLastCountableRangeLazy                               5.844      9.0        +54.0%   **0.65x (?)**
DropWhileArrayLazy                                       44.926     67.207     +49.6%   **0.67x**
DropWhileArray                                           26.627     38.169     +43.3%   **0.70x**
MapReduceLazySequence                                    65.933     87.722     +33.0%   **0.75x (?)**
Breadcrumbs.MutatedIdxToUTF16.ASCII                      3.59       4.316      +20.2%   **0.83x (?)**
PrefixWhileAnySequence                                   185.4      219.889    +18.6%   **0.84x (?)**
PrefixWhileSequence                                      185.714    219.75     +18.3%   **0.85x (?)**
MapReduceAnyCollection                                   160.556    184.375    +14.8%   **0.87x (?)**
DropWhileAnySeqCntRange                                  107.308    120.625    +12.4%   **0.89x (?)**
PrefixWhileAnySeqCRangeIter                              128.727    142.6      +10.8%   **0.90x (?)**
PrefixWhileAnySeqCntRange                                129.0      142.769    +10.7%   **0.90x (?)**
PrefixWhileAnyCollectionLazy                             121.143    134.0      +10.6%   **0.90x (?)**
DropFirstAnyCollection                                   114.0      123.875    +8.7%    **0.92x (?)**

IMPROVEMENT                                              OLD        NEW        DELTA    RATIO    
DropWhileCountableRange                                  26.852     13.575     -49.4%   **1.98x**
DropFirstCountableRange                                  26.852     13.58      -49.4%   **1.98x**
PrefixCountableRange                                     26.852     13.717     -48.9%   **1.96x**
PrefixArray                                              26.844     13.854     -48.4%   **1.94x**
DropLastCountableRange                                   9.034      4.74       -47.5%   **1.91x**
DropLastArray                                            9.022      4.84       -46.3%   **1.86x**
DropWhileSequence                                        26.842     14.555     -45.8%   **1.84x (?)**
SuffixCountableRange                                     9.034      4.924      -45.5%   **1.83x (?)**
SuffixArray                                              9.0        5.25       -41.7%   **1.71x**
MapReduceLazyCollectionShort                             50.083     31.25      -37.6%   **1.60x (?)**
PrefixSequence                                           40.25      26.885     -33.2%   **1.50x (?)**
PrefixSequenceLazy                                       40.214     26.885     -33.1%   **1.50x**
RemoveWhereSwapInts                                      13.822     9.605      -30.5%   **1.44x (?)**
UTF8Decode_InitFromCustom_noncontiguous                  359.0      258.429    -28.0%   **1.39x**
UTF8Decode_InitFromCustom_noncontiguous_ascii            831.5      618.333    -25.6%   **1.34x (?)**
DropFirstSequence                                        44.864     33.52      -25.3%   **1.34x (?)**
DropFirstSequenceLazy                                    44.864     33.56      -25.2%   **1.34x (?)**
DropWhileSequenceLazy                                    76.176     58.238     -23.5%   **1.31x (?)**
SortAdjacentIntPyramids                                  945.833    726.111    -23.2%   **1.30x**
UTF8Decode_InitFromCustom_noncontiguous_ascii_as_ascii   932.5      720.333    -22.8%   **1.29x (?)**
UTF8Decode_InitDecoding                                  161.273    129.909    -19.4%   **1.24x (?)**
PrefixWhileArray                                         54.057     44.455     -17.8%   **1.22x (?)**
SortIntPyramid                                           612.692    504.5      -17.7%   **1.21x (?)**
UTF8Decode_InitFromCustom_contiguous                     161.455    133.846    -17.1%   **1.21x (?)**
Breadcrumbs.IdxToUTF16Range.longMixed                    396.5      339.667    -14.3%   **1.17x (?)**
FlattenListLoop                                          1614.0     1385.0     -14.2%   **1.17x (?)**
RemoveWhereMoveInts                                      13.643     11.735     -14.0%   **1.16x (?)**
DropWhileAnyCollectionLazy                               161.286    143.727    -10.9%   **1.12x (?)**
PrefixWhileAnyCollection                                 151.909    138.667    -8.7%    **1.10x (?)**
Chars2                                                   3450.0     3168.519   -8.2%    **1.09x (?)**
StringSwitch                                             237.375    221.0      -6.9%    **1.07x (?)**
DataCreateEmptyArray                                     1610.714   1501.786   -6.8%    **1.07x (?)**

------- Code size: -Osize -------

REGRESSION        OLD     NEW     DELTA   RATIO  
RemoveWhere.o     12053   12531   +4.0%   **0.96x**
Diffing.o         6718    6838    +1.8%   **0.98x**
RandomTree.o      11032   11168   +1.2%   **0.99x**
BufferFill.o      9424    9522    +1.0%   **0.99x**

IMPROVEMENT       OLD     NEW     DELTA   RATIO  
IndexPathTest.o   7326    7058    -3.7%   **1.04x**
ReduceInto.o      7987    7871    -1.5%   **1.01x**
MapReduce.o       20602   20307   -1.4%   **1.01x**
------- Performance (x86_64): -Onone -------

REGRESSION                               OLD       NEW       DELTA    RATIO    
Breadcrumbs.IdxToUTF16Range.longASCII    230.667   255.625   +10.8%   **0.90x**
Breadcrumbs.IdxToUTF16.longASCII         1016.0    1103.5    +8.6%    **0.92x (?)**

IMPROVEMENT                              OLD       NEW       DELTA    RATIO    
UTF8Decode_InitFromCustom_contiguous     173.308   138.417   -20.1%   **1.25x (?)**
UTF8Decode_InitDecoding                  167.25    136.75    -18.2%   **1.22x (?)**
DataSubscriptMedium                      65.313    56.632    -13.3%   **1.15x (?)**
ArrayAppendLatin1Substring               26136.0   23052.0   -11.8%   **1.13x (?)**
ArrayAppendAsciiSubstring                25776.0   22752.0   -11.7%   **1.13x (?)**
RC4                                      12832.0   11346.0   -11.6%   **1.13x (?)**
ArrayAppendUTF16Substring                25788.0   22836.0   -11.4%   **1.13x (?)**
Breadcrumbs.IdxToUTF16Range.longMixed    610.0     555.0     -9.0%    **1.10x (?)**
UTF8Decode_InitDecoding_ascii_as_ascii   207.545   192.3     -7.3%    **1.08x (?)**
TypeName                                 1533.0    1427.0    -6.9%    **1.07x (?)**

------- Code size: -swiftlibs -------

@stephentyrone
Copy link
Contributor

How stale is the baseline for the perf tests?

@@ -201,6 +201,17 @@ extension String.UTF16View: BidirectionalCollection {
return _foreignIndex(i, offsetBy: n)
}

if n.magnitude <= _StringBreadcrumbs.breadcrumbStride {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On average, using breadcrumbs ought to win when |n| > breadcrumbStride/2, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, the average cost of using breadcrumbs is breadcrumbStride/2 each time we do it. In this case, we're doing it twice, though, so I expect the cutoff to be doubled. 🤔

@lorentey lorentey force-pushed the string-utf16-speedup branch from 8e85017 to 3b340dc Compare December 28, 2022 03:09
@lorentey
Copy link
Member Author

@swift-ci benchmark

@lorentey
Copy link
Member Author

lorentey commented Dec 28, 2022

How stale is the baseline for the perf tests?

We are building & measuring the baseline on the fly with each benchmark run, so it should be always up to date. Huh, unless we somehow select the wrong commit for the baseline builds, perhaps?

Edit: Hm, maybe: the logs indicate that CI is for some reason rebuilding swiftAST and swiftParse after switching to the PR branch, but this PR only contains stdlib changes. 🤔

Previously we insisted on using breadcrumbs even if we only needed to
travel a very short way. This could be as much as ten times slower
than the naive algorithm of simply visiting all the Unicode scalars
in between the start and the end.

(Using breadcrumbs generally means that we need to walk to both
endpoints from their nearest breadcrumb, which on average requires
walking half the distance between breadcrumbs — and this can mean
visiting vastly more Unicode scalars than the ones that are simply
lying in between the endpoints themselves.)
… ranges

Instead of calling `_toUTF16Index` twice, call it once and then use
`index(_:offsetBy:)` to potentially avoid another breadcrumbs lookup.
@lorentey lorentey force-pushed the string-utf16-speedup branch from a8ab24c to 6fee1b3 Compare December 28, 2022 04:22
@lorentey
Copy link
Member Author

In fact, let's try if the rebuild set changes after rebasing this on top of current main.

@swift-ci benchmark

@lorentey
Copy link
Member Author

@swift-ci benchmark

@lorentey
Copy link
Member Author

@swift-ci test

@lorentey
Copy link
Member Author

This has still rebuilt a lot more source files between main & the PR branch than I'd have expected.

------- Performance (x86_64): -O -------

REGRESSION                                  OLD       NEW       DELTA     RATIO    
StringWithCString2                          0.0       0.002     +200.0%   **0.33x**
Breadcrumbs.UTF16ToIdxRange.longMixed       163.0     264.778   +62.4%    **0.62x (?)**
SumUsingReduce                              66.571    83.222    +25.0%    **0.80x (?)**
NSStringConversion.InlineBuffer.UTF8        786.0     959.0     +22.0%    **0.82x (?)**
Set.isSuperset.Seq.Empty.Int                43.577    50.08     +14.9%    **0.87x (?)**
Set.isDisjoint.Seq.Int.Empty                46.837    53.093    +13.4%    **0.88x (?)**
SortAdjacentIntPyramids                     768.571   868.125   +13.0%    **0.89x (?)**
BridgeString.find.native.longNonASCII       467.0     524.0     +12.2%    **0.89x (?)**
SortIntPyramid                              478.438   532.188   +11.2%    **0.90x (?)**
Breadcrumbs.UTF16ToIdxRange.longASCII       42.923    47.563    +10.8%    **0.90x (?)**

IMPROVEMENT                                 OLD       NEW       DELTA     RATIO    
MapReduceAnyCollection                      127.0     86.583    -31.8%    **1.47x (?)**
Breadcrumbs.CopyUTF16CodeUnits.Mixed        88.68     63.516    -28.4%    **1.40x (?)**
UTF8Decode_InitDecoding                     156.273   132.25    -15.4%    **1.18x (?)**
Breadcrumbs.IdxToUTF16Range.longMixed       429.5     368.333   -14.2%    **1.17x (?)**
Set.isSubset.Int.Empty                      51.192    44.091    -13.9%    **1.16x (?)**
MapReduceClass2                             12.517    10.803    -13.7%    **1.16x (?)**
UTF8Decode_InitFromCustom_contiguous        155.857   134.529   -13.7%    **1.16x (?)**
NormalizedIterator_nonBMPSlowestPrenormal   453.333   403.455   -11.0%    **1.12x (?)**
NormalizedIterator_slowerPrenormal          316.338   286.386   -9.5%     **1.10x (?)**
Breadcrumbs.MutatedIdxToUTF16.Mixed         219.3     198.636   -9.4%     **1.10x (?)**

------- Code size: -O -------

REGRESSION        OLD     NEW     DELTA    RATIO  
Diffing.o         7533    7651    +1.6%    **0.98x**

IMPROVEMENT       OLD     NEW     DELTA    RATIO  
ReduceInto.o      14311   10879   -24.0%   **1.32x**
IndexPathTest.o   8290    7850    -5.3%    **1.06x**
MapReduce.o       27626   26435   -4.3%    **1.05x**
RemoveWhere.o     14948   14581   -2.5%    **1.03x**
------- Performance (x86_64): -Osize -------

REGRESSION                                      OLD        NEW        DELTA    RATIO    
DropFirstArray                                  13.88      26.853     +93.5%   **0.52x (?)**
RemoveWhereMoveInts                             7.92       13.675     +72.7%   **0.58x (?)**
SuffixArray                                     5.216      9.0        +72.5%   **0.58x**
PrefixCountableRangeLazy                        15.72      26.839     +70.7%   **0.59x (?)**
FlattenListLoop                                 991.5      1626.0     +64.0%   **0.61x (?)**
PrefixAnyCollection                             85.267     138.714    +62.7%   **0.61x**
Breadcrumbs.UTF16ToIdxRange.longMixed           163.5      265.375    +62.3%   **0.62x**
Data.init.Sequence.64kB.Count.RE                19.191     29.524     +53.8%   **0.65x**
MapReduceLazyCollectionShort                    31.24      47.905     +53.3%   **0.65x**
Data.init.Sequence.64kB.Count.RE.I              19.429     29.531     +52.0%   **0.66x**
FlattenListFlatMap                              2977.0     4523.0     +51.9%   **0.66x (?)**
StringWithCString2                              0.001      0.002      +50.0%   **0.67x**
PrefixWhileArrayLazy                            26.861     40.261     +49.9%   **0.67x**
Data.append.Sequence.64kB.Count.RE.I            20.009     29.969     +49.8%   **0.67x**
MapReduceLazySequence                           44.091     65.947     +49.6%   **0.67x**
Data.append.Sequence.64kB.Count.RE              20.471     30.0       +46.5%   **0.68x**
DropFirstSequenceLazy                           33.0       47.333     +43.4%   **0.70x**
DropFirstSequence                               33.0       47.318     +43.4%   **0.70x**
SuffixCountableRangeLazy                        4.653      6.365      +36.8%   **0.73x (?)**
RemoveWhereSwapInts                             10.02      13.699     +36.7%   **0.73x (?)**
SuffixAnyCollection                             30.903     41.724     +35.0%   **0.74x**
DropWhileArray                                  25.979     33.667     +29.6%   **0.77x (?)**
DropWhileAnyCollectionLazy                      152.667    197.1      +29.1%   **0.77x (?)**
DropFirstAnySeqCntRangeLazy                     134.333    170.143    +26.7%   **0.79x**
DropFirstAnySeqCRangeIterLazy                   134.385    170.0      +26.5%   **0.79x**
SortAdjacentIntPyramids                         717.5      906.667    +26.4%   **0.79x**
SortIntPyramid                                  431.875    545.714    +26.4%   **0.79x**
DropLastAnyCollection                           35.625     44.5       +24.9%   **0.80x (?)**
PrefixAnySeqCntRange                            107.308    134.0      +24.9%   **0.80x**
DropFirstAnyCollection                          94.375     117.444    +24.4%   **0.80x (?)**
Data.init.Sequence.809B.Count.RE.I              45.923     56.417     +22.9%   **0.81x (?)**
Data.init.Sequence.809B.Count.RE                46.0       56.25      +22.3%   **0.82x (?)**
NSStringConversion.InlineBuffer.UTF8            793.0      963.0      +21.4%   **0.82x (?)**
SequenceAlgosArray                              2077.273   2520.0     +21.3%   **0.82x**
Data.append.Sequence.809B.Count.RE.I            58.87      69.438     +18.0%   **0.85x (?)**
Data.append.Sequence.809B.Count.RE              58.818     69.19      +17.6%   **0.85x (?)**
Set.filter.Int100.20k                           26.354     30.301     +15.0%   **0.87x (?)**
PrefixAnySeqCntRangeLazy                        107.538    123.5      +14.8%   **0.87x (?)**
PrefixAnySeqCRangeIterLazy                      107.571    123.429    +14.7%   **0.87x (?)**
PrefixWhileAnyCollectionLazy                    107.6      122.333    +13.7%   **0.88x (?)**
PrefixWhileAnySeqCRangeIterLazy                 107.643    122.308    +13.6%   **0.88x (?)**
PrefixWhileAnySeqCntRangeLazy                   107.643    122.308    +13.6%   **0.88x (?)**
Set.filter.Int100.16k                           21.652     24.543     +13.4%   **0.88x (?)**
DropFirstCountableRangeLazy                     13.58      15.353     +13.1%   **0.88x (?)**
DataAppendSequence                              6197.297   6989.474   +12.8%   **0.89x (?)**
DropWhileAnyCollection                          119.154    133.692    +12.2%   **0.89x (?)**
Set.isSuperset.Seq.Empty.Int                    45.731     51.192     +11.9%   **0.89x (?)**
Set.filter.Int100.28k                           38.593     42.878     +11.1%   **0.90x (?)**
StringWalk                                      1265.455   1380.69    +9.1%    **0.92x (?)**
PrefixWhileAnyCollection                        152.364    165.3      +8.5%    **0.92x (?)**
Set.intersection.Seq.Int0                       35.464     38.34      +8.1%    **0.92x (?)**

IMPROVEMENT                                     OLD        NEW        DELTA    RATIO    
PrefixSequenceLazy                              53.7       26.903     -49.9%   **2.00x (?)**
PrefixSequence                                  53.688     26.903     -49.9%   **2.00x**
DropFirstCountableRange                         26.87      13.578     -49.5%   **1.98x**
PrefixCountableRange                            26.865     13.711     -49.0%   **1.96x**
DropWhileCountableRange                         26.852     13.908     -48.2%   **1.93x**
DropLastCountableRangeLazy                      9.034      4.733      -47.6%   **1.91x (?)**
PrefixArray                                     26.853     14.096     -47.5%   **1.90x**
DropLastCountableRange                          9.02       4.74       -47.4%   **1.90x**
DropLastArrayLazy                               9.031      4.824      -46.6%   **1.87x (?)**
SuffixCountableRange                            9.034      4.93       -45.4%   **1.83x**
SuffixArrayLazy                                 9.019      5.26       -41.7%   **1.71x**
SumUsingReduceInto                              441.5      281.5      -36.2%   **1.57x**
DropWhileSequenceLazy                           71.667     49.429     -31.0%   **1.45x (?)**
Breadcrumbs.CopyUTF16CodeUnits.Mixed            88.52      63.69      -28.0%   **1.39x**
DropWhileAnySeqCRangeIter                       120.727    94.167     -22.0%   **1.28x (?)**
UTF8Decode_InitFromCustom_noncontiguous         325.833    269.571    -17.3%   **1.21x**
UTF8Decode_InitDecoding                         156.6      131.462    -16.1%   **1.19x (?)**
DropWhileAnySeqCntRange                         111.917    94.071     -15.9%   **1.19x (?)**
UTF8Decode_InitFromCustom_contiguous            155.786    132.0      -15.3%   **1.18x (?)**
Breadcrumbs.IdxToUTF16Range.longMixed           436.0      372.0      -14.7%   **1.17x (?)**
SuffixSequence                                  127.769    110.692    -13.4%   **1.15x (?)**
SuffixSequenceLazy                              128.5      113.125    -12.0%   **1.14x (?)**
UTF8Decode_InitFromCustom_noncontiguous_ascii   733.0      648.667    -11.5%   **1.13x (?)**
NormalizedIterator_emoji                        360.769    322.0      -10.7%   **1.12x (?)**
NormalizedIterator_nonBMPSlowestPrenormal       453.636    406.034    -10.5%   **1.12x (?)**
StaticArray                                     1.91       1.719      -10.0%   **1.11x (?)**

------- Code size: -Osize -------

REGRESSION        OLD     NEW     DELTA   RATIO  
RemoveWhere.o     11852   12330   +4.0%   **0.96x**
Diffing.o         6696    6816    +1.8%   **0.98x**
RandomTree.o      11040   11180   +1.3%   **0.99x**
BufferFill.o      9347    9445    +1.0%   **0.99x**

IMPROVEMENT       OLD     NEW     DELTA   RATIO  
IndexPathTest.o   7256    6988    -3.7%   **1.04x**
ReduceInto.o      7908    7792    -1.5%   **1.01x**
MapReduce.o       20475   20180   -1.4%   **1.01x**
------- Performance (x86_64): -Onone -------

REGRESSION                              OLD       NEW       DELTA    RATIO    
Breadcrumbs.UTF16ToIdxRange.longMixed   377.8     480.0     +27.1%   **0.79x (?)**
BridgeString.find.native.longNonASCII   466.75    523.667   +12.2%   **0.89x (?)**
ObjectiveCBridgeStubDateAccess          2991.0    3237.0    +8.2%    **0.92x (?)**

IMPROVEMENT                             OLD       NEW       DELTA    RATIO    
UTF8Decode_InitDecoding                 165.667   137.75    -16.9%   **1.20x (?)**
UTF8Decode_InitFromCustom_contiguous    165.75    141.4     -14.7%   **1.17x (?)**
Breadcrumbs.CopyUTF16CodeUnits.Mixed    188.364   162.0     -14.0%   **1.16x (?)**
Breadcrumbs.IdxToUTF16Range.longMixed   653.5     592.0     -9.4%    **1.10x (?)**
Breadcrumbs.MutatedUTF16ToIdx.Mixed     230.889   211.714   -8.3%    **1.09x (?)**
Breadcrumbs.MutatedIdxToUTF16.Mixed     233.556   214.8     -8.0%    **1.09x (?)**

@lorentey
Copy link
Member Author

Hm; it seems the CI benchmarks are no longer providing a usable signal.

Even if we accept that the timer results are unreliable (given all the (?) marks), these runs should at least be able to reliably measure code size changes in benchmark object files. Unfortunately, the code size results are nonsensical -- given that this PR only touches non-inlinable parts in the stdlib, I don't see how code generation for clients could possibly change by as much as 24%, especially in modules that don't even mention or exercise String's UTF-16 view:

------- Code size: -O -------

REGRESSION        OLD     NEW     DELTA    RATIO  
Diffing.o         7533    7651    +1.6%    **0.98x**

IMPROVEMENT       OLD     NEW     DELTA    RATIO  
ReduceInto.o      14311   10879   -24.0%   **1.32x**
IndexPathTest.o   8290    7850    -5.3%    **1.06x**
MapReduce.o       27626   26435   -4.3%    **1.05x**
RemoveWhere.o     14948   14581   -2.5%    **1.03x**

@lorentey
Copy link
Member Author

The code size differences don't seem to reproduce in local benchmark builds -- evidently the "swift-ci benchmark" command is not doing the right thing.

@stephentyrone
Copy link
Contributor

stephentyrone commented Dec 29, 2022

Is any of the code in question inlineable? Last time I poked at it, benchmarks didn't properly rebuild in the face of inlineable/transparent code changes.

(I'm pretty sure all of this is behind ABI, but that's the first thing to check.)

@lorentey
Copy link
Member Author

No, the changes in this PR only affect non-inlinable functions.

The logs also indicate that the benchmarks are fully rebuilt twice -- once for the base commit (presumably from the head of main), and once for the PR's head: I see two separate build log entries for every benchmark module & optimization level.

@lorentey
Copy link
Member Author

Lovely, so build-script can run benchmarks locally, but naturally its local benchmark comparisons are silently failing with no output.

--- check-swift-benchmark-macosx-arm64 ---
+ /opt/homebrew/bin/cmake --build /Users/klorentey/Swift/build/Ninja-Release/swift-macosx-arm64 -- -j10 check-swift-benchmark-macosx-arm64
[1/1][100%][906.330s] cd /Users/klorentey/Swift/build/Ninja-Release/swift-macosx-arm64/benchmark && /Users/klorentey/Swift/build/Ninja-Release/swift-macosx-arm64/bin/Benchmark_Driver run -o O --output-dir /Users/klorentey/Swift/build/Ninja-Release/swift-macosx-arm64/benchmark/logs --architecture arm64 --swift-repo /Users/klorentey/Swift/swift --independent-samples 3 && /Users/klorentey/Swift/build/Ninja-Release/swift-macosx-arm64/bin/Benchmark_Driver run -o Onone --output-dir /Users/klorentey/Swift/build/Ninja-Release/swift-macosx-arm64/benchmark/logs --swift-repo /Users/klorentey/Swift/swift --architecture arm64 --independent-samples 3 && /Users/klorentey/Swift/build/Ninja-Release/swift-macosx-arm64/bin/Benchmark_Driver compare --log-dir /Users/klorentey/Swift/build/Ninja-Release/swift-macosx-arm64/benchmark/logs --swift-repo /Users/klorentey/Swift/swift --compare-script /Users/klorentey/Swift/swift/benchmark/scripts/compare_perf_tests.py
[...]
branch/branch comparison skipped: no previous branch logs
Comparing main/Benchmark_O-arm64-apple-macosx10.9-20221228163616-54aa1055e54.log string-utf16-speedup/Benchmark_O-arm64-apple-macosx10.9-20221228165658-9e11f382680.log ...
Comparing main/Benchmark_Onone-arm64-apple-macosx10.9-20221228164918-54aa1055e54.log string-utf16-speedup/Benchmark_Onone-arm64-apple-macosx10.9-20221228170958-9e11f382680.log ...
-- check-swift-benchmark-macosx-arm64 finished --
--- Finished tests for swift ---

We need a reliable way to track the performance of the stdlib. This is not it.

We commonly start from the `startIndex`, in which case
`_nativeGetOffset` is essentially free. Consider this
case when calculating the threshold for using breadcrumbs.
Speed up conversion between UTF-16 offset ranges
and string index ranges, by carefully switching
between absolute and relative index calculations,
depending on the distance we need to go.

It is a surprisingly tricky puzzle to do this
correctly while avoiding redundant calculations.
Offset ranges within substrings add the additional
complication of having to bias offset values with
the absolute offset of the substring’s start index.
@lorentey lorentey force-pushed the string-utf16-speedup branch from b7d1174 to d00f8ed Compare December 29, 2022 04:08
@lorentey
Copy link
Member Author

lorentey commented Dec 29, 2022

Evidently compare_perf_tests.py only produces output if the benchmark results are saved in the newer JSON format, which for some reason build-script -B does not use. (cc @tbkka)

Local benchmark results, including the new benchmarks in #62783:

Regression (30)
TEST OLD NEW DELTA RATIO
Breadcrumbs.IdxToUTF16.longASCII 59.04680187207488 84.07816907816908 +42.4% 0.70x
ArrayAppendFromGeneric 198.4879614767255 269.80059663997486 +35.9% 0.74x (?)
Data.append.Sequence.64kB.Count.RE 1.8508014796547472 2.2590885816692268 +22.0% 0.82x
StringDistance.utf16.ascii 6.20728 7.45696 +20.1% 0.83x
EqualSubstringSubstringGenericEquatable 19.57496532461857 22.51707220486646 +15.0% 0.87x
EqualSubstringSubstring 19.576396576396576 22.516187742816143 +15.0% 0.87x
EqualSubstringString 19.577644354087894 22.50591807101685 +15.0% 0.87x
EqualStringSubstring 19.578646718803128 22.500672 +14.9% 0.87x
LessSubstringSubstring 19.56947856947857 22.41163776292868 +14.5% 0.87x
LessSubstringSubstringGenericComparable 19.57826 22.387092322553936 +14.3% 0.87x
DataAppendDataSmallToLarge 14425.007425007425 16477.34988844699 +14.2% 0.88x (?)
MapReduceLazyCollectionShort 19.98152 22.214637287823727 +11.2% 0.90x
NSError 65.3837619334308 72.25508961197146 +10.5% 0.90x
DropWhileAnySeqCntRangeLazy 42.01219007314044 46.4047432379459 +10.5% 0.91x
DropWhileAnySeqCRangeIterLazy 42.025872310467726 46.3759 +10.4% 0.91x
DataSubscriptMedium 31.000099000099 34.14411514411515 +10.1% 0.91x
Breadcrumbs.MutatedIdxToUTF16.ASCII 2.047295047295047 2.249588 +9.9% 0.91x
SevenBoom 381.6302146618064 412.55905861456483 +8.1% 0.93x
KeyPathMutatingGetset 108.8859627016129 117.5388401170524 +7.9% 0.93x
DataAppendArray 1521.8304 1639.819037104594 +7.8% 0.93x
StringFromLongWholeSubstring 2.029816 2.183003183003183 +7.5% 0.93x
PopFrontArrayGeneric 1115.5941237649506 1190.6787844362398 +6.7% 0.94x (?)
ParseInt.UInt64.Hex 177.93763389669127 189.75278502014694 +6.6% 0.94x (?)
Breadcrumbs.MutatedUTF16ToIdx.ASCII 1.9687158748634994 2.0922980922980923 +6.3% 0.94x
ArrayAppendGenericStructs 428.54389438943895 455.2130076682761 +6.2% 0.94x (?)
KeyPathNestedClasses 107.853875 114.2388933226922 +5.9% 0.94x
RemoveWhereFilterInts 22.514584514584513 23.795157951579515 +5.7% 0.95x (?)
ObjectiveCBridgeStubToNSString 886.6678867740361 937.0666270666271 +5.7% 0.95x
Set.filter.Int50.20k 154.8442532942899 163.24413720686033 +5.4% 0.95x
StringDistance.characters.ascii 79.83466135458167 84.15054403264196 +5.4% 0.95x
Improvement (32)
TEST OLD NEW DELTA RATIO
StringDistance.utf16.mixed 3078.7476923076924 68.6646025345622 -97.8% 44.84x
ArrayAppendToGeneric 630.865521638041 192.96312593505235 -69.4% 3.27x
Breadcrumbs.UTF16ToIdxRange.longASCII 18.023405046810094 8.73442 -51.5% 2.06x
Breadcrumbs.IdxToUTF16Range.longASCII 21.808675 10.844276754214034 -50.3% 2.01x
DropWhileAnySeqCRangeIter 38.456924456924455 19.289611475337704 -49.8% 1.99x
DropWhileAnySeqCntRange 38.38824538824539 19.279018069198763 -49.8% 1.99x
Breadcrumbs.CopyUTF16CodeUnits.ASCII 15.17908379083791 9.088257176514354 -40.1% 1.67x
Breadcrumbs.CopyUTF16CodeUnits.Mixed 56.37827377112631 33.78485670914026 -40.1% 1.67x
Breadcrumbs.IdxToUTF16Range.longMixed 392.5728088336784 244.31128848346637 -37.8% 1.61x
ArrayAppendSequence 804.7557512070434 576.1588541666666 -28.4% 1.40x
RecursiveOwnedParameter 91.17581340650725 71.49056411302662 -21.6% 1.28x
DataReplaceLarge 14340.860215053763 11252.50505050505 -21.5% 1.27x
MapReduceAnyCollection 96.26754118364856 77.65943919139224 -19.3% 1.24x
ArrayAppendLazyMap 1286.3860693034653 1042.9625918503675 -18.9% 1.23x
Diffing.Myers.Similar 170.30787001386108 140.04724702380952 -17.8% 1.22x
ArrayAppendStrings 1748.4741097539995 1495.2138654794085 -14.5% 1.17x
ObjectiveCBridgeStubDataAppend 1283.2317397583033 1114.6469092938187 -13.1% 1.15x
ObserverUnappliedMethod 408.359375 354.93889716840533 -13.1% 1.15x
DataAppendDataMediumToLarge 16704.413619167717 14531.748998664887 -13.0% 1.15x (?)
ObserverForwarderStruct 286.27021464275214 254.11041990668738 -11.2% 1.13x (?)
DropWhileAnyCollectionLazy 46.40428425705542 42.343245967741936 -8.8% 1.10x
BufferFillFromSlice 15.37899437899438 14.13347287071325 -8.1% 1.09x
DataAppendDataLargeToMedium 16164.151164151164 14948.926403835363 -7.5% 1.08x (?)
Breadcrumbs.UTF16ToIdxRange.longMixed 107.66192532385065 100.14149162143026 -7.0% 1.08x
DictionaryOfAnyHashableStrings_insert 1815.0216284987278 1690.962402856543 -6.8% 1.07x
DataCreateMediumArray 556.3105963105963 520.218401747214 -6.5% 1.07x
DataCopyBytesSmall 49.60020800832033 46.539434104154914 -6.2% 1.07x
DataReplaceMedium 1948.034 1831.911631911632 -6.0% 1.06x
ArrayAppendArrayOfInt 210.16216216216213 197.6499495967742 -6.0% 1.06x
StringBuilderLong 524.0519691433211 493.8352679281285 -5.8% 1.06x
DictionaryOfAnyHashableStrings_lookup 1209.2004 1145.8896942351078 -5.2% 1.06x
Data.init.Sequence.64kB.Count 3.4351065414990316 3.2685272685272686 -4.8% 1.05x

@stephentyrone
Copy link
Contributor

I'd like to understand what's going on with Breadcrumbs.IdxToUTF16.longASCII, but I'm happy to take this in the meantime.

@@ -201,6 +204,14 @@ extension String.UTF16View: BidirectionalCollection {
return _foreignIndex(i, offsetBy: n)
}

let threshold = (
i == startIndex ? _breadcrumbStride / 2 : _breadcrumbStride)
if n.magnitude < threshold, !_guts.isASCII {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the sense of this condition right? Isn't direct computation always faster when _guts.isASCII is true? What am I misreading?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The _nativeGetIndex/_nativeGetOffset calls below have O(1) ASCII fast paths, so calling the advancing loop in _index(_:offsetBy:) would make that case worse. (I used to have a special case for _guts.isASCII inside this branch, but it seems simpler to just let the original code take care of it.)

@lorentey
Copy link
Member Author

The generated code looked reasonable enough, but the Breadcrumbs.IdxToUTF16.longASCII and StringDistance.utf16.ascii regressions could've perhaps been caused by the extra work to calculate & check the breadcrumb threshold. Moving the ASCII case into an up front check eliminated the regressions.

Frustratingly though this resulted in 130-135% regressions for StringDistance.scalars.ascii, StringDistance.utf8.ascii, and StringDistance.characters.ascii, whose code paths aren't even affected by this PR. Re-running the baseline benchmarks made these disappear, so perhaps my machine just doesn't feel like doing ASCII workloads today. ¯\_(ツ)_/¯

(FWIW, the new -20% improvement to Breadcrumbs.IdxToUTF16.longASCII remained the same with both baselines.)

Regression (14)
TEST OLD NEW DELTA RATIO
ArrayAppendLazyMap 869.5159515951594 1095.4270670826832 +26.0% 0.79x
ObjectiveCBridgeStubDataAppend 1127.2771254941295 1338.6282086668507 +18.7% 0.84x
DataAppendDataLargeToMedium 9997.725302187995 11734.009360374415 +17.4% 0.85x
ObserverForwarderStruct 278.77053140096615 324.9079453972698 +16.6% 0.86x
RawBufferCopyBytes 13.382943829438295 14.951712 +11.7% 0.90x
RemoveWhereFilterStrings 149.8489108910891 163.7511528064875 +9.3% 0.92x (?)
ArrayAppendAscii 1717.6146839635662 1845.3976 +7.4% 0.93x
ArrayInitFromSlice 187.10742574257426 199.95243282498186 +6.9% 0.94x
ArrayAppendLatin1Substring 9303.454016298021 9873.432494279177 +6.1% 0.94x (?)
Breadcrumbs.UTF16ToIdx.longASCII 54.91656766389459 58.213640922768306 +6.0% 0.94x
String.replaceSubrange.String 8.290125160500642 8.78018678018678 +5.9% 0.94x
KeyPathNestedClasses 110.76098552197104 116.9216559675095 +5.6% 0.95x
Data.init.Sequence.809B.Count.RE 12.5098685888173 13.177822956052516 +5.3% 0.95x
SortStringsUnicode 1584.103471520054 1667.9918323863637 +5.3% 0.95x
Improvement (46)
TEST OLD NEW DELTA RATIO
StringDistance.utf16.mixed 3104.5216049382716 70.24611175115207 -97.7% 44.19x
Breadcrumbs.IdxToUTF16Range.longASCII 21.951630709784258 5.664139664139664 -74.2% 3.88x
ArrayAppendFromGeneric 560.7700477960701 194.43880428652 -65.3% 2.88x
StringDistance.utf16.ascii 14.596465964659647 7.646499646499646 -47.6% 1.91x
Breadcrumbs.CopyUTF16CodeUnits.Mixed 57.14503219070819 34.69724770642202 -39.3% 1.65x
Breadcrumbs.IdxToUTF16Range.longMixed 394.3499653499654 245.21128259712614 -37.8% 1.61x
ArraySetElement 353.1861471861472 221.96106275767292 -37.2% 1.59x
Breadcrumbs.UTF16ToIdxRange.longASCII 18.58256 12.890522591497012 -30.6% 1.44x
ArrayAppendSequence 812.8057094162762 598.9427135678392 -26.3% 1.36x
StringEqualPointerComparison 125.63169642857143 93.95024181547619 -25.2% 1.34x
Breadcrumbs.CopyUTF16CodeUnits.ASCII 15.385172466552199 11.926485411883295 -22.5% 1.29x
Breadcrumbs.IdxToUTF16.longASCII 59.73165589935396 47.25780321849012 -20.9% 1.26x
MapReduceAnyCollection 96.97548926467795 77.6663849923948 -19.9% 1.25x
DataToStringEmpty 546.2048292173148 442.1126845073803 -19.1% 1.24x
ArrayAppendStrings 1811.895657809462 1480.6139111320458 -18.3% 1.22x
EqualStringSubstring 22.844157324719742 19.583134573594105 -14.3% 1.17x
LessSubstringSubstring 22.67857307143843 19.56976 -13.7% 1.16x
EqualSubstringSubstring 22.716634716634715 19.616857869725916 -13.6% 1.16x
EqualSubstringSubstringGenericEquatable 22.715052580630967 19.61774 -13.6% 1.16x
LessSubstringSubstringGenericComparable 22.654744928469572 19.58526 -13.5% 1.16x
EqualSubstringString 22.614241685450114 19.58864 -13.4% 1.15x
DataAppendDataMediumToLarge 10306.288570408527 9105.70152543356 -11.6% 1.13x (?)
Breadcrumbs.UTF16ToIdxRange.longMixed 110.3634699853587 99.31144335825186 -10.0% 1.11x
DataToStringSmall 1081.4801036816589 979.1837499999999 -9.5% 1.10x
DataAppendDataLargeToSmall 9846.957033054727 8925.952653348588 -9.4% 1.10x (?)
DataReplaceLarge 10516.208851100937 9537.983674334782 -9.3% 1.10x (?)
Prims.NonStrongRef.UnownedUnsafe 107.8287087273291 97.86537915364953 -9.2% 1.10x
FindString.Loop1.Substring 273.5505780346821 249.09390125847048 -8.9% 1.10x
Prims.NonStrongRef.UnownedUnsafe.Closure 107.61694595232476 98.0281887892861 -8.9% 1.10x
RemoveWhereQuadraticInts 680.3831026948288 619.997383911053 -8.9% 1.10x
ArrayAppendToGeneric 635.9151156232374 579.7549980399843 -8.8% 1.10x
ParseInt.UInt64.Hex 213.81696428571428 196.3216401990993 -8.2% 1.09x
Data.init.Sequence.64kB.Count.RE 2.205452821811287 2.0374790749581497 -7.6% 1.08x
DataToStringMedium 1667.6226581891128 1550.8512 -7.0% 1.08x
Diffing.Similar 163.00455486542444 152.12760952660983 -6.7% 1.07x
Set.subtracting.Seq.Empty.Int 97.17290706605223 90.76932769327694 -6.6% 1.07x
DataCopyBytesSmall 50.27175 47.0286 -6.5% 1.07x
Diffing.Myers.Similar 154.30286771507863 144.3515050959943 -6.4% 1.07x
LineSink.scalars.alpha 34.33507871204651 32.155296 -6.3% 1.07x
ArrayAppendUTF16Substring 9867.688258064516 9247.880184331798 -6.3% 1.07x
FlattenListFlatMap 4848.395721925133 4557.5641025641025 -6.0% 1.06x
Diffing.Pangrams 1592.9009584664536 1498.0873493975903 -6.0% 1.06x
Set.subtracting.Seq.Int.Empty 103.92119901112484 97.8462 -5.8% 1.06x
Diffing.ReversedLorem 563.8752107925801 532.61899503037 -5.5% 1.06x
Diffing.ReversedAlphabets 115.1374007936508 109.21288851554063 -5.1% 1.05x
Diffing.Disparate 94.24801315428884 89.5383295194508 -5.0% 1.05x

@lorentey
Copy link
Member Author

@swift-ci test

@stephentyrone
Copy link
Contributor

OK. I'm happy with taking this once we get a clean test run at this point.

Evidently we did not have any tests that exercised
`distance(from:to:)` and `index(_:offsetBy:)`. :-O
- Align input indices to scalar boundaries
- Don’t pass decreasing indices to _utf16Distance
@lorentey lorentey force-pushed the string-utf16-speedup branch from c564b1f to 5d354ce Compare January 2, 2023 04:58
@lorentey
Copy link
Member Author

lorentey commented Jan 2, 2023

It turns out we did not have proper test coverage for String.UTF16.distance(from:to:) and .index(_:offsetBy:). 😨

The last commits do not materially change benchmark results. (Except for the ASCII StringDistance tasks, which uniformly slowed down again by a factor of 10-20 or so (in both the baseline & PR measurements). My top guess is that we may not be properly configuring the QoS level of the benchmark process, so we get unstable/inconsistent scheduling/affinity. Anyway, the measurements seem to be stable across short timespans, so the relative differences seem valid.)

Regression (17)
TEST OLD NEW DELTA RATIO
ExclusivityGlobal 0.0 0.096899 +9689.9% 0.01x
ObjectiveCBridgeASCIIStringFromFile 0.0 0.004024 +402.4% 0.20x
ArrayAppendGenericStructs 391.9276927692769 1389.1610347238407 +254.4% 0.28x
RemoveWhereFilterStrings 141.9654919236417 182.38476190476192 +28.5% 0.78x
Data.append.Sequence.64kB.Count.RE 2.201877070298123 2.7967163534240655 +27.0% 0.79x
ArrayAppendStrings 1468.2893054963783 1817.191887675507 +23.8% 0.81x
ArrayAppendToGeneric 600.2321778940484 728.305424099017 +21.3% 0.82x
PrefixWhileArray 48.08232131714107 57.8755037510075 +20.4% 0.83x
ObjectiveCBridgeStubDataAppend 1120.3197158081705 1318.650780872645 +17.7% 0.85x
Data.append.Sequence.64kB.Count0 162.36250236250237 190.43015963511974 +17.3% 0.85x
ProtocolDispatch 186.39279869067104 217.65001380071763 +16.8% 0.86x
BufferFillFromSlice 13.994899872496813 15.785969334173492 +12.8% 0.89x (?)
DataAppendBytesMedium 1525.1499210941608 1695.712749946248 +11.2% 0.90x
DataAppendDataMediumToMedium 1637.6751721344676 1781.0147849462364 +8.8% 0.92x
Data.init.Sequence.511B.Count.I 15.715491318265594 17.030176565008027 +8.4% 0.92x
MapReduceClass2 7.854055416221665 8.330853323413294 +6.1% 0.94x (?)
String.replaceSubrange.String 8.23093200846444 8.679880439043512 +5.5% 0.95x
Improvement (63)
TEST OLD NEW DELTA RATIO
StringDistance.utf16.mixed 3084.3258785942494 67.40059642147118 -97.8% 45.76x
SubstringRemoveLast1 0.038172038172038174 0.0 -97.4% 39.17x
Breadcrumbs.IdxToUTF16Range.longASCII 21.762324197187155 4.983150983150983 -77.1% 4.37x
StringDistance.utf16.ascii 267.38345070422537 67.98451100870456 -74.6% 3.93x
Breadcrumbs.IdxToUTF16.longASCII 59.16057134971018 27.98367957566897 -52.7% 2.11x
Breadcrumbs.CopyUTF16CodeUnits.Mixed 55.945609945609945 33.54770154770155 -40.0% 1.67x
Breadcrumbs.IdxToUTF16Range.longMixed 391.2633033863165 243.45650999459752 -37.8% 1.61x
ArrayAppendSequence 753.0925000000001 492.6215621562156 -34.6% 1.53x (?)
Breadcrumbs.UTF16ToIdxRange.longASCII 18.345242761942096 12.442959657757946 -32.2% 1.47x
DataToStringEmpty 569.0881976991905 420.6482706482706 -26.1% 1.35x
DataAppendDataMediumToLarge 16161.067832034962 12075.701839303001 -25.3% 1.34x
StringEqualPointerComparison 124.20721671238658 93.23564615672339 -24.9% 1.33x
Breadcrumbs.CopyUTF16CodeUnits.ASCII 15.173112558013022 11.411979119791198 -24.8% 1.33x
ObjectiveCBridgeStubDateAccess 157.60916442048517 124.43265380414998 -21.0% 1.27x
MapReduceAnyCollection 96.22638999733972 77.74146649810366 -19.2% 1.24x
Breadcrumbs.MutatedIdxToUTF16.ASCII 2.022464 1.665037665037665 -17.7% 1.21x
ArrayOfPOD 350.18941798941796 289.24116424116426 -17.4% 1.21x
DataToStringSmall 1117.1036275285344 947.758556547619 -15.2% 1.18x
EqualStringSubstring 22.663642663642662 19.460661662496523 -14.1% 1.16x
LessSubstringSubstringGenericComparable 22.557711557711556 19.41238729909839 -13.9% 1.16x
LessSubstringSubstring 22.494775 19.416030099190515 -13.7% 1.16x
EqualSubstringSubstring 22.5104643557881 19.457371643900142 -13.6% 1.16x
EqualSubstringString 22.45770080044162 19.41766 -13.5% 1.16x
EqualSubstringSubstringGenericEquatable 22.441564896336267 19.443331443331445 -13.4% 1.15x
Diffing.Similar 175.83179723502303 153.80505693881491 -12.5% 1.14x
Breadcrumbs.UTF16ToIdxRange.longMixed 107.91 95.26484918793504 -11.7% 1.13x
Diffing.Pangrams 1708.518450184502 1509.771285475793 -11.6% 1.13x
RemoveWhereFilterInts 25.285674855092466 22.59664 -10.6% 1.12x
DataToStringMedium 1689.2884645002312 1521.499488976493 -9.9% 1.11x
CharIteration_utf16_unicodeScalars 1893.7649880095923 1707.4799065154193 -9.8% 1.11x
DataAppendDataSmallToMedium 1551.1783517835179 1401.3136213136213 -9.7% 1.11x
NSStringConversion.MutableCopy.Rebridge.UTF8 234.9656946826758 213.0114521300962 -9.3% 1.10x
Diffing.Myers.Similar 157.1981512206684 143.80248265790433 -8.5% 1.09x
MapReduceClassShort2 52.74447447924948 48.408788351534064 -8.2% 1.09x
Diffing.ReversedLorem 590.7385670731708 544.2202312138728 -7.9% 1.09x
OpenClose 46.94776558212466 43.40500381000762 -7.5% 1.08x
Prims.NonStrongRef.UnownedUnsafe 103.13078976291412 95.8390214436726 -7.1% 1.08x
Prims.NonStrongRef.UnownedUnsafe.Closure 102.9244159413651 95.97832444520901 -6.7% 1.07x
Breadcrumbs.IdxToUTF16.longMixed 740.9079878665318 691.5965965965966 -6.7% 1.07x
MapReduceShortString 5.3356083356083355 4.98558648111332 -6.6% 1.07x
UTF8Decode_InitFromBytes_ascii_as_ascii 258.40736607142856 242.27836611195158 -6.2% 1.07x
Data.init.Sequence.809B.Count.RE 13.232027176489177 12.4134943875061 -6.2% 1.07x
DataToStringLargeUnicode 2340.9364480261142 2198.406374501992 -6.1% 1.06x
Dict.CopyKeyValue.16k 653.0077101002313 613.7950949367089 -6.0% 1.06x (?)
ArrayPlusEqualSingleElementCollection 382.17956 360.44699717855707 -5.7% 1.06x (?)
Data.append.Sequence.64kB.Count.I 3.5885543295305777 3.386308 -5.6% 1.06x
NSStringConversion.MutableCopy.UTF8 310.12393767705385 292.7287784679089 -5.6% 1.06x
Data.init.Sequence.809B.Count.RE.I 13.14628136200717 12.410035250881272 -5.6% 1.06x
DataAppendArray 1631.0764430577224 1541.2030258662762 -5.5% 1.06x
LineSink.scalars.alpha 33.90805362080431 32.03970203970204 -5.5% 1.06x
Dict.CopyKeyValue.20k 771.5848506919156 729.1044650379107 -5.5% 1.06x
NSStringConversion.MutableCopy.Rebridge.LongUTF8 205.6728742955158 194.56545454545454 -5.4% 1.06x
NSStringConversion.MutableCopy.Rebridge.Medium 273.20762398223536 258.5820895522388 -5.4% 1.06x
ArrayAppendFromGeneric 197.9175101214575 187.41782868525894 -5.3% 1.06x (?)
SIMDReduce.Int8x64.Cast 57.92093250733616 54.88939828080229 -5.2% 1.06x
Data.append.Sequence.809B.Count.RE 23.575889106692802 22.35225522552255 -5.2% 1.05x
Diffing.PangramToAlphabet 681.0186170212766 646.2713414634146 -5.1% 1.05x
Set.filter.Int50.20k 165.60588235294117 157.25940594059406 -5.0% 1.05x
SetIntersectionBox25 96.7695007800312 91.91776746400095 -5.0% 1.05x
Prims.NonStrongRef.UnownedSafe.Closure 251.12207527975585 238.57288732394366 -5.0% 1.05x
DropWhileAnySequenceLazy 510.15220048899755 484.6727073036793 -5.0% 1.05x
DataReplaceLarge 13504.339473386824 12838.700114025085 -4.9% 1.05x (?)
MapReduceAnyCollectionShort 528.2943416757346 502.3500588004704 -4.9% 1.05x

@lorentey
Copy link
Member Author

lorentey commented Jan 2, 2023

@swift-ci test

…ithms

[Bidirectional]Collection’s default index manipulation methods (as
well as _utf16Distance) do not expect to be given unreachable
indices, and they tend to fail when operating on them. Round indices
down to the nearest scalar boundary before calling these.
@lorentey
Copy link
Member Author

lorentey commented Jan 4, 2023

@swift-ci test

@stephentyrone
Copy link
Contributor

let's clone this for 5.8 too, once you're ready.

@lorentey
Copy link
Member Author

lorentey commented Jan 4, 2023

Failed Tests (16):
  Swift(macosx-x86_64) :: bindings-build-record.swift
  Swift(macosx-x86_64) :: check-interface-implementation-fine.swift
  Swift(macosx-x86_64) :: crash-added-fine.swift
  Swift(macosx-x86_64) :: crash-simple-fine.swift
  Swift(macosx-x86_64) :: dependencies-preservation-fine.swift
  Swift(macosx-x86_64) :: driver-show-incremental-arguments-fine.swift
  Swift(macosx-x86_64) :: driver-show-incremental-conflicting-arguments-fine.swift
  Swift(macosx-x86_64) :: driver-show-incremental-inputs-fine.swift
  Swift(macosx-x86_64) :: driver-show-incremental-malformed-fine.swift
  Swift(macosx-x86_64) :: driver-show-incremental-mutual-fine.swift
  Swift(macosx-x86_64) :: driver-show-incremental-swift-version-fine.swift
  Swift(macosx-x86_64) :: fail-added-fine.swift
  Swift(macosx-x86_64) :: fail-chained-fine.swift
  Swift(macosx-x86_64) :: fail-interface-hash-fine.swift
  Swift(macosx-x86_64) :: fail-simple-fine.swift
  Swift(macosx-x86_64) :: independent-fine.swift

@lorentey
Copy link
Member Author

lorentey commented Jan 4, 2023

@swift-ci test macOS platform

1 similar comment
@lorentey
Copy link
Member Author

lorentey commented Jan 4, 2023

@swift-ci test macOS platform

@lorentey
Copy link
Member Author

lorentey commented Jan 5, 2023

java.nio.file.FileSystemException: /Users/ec2-user/jenkins/workspace/swift-PR-macos@tmp/durable-45f8247f: No space left on device

@lorentey
Copy link
Member Author

lorentey commented Jan 5, 2023

@swift-ci smoke test macOS platform

@lorentey lorentey merged commit 4ffc5fe into swiftlang:main Jan 5, 2023
@lorentey lorentey deleted the string-utf16-speedup branch January 5, 2023 05:20
lorentey added a commit to lorentey/swift that referenced this pull request Feb 11, 2023
This is a wild guess at what might be causing our persistent, random
String failures on the main branch:

```
  Swift(macosx-x86_64) :: Prototypes/CollectionTransformers.swift
  Swift(macosx-x86_64) :: stdlib/NSSlowString.swift
  Swift(macosx-x86_64) :: stdlib/NSStringAPI.swift
  Swift(macosx-x86_64) :: stdlib/StringIndex.swift
  Swift-validation(macosx-x86_64) :: stdlib/String.swift
  Swift-validation(macosx-x86_64) :: stdlib/StringBreadcrumbs.swift
  Swift-validation(macosx-x86_64) :: stdlib/StringUTF8.swift
```

FWIW, it appears this is *not* caused by swiftlang#62717:
that change has also landed on release/5.8, and I haven’t seen these
issues on that branch.

Our atomic breadcrumbs initialization vs its non-atomic loading
gives me an uneasy feeling that this may in fact be a long standing
synchronization issue that is only now causing problems (for whatever
reason). I am unable to reproduce these issues locally, so this guess
may be (and probably is) wildly off the mark, but this PR is likely
to be a good idea anyway, if only to rule out this possibility.

rdar://104751936
atrick pushed a commit to atrick/swift that referenced this pull request Feb 13, 2023
This is a wild guess at what might be causing our persistent, random
String failures on the main branch:

```
  Swift(macosx-x86_64) :: Prototypes/CollectionTransformers.swift
  Swift(macosx-x86_64) :: stdlib/NSSlowString.swift
  Swift(macosx-x86_64) :: stdlib/NSStringAPI.swift
  Swift(macosx-x86_64) :: stdlib/StringIndex.swift
  Swift-validation(macosx-x86_64) :: stdlib/String.swift
  Swift-validation(macosx-x86_64) :: stdlib/StringBreadcrumbs.swift
  Swift-validation(macosx-x86_64) :: stdlib/StringUTF8.swift
```

FWIW, it appears this is *not* caused by swiftlang#62717:
that change has also landed on release/5.8, and I haven’t seen these
issues on that branch.

Our atomic breadcrumbs initialization vs its non-atomic loading
gives me an uneasy feeling that this may in fact be a long standing
synchronization issue that is only now causing problems (for whatever
reason). I am unable to reproduce these issues locally, so this guess
may be (and probably is) wildly off the mark, but this PR is likely
to be a good idea anyway, if only to rule out this possibility.

rdar://104751936
(cherry picked from commit 73f349c)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants