[stdlib] Speed up short UTF-16 distance calculations #62717

lorentey · 2022-12-21T03:11:45Z

Previously we insisted on using UTF-16 breadcrumbs even if we only needed to travel a very short way. This could be as much as ~~ten~~ forty times slower than the naive algorithm of simply visiting all the Unicode scalars in between the start and the end.

(Using breadcrumbs generally means that we need to walk to both endpoints from their nearest breadcrumb, which on average requires walking half the distance between breadcrumbs, twice — and this can mean visiting vastly more Unicode scalars than if we simply walked through the ones that are lying in between the endpoints themselves.)

To put it another way, when we want to measure how long it takes to walk between two trees within a nearby park, it probably isn't a great idea to start by separately measuring each of their distances from the nearest airport. 😛

rdar://103575481

lorentey · 2022-12-21T03:12:50Z

@swift-ci test

lorentey · 2022-12-21T03:12:58Z

@swift-ci benchmark

lorentey · 2022-12-21T17:43:33Z

Benchmark results are... peculiar.

------- Performance (x86_64): -O -------

REGRESSION                                OLD        NEW        DELTA    RATIO    
Breadcrumbs.IdxToUTF16.longASCII          128.556    198.545    +54.4%   **0.65x**
FlattenListFlatMap                        3041.0     4311.0     +41.8%   **0.71x (?)**
Breadcrumbs.MutatedIdxToUTF16.ASCII       3.588      4.316      +20.3%   **0.83x**
MapReduceAnyCollection                    123.667    143.7      +16.2%   **0.86x (?)**
SortAdjacentIntPyramids                   763.333    865.714    +13.4%   **0.88x (?)**
SortIntPyramid                            484.333    545.333    +12.6%   **0.89x**
Set.isDisjoint.Seq.Int.Empty              39.757     44.286     +11.4%   **0.90x (?)**
Set.isSuperset.Seq.Empty.Int              43.0       47.538     +10.6%   **0.90x (?)**
Set.isDisjoint.Box.Empty                  49.021     54.19      +10.5%   **0.90x (?)**
Set.isSubset.Seq.Int25                    47.281     51.08      +8.0%    **0.93x (?)**

IMPROVEMENT                               OLD        NEW        DELTA    RATIO    
ArrayAppendGenericStructs                 1522.0     610.0      -59.9%   **2.50x (?)**
UTF8Decode_InitDecoding                   163.556    131.0      -19.9%   **1.25x (?)**
UTF8Decode_InitFromCustom_contiguous      160.2      131.231    -18.1%   **1.22x (?)**
MapReduceClass2                           12.479     10.634     -14.8%   **1.17x (?)**
Breadcrumbs.IdxToUTF16Range.longMixed     397.0      341.333    -14.0%   **1.16x**
UTF8Decode_InitFromCustom_noncontiguous   283.286    249.375    -12.0%   **1.14x (?)**
DataAppendDataSmallToSmall                2723.333   2518.462   -7.5%    **1.08x (?)**
ObjectiveCBridgeStubToNSStringRef         92.808     86.667     -6.6%    **1.07x (?)**

------- Code size: -O -------

REGRESSION        OLD     NEW     DELTA    RATIO  
Diffing.o         7549    7667    +1.6%    **0.98x**

IMPROVEMENT       OLD     NEW     DELTA    RATIO  
ReduceInto.o      14328   10896   -24.0%   **1.31x**
IndexPathTest.o   8386    7946    -5.2%    **1.06x**
MapReduce.o       27754   26723   -3.7%    **1.04x**
RemoveWhere.o     15171   14811   -2.4%    **1.02x**

------- Performance (x86_64): -Osize -------

REGRESSION                                               OLD        NEW        DELTA    RATIO    
PrefixWhileSequenceLazy                                  26.846     53.619     +99.7%   **0.50x**
PrefixWhileArrayLazy                                     20.229     40.25      +99.0%   **0.50x**
PrefixArrayLazy                                          13.958     26.852     +92.4%   **0.52x**
DropFirstCountableRangeLazy                              14.657     26.852     +83.2%   **0.55x**
Breadcrumbs.IdxToUTF16Range.longASCII                    34.435     62.649     +81.9%   **0.55x**
DropLastArrayLazy                                        5.115      9.0        +75.9%   **0.57x**
DropWhileCountableRangeLazy                              40.273     67.053     +66.5%   **0.60x**
Breadcrumbs.IdxToUTF16.longASCII                         128.556    201.545    +56.8%   **0.64x**
DropLastCountableRangeLazy                               5.844      9.0        +54.0%   **0.65x (?)**
DropWhileArrayLazy                                       44.926     67.207     +49.6%   **0.67x**
DropWhileArray                                           26.627     38.169     +43.3%   **0.70x**
MapReduceLazySequence                                    65.933     87.722     +33.0%   **0.75x (?)**
Breadcrumbs.MutatedIdxToUTF16.ASCII                      3.59       4.316      +20.2%   **0.83x (?)**
PrefixWhileAnySequence                                   185.4      219.889    +18.6%   **0.84x (?)**
PrefixWhileSequence                                      185.714    219.75     +18.3%   **0.85x (?)**
MapReduceAnyCollection                                   160.556    184.375    +14.8%   **0.87x (?)**
DropWhileAnySeqCntRange                                  107.308    120.625    +12.4%   **0.89x (?)**
PrefixWhileAnySeqCRangeIter                              128.727    142.6      +10.8%   **0.90x (?)**
PrefixWhileAnySeqCntRange                                129.0      142.769    +10.7%   **0.90x (?)**
PrefixWhileAnyCollectionLazy                             121.143    134.0      +10.6%   **0.90x (?)**
DropFirstAnyCollection                                   114.0      123.875    +8.7%    **0.92x (?)**

IMPROVEMENT                                              OLD        NEW        DELTA    RATIO    
DropWhileCountableRange                                  26.852     13.575     -49.4%   **1.98x**
DropFirstCountableRange                                  26.852     13.58      -49.4%   **1.98x**
PrefixCountableRange                                     26.852     13.717     -48.9%   **1.96x**
PrefixArray                                              26.844     13.854     -48.4%   **1.94x**
DropLastCountableRange                                   9.034      4.74       -47.5%   **1.91x**
DropLastArray                                            9.022      4.84       -46.3%   **1.86x**
DropWhileSequence                                        26.842     14.555     -45.8%   **1.84x (?)**
SuffixCountableRange                                     9.034      4.924      -45.5%   **1.83x (?)**
SuffixArray                                              9.0        5.25       -41.7%   **1.71x**
MapReduceLazyCollectionShort                             50.083     31.25      -37.6%   **1.60x (?)**
PrefixSequence                                           40.25      26.885     -33.2%   **1.50x (?)**
PrefixSequenceLazy                                       40.214     26.885     -33.1%   **1.50x**
RemoveWhereSwapInts                                      13.822     9.605      -30.5%   **1.44x (?)**
UTF8Decode_InitFromCustom_noncontiguous                  359.0      258.429    -28.0%   **1.39x**
UTF8Decode_InitFromCustom_noncontiguous_ascii            831.5      618.333    -25.6%   **1.34x (?)**
DropFirstSequence                                        44.864     33.52      -25.3%   **1.34x (?)**
DropFirstSequenceLazy                                    44.864     33.56      -25.2%   **1.34x (?)**
DropWhileSequenceLazy                                    76.176     58.238     -23.5%   **1.31x (?)**
SortAdjacentIntPyramids                                  945.833    726.111    -23.2%   **1.30x**
UTF8Decode_InitFromCustom_noncontiguous_ascii_as_ascii   932.5      720.333    -22.8%   **1.29x (?)**
UTF8Decode_InitDecoding                                  161.273    129.909    -19.4%   **1.24x (?)**
PrefixWhileArray                                         54.057     44.455     -17.8%   **1.22x (?)**
SortIntPyramid                                           612.692    504.5      -17.7%   **1.21x (?)**
UTF8Decode_InitFromCustom_contiguous                     161.455    133.846    -17.1%   **1.21x (?)**
Breadcrumbs.IdxToUTF16Range.longMixed                    396.5      339.667    -14.3%   **1.17x (?)**
FlattenListLoop                                          1614.0     1385.0     -14.2%   **1.17x (?)**
RemoveWhereMoveInts                                      13.643     11.735     -14.0%   **1.16x (?)**
DropWhileAnyCollectionLazy                               161.286    143.727    -10.9%   **1.12x (?)**
PrefixWhileAnyCollection                                 151.909    138.667    -8.7%    **1.10x (?)**
Chars2                                                   3450.0     3168.519   -8.2%    **1.09x (?)**
StringSwitch                                             237.375    221.0      -6.9%    **1.07x (?)**
DataCreateEmptyArray                                     1610.714   1501.786   -6.8%    **1.07x (?)**

------- Code size: -Osize -------

REGRESSION        OLD     NEW     DELTA   RATIO  
RemoveWhere.o     12053   12531   +4.0%   **0.96x**
Diffing.o         6718    6838    +1.8%   **0.98x**
RandomTree.o      11032   11168   +1.2%   **0.99x**
BufferFill.o      9424    9522    +1.0%   **0.99x**

IMPROVEMENT       OLD     NEW     DELTA   RATIO  
IndexPathTest.o   7326    7058    -3.7%   **1.04x**
ReduceInto.o      7987    7871    -1.5%   **1.01x**
MapReduce.o       20602   20307   -1.4%   **1.01x**

------- Performance (x86_64): -Onone -------

REGRESSION                               OLD       NEW       DELTA    RATIO    
Breadcrumbs.IdxToUTF16Range.longASCII    230.667   255.625   +10.8%   **0.90x**
Breadcrumbs.IdxToUTF16.longASCII         1016.0    1103.5    +8.6%    **0.92x (?)**

IMPROVEMENT                              OLD       NEW       DELTA    RATIO    
UTF8Decode_InitFromCustom_contiguous     173.308   138.417   -20.1%   **1.25x (?)**
UTF8Decode_InitDecoding                  167.25    136.75    -18.2%   **1.22x (?)**
DataSubscriptMedium                      65.313    56.632    -13.3%   **1.15x (?)**
ArrayAppendLatin1Substring               26136.0   23052.0   -11.8%   **1.13x (?)**
ArrayAppendAsciiSubstring                25776.0   22752.0   -11.7%   **1.13x (?)**
RC4                                      12832.0   11346.0   -11.6%   **1.13x (?)**
ArrayAppendUTF16Substring                25788.0   22836.0   -11.4%   **1.13x (?)**
Breadcrumbs.IdxToUTF16Range.longMixed    610.0     555.0     -9.0%    **1.10x (?)**
UTF8Decode_InitDecoding_ascii_as_ascii   207.545   192.3     -7.3%    **1.08x (?)**
TypeName                                 1533.0    1427.0    -6.9%    **1.07x (?)**

------- Code size: -swiftlibs -------

stephentyrone · 2022-12-21T17:52:07Z

How stale is the baseline for the perf tests?

stephentyrone · 2022-12-21T17:53:29Z

stdlib/public/core/StringUTF16View.swift

@@ -201,6 +201,17 @@ extension String.UTF16View: BidirectionalCollection {
      return _foreignIndex(i, offsetBy: n)
    }

+    if n.magnitude <= _StringBreadcrumbs.breadcrumbStride {


On average, using breadcrumbs ought to win when |n| > breadcrumbStride/2, right?

Yep, the average cost of using breadcrumbs is breadcrumbStride/2 each time we do it. In this case, we're doing it twice, though, so I expect the cutoff to be doubled. 🤔

stdlib/public/core/StringUTF16View.swift

lorentey · 2022-12-28T03:42:33Z

@swift-ci benchmark

lorentey · 2022-12-28T03:59:10Z

How stale is the baseline for the perf tests?

We are building & measuring the baseline on the fly with each benchmark run, so it should be always up to date. Huh, unless we somehow select the wrong commit for the baseline builds, perhaps?

Edit: Hm, maybe: the logs indicate that CI is for some reason rebuilding swiftAST and swiftParse after switching to the PR branch, but this PR only contains stdlib changes. 🤔

Previously we insisted on using breadcrumbs even if we only needed to travel a very short way. This could be as much as ten times slower than the naive algorithm of simply visiting all the Unicode scalars in between the start and the end. (Using breadcrumbs generally means that we need to walk to both endpoints from their nearest breadcrumb, which on average requires walking half the distance between breadcrumbs — and this can mean visiting vastly more Unicode scalars than the ones that are simply lying in between the endpoints themselves.)

… ranges Instead of calling `_toUTF16Index` twice, call it once and then use `index(_:offsetBy:)` to potentially avoid another breadcrumbs lookup.

lorentey · 2022-12-28T04:23:41Z

In fact, let's try if the rebuild set changes after rebasing this on top of current main.

@swift-ci benchmark

lorentey · 2022-12-28T04:29:32Z

@swift-ci benchmark

lorentey · 2022-12-28T04:30:07Z

@swift-ci test

lorentey · 2022-12-28T05:29:39Z

This has still rebuilt a lot more source files between main & the PR branch than I'd have expected.

------- Performance (x86_64): -O -------

REGRESSION                                  OLD       NEW       DELTA     RATIO    
StringWithCString2                          0.0       0.002     +200.0%   **0.33x**
Breadcrumbs.UTF16ToIdxRange.longMixed       163.0     264.778   +62.4%    **0.62x (?)**
SumUsingReduce                              66.571    83.222    +25.0%    **0.80x (?)**
NSStringConversion.InlineBuffer.UTF8        786.0     959.0     +22.0%    **0.82x (?)**
Set.isSuperset.Seq.Empty.Int                43.577    50.08     +14.9%    **0.87x (?)**
Set.isDisjoint.Seq.Int.Empty                46.837    53.093    +13.4%    **0.88x (?)**
SortAdjacentIntPyramids                     768.571   868.125   +13.0%    **0.89x (?)**
BridgeString.find.native.longNonASCII       467.0     524.0     +12.2%    **0.89x (?)**
SortIntPyramid                              478.438   532.188   +11.2%    **0.90x (?)**
Breadcrumbs.UTF16ToIdxRange.longASCII       42.923    47.563    +10.8%    **0.90x (?)**

IMPROVEMENT                                 OLD       NEW       DELTA     RATIO    
MapReduceAnyCollection                      127.0     86.583    -31.8%    **1.47x (?)**
Breadcrumbs.CopyUTF16CodeUnits.Mixed        88.68     63.516    -28.4%    **1.40x (?)**
UTF8Decode_InitDecoding                     156.273   132.25    -15.4%    **1.18x (?)**
Breadcrumbs.IdxToUTF16Range.longMixed       429.5     368.333   -14.2%    **1.17x (?)**
Set.isSubset.Int.Empty                      51.192    44.091    -13.9%    **1.16x (?)**
MapReduceClass2                             12.517    10.803    -13.7%    **1.16x (?)**
UTF8Decode_InitFromCustom_contiguous        155.857   134.529   -13.7%    **1.16x (?)**
NormalizedIterator_nonBMPSlowestPrenormal   453.333   403.455   -11.0%    **1.12x (?)**
NormalizedIterator_slowerPrenormal          316.338   286.386   -9.5%     **1.10x (?)**
Breadcrumbs.MutatedIdxToUTF16.Mixed         219.3     198.636   -9.4%     **1.10x (?)**

------- Code size: -O -------

REGRESSION        OLD     NEW     DELTA    RATIO  
Diffing.o         7533    7651    +1.6%    **0.98x**

IMPROVEMENT       OLD     NEW     DELTA    RATIO  
ReduceInto.o      14311   10879   -24.0%   **1.32x**
IndexPathTest.o   8290    7850    -5.3%    **1.06x**
MapReduce.o       27626   26435   -4.3%    **1.05x**
RemoveWhere.o     14948   14581   -2.5%    **1.03x**

------- Performance (x86_64): -Osize -------

REGRESSION                                      OLD        NEW        DELTA    RATIO    
DropFirstArray                                  13.88      26.853     +93.5%   **0.52x (?)**
RemoveWhereMoveInts                             7.92       13.675     +72.7%   **0.58x (?)**
SuffixArray                                     5.216      9.0        +72.5%   **0.58x**
PrefixCountableRangeLazy                        15.72      26.839     +70.7%   **0.59x (?)**
FlattenListLoop                                 991.5      1626.0     +64.0%   **0.61x (?)**
PrefixAnyCollection                             85.267     138.714    +62.7%   **0.61x**
Breadcrumbs.UTF16ToIdxRange.longMixed           163.5      265.375    +62.3%   **0.62x**
Data.init.Sequence.64kB.Count.RE                19.191     29.524     +53.8%   **0.65x**
MapReduceLazyCollectionShort                    31.24      47.905     +53.3%   **0.65x**
Data.init.Sequence.64kB.Count.RE.I              19.429     29.531     +52.0%   **0.66x**
FlattenListFlatMap                              2977.0     4523.0     +51.9%   **0.66x (?)**
StringWithCString2                              0.001      0.002      +50.0%   **0.67x**
PrefixWhileArrayLazy                            26.861     40.261     +49.9%   **0.67x**
Data.append.Sequence.64kB.Count.RE.I            20.009     29.969     +49.8%   **0.67x**
MapReduceLazySequence                           44.091     65.947     +49.6%   **0.67x**
Data.append.Sequence.64kB.Count.RE              20.471     30.0       +46.5%   **0.68x**
DropFirstSequenceLazy                           33.0       47.333     +43.4%   **0.70x**
DropFirstSequence                               33.0       47.318     +43.4%   **0.70x**
SuffixCountableRangeLazy                        4.653      6.365      +36.8%   **0.73x (?)**
RemoveWhereSwapInts                             10.02      13.699     +36.7%   **0.73x (?)**
SuffixAnyCollection                             30.903     41.724     +35.0%   **0.74x**
DropWhileArray                                  25.979     33.667     +29.6%   **0.77x (?)**
DropWhileAnyCollectionLazy                      152.667    197.1      +29.1%   **0.77x (?)**
DropFirstAnySeqCntRangeLazy                     134.333    170.143    +26.7%   **0.79x**
DropFirstAnySeqCRangeIterLazy                   134.385    170.0      +26.5%   **0.79x**
SortAdjacentIntPyramids                         717.5      906.667    +26.4%   **0.79x**
SortIntPyramid                                  431.875    545.714    +26.4%   **0.79x**
DropLastAnyCollection                           35.625     44.5       +24.9%   **0.80x (?)**
PrefixAnySeqCntRange                            107.308    134.0      +24.9%   **0.80x**
DropFirstAnyCollection                          94.375     117.444    +24.4%   **0.80x (?)**
Data.init.Sequence.809B.Count.RE.I              45.923     56.417     +22.9%   **0.81x (?)**
Data.init.Sequence.809B.Count.RE                46.0       56.25      +22.3%   **0.82x (?)**
NSStringConversion.InlineBuffer.UTF8            793.0      963.0      +21.4%   **0.82x (?)**
SequenceAlgosArray                              2077.273   2520.0     +21.3%   **0.82x**
Data.append.Sequence.809B.Count.RE.I            58.87      69.438     +18.0%   **0.85x (?)**
Data.append.Sequence.809B.Count.RE              58.818     69.19      +17.6%   **0.85x (?)**
Set.filter.Int100.20k                           26.354     30.301     +15.0%   **0.87x (?)**
PrefixAnySeqCntRangeLazy                        107.538    123.5      +14.8%   **0.87x (?)**
PrefixAnySeqCRangeIterLazy                      107.571    123.429    +14.7%   **0.87x (?)**
PrefixWhileAnyCollectionLazy                    107.6      122.333    +13.7%   **0.88x (?)**
PrefixWhileAnySeqCRangeIterLazy                 107.643    122.308    +13.6%   **0.88x (?)**
PrefixWhileAnySeqCntRangeLazy                   107.643    122.308    +13.6%   **0.88x (?)**
Set.filter.Int100.16k                           21.652     24.543     +13.4%   **0.88x (?)**
DropFirstCountableRangeLazy                     13.58      15.353     +13.1%   **0.88x (?)**
DataAppendSequence                              6197.297   6989.474   +12.8%   **0.89x (?)**
DropWhileAnyCollection                          119.154    133.692    +12.2%   **0.89x (?)**
Set.isSuperset.Seq.Empty.Int                    45.731     51.192     +11.9%   **0.89x (?)**
Set.filter.Int100.28k                           38.593     42.878     +11.1%   **0.90x (?)**
StringWalk                                      1265.455   1380.69    +9.1%    **0.92x (?)**
PrefixWhileAnyCollection                        152.364    165.3      +8.5%    **0.92x (?)**
Set.intersection.Seq.Int0                       35.464     38.34      +8.1%    **0.92x (?)**

IMPROVEMENT                                     OLD        NEW        DELTA    RATIO    
PrefixSequenceLazy                              53.7       26.903     -49.9%   **2.00x (?)**
PrefixSequence                                  53.688     26.903     -49.9%   **2.00x**
DropFirstCountableRange                         26.87      13.578     -49.5%   **1.98x**
PrefixCountableRange                            26.865     13.711     -49.0%   **1.96x**
DropWhileCountableRange                         26.852     13.908     -48.2%   **1.93x**
DropLastCountableRangeLazy                      9.034      4.733      -47.6%   **1.91x (?)**
PrefixArray                                     26.853     14.096     -47.5%   **1.90x**
DropLastCountableRange                          9.02       4.74       -47.4%   **1.90x**
DropLastArrayLazy                               9.031      4.824      -46.6%   **1.87x (?)**
SuffixCountableRange                            9.034      4.93       -45.4%   **1.83x**
SuffixArrayLazy                                 9.019      5.26       -41.7%   **1.71x**
SumUsingReduceInto                              441.5      281.5      -36.2%   **1.57x**
DropWhileSequenceLazy                           71.667     49.429     -31.0%   **1.45x (?)**
Breadcrumbs.CopyUTF16CodeUnits.Mixed            88.52      63.69      -28.0%   **1.39x**
DropWhileAnySeqCRangeIter                       120.727    94.167     -22.0%   **1.28x (?)**
UTF8Decode_InitFromCustom_noncontiguous         325.833    269.571    -17.3%   **1.21x**
UTF8Decode_InitDecoding                         156.6      131.462    -16.1%   **1.19x (?)**
DropWhileAnySeqCntRange                         111.917    94.071     -15.9%   **1.19x (?)**
UTF8Decode_InitFromCustom_contiguous            155.786    132.0      -15.3%   **1.18x (?)**
Breadcrumbs.IdxToUTF16Range.longMixed           436.0      372.0      -14.7%   **1.17x (?)**
SuffixSequence                                  127.769    110.692    -13.4%   **1.15x (?)**
SuffixSequenceLazy                              128.5      113.125    -12.0%   **1.14x (?)**
UTF8Decode_InitFromCustom_noncontiguous_ascii   733.0      648.667    -11.5%   **1.13x (?)**
NormalizedIterator_emoji                        360.769    322.0      -10.7%   **1.12x (?)**
NormalizedIterator_nonBMPSlowestPrenormal       453.636    406.034    -10.5%   **1.12x (?)**
StaticArray                                     1.91       1.719      -10.0%   **1.11x (?)**

------- Code size: -Osize -------

REGRESSION        OLD     NEW     DELTA   RATIO  
RemoveWhere.o     11852   12330   +4.0%   **0.96x**
Diffing.o         6696    6816    +1.8%   **0.98x**
RandomTree.o      11040   11180   +1.3%   **0.99x**
BufferFill.o      9347    9445    +1.0%   **0.99x**

IMPROVEMENT       OLD     NEW     DELTA   RATIO  
IndexPathTest.o   7256    6988    -3.7%   **1.04x**
ReduceInto.o      7908    7792    -1.5%   **1.01x**
MapReduce.o       20475   20180   -1.4%   **1.01x**

------- Performance (x86_64): -Onone -------

REGRESSION                              OLD       NEW       DELTA    RATIO    
Breadcrumbs.UTF16ToIdxRange.longMixed   377.8     480.0     +27.1%   **0.79x (?)**
BridgeString.find.native.longNonASCII   466.75    523.667   +12.2%   **0.89x (?)**
ObjectiveCBridgeStubDateAccess          2991.0    3237.0    +8.2%    **0.92x (?)**

IMPROVEMENT                             OLD       NEW       DELTA    RATIO    
UTF8Decode_InitDecoding                 165.667   137.75    -16.9%   **1.20x (?)**
UTF8Decode_InitFromCustom_contiguous    165.75    141.4     -14.7%   **1.17x (?)**
Breadcrumbs.CopyUTF16CodeUnits.Mixed    188.364   162.0     -14.0%   **1.16x (?)**
Breadcrumbs.IdxToUTF16Range.longMixed   653.5     592.0     -9.4%    **1.10x (?)**
Breadcrumbs.MutatedUTF16ToIdx.Mixed     230.889   211.714   -8.3%    **1.09x (?)**
Breadcrumbs.MutatedIdxToUTF16.Mixed     233.556   214.8     -8.0%    **1.09x (?)**

lorentey · 2022-12-28T06:01:39Z

Hm; it seems the CI benchmarks are no longer providing a usable signal.

Even if we accept that the timer results are unreliable (given all the (?) marks), these runs should at least be able to reliably measure code size changes in benchmark object files. Unfortunately, the code size results are nonsensical -- given that this PR only touches non-inlinable parts in the stdlib, I don't see how code generation for clients could possibly change by as much as 24%, especially in modules that don't even mention or exercise String's UTF-16 view:

------- Code size: -O -------

REGRESSION        OLD     NEW     DELTA    RATIO  
Diffing.o         7533    7651    +1.6%    **0.98x**

IMPROVEMENT       OLD     NEW     DELTA    RATIO  
ReduceInto.o      14311   10879   -24.0%   **1.32x**
IndexPathTest.o   8290    7850    -5.3%    **1.06x**
MapReduce.o       27626   26435   -4.3%    **1.05x**
RemoveWhere.o     14948   14581   -2.5%    **1.03x**

lorentey · 2022-12-28T23:26:40Z

The code size differences don't seem to reproduce in local benchmark builds -- evidently the "swift-ci benchmark" command is not doing the right thing.

stephentyrone · 2022-12-29T00:11:23Z

Is any of the code in question inlineable? Last time I poked at it, benchmarks didn't properly rebuild in the face of inlineable/transparent code changes.

(I'm pretty sure all of this is behind ABI, but that's the first thing to check.)

lorentey · 2022-12-29T00:17:53Z

No, the changes in this PR only affect non-inlinable functions.

The logs also indicate that the benchmarks are fully rebuilt twice -- once for the base commit (presumably from the head of main), and once for the PR's head: I see two separate build log entries for every benchmark module & optimization level.

lorentey · 2022-12-29T01:31:14Z

Lovely, so build-script can run benchmarks locally, but naturally its local benchmark comparisons are silently failing with no output.

--- check-swift-benchmark-macosx-arm64 ---
+ /opt/homebrew/bin/cmake --build /Users/klorentey/Swift/build/Ninja-Release/swift-macosx-arm64 -- -j10 check-swift-benchmark-macosx-arm64
[1/1][100%][906.330s] cd /Users/klorentey/Swift/build/Ninja-Release/swift-macosx-arm64/benchmark && /Users/klorentey/Swift/build/Ninja-Release/swift-macosx-arm64/bin/Benchmark_Driver run -o O --output-dir /Users/klorentey/Swift/build/Ninja-Release/swift-macosx-arm64/benchmark/logs --architecture arm64 --swift-repo /Users/klorentey/Swift/swift --independent-samples 3 && /Users/klorentey/Swift/build/Ninja-Release/swift-macosx-arm64/bin/Benchmark_Driver run -o Onone --output-dir /Users/klorentey/Swift/build/Ninja-Release/swift-macosx-arm64/benchmark/logs --swift-repo /Users/klorentey/Swift/swift --architecture arm64 --independent-samples 3 && /Users/klorentey/Swift/build/Ninja-Release/swift-macosx-arm64/bin/Benchmark_Driver compare --log-dir /Users/klorentey/Swift/build/Ninja-Release/swift-macosx-arm64/benchmark/logs --swift-repo /Users/klorentey/Swift/swift --compare-script /Users/klorentey/Swift/swift/benchmark/scripts/compare_perf_tests.py
[...]
branch/branch comparison skipped: no previous branch logs
Comparing main/Benchmark_O-arm64-apple-macosx10.9-20221228163616-54aa1055e54.log string-utf16-speedup/Benchmark_O-arm64-apple-macosx10.9-20221228165658-9e11f382680.log ...
Comparing main/Benchmark_Onone-arm64-apple-macosx10.9-20221228164918-54aa1055e54.log string-utf16-speedup/Benchmark_Onone-arm64-apple-macosx10.9-20221228170958-9e11f382680.log ...
-- check-swift-benchmark-macosx-arm64 finished --
--- Finished tests for swift ---

We need a reliable way to track the performance of the stdlib. This is not it.

We commonly start from the `startIndex`, in which case `_nativeGetOffset` is essentially free. Consider this case when calculating the threshold for using breadcrumbs.

Speed up conversion between UTF-16 offset ranges and string index ranges, by carefully switching between absolute and relative index calculations, depending on the distance we need to go. It is a surprisingly tricky puzzle to do this correctly while avoiding redundant calculations. Offset ranges within substrings add the additional complication of having to bias offset values with the absolute offset of the substring’s start index.

lorentey · 2022-12-29T04:12:51Z

Evidently compare_perf_tests.py only produces output if the benchmark results are saved in the newer JSON format, which for some reason build-script -B does not use. (cc @tbkka)

Local benchmark results, including the new benchmarks in #62783:

Regression (30)

TEST	OLD	NEW	DELTA	RATIO
Breadcrumbs.IdxToUTF16.longASCII	59.04680187207488	84.07816907816908	+42.4%	0.70x
ArrayAppendFromGeneric	198.4879614767255	269.80059663997486	+35.9%	0.74x (?)
Data.append.Sequence.64kB.Count.RE	1.8508014796547472	2.2590885816692268	+22.0%	0.82x
StringDistance.utf16.ascii	6.20728	7.45696	+20.1%	0.83x
EqualSubstringSubstringGenericEquatable	19.57496532461857	22.51707220486646	+15.0%	0.87x
EqualSubstringSubstring	19.576396576396576	22.516187742816143	+15.0%	0.87x
EqualSubstringString	19.577644354087894	22.50591807101685	+15.0%	0.87x
EqualStringSubstring	19.578646718803128	22.500672	+14.9%	0.87x
LessSubstringSubstring	19.56947856947857	22.41163776292868	+14.5%	0.87x
LessSubstringSubstringGenericComparable	19.57826	22.387092322553936	+14.3%	0.87x
DataAppendDataSmallToLarge	14425.007425007425	16477.34988844699	+14.2%	0.88x (?)
MapReduceLazyCollectionShort	19.98152	22.214637287823727	+11.2%	0.90x
NSError	65.3837619334308	72.25508961197146	+10.5%	0.90x
DropWhileAnySeqCntRangeLazy	42.01219007314044	46.4047432379459	+10.5%	0.91x
DropWhileAnySeqCRangeIterLazy	42.025872310467726	46.3759	+10.4%	0.91x
DataSubscriptMedium	31.000099000099	34.14411514411515	+10.1%	0.91x
Breadcrumbs.MutatedIdxToUTF16.ASCII	2.047295047295047	2.249588	+9.9%	0.91x
SevenBoom	381.6302146618064	412.55905861456483	+8.1%	0.93x
KeyPathMutatingGetset	108.8859627016129	117.5388401170524	+7.9%	0.93x
DataAppendArray	1521.8304	1639.819037104594	+7.8%	0.93x
StringFromLongWholeSubstring	2.029816	2.183003183003183	+7.5%	0.93x
PopFrontArrayGeneric	1115.5941237649506	1190.6787844362398	+6.7%	0.94x (?)
ParseInt.UInt64.Hex	177.93763389669127	189.75278502014694	+6.6%	0.94x (?)
Breadcrumbs.MutatedUTF16ToIdx.ASCII	1.9687158748634994	2.0922980922980923	+6.3%	0.94x
ArrayAppendGenericStructs	428.54389438943895	455.2130076682761	+6.2%	0.94x (?)
KeyPathNestedClasses	107.853875	114.2388933226922	+5.9%	0.94x
RemoveWhereFilterInts	22.514584514584513	23.795157951579515	+5.7%	0.95x (?)
ObjectiveCBridgeStubToNSString	886.6678867740361	937.0666270666271	+5.7%	0.95x
Set.filter.Int50.20k	154.8442532942899	163.24413720686033	+5.4%	0.95x
StringDistance.characters.ascii	79.83466135458167	84.15054403264196	+5.4%	0.95x

Improvement (32)

TEST	OLD	NEW	DELTA	RATIO
StringDistance.utf16.mixed	3078.7476923076924	68.6646025345622	-97.8%	44.84x
ArrayAppendToGeneric	630.865521638041	192.96312593505235	-69.4%	3.27x
Breadcrumbs.UTF16ToIdxRange.longASCII	18.023405046810094	8.73442	-51.5%	2.06x
Breadcrumbs.IdxToUTF16Range.longASCII	21.808675	10.844276754214034	-50.3%	2.01x
DropWhileAnySeqCRangeIter	38.456924456924455	19.289611475337704	-49.8%	1.99x
DropWhileAnySeqCntRange	38.38824538824539	19.279018069198763	-49.8%	1.99x
Breadcrumbs.CopyUTF16CodeUnits.ASCII	15.17908379083791	9.088257176514354	-40.1%	1.67x
Breadcrumbs.CopyUTF16CodeUnits.Mixed	56.37827377112631	33.78485670914026	-40.1%	1.67x
Breadcrumbs.IdxToUTF16Range.longMixed	392.5728088336784	244.31128848346637	-37.8%	1.61x
ArrayAppendSequence	804.7557512070434	576.1588541666666	-28.4%	1.40x
RecursiveOwnedParameter	91.17581340650725	71.49056411302662	-21.6%	1.28x
DataReplaceLarge	14340.860215053763	11252.50505050505	-21.5%	1.27x
MapReduceAnyCollection	96.26754118364856	77.65943919139224	-19.3%	1.24x
ArrayAppendLazyMap	1286.3860693034653	1042.9625918503675	-18.9%	1.23x
Diffing.Myers.Similar	170.30787001386108	140.04724702380952	-17.8%	1.22x
ArrayAppendStrings	1748.4741097539995	1495.2138654794085	-14.5%	1.17x
ObjectiveCBridgeStubDataAppend	1283.2317397583033	1114.6469092938187	-13.1%	1.15x
ObserverUnappliedMethod	408.359375	354.93889716840533	-13.1%	1.15x
DataAppendDataMediumToLarge	16704.413619167717	14531.748998664887	-13.0%	1.15x (?)
ObserverForwarderStruct	286.27021464275214	254.11041990668738	-11.2%	1.13x (?)
DropWhileAnyCollectionLazy	46.40428425705542	42.343245967741936	-8.8%	1.10x
BufferFillFromSlice	15.37899437899438	14.13347287071325	-8.1%	1.09x
DataAppendDataLargeToMedium	16164.151164151164	14948.926403835363	-7.5%	1.08x (?)
Breadcrumbs.UTF16ToIdxRange.longMixed	107.66192532385065	100.14149162143026	-7.0%	1.08x
DictionaryOfAnyHashableStrings_insert	1815.0216284987278	1690.962402856543	-6.8%	1.07x
DataCreateMediumArray	556.3105963105963	520.218401747214	-6.5%	1.07x
DataCopyBytesSmall	49.60020800832033	46.539434104154914	-6.2%	1.07x
DataReplaceMedium	1948.034	1831.911631911632	-6.0%	1.06x
ArrayAppendArrayOfInt	210.16216216216213	197.6499495967742	-6.0%	1.06x
StringBuilderLong	524.0519691433211	493.8352679281285	-5.8%	1.06x
DictionaryOfAnyHashableStrings_lookup	1209.2004	1145.8896942351078	-5.2%	1.06x
Data.init.Sequence.64kB.Count	3.4351065414990316	3.2685272685272686	-4.8%	1.05x

stephentyrone · 2022-12-29T14:47:25Z

I'd like to understand what's going on with Breadcrumbs.IdxToUTF16.longASCII, but I'm happy to take this in the meantime.

stdlib/public/core/StringBridge.swift

stephentyrone · 2022-12-29T14:51:13Z

stdlib/public/core/StringUTF16View.swift

@@ -201,6 +204,14 @@ extension String.UTF16View: BidirectionalCollection {
      return _foreignIndex(i, offsetBy: n)
    }

+    let threshold = (
+      i == startIndex ? _breadcrumbStride / 2 : _breadcrumbStride)
+    if n.magnitude < threshold, !_guts.isASCII {


is the sense of this condition right? Isn't direct computation always faster when _guts.isASCII is true? What am I misreading?

The _nativeGetIndex/_nativeGetOffset calls below have O(1) ASCII fast paths, so calling the advancing loop in _index(_:offsetBy:) would make that case worse. (I used to have a special case for _guts.isASCII inside this branch, but it seems simpler to just let the original code take care of it.)

…checks

lorentey · 2022-12-30T00:52:11Z

The generated code looked reasonable enough, but the Breadcrumbs.IdxToUTF16.longASCII and StringDistance.utf16.ascii regressions could've perhaps been caused by the extra work to calculate & check the breadcrumb threshold. Moving the ASCII case into an up front check eliminated the regressions.

Frustratingly though this resulted in 130-135% regressions for StringDistance.scalars.ascii, StringDistance.utf8.ascii, and StringDistance.characters.ascii, whose code paths aren't even affected by this PR. Re-running the baseline benchmarks made these disappear, so perhaps my machine just doesn't feel like doing ASCII workloads today. ¯\_(ツ)_/¯

(FWIW, the new -20% improvement to Breadcrumbs.IdxToUTF16.longASCII remained the same with both baselines.)

Regression (14)

TEST	OLD	NEW	DELTA	RATIO
ArrayAppendLazyMap	869.5159515951594	1095.4270670826832	+26.0%	0.79x
ObjectiveCBridgeStubDataAppend	1127.2771254941295	1338.6282086668507	+18.7%	0.84x
DataAppendDataLargeToMedium	9997.725302187995	11734.009360374415	+17.4%	0.85x
ObserverForwarderStruct	278.77053140096615	324.9079453972698	+16.6%	0.86x
RawBufferCopyBytes	13.382943829438295	14.951712	+11.7%	0.90x
RemoveWhereFilterStrings	149.8489108910891	163.7511528064875	+9.3%	0.92x (?)
ArrayAppendAscii	1717.6146839635662	1845.3976	+7.4%	0.93x
ArrayInitFromSlice	187.10742574257426	199.95243282498186	+6.9%	0.94x
ArrayAppendLatin1Substring	9303.454016298021	9873.432494279177	+6.1%	0.94x (?)
Breadcrumbs.UTF16ToIdx.longASCII	54.91656766389459	58.213640922768306	+6.0%	0.94x
String.replaceSubrange.String	8.290125160500642	8.78018678018678	+5.9%	0.94x
KeyPathNestedClasses	110.76098552197104	116.9216559675095	+5.6%	0.95x
Data.init.Sequence.809B.Count.RE	12.5098685888173	13.177822956052516	+5.3%	0.95x
SortStringsUnicode	1584.103471520054	1667.9918323863637	+5.3%	0.95x

Improvement (46)

TEST	OLD	NEW	DELTA	RATIO
StringDistance.utf16.mixed	3104.5216049382716	70.24611175115207	-97.7%	44.19x
Breadcrumbs.IdxToUTF16Range.longASCII	21.951630709784258	5.664139664139664	-74.2%	3.88x
ArrayAppendFromGeneric	560.7700477960701	194.43880428652	-65.3%	2.88x
StringDistance.utf16.ascii	14.596465964659647	7.646499646499646	-47.6%	1.91x
Breadcrumbs.CopyUTF16CodeUnits.Mixed	57.14503219070819	34.69724770642202	-39.3%	1.65x
Breadcrumbs.IdxToUTF16Range.longMixed	394.3499653499654	245.21128259712614	-37.8%	1.61x
ArraySetElement	353.1861471861472	221.96106275767292	-37.2%	1.59x
Breadcrumbs.UTF16ToIdxRange.longASCII	18.58256	12.890522591497012	-30.6%	1.44x
ArrayAppendSequence	812.8057094162762	598.9427135678392	-26.3%	1.36x
StringEqualPointerComparison	125.63169642857143	93.95024181547619	-25.2%	1.34x
Breadcrumbs.CopyUTF16CodeUnits.ASCII	15.385172466552199	11.926485411883295	-22.5%	1.29x
Breadcrumbs.IdxToUTF16.longASCII	59.73165589935396	47.25780321849012	-20.9%	1.26x
MapReduceAnyCollection	96.97548926467795	77.6663849923948	-19.9%	1.25x
DataToStringEmpty	546.2048292173148	442.1126845073803	-19.1%	1.24x
ArrayAppendStrings	1811.895657809462	1480.6139111320458	-18.3%	1.22x
EqualStringSubstring	22.844157324719742	19.583134573594105	-14.3%	1.17x
LessSubstringSubstring	22.67857307143843	19.56976	-13.7%	1.16x
EqualSubstringSubstring	22.716634716634715	19.616857869725916	-13.6%	1.16x
EqualSubstringSubstringGenericEquatable	22.715052580630967	19.61774	-13.6%	1.16x
LessSubstringSubstringGenericComparable	22.654744928469572	19.58526	-13.5%	1.16x
EqualSubstringString	22.614241685450114	19.58864	-13.4%	1.15x
DataAppendDataMediumToLarge	10306.288570408527	9105.70152543356	-11.6%	1.13x (?)
Breadcrumbs.UTF16ToIdxRange.longMixed	110.3634699853587	99.31144335825186	-10.0%	1.11x
DataToStringSmall	1081.4801036816589	979.1837499999999	-9.5%	1.10x
DataAppendDataLargeToSmall	9846.957033054727	8925.952653348588	-9.4%	1.10x (?)
DataReplaceLarge	10516.208851100937	9537.983674334782	-9.3%	1.10x (?)
Prims.NonStrongRef.UnownedUnsafe	107.8287087273291	97.86537915364953	-9.2%	1.10x
FindString.Loop1.Substring	273.5505780346821	249.09390125847048	-8.9%	1.10x
Prims.NonStrongRef.UnownedUnsafe.Closure	107.61694595232476	98.0281887892861	-8.9%	1.10x
RemoveWhereQuadraticInts	680.3831026948288	619.997383911053	-8.9%	1.10x
ArrayAppendToGeneric	635.9151156232374	579.7549980399843	-8.8%	1.10x
ParseInt.UInt64.Hex	213.81696428571428	196.3216401990993	-8.2%	1.09x
Data.init.Sequence.64kB.Count.RE	2.205452821811287	2.0374790749581497	-7.6%	1.08x
DataToStringMedium	1667.6226581891128	1550.8512	-7.0%	1.08x
Diffing.Similar	163.00455486542444	152.12760952660983	-6.7%	1.07x
Set.subtracting.Seq.Empty.Int	97.17290706605223	90.76932769327694	-6.6%	1.07x
DataCopyBytesSmall	50.27175	47.0286	-6.5%	1.07x
Diffing.Myers.Similar	154.30286771507863	144.3515050959943	-6.4%	1.07x
LineSink.scalars.alpha	34.33507871204651	32.155296	-6.3%	1.07x
ArrayAppendUTF16Substring	9867.688258064516	9247.880184331798	-6.3%	1.07x
FlattenListFlatMap	4848.395721925133	4557.5641025641025	-6.0%	1.06x
Diffing.Pangrams	1592.9009584664536	1498.0873493975903	-6.0%	1.06x
Set.subtracting.Seq.Int.Empty	103.92119901112484	97.8462	-5.8%	1.06x
Diffing.ReversedLorem	563.8752107925801	532.61899503037	-5.5%	1.06x
Diffing.ReversedAlphabets	115.1374007936508	109.21288851554063	-5.1%	1.05x
Diffing.Disparate	94.24801315428884	89.5383295194508	-5.0%	1.05x

lorentey · 2022-12-30T01:01:33Z

@swift-ci test

stephentyrone · 2022-12-30T02:49:24Z

OK. I'm happy with taking this once we get a clean test run at this point.

Evidently we did not have any tests that exercised `distance(from:to:)` and `index(_:offsetBy:)`. :-O

- Align input indices to scalar boundaries - Don’t pass decreasing indices to _utf16Distance

lorentey · 2023-01-02T05:26:24Z

It turns out we did not have proper test coverage for String.UTF16.distance(from:to:) and .index(_:offsetBy:). 😨

The last commits do not materially change benchmark results. (Except for the ASCII StringDistance tasks, which uniformly slowed down again by a factor of 10-20 or so (in both the baseline & PR measurements). My top guess is that we may not be properly configuring the QoS level of the benchmark process, so we get unstable/inconsistent scheduling/affinity. Anyway, the measurements seem to be stable across short timespans, so the relative differences seem valid.)

Regression (17)

TEST	OLD	NEW	DELTA	RATIO
ExclusivityGlobal	0.0	0.096899	+9689.9%	0.01x
ObjectiveCBridgeASCIIStringFromFile	0.0	0.004024	+402.4%	0.20x
ArrayAppendGenericStructs	391.9276927692769	1389.1610347238407	+254.4%	0.28x
RemoveWhereFilterStrings	141.9654919236417	182.38476190476192	+28.5%	0.78x
Data.append.Sequence.64kB.Count.RE	2.201877070298123	2.7967163534240655	+27.0%	0.79x
ArrayAppendStrings	1468.2893054963783	1817.191887675507	+23.8%	0.81x
ArrayAppendToGeneric	600.2321778940484	728.305424099017	+21.3%	0.82x
PrefixWhileArray	48.08232131714107	57.8755037510075	+20.4%	0.83x
ObjectiveCBridgeStubDataAppend	1120.3197158081705	1318.650780872645	+17.7%	0.85x
Data.append.Sequence.64kB.Count0	162.36250236250237	190.43015963511974	+17.3%	0.85x
ProtocolDispatch	186.39279869067104	217.65001380071763	+16.8%	0.86x
BufferFillFromSlice	13.994899872496813	15.785969334173492	+12.8%	0.89x (?)
DataAppendBytesMedium	1525.1499210941608	1695.712749946248	+11.2%	0.90x
DataAppendDataMediumToMedium	1637.6751721344676	1781.0147849462364	+8.8%	0.92x
Data.init.Sequence.511B.Count.I	15.715491318265594	17.030176565008027	+8.4%	0.92x
MapReduceClass2	7.854055416221665	8.330853323413294	+6.1%	0.94x (?)
String.replaceSubrange.String	8.23093200846444	8.679880439043512	+5.5%	0.95x

Improvement (63)

TEST	OLD	NEW	DELTA	RATIO
StringDistance.utf16.mixed	3084.3258785942494	67.40059642147118	-97.8%	45.76x
SubstringRemoveLast1	0.038172038172038174	0.0	-97.4%	39.17x
Breadcrumbs.IdxToUTF16Range.longASCII	21.762324197187155	4.983150983150983	-77.1%	4.37x
StringDistance.utf16.ascii	267.38345070422537	67.98451100870456	-74.6%	3.93x
Breadcrumbs.IdxToUTF16.longASCII	59.16057134971018	27.98367957566897	-52.7%	2.11x
Breadcrumbs.CopyUTF16CodeUnits.Mixed	55.945609945609945	33.54770154770155	-40.0%	1.67x
Breadcrumbs.IdxToUTF16Range.longMixed	391.2633033863165	243.45650999459752	-37.8%	1.61x
ArrayAppendSequence	753.0925000000001	492.6215621562156	-34.6%	1.53x (?)
Breadcrumbs.UTF16ToIdxRange.longASCII	18.345242761942096	12.442959657757946	-32.2%	1.47x
DataToStringEmpty	569.0881976991905	420.6482706482706	-26.1%	1.35x
DataAppendDataMediumToLarge	16161.067832034962	12075.701839303001	-25.3%	1.34x
StringEqualPointerComparison	124.20721671238658	93.23564615672339	-24.9%	1.33x
Breadcrumbs.CopyUTF16CodeUnits.ASCII	15.173112558013022	11.411979119791198	-24.8%	1.33x
ObjectiveCBridgeStubDateAccess	157.60916442048517	124.43265380414998	-21.0%	1.27x
MapReduceAnyCollection	96.22638999733972	77.74146649810366	-19.2%	1.24x
Breadcrumbs.MutatedIdxToUTF16.ASCII	2.022464	1.665037665037665	-17.7%	1.21x
ArrayOfPOD	350.18941798941796	289.24116424116426	-17.4%	1.21x
DataToStringSmall	1117.1036275285344	947.758556547619	-15.2%	1.18x
EqualStringSubstring	22.663642663642662	19.460661662496523	-14.1%	1.16x
LessSubstringSubstringGenericComparable	22.557711557711556	19.41238729909839	-13.9%	1.16x
LessSubstringSubstring	22.494775	19.416030099190515	-13.7%	1.16x
EqualSubstringSubstring	22.5104643557881	19.457371643900142	-13.6%	1.16x
EqualSubstringString	22.45770080044162	19.41766	-13.5%	1.16x
EqualSubstringSubstringGenericEquatable	22.441564896336267	19.443331443331445	-13.4%	1.15x
Diffing.Similar	175.83179723502303	153.80505693881491	-12.5%	1.14x
Breadcrumbs.UTF16ToIdxRange.longMixed	107.91	95.26484918793504	-11.7%	1.13x
Diffing.Pangrams	1708.518450184502	1509.771285475793	-11.6%	1.13x
RemoveWhereFilterInts	25.285674855092466	22.59664	-10.6%	1.12x
DataToStringMedium	1689.2884645002312	1521.499488976493	-9.9%	1.11x
CharIteration_utf16_unicodeScalars	1893.7649880095923	1707.4799065154193	-9.8%	1.11x
DataAppendDataSmallToMedium	1551.1783517835179	1401.3136213136213	-9.7%	1.11x
NSStringConversion.MutableCopy.Rebridge.UTF8	234.9656946826758	213.0114521300962	-9.3%	1.10x
Diffing.Myers.Similar	157.1981512206684	143.80248265790433	-8.5%	1.09x
MapReduceClassShort2	52.74447447924948	48.408788351534064	-8.2%	1.09x
Diffing.ReversedLorem	590.7385670731708	544.2202312138728	-7.9%	1.09x
OpenClose	46.94776558212466	43.40500381000762	-7.5%	1.08x
Prims.NonStrongRef.UnownedUnsafe	103.13078976291412	95.8390214436726	-7.1%	1.08x
Prims.NonStrongRef.UnownedUnsafe.Closure	102.9244159413651	95.97832444520901	-6.7%	1.07x
Breadcrumbs.IdxToUTF16.longMixed	740.9079878665318	691.5965965965966	-6.7%	1.07x
MapReduceShortString	5.3356083356083355	4.98558648111332	-6.6%	1.07x
UTF8Decode_InitFromBytes_ascii_as_ascii	258.40736607142856	242.27836611195158	-6.2%	1.07x
Data.init.Sequence.809B.Count.RE	13.232027176489177	12.4134943875061	-6.2%	1.07x
DataToStringLargeUnicode	2340.9364480261142	2198.406374501992	-6.1%	1.06x
Dict.CopyKeyValue.16k	653.0077101002313	613.7950949367089	-6.0%	1.06x (?)
ArrayPlusEqualSingleElementCollection	382.17956	360.44699717855707	-5.7%	1.06x (?)
Data.append.Sequence.64kB.Count.I	3.5885543295305777	3.386308	-5.6%	1.06x
NSStringConversion.MutableCopy.UTF8	310.12393767705385	292.7287784679089	-5.6%	1.06x
Data.init.Sequence.809B.Count.RE.I	13.14628136200717	12.410035250881272	-5.6%	1.06x
DataAppendArray	1631.0764430577224	1541.2030258662762	-5.5%	1.06x
LineSink.scalars.alpha	33.90805362080431	32.03970203970204	-5.5%	1.06x
Dict.CopyKeyValue.20k	771.5848506919156	729.1044650379107	-5.5%	1.06x
NSStringConversion.MutableCopy.Rebridge.LongUTF8	205.6728742955158	194.56545454545454	-5.4%	1.06x
NSStringConversion.MutableCopy.Rebridge.Medium	273.20762398223536	258.5820895522388	-5.4%	1.06x
ArrayAppendFromGeneric	197.9175101214575	187.41782868525894	-5.3%	1.06x (?)
SIMDReduce.Int8x64.Cast	57.92093250733616	54.88939828080229	-5.2%	1.06x
Data.append.Sequence.809B.Count.RE	23.575889106692802	22.35225522552255	-5.2%	1.05x
Diffing.PangramToAlphabet	681.0186170212766	646.2713414634146	-5.1%	1.05x
Set.filter.Int50.20k	165.60588235294117	157.25940594059406	-5.0%	1.05x
SetIntersectionBox25	96.7695007800312	91.91776746400095	-5.0%	1.05x
Prims.NonStrongRef.UnownedSafe.Closure	251.12207527975585	238.57288732394366	-5.0%	1.05x
DropWhileAnySequenceLazy	510.15220048899755	484.6727073036793	-5.0%	1.05x
DataReplaceLarge	13504.339473386824	12838.700114025085	-4.9%	1.05x (?)
MapReduceAnyCollectionShort	528.2943416757346	502.3500588004704	-4.9%	1.05x

lorentey · 2023-01-02T05:26:40Z

@swift-ci test

…ithms [Bidirectional]Collection’s default index manipulation methods (as well as _utf16Distance) do not expect to be given unreachable indices, and they tend to fail when operating on them. Round indices down to the nearest scalar boundary before calling these.

lorentey · 2023-01-04T00:12:42Z

@swift-ci test

stephentyrone · 2023-01-04T00:40:04Z

let's clone this for 5.8 too, once you're ready.

lorentey · 2023-01-04T06:18:43Z

Failed Tests (16):
  Swift(macosx-x86_64) :: bindings-build-record.swift
  Swift(macosx-x86_64) :: check-interface-implementation-fine.swift
  Swift(macosx-x86_64) :: crash-added-fine.swift
  Swift(macosx-x86_64) :: crash-simple-fine.swift
  Swift(macosx-x86_64) :: dependencies-preservation-fine.swift
  Swift(macosx-x86_64) :: driver-show-incremental-arguments-fine.swift
  Swift(macosx-x86_64) :: driver-show-incremental-conflicting-arguments-fine.swift
  Swift(macosx-x86_64) :: driver-show-incremental-inputs-fine.swift
  Swift(macosx-x86_64) :: driver-show-incremental-malformed-fine.swift
  Swift(macosx-x86_64) :: driver-show-incremental-mutual-fine.swift
  Swift(macosx-x86_64) :: driver-show-incremental-swift-version-fine.swift
  Swift(macosx-x86_64) :: fail-added-fine.swift
  Swift(macosx-x86_64) :: fail-chained-fine.swift
  Swift(macosx-x86_64) :: fail-interface-hash-fine.swift
  Swift(macosx-x86_64) :: fail-simple-fine.swift
  Swift(macosx-x86_64) :: independent-fine.swift

lorentey · 2023-01-04T06:19:06Z

@swift-ci test macOS platform

lorentey · 2023-01-04T19:17:32Z

@swift-ci test macOS platform

lorentey · 2023-01-05T00:19:47Z

java.nio.file.FileSystemException: /Users/ec2-user/jenkins/workspace/swift-PR-macos@tmp/durable-45f8247f: No space left on device

lorentey · 2023-01-05T00:21:52Z

@swift-ci smoke test macOS platform

This is a wild guess at what might be causing our persistent, random String failures on the main branch: ``` Swift(macosx-x86_64) :: Prototypes/CollectionTransformers.swift Swift(macosx-x86_64) :: stdlib/NSSlowString.swift Swift(macosx-x86_64) :: stdlib/NSStringAPI.swift Swift(macosx-x86_64) :: stdlib/StringIndex.swift Swift-validation(macosx-x86_64) :: stdlib/String.swift Swift-validation(macosx-x86_64) :: stdlib/StringBreadcrumbs.swift Swift-validation(macosx-x86_64) :: stdlib/StringUTF8.swift ``` FWIW, it appears this is *not* caused by swiftlang#62717: that change has also landed on release/5.8, and I haven’t seen these issues on that branch. Our atomic breadcrumbs initialization vs its non-atomic loading gives me an uneasy feeling that this may in fact be a long standing synchronization issue that is only now causing problems (for whatever reason). I am unable to reproduce these issues locally, so this guess may be (and probably is) wildly off the mark, but this PR is likely to be a good idea anyway, if only to rule out this possibility. rdar://104751936

This is a wild guess at what might be causing our persistent, random String failures on the main branch: ``` Swift(macosx-x86_64) :: Prototypes/CollectionTransformers.swift Swift(macosx-x86_64) :: stdlib/NSSlowString.swift Swift(macosx-x86_64) :: stdlib/NSStringAPI.swift Swift(macosx-x86_64) :: stdlib/StringIndex.swift Swift-validation(macosx-x86_64) :: stdlib/String.swift Swift-validation(macosx-x86_64) :: stdlib/StringBreadcrumbs.swift Swift-validation(macosx-x86_64) :: stdlib/StringUTF8.swift ``` FWIW, it appears this is *not* caused by swiftlang#62717: that change has also landed on release/5.8, and I haven’t seen these issues on that branch. Our atomic breadcrumbs initialization vs its non-atomic loading gives me an uneasy feeling that this may in fact be a long standing synchronization issue that is only now causing problems (for whatever reason). I am unable to reproduce these issues locally, so this guess may be (and probably is) wildly off the mark, but this PR is likely to be a good idea anyway, if only to rule out this possibility. rdar://104751936 (cherry picked from commit 73f349c)

lorentey requested a review from Catfish-Man December 21, 2022 03:11

stephentyrone reviewed Dec 21, 2022

View reviewed changes

stdlib/public/core/StringUTF16View.swift Outdated Show resolved Hide resolved

lorentey force-pushed the string-utf16-speedup branch from 8e85017 to 3b340dc Compare December 28, 2022 03:09

lorentey mentioned this pull request Dec 28, 2022

[benchmark] Add some distance(from:to:) benchmarks for String views #62783

Merged

lorentey force-pushed the string-utf16-speedup branch from cf3551c to a8ab24c Compare December 28, 2022 04:20

lorentey added 4 commits December 27, 2022 20:22

[stdlib] Simplify breadcrumbs avoidance paths in String.UTF16View

f3a9305

[stdlib] StringProtocol._toUTF16Indices: Speed up conversion of short…

2423b8b

… ranges Instead of calling `_toUTF16Index` twice, call it once and then use `index(_:offsetBy:)` to potentially avoid another breadcrumbs lookup.

[stdlib] Breadcrumbs are spaced in UTF-16 code units, not UTF-8

6fee1b3

lorentey force-pushed the string-utf16-speedup branch from a8ab24c to 6fee1b3 Compare December 28, 2022 04:22

lorentey added 2 commits December 28, 2022 20:07

[stdlib] String.UTF16View: Rework thresholds for relative indexing

ec35728

We commonly start from the `startIndex`, in which case `_nativeGetOffset` is essentially free. Consider this case when calculating the threshold for using breadcrumbs.

lorentey force-pushed the string-utf16-speedup branch from b7d1174 to d00f8ed Compare December 29, 2022 04:08

stephentyrone reviewed Dec 29, 2022

View reviewed changes

lorentey added 2 commits December 29, 2022 13:18

[stdlib] Remove @_specialize attributes obsoleted by explicit type …

7d89d62

…checks

[stdlib] String.UTF16View: Tweak ASCII paths

fce428e

stephentyrone approved these changes Dec 30, 2022

View reviewed changes

lorentey added 2 commits January 1, 2023 20:58

[test] String.UTF16View: Add some basic collection tests

051f9ed

Evidently we did not have any tests that exercised `distance(from:to:)` and `index(_:offsetBy:)`. :-O

[stdlib] Fix String.UTF16View.distance(from:to:)

5d354ce

- Align input indices to scalar boundaries - Don’t pass decreasing indices to _utf16Distance

lorentey force-pushed the string-utf16-speedup branch from c564b1f to 5d354ce Compare January 2, 2023 04:58

lorentey added 2 commits January 3, 2023 16:08

[test] Cleanup

cd55016

lorentey mentioned this pull request Jan 4, 2023

[5.8][stdlib] Speed up short UTF-16 distance calculations #62823

Merged

lorentey merged commit 4ffc5fe into swiftlang:main Jan 5, 2023

lorentey deleted the string-utf16-speedup branch January 5, 2023 05:20

lorentey mentioned this pull request Feb 11, 2023

[stdlib] Rework String breadcrumbs initialization/loading #63592

Merged

lorentey mentioned this pull request Feb 13, 2023

Fix potentially undefined behavior in StringObject.nativeStorage and document Builtin.unsafeBitCast #63631

Closed

lorentey mentioned this pull request May 2, 2023

Unable To Append Two Emoji Strings On macOS Ventura's Swift #63664

Open

[stdlib] Speed up short UTF-16 distance calculations #62717

[stdlib] Speed up short UTF-16 distance calculations #62717

Uh oh!

Conversation

lorentey commented Dec 21, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lorentey commented Dec 21, 2022

Uh oh!

lorentey commented Dec 21, 2022

Uh oh!

lorentey commented Dec 21, 2022

Uh oh!

stephentyrone commented Dec 21, 2022

Uh oh!

stephentyrone Dec 21, 2022

Choose a reason for hiding this comment

Uh oh!

lorentey Dec 28, 2022

Choose a reason for hiding this comment

Uh oh!

Uh oh!

lorentey commented Dec 28, 2022

Uh oh!

lorentey commented Dec 28, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lorentey commented Dec 28, 2022

Uh oh!

lorentey commented Dec 28, 2022

Uh oh!

lorentey commented Dec 28, 2022

Uh oh!

lorentey commented Dec 28, 2022

Uh oh!

lorentey commented Dec 28, 2022

Uh oh!

lorentey commented Dec 28, 2022

Uh oh!

stephentyrone commented Dec 29, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lorentey commented Dec 29, 2022

Uh oh!

lorentey commented Dec 29, 2022

Uh oh!

lorentey commented Dec 29, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

stephentyrone commented Dec 29, 2022

Uh oh!

Uh oh!

stephentyrone Dec 29, 2022

Choose a reason for hiding this comment

Uh oh!

lorentey Dec 29, 2022

Choose a reason for hiding this comment

Uh oh!

lorentey commented Dec 30, 2022

Uh oh!

lorentey commented Dec 30, 2022

Uh oh!

stephentyrone commented Dec 30, 2022

Uh oh!

lorentey commented Jan 2, 2023

Uh oh!

lorentey commented Jan 2, 2023

Uh oh!

lorentey commented Jan 4, 2023

Uh oh!

stephentyrone commented Jan 4, 2023

Uh oh!

lorentey commented Jan 4, 2023

Uh oh!

lorentey commented Jan 4, 2023

Uh oh!

lorentey commented Jan 4, 2023

Uh oh!

lorentey commented Jan 5, 2023

Uh oh!

lorentey commented Dec 21, 2022 •

edited

Loading

lorentey commented Dec 28, 2022 •

edited

Loading

stephentyrone commented Dec 29, 2022 •

edited

Loading

lorentey commented Dec 29, 2022 •

edited

Loading