Skip to content

Enable ClosureLifetimeFixup's capture copy elimination for copyable types as well. #74795

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jul 3, 2024

Conversation

meg-gupta
Copy link
Contributor

@meg-gupta meg-gupta commented Jun 27, 2024

rdar://130026658

@meg-gupta
Copy link
Contributor Author

@swift-ci apple silicon benchmark

@meg-gupta
Copy link
Contributor Author

@swift-ci test

@meg-gupta
Copy link
Contributor Author

------- Performance (arm64): -O -------

REGRESSION                                  OLD        NEW        DELTA    RATIO    
ArrayAppendGenericStructs                   906.154    1032.727   +14.0%   **0.88x (?)**
NormalizedIterator_fastPrenormal            321.642    355.156    +10.4%   **0.91x**
NormalizedIterator_latin1                   116.421    127.556    +9.6%    **0.91x**
ObserverUnappliedMethod                     249.714    273.261    +9.4%    **0.91x (?)**
NormalizedIterator_nonBMPSlowestPrenormal   263.933    288.625    +9.4%    **0.91x**
Breadcrumbs.IdxToUTF16Range.longASCII       8.281      9.04       +9.2%    **0.92x**
NormalizedIterator_emoji                    206.957    225.854    +9.1%    **0.92x**

IMPROVEMENT                                 OLD        NEW        DELTA    RATIO    
SequenceAlgosAnySequence                    7351.613   4192.157   -43.0%   **1.75x**
SuffixAnySeqCRangeIterLazy                  11904.0    6865.0     -42.3%   **1.73x**
DropLastAnySeqCRangeIterLazy                12068.0    7004.0     -42.0%   **1.72x**
SuffixAnySeqCntRangeLazy                    11708.0    6868.0     -41.3%   **1.70x**
DropLastAnySeqCntRangeLazy                  11884.0    7013.0     -41.0%   **1.69x**
PrefixAnyCollectionLazy                     19597.0    14832.0    -24.3%   **1.32x**
DropFirstAnyCollectionLazy                  19534.0    14878.0    -23.8%   **1.31x**
SuffixAnyCollectionLazy                     6485.0     4947.0     -23.7%   **1.31x**
DropLastAnyCollectionLazy                   6488.0     4951.0     -23.7%   **1.31x**
StringSwitch                                165.692    145.6      -12.1%   **1.14x**
SequenceAlgosUnfoldSequence                 296.027    268.718    -9.2%    **1.10x**
String.replaceSubrange.RepChar.Small        161.929    149.091    -7.9%    **1.09x**

------- Code size: -O -------

IMPROVEMENT                                        OLD     NEW     DELTA    RATIO  
BinaryFloatingPointConversionFromBinaryInteger.o   18454   15782   -14.5%   **1.17x**
Prefix.o                                           10086   9662    -4.2%    **1.04x**
DropFirst.o                                        9785    9381    -4.1%    **1.04x**
ChaCha.o                                           11486   11138   -3.0%    **1.03x**
DropLast.o                                         16096   15672   -2.6%    **1.03x**
Suffix.o                                           16058   15654   -2.5%    **1.03x** 

------- Performance (arm64): -Osize -------

REGRESSION                                  OLD        NEW       DELTA    RATIO    
MapReduceAnyCollection                      269.0      319.0     +18.6%   **0.84x**
MapReduceString                             30.917     35.564    +15.0%   **0.87x**
MapReduceAnyCollectionShort                 598.108    681.818   +14.0%   **0.88x (?)**
NormalizedIterator_latin1                   118.211    131.706   +11.4%   **0.90x**
NormalizedIterator_fastPrenormal            324.11     357.761   +10.4%   **0.91x**
NormalizedIterator_nonBMPSlowestPrenormal   265.465    290.233   +9.3%    **0.91x**
NormalizedIterator_emoji                    208.255    226.952   +9.0%    **0.92x**

IMPROVEMENT                                 OLD        NEW       DELTA    RATIO    
SuffixAnySeqCntRangeLazy                    11970.0    6853.0    -42.7%   **1.75x**
DropLastAnySeqCntRangeLazy                  12252.0    7032.0    -42.6%   **1.74x**
SequenceAlgosAnySequence                    7276.667   4186.0    -42.5%   **1.74x**
SuffixAnySeqCRangeIterLazy                  11905.0    6871.0    -42.3%   **1.73x**
DropLastAnySeqCRangeIterLazy                12067.0    7022.0    -41.8%   **1.72x**
StringWithCString2                          0.002      0.001     -33.3%   **1.50x**
DropLastAnyCollectionLazy                   6682.0     5006.0    -25.1%   **1.33x**
StringSwitch                                199.727    150.0     -24.9%   **1.33x**
SuffixAnyCollectionLazy                     6667.0     5009.0    -24.9%   **1.33x**
DropFirstAnyCollectionLazy                  19991.0    15073.0   -24.6%   **1.33x**
PrefixAnyCollectionLazy                     19672.0    15018.0   -23.7%   **1.31x**
MapReduceSequence                           330.429    289.75    -12.3%   **1.14x**
String.replaceSubrange.RepChar.Small        160.5      143.133   -10.8%   **1.12x (?)**
ObserverUnappliedMethod                     278.615    250.652   -10.0%   **1.11x (?)**
Set.subtracting.Box.Empty                   13.323     12.269    -7.9%    **1.09x**

------- Code size: -Osize -------

REGRESSION                                         OLD     NEW     DELTA    RATIO  
MapReduce.o                                        14137   14553   +2.9%    **0.97x**

IMPROVEMENT                                        OLD     NEW     DELTA    RATIO  
BinaryFloatingPointConversionFromBinaryInteger.o   16006   13730   -14.2%   **1.17x**
DropFirst.o                                        10047   9771    -2.7%    **1.03x**
Prefix.o                                           10328   10052   -2.7%    **1.03x**
Suffix.o                                           13432   13156   -2.1%    **1.02x**
DropLast.o                                         13434   13158   -2.1%    **1.02x**
ExistentialPerformance.o                           23330   23062   -1.1%    **1.01x**

------- Performance (arm64): -Onone -------

REGRESSION                                        OLD         NEW         DELTA    RATIO    
DataCreateEmpty                                   599.167     701.563     +17.1%   **0.85x**
NormalizedIterator_fastPrenormal                  350.333     385.172     +9.9%    **0.91x**
NormalizedIterator_nonBMPSlowestPrenormal         275.181     300.0       +9.0%    **0.92x**
NormalizedIterator_emoji                          215.1       234.359     +9.0%    **0.92x**
RawBufferCopyBytes                                11.551      12.438      +7.7%    **0.93x (?)**

IMPROVEMENT                                       OLD         NEW         DELTA    RATIO    
PrefixWhileCountableRange                         8535.0      4717.0      -44.7%   **1.81x**
PrefixWhileAnyCollection                          17459.0     9811.0      -43.8%   **1.78x**
DropWhileCountableRange                           2972.0      1697.0      -42.9%   **1.75x**
SuffixAnyCollection                               3068.0      1771.0      -42.3%   **1.73x**
DropFirstAnyCollection                            9177.0      5311.0      -42.1%   **1.73x**
SequenceAlgosAnySequence                          7520.69     4357.143    -42.1%   **1.73x**
PrefixAnyCollection                               9174.0      5322.0      -42.0%   **1.72x**
DropWhileAnyCollection                            11726.0     6820.0      -41.8%   **1.72x**
DropFirstAnySeqCntRange                           12623.0     7380.0      -41.5%   **1.71x**
DropFirstAnySeqCRangeIter                         12592.0     7371.0      -41.5%   **1.71x**
DropFirstAnySeqCRangeIterLazy                     12600.0     7410.0      -41.2%   **1.70x**
SequenceAlgosRange                                714330.0    424430.0    -40.6%   **1.68x**
DropLastAnyCollection                             2971.0      1775.0      -40.3%   **1.67x**
DropFirstAnySeqCntRangeLazy                       12335.0     7415.0      -39.9%   **1.66x**
SuffixAnySeqCntRangeLazy                          13300.0     8106.0      -39.1%   **1.64x**
SuffixAnySeqCRangeIterLazy                        13286.0     8122.0      -38.9%   **1.64x**
SuffixAnySeqCRangeIter                            13091.0     8007.0      -38.8%   **1.63x**
PrefixAnySeqCntRange                              10399.0     6433.0      -38.1%   **1.62x**
SuffixAnySeqCntRange                              12879.0     8001.0      -37.9%   **1.61x**
PrefixAnySeqCRangeIter                            10368.0     6442.0      -37.9%   **1.61x**
PrefixAnySeqCntRangeLazy                          10139.0     6363.0      -37.2%   **1.59x**
LazilyFilteredRange                               333520.0    209670.0    -37.1%   **1.59x**
PrefixAnySeqCRangeIterLazy                        10086.0     6377.0      -36.8%   **1.58x**
DropWhileAnySeqCntRangeLazy                       14495.0     9294.0      -35.9%   **1.56x**
DropWhileAnyCollectionLazy                        14496.0     9296.0      -35.9%   **1.56x**
DropWhileAnySeqCRangeIterLazy                     14463.0     9289.0      -35.8%   **1.56x**
DropWhileAnySeqCRangeIter                         15161.0     9738.0      -35.8%   **1.56x**
DropLastAnySeqCRangeIterLazy                      14297.0     9250.0      -35.3%   **1.55x**
PrefixWhileAnyCollectionLazy                      11455.0     7437.0      -35.1%   **1.54x**
DropLastAnySeqCntRangeLazy                        14199.0     9266.0      -34.7%   **1.53x**
DropLastAnySeqCRangeIter                          13915.0     9134.0      -34.4%   **1.52x**
DropWhileCountableRangeLazy                       13882.0     9114.0      -34.3%   **1.52x**
DropWhileAnySeqCntRange                           14728.0     9722.0      -34.0%   **1.51x**
DropLastAnySeqCntRange                            13761.0     9117.0      -33.7%   **1.51x**
PrefixWhileAnySeqCRangeIterLazy                   11163.0     7396.0      -33.7%   **1.51x**
PrefixWhileAnySeqCntRangeLazy                     11076.0     7392.0      -33.3%   **1.50x**
PrefixWhileCountableRangeLazy                     10864.0     7298.0      -32.8%   **1.49x**
DictionaryGroup                                   1978.0      1360.0      -31.2%   **1.45x**
PrefixWhileAnySeqCntRange                         12756.0     9075.0      -28.9%   **1.41x**
RC4                                               5491.0      3948.0      -28.1%   **1.39x**
PrefixWhileAnySeqCRangeIter                       12564.0     9046.0      -28.0%   **1.39x**
ReversedBidirectional                             21235.0     15481.0     -27.1%   **1.37x**
DropFirstCountableRangeLazy                       17524.0     12799.0     -27.0%   **1.37x**
PrefixCountableRangeLazy                          17612.0     12964.0     -26.4%   **1.36x**
SuffixCountableRangeLazy                          5936.0      4398.0      -25.9%   **1.35x**
DropLastCountableRangeLazy                        5867.0      4361.0      -25.7%   **1.35x**
ArraySubscript                                    24036.0     18644.0     -22.4%   **1.29x**
CreateObjects                                     375.333     292.75      -22.0%   **1.28x**
MonteCarloPi                                      1158750.0   916875.0    -20.9%   **1.26x**
RangeOverlapsRange                                6143.0      4932.0      -19.7%   **1.25x**
MonteCarloE                                       266300.0    215040.0    -19.2%   **1.24x**
RemoveWhereQuadraticInts                          3126.0      2529.0      -19.1%   **1.24x**
SuffixAnyCollectionLazy                           12397.0     10145.0     -18.2%   **1.22x (?)**
PolymorphicCalls                                  1428.0      1179.5      -17.4%   **1.21x**
DropLastAnyCollectionLazy                         12395.0     10242.0     -17.4%   **1.21x**
PrefixAnyCollectionLazy                           37076.0     30661.0     -17.3%   **1.21x**
RemoveWhereQuadraticStrings                       3832.0      3186.0      -16.9%   **1.20x**
ChaCha                                            11267.0     9382.0      -16.7%   **1.20x**
DropFirstAnyCollectionLazy                        36983.0     30922.0     -16.4%   **1.20x**
Data.append.Sequence.64kB.Count0.RE.I             8598.0      7246.0      -15.7%   **1.19x**
Data.append.Sequence.809B.Count0.RE               10664.0     9031.0      -15.3%   **1.18x**
Data.append.Sequence.809B.Count.RE.I              10412.0     8826.0      -15.2%   **1.18x**
Data.append.Sequence.809B.Count0.RE.I             10655.0     9044.0      -15.1%   **1.18x**
Data.init.Sequence.64kB.Count.RE.I                8384.0      7128.0      -15.0%   **1.18x**
Data.init.Sequence.809B.Count.RE                  10393.0     8837.0      -15.0%   **1.18x**
Data.append.Sequence.809B.Count.RE                10388.0     8835.0      -14.9%   **1.18x**
DataAppendSequence                                1038500.0   883800.0    -14.9%   **1.18x**
Data.append.Sequence.64kB.Count0.RE               8582.0      7309.0      -14.8%   **1.17x**
Data.append.Sequence.64kB.Count.RE.I              8385.0      7142.0      -14.8%   **1.17x**
Data.init.Sequence.809B.Count.RE.I                10392.0     8862.0      -14.7%   **1.17x**
Data.init.Sequence.64kB.Count.RE                  8366.0      7135.0      -14.7%   **1.17x**
Data.append.Sequence.64kB.Count.RE                8364.0      7147.0      -14.6%   **1.17x**
RangeOverlapsClosedRange                          6764.0      5811.0      -14.1%   **1.16x**
Data.init.Sequence.64kB.Count0.RE                 8537.0      7347.0      -13.9%   **1.16x (?)**
Data.init.Sequence.809B.Count0.RE                 10607.0     9140.0      -13.8%   **1.16x**
RangeIterationSigned                              3272.0      2825.0      -13.7%   **1.16x**
ByteSwap                                          381.833     329.857     -13.6%   **1.16x**
Data.init.Sequence.809B.Count0.RE.I               10609.0     9177.0      -13.5%   **1.16x**
Data.init.Sequence.64kB.Count0.RE.I               8535.0      7383.0      -13.5%   **1.16x**
RandomDoubleOpaqueLCG                             12931.0     11329.0     -12.4%   **1.14x**
RandomDoubleLCG                                   12693.0     11203.0     -11.7%   **1.13x**
NibbleSort                                        122940.0    109920.0    -10.6%   **1.12x**
ClosedRangeOverlapsClosedRange                    5355.0      4798.0      -10.4%   **1.12x**
FloatingPointPrinting_Float_description_uniform   5223.077    4723.256    -9.6%    **1.11x**
RandomDouble01LCG                                 8836.0      8012.0      -9.3%    **1.10x**
Data.init.Sequence.2049B.Count.I                  1170.5      1062.5      -9.2%    **1.10x**
Data.init.Sequence.64kB.Count                     741.667     674.0       -9.1%    **1.10x**
Data.append.Sequence.64kB.Count.I                 742.0       674.333     -9.1%    **1.10x**
Data.init.Sequence.809B.Count                     935.5       850.5       -9.1%    **1.10x**
Data.init.Sequence.64kB.Count.I                   741.333     674.0       -9.1%    **1.10x**
Data.append.Sequence.64kB.Count                   741.5       674.333     -9.1%    **1.10x**
Data.init.Sequence.511B.Count.I                   901.0       819.5       -9.0%    **1.10x**
Data.init.Sequence.2047B.Count.I                  1167.0      1061.5      -9.0%    **1.10x**
Data.append.Sequence.809B.Count                   949.0       864.5       -8.9%    **1.10x**
Data.append.Sequence.809B.Count.I                 950.0       865.5       -8.9%    **1.10x**
Data.init.Sequence.513B.Count.I                   903.5       823.667     -8.8%    **1.10x**
Data.init.Sequence.809B.Count.I                   933.0       851.0       -8.8%    **1.10x**
String.replaceSubrange.RepChar.Small              160.571     146.667     -8.7%    **1.09x**
LineSink.scalars.complex                          345.286     315.857     -8.5%    **1.09x**
LineSink.bytes.complex                            345.143     315.857     -8.5%    **1.09x**
RandomDoubleOpaqueDef                             16221.429   14853.333   -8.4%    **1.09x**
RandomDoubleDef                                   16050.0     14706.667   -8.4%    **1.09x**
ConvertFloatingPoint.MockFloat64Exactly           173.333     159.0       -8.3%    **1.09x**
CharIteration_utf16_unicodeScalars                43660.0     40280.0     -7.7%    **1.08x (?)**
RandomDouble01Def                                 12070.588   11182.353   -7.4%    **1.08x**
RandomIntegersDef                                 12564.706   11678.947   -7.0%    **1.08x**
FlattenDistanceFromTo.Array.String.04.08          708.333     660.0       -6.8%    **1.07x (?)**
LineSink.bytes.alpha                              101.333     94.538      -6.7%    **1.07x**
ConvertFloatingPoint.MockFloat64ToInt64           15317.0     14296.0     -6.7%    **1.07x (?)**
ConvertFloatingPoint.MockFloat64Exactly2          253.667     236.778     -6.7%    **1.07x**

------- Code size: -swiftlibs -------

IMPROVEMENT          OLD       NEW       DELTA   RATIO  
libswiftCore.dylib   5144576   5046272   -1.9%   **1.02x**

@meg-gupta meg-gupta changed the title [DNM] Enable ClosureLifetimeFixup's capture copy elimination for copyable types as well. Enable ClosureLifetimeFixup's capture copy elimination for copyable types as well. Jun 27, 2024
@meg-gupta meg-gupta marked this pull request as ready for review June 27, 2024 20:23
@meg-gupta meg-gupta requested a review from kavon as a code owner June 27, 2024 20:23
@meg-gupta meg-gupta requested review from atrick, jckarter, nate-chandler and eeckstein and removed request for kavon June 27, 2024 20:23
@meg-gupta
Copy link
Contributor Author

Based on top of #74768

@meg-gupta meg-gupta changed the title Enable ClosureLifetimeFixup's capture copy elimination for copyable types as well. [DNM] Enable ClosureLifetimeFixup's capture copy elimination for copyable types as well. Jun 27, 2024
@meg-gupta
Copy link
Contributor Author

@swift-ci test windows platform

@meg-gupta
Copy link
Contributor Author

@swift-ci build toolchain

Copy link
Contributor

@atrick atrick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fix this comment

For noncopyable address-only captures...

We should enable this optimization so it is actually tested. It is high risk because this code has been extremely bug prone, and we're suddenly enabling it on all address captures. So I wouldn't do it on a release branch.

@atrick
Copy link
Contributor

atrick commented Jul 2, 2024

The Onone performance is awesome, and this "optimization" totally makes sense at Onone if we're willing to give up debugging the captured value after the closure runs!

@meg-gupta
Copy link
Contributor Author

@swift-ci smoke test

@meg-gupta meg-gupta changed the title [DNM] Enable ClosureLifetimeFixup's capture copy elimination for copyable types as well. Enable ClosureLifetimeFixup's capture copy elimination for copyable types as well. Jul 2, 2024
@meg-gupta
Copy link
Contributor Author

@swift-ci smoke test

@meg-gupta meg-gupta enabled auto-merge July 2, 2024 23:28
@meg-gupta
Copy link
Contributor Author

Smoke testing now since full testing passed previously

@meg-gupta meg-gupta merged commit c2fd130 into swiftlang:main Jul 3, 2024
3 checks passed
@meg-gupta meg-gupta deleted the copyelimexpand branch July 10, 2024 00:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants