Skip to content

[SILOptimizer] Alter FSO arg explosion heuristic. #27239

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

nate-chandler
Copy link
Contributor

The new rule on the basis of which an argument will be exploded is as follows:

(1) An argument with more than 3 live leaves after considering the results of the owned-to-guaranteed transformation will not be exploded.

(2) Rule (1) permitting, an argument all of whose non-trivial leaves are live will be exploded if and only if (a) there is more than one non-trivial leaf and (b) specialization will not result in a thunk. This rule was beneficial before semantic ARC because it avoided collapsing RC identities. It is not useful in the face of semantic ARC and should be retired.

(3) Rule (1) permitting, an argument with dead non-trivial leaves will be exploded.

rdar://problem/39957093

@nate-chandler
Copy link
Contributor Author

@swift-ci please smoke benchmark

@swift-ci
Copy link
Contributor

Performance: -O

Regression OLD NEW DELTA RATIO
BinaryFloatingPointPropertiesBinade 25 31 +24.0% 0.81x (?)
RandomShuffleLCG2 704 816 +15.9% 0.86x
RandomDoubleLCG 220 254 +15.5% 0.87x
CSVParsingAltIndices2 847 968 +14.3% 0.88x
Set.subtracting.Empty.Box 14 16 +14.3% 0.88x
Phonebook 1652 1876 +13.6% 0.88x
CharIteration_russian_unicodeScalars 3000 3400 +13.3% 0.88x
FlattenListLoop 3977 4432 +11.4% 0.90x (?)
CharIndexing_tweet_unicodeScalars 10160 11280 +11.0% 0.90x
CharIndexing_ascii_unicodeScalars 5200 5760 +10.8% 0.90x (?)
StringComparison_nonBMPSlowestPrenormal 1450 1590 +9.7% 0.91x (?)
StringComparison_emoji 752 824 +9.6% 0.91x (?)
RemoveWhereMoveInts 34 37 +8.8% 0.92x (?)
RemoveWhereSwapInts 60 65 +8.3% 0.92x (?)
StrComplexWalk 2560 2770 +8.2% 0.92x
Set.subtracting.Seq.Empty.Int 187 202 +8.0% 0.93x
Set.subtracting.Seq.Empty.Box 188 203 +8.0% 0.93x (?)
MapReduce 369 398 +7.9% 0.93x
MapReduceAnyCollection 370 399 +7.8% 0.93x (?)
StrToInt 1550 1670 +7.7% 0.93x
 
Improvement OLD NEW DELTA RATIO
RangeIterationSigned 200 171 -14.5% 1.17x
Data.init.Sequence.64kB.Count0 399 342 -14.3% 1.17x (?)
Data.init.Sequence.64kB.Count0.I 399 342 -14.3% 1.17x (?)
Data.append.Sequence.64kB.Count0.I 396 340 -14.1% 1.16x (?)
Data.append.Sequence.64kB.Count0 395 341 -13.7% 1.16x (?)
FatCompactMap 1400 1210 -13.6% 1.16x
ChainedFilterMap 1260 1089 -13.6% 1.16x
Data.init.Sequence.2047B.Count0.I 716 621 -13.3% 1.15x (?)
Data.init.Sequence.2049B.Count0.I 714 622 -12.9% 1.15x (?)
Data.init.Sequence.64kB.Count0.RE.I 401 353 -12.0% 1.14x (?)
Data.init.Sequence.64kB.Count0.RE 400 353 -11.7% 1.13x (?)
Data.append.Sequence.64kB.Count0.RE.I 397 352 -11.3% 1.13x (?)
Data.append.Sequence.64kB.Count0.RE 397 352 -11.3% 1.13x (?)
Data.append.Sequence.809B.Count0 575 511 -11.1% 1.13x (?)
Data.append.Sequence.809B.Count0.I 573 510 -11.0% 1.12x (?)
Data.init.Sequence.809B.Count0 639 570 -10.8% 1.12x (?)
Data.init.Sequence.809B.Count0.I 638 571 -10.5% 1.12x (?)
ObjectiveCBridgeStubDataAppend 5960 5360 -10.1% 1.11x
Data.append.Sequence.809B.Count0.RE 578 521 -9.9% 1.11x (?)
Data.append.Sequence.809B.Count0.RE.I 579 522 -9.8% 1.11x (?)
Data.init.Sequence.513B.Count0.I 673 609 -9.5% 1.11x (?)
Data.init.Sequence.809B.Count0.RE 642 581 -9.5% 1.10x (?)
Data.init.Sequence.809B.Count0.RE.I 641 582 -9.2% 1.10x (?)
Data.init.Sequence.511B.Count0.I 669 609 -9.0% 1.10x (?)
NSStringConversion.UTF8 833 761 -8.6% 1.09x (?)
CSVParsing.Scalar 205 191 -6.8% 1.07x (?)
LessSubstringSubstring 45 42 -6.7% 1.07x (?)
EqualStringSubstring 45 42 -6.7% 1.07x (?)
EqualSubstringSubstringGenericEquatable 45 42 -6.7% 1.07x (?)
EqualSubstringString 45 42 -6.7% 1.07x (?)
LessSubstringSubstringGenericComparable 45 42 -6.7% 1.07x (?)

Code size: -O

Regression OLD NEW DELTA RATIO
ProtocolDispatch.o 820 910 +11.0% 0.90x
CharacterLiteralsLarge.o 938 1028 +9.6% 0.91x
IterateData.o 1811 1979 +9.3% 0.92x
CaptureProp.o 939 1013 +7.9% 0.93x
TypeFlood.o 963 1035 +7.5% 0.93x
COWArrayGuaranteedParameterOverhead.o 1246 1334 +7.1% 0.93x
DictionaryLiteral.o 1273 1361 +6.9% 0.94x
CharacterLiteralsSmall.o 1386 1476 +6.5% 0.94x
RecursiveOwnedParameter.o 1405 1490 +6.0% 0.94x
NSDictionaryCastToSwift.o 1653 1752 +6.0% 0.94x
ArraySetElement.o 1211 1283 +5.9% 0.94x
PointerArithmetics.o 1815 1922 +5.9% 0.94x
RangeIteration.o 1684 1780 +5.7% 0.95x
Sim2DArray.o 1283 1355 +5.6% 0.95x
NSError.o 1348 1420 +5.3% 0.95x
MonteCarloPi.o 1623 1706 +5.1% 0.95x
SuperChars.o 1478 1552 +5.0% 0.95x
ProtocolDispatch2.o 1778 1866 +4.9% 0.95x
Chars.o 1651 1725 +4.5% 0.96x
OpaqueConsumingUsers.o 2047 2137 +4.4% 0.96x
Fibonacci.o 1650 1722 +4.4% 0.96x
SevenBoom.o 1664 1736 +4.3% 0.96x
ByteSwap.o 1674 1746 +4.3% 0.96x
LinkedList.o 2183 2274 +4.2% 0.96x
DeadArray.o 1898 1970 +3.8% 0.96x
Ackermann.o 1906 1978 +3.8% 0.96x
Join.o 2334 2422 +3.8% 0.96x
BitCount.o 1914 1986 +3.8% 0.96x
StackPromo.o 1986 2058 +3.6% 0.97x
XorLoop.o 2066 2138 +3.5% 0.97x
Memset.o 2208 2280 +3.3% 0.97x
Integrate.o 2351 2423 +3.1% 0.97x
StrComplexWalk.o 2815 2898 +2.9% 0.97x
ArrayOfPOD.o 2442 2514 +2.9% 0.97x
SortArrayInClass.o 2722 2802 +2.9% 0.97x
AnyHashableWithAClass.o 3002 3090 +2.9% 0.97x
DictionaryBridge.o 3286 3374 +2.7% 0.97x
Calculator.o 2762 2834 +2.6% 0.97x
ObserverPartiallyAppliedMethod.o 3419 3507 +2.6% 0.97x
ObserverForwarderStruct.o 3490 3578 +2.5% 0.98x
ErrorHandling.o 3007 3079 +2.4% 0.98x
NIOChannelPipeline.o 4096 4192 +2.3% 0.98x
ObserverClosure.o 3132 3204 +2.3% 0.98x
ObjectAllocation.o 4097 4190 +2.3% 0.98x
OpenClose.o 3270 3342 +2.2% 0.98x
MonteCarloE.o 3386 3458 +2.1% 0.98x
Array2D.o 4236 4324 +2.1% 0.98x
Hanoi.o 3593 3665 +2.0% 0.98x
StrToInt.o 3597 3669 +2.0% 0.98x
ArraySubscript.o 3718 3790 +1.9% 0.98x
PopFrontGeneric.o 4684 4772 +1.9% 0.98x
SortLargeExistentials.o 20558 20944 +1.9% 0.98x
ObserverUnappliedMethod.o 4926 5014 +1.8% 0.98x
Histogram.o 4140 4212 +1.7% 0.98x
SortLettersInPlace.o 8935 9089 +1.7% 0.98x
ClassArrayGetter.o 5631 5719 +1.6% 0.98x
RangeReplaceableCollectionPlusDefault.o 6054 6142 +1.5% 0.99x
NopDeinit.o 5421 5493 +1.3% 0.99x
TwoSum.o 5462 5534 +1.3% 0.99x
RangeAssignment.o 4938 5003 +1.3% 0.99x
Phonebook.o 11633 11779 +1.3% 0.99x
PolymorphicCalls.o 7606 7694 +1.2% 0.99x
NibbleSort.o 12208 12338 +1.1% 0.99x
DiffingMyers.o 8347 8435 +1.1% 0.99x
Combos.o 6882 6954 +1.0% 0.99x
 
Improvement OLD NEW DELTA RATIO
ObjectiveCBridgingStubs.o 22756 17348 -23.8% 1.31x
ExistentialPerformance.o 46247 37063 -19.9% 1.25x
DataBenchmarks.o 84534 70895 -16.1% 1.19x
UTF8Decode.o 12422 10542 -15.1% 1.18x
ArrayAppend.o 38130 32426 -15.0% 1.18x
StringWalk.o 39215 33663 -14.2% 1.16x
StringBuilder.o 8261 7301 -11.6% 1.13x
StringReplaceSubrange.o 5270 4742 -10.0% 1.11x
StringInterpolation.o 7393 6751 -8.7% 1.10x
FloatingPointPrinting.o 6024 5530 -8.2% 1.09x
NSStringConversion.o 8741 8085 -7.5% 1.08x
SetTests.o 143659 134315 -6.5% 1.07x
Substring.o 18041 16879 -6.4% 1.07x
ArrayLiteral.o 3319 3111 -6.3% 1.07x
ObjectiveCNoBridgingStubs.o 8868 8340 -6.0% 1.06x
AngryPhonebook.o 9875 9299 -5.8% 1.06x
InsertCharacter.o 5016 4729 -5.7% 1.06x
Diffing.o 8086 7670 -5.1% 1.05x
ObjectiveCBridging.o 59441 56609 -4.8% 1.05x
StringComparison.o 40084 38228 -4.6% 1.05x
DictionaryCopy.o 10795 10315 -4.4% 1.05x
PrefixWhile.o 20517 19605 -4.4% 1.05x
StringEdits.o 12309 11773 -4.4% 1.05x
DropWhile.o 21561 20649 -4.2% 1.04x
RandomValues.o 2991 2865 -4.2% 1.04x
Prefix.o 21610 20714 -4.1% 1.04x
DropFirst.o 22501 21589 -4.1% 1.04x
FindStringNaive.o 11136 10712 -3.8% 1.04x
DropLast.o 24454 23542 -3.7% 1.04x
Suffix.o 24292 23396 -3.7% 1.04x
MapReduce.o 32139 31099 -3.2% 1.03x
COWTree.o 12346 12018 -2.7% 1.03x
StringMatch.o 4774 4655 -2.5% 1.03x
StringTests.o 8206 8014 -2.3% 1.02x
IntegerParsing.o 19646 19198 -2.3% 1.02x
RemoveWhere.o 24207 23679 -2.2% 1.02x
FloatingPointParsing.o 15480 15192 -1.9% 1.02x
SequenceAlgos.o 19895 19551 -1.7% 1.02x
ReduceInto.o 16823 16551 -1.6% 1.02x
CString.o 7954 7826 -1.6% 1.02x
DriverUtils.o 149581 147385 -1.5% 1.01x
RC4.o 3936 3880 -1.4% 1.01x
Exclusivity.o 4469 4408 -1.4% 1.01x
CSVParsing.o 57880 57096 -1.4% 1.01x
DictTest4Legacy.o 23731 23411 -1.3% 1.01x
DictionaryBridgeToObjC.o 5202 5138 -1.2% 1.01x
BinaryFloatingPointProperties.o 7554 7466 -1.2% 1.01x
DictTest4.o 22884 22628 -1.1% 1.01x

Performance: -Osize

Regression OLD NEW DELTA RATIO
ObjectiveCBridgeFromNSSetAnyObjectToString 75500 83000 +9.9% 0.91x (?)
Set.subtracting.Seq.Empty.Int 188 205 +9.0% 0.92x
FlattenListLoop 4066 4429 +8.9% 0.92x (?)
Array2D 6912 7520 +8.8% 0.92x
 
Improvement OLD NEW DELTA RATIO
EqualSubstringString 45 42 -6.7% 1.07x

Code size: -Osize

Performance: -Onone

Regression OLD NEW DELTA RATIO
CharIteration_punctuatedJapanese_unicodeScalars_Backwards 55600 60760 +9.3% 0.92x (?)
CharIteration_ascii_unicodeScalars_Backwards 320520 347320 +8.4% 0.92x (?)
CharIteration_tweet_unicodeScalars_Backwards 642280 693400 +8.0% 0.93x (?)
CharIteration_chinese_unicodeScalars_Backwards 246600 265960 +7.9% 0.93x (?)
ArrayOfPOD 1063 1145 +7.7% 0.93x (?)
 
Improvement OLD NEW DELTA RATIO
NSStringConversion.Mutable 2656 2344 -11.7% 1.13x (?)
NSStringConversion.UTF8 2372 2123 -10.5% 1.12x (?)

Code size: -swiftlibs

Improvement OLD NEW DELTA RATIO
libswiftDarwin.dylib 32768 28672 -12.5% 1.14x
libswiftSceneKit.dylib 53248 49152 -7.7% 1.08x
libswiftCloudKit.dylib 73728 69632 -5.6% 1.06x
libswiftsimd.dylib 139264 135168 -2.9% 1.03x
libswiftNetwork.dylib 172032 167936 -2.4% 1.02x
libswiftStdlibUnittest.dylib 368640 364544 -1.1% 1.01x
How to read the data The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the
regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false
alarms. Unexpected regressions which are marked with '(?)' are probably noise.
If you see regressions which you cannot explain you can try to run the
benchmarks again. If regressions still show up, please consult with the
performance team (@eeckstein).

Hardware Overview
  Model Name: Mac Pro
  Model Identifier: MacPro6,1
  Processor Name: 12-Core Intel Xeon E5
  Processor Speed: 2.7 GHz
  Number of Processors: 1
  Total Number of Cores: 12
  L2 Cache (per Core): 256 KB
  L3 Cache: 30 MB
  Memory: 64 GB

@nate-chandler
Copy link
Contributor Author

@swift-ci please test

@swift-ci
Copy link
Contributor

Build failed
Swift Test OS X Platform
Git Sha - 2fbb9f08e63a787939147d93230aacf4f2ca11f0

Copy link
Contributor

@eeckstein eeckstein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!
My main concern here is that I'm not sure if exploding arguments for the sake of reducing ARC traffic helps at all, given that we pass most of our arguments as guaranteed. Some of the assumptions in the FSO come from a time where we had the owned-convention as default.
I think it's worth doing some more experiments (in case you didn't do it yet), e.g. compare benchmark results with disabling argument explosion at all, etc.

/// live non-trivial leaf and (b) specializing will not introduce a thunk.
/// Breaking apart the non-trivial leaves is beneficial because it imbues
/// the leaves with different RC identities, enabling the low level ARC
/// optimizer to pair retains/releases in an easier way.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, this is beneficial now, but how about when we have ownership SIL at this point? Will it sill be beneficial?
Anyway, I think it's worth running the benchmarks with and without only this rule.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup! The comment below

/// TODO: Eliminate the second case.  That splitting up is only beneficial 
///       before semantic ARC.

is about exactly that, eliminating the second rule once we have ownership SIL at this point. As you say, I'll run the benchmarks without this rule to see how it works.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, opened a second PR here to run the benchmarks without this rule: #27252 .

// If it is known without taking the owned-to-guaranteed transformation into
// account both that exploding will reduce ARC traffic (because an upper bound
// for the number of live non-trivial leaves is less thann the non-trivial
// leaf count) and also that the explosion will fit within the heuristic upper
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, when I think about it: is this really true, given that we pass most of the arguments as guaranteed by default? I think it mainly depends on the call site. Only if we need a retain-release pair at the call site, eliminating a dead argument will give a benefit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eeckstein it is not that simple. What if we want to move that value pass the call site. Consider a dead guaranteed parameter. Without eliminating the argument we will not be able to reduce the lifetime and do other ARC optimizations.

I would put it this way, it only makes sense if we eliminate an ARC "constraint" on the lifetime of the non-trivially typed thing.

@nate-chandler
Copy link
Contributor Author

@swift-ci please benchmark

@swift-ci
Copy link
Contributor

Performance: -O

Regression OLD NEW DELTA RATIO
Array2D 3472 4368 +25.8% 0.79x
DataCreateEmpty 80 100 +25.0% 0.80x
ArrayAppendUTF16Substring 9432 11376 +20.6% 0.83x
ArrayAppendAsciiSubstring 9432 11376 +20.6% 0.83x (?)
RandomShuffleLCG2 336 400 +19.0% 0.84x (?)
CharIndexing_tweet_unicodeScalars_Backwards 5160 5960 +15.5% 0.87x (?)
ObjectiveCBridgeStringHash 40 46 +15.0% 0.87x
CharIndexing_punctuated_unicodeScalars_Backwards 600 680 +13.3% 0.88x
CharIteration_punctuated_unicodeScalars_Backwards 600 680 +13.3% 0.88x
CSVParsingAltIndices2 583 660 +13.2% 0.88x (?)
ObjectiveCBridgeStubNSDateRefAccess 174 196 +12.6% 0.89x (?)
SuffixSequence 286 322 +12.6% 0.89x
UTF8Decode_InitFromData 120 135 +12.5% 0.89x
SuffixSequenceLazy 287 319 +11.1% 0.90x
StrComplexWalk 1890 2100 +11.1% 0.90x
StrToInt 900 1000 +11.1% 0.90x
RangeOverlapsClosedRange 55 61 +10.9% 0.90x (?)
CharIteration_ascii_unicodeScalars_Backwards 2600 2880 +10.8% 0.90x (?)
MapReduceAnyCollection 196 217 +10.7% 0.90x (?)
CharIteration_tweet_unicodeScalars_Backwards 5160 5680 +10.1% 0.91x
UTF8Decode_InitFromBytes 129 142 +10.1% 0.91x
CharIteration_punctuated_unicodeScalars 400 440 +10.0% 0.91x
ParseInt.Small.Hex 205 225 +9.8% 0.91x
RemoveWhereSwapInts 31 34 +9.7% 0.91x (?)
CharIteration_ascii_unicodeScalars 1720 1880 +9.3% 0.91x (?)
CharIteration_punctuatedJapanese_unicodeScalars 440 480 +9.1% 0.92x (?)
DistinctClassFieldAccesses 177 193 +9.0% 0.92x
RemoveWhereFilterInts 23 25 +8.7% 0.92x (?)
Phonebook 1113 1204 +8.2% 0.92x
CharIteration_tweet_unicodeScalars 3440 3720 +8.1% 0.92x (?)
ArrayInClass 805 870 +8.1% 0.93x (?)
Fibonacci 89 96 +7.9% 0.93x
 
Improvement OLD NEW DELTA RATIO
LessSubstringSubstring 28 21 -25.0% 1.33x
EqualStringSubstring 28 21 -25.0% 1.33x (?)
EqualSubstringSubstringGenericEquatable 28 21 -25.0% 1.33x
EqualSubstringString 28 21 -25.0% 1.33x
LessSubstringSubstringGenericComparable 28 21 -25.0% 1.33x
EqualSubstringSubstring 28 22 -21.4% 1.27x
Dictionary4 197 157 -20.3% 1.25x
RangeIterationSigned 108 87 -19.4% 1.24x
PointerArithmetics 23900 19300 -19.2% 1.24x
ReversedDictionary2 261 213 -18.4% 1.23x
ReversedArray2 109 89 -18.3% 1.22x (?)
UTF8Decode_InitDecoding 123 105 -14.6% 1.17x
ClosedRangeOverlapsClosedRange 48 41 -14.6% 1.17x
Data.append.Sequence.64kB.Count0.I 268 231 -13.8% 1.16x (?)
Data.append.Sequence.64kB.Count0 268 232 -13.4% 1.16x (?)
PrefixArrayLazy 15 13 -13.3% 1.15x (?)
Data.init.Sequence.64kB.Count0 269 235 -12.6% 1.14x (?)
Data.init.Sequence.64kB.Count0.I 269 235 -12.6% 1.14x
Dictionary4OfObjects 225 197 -12.4% 1.14x
RangeOverlapsRange 65 57 -12.3% 1.14x (?)
Data.append.Sequence.809B.Count0 391 345 -11.8% 1.13x
Data.append.Sequence.809B.Count0.I 391 345 -11.8% 1.13x
Data.init.Sequence.64kB.Count0.RE.I 283 250 -11.7% 1.13x
Data.append.Sequence.64kB.Count0.RE.I 283 250 -11.7% 1.13x
Data.append.Sequence.64kB.Count0.RE 283 250 -11.7% 1.13x
ObjectiveCBridgeStubDataAppend 4680 4140 -11.5% 1.13x (?)
Data.init.Sequence.2047B.Count0.I 479 426 -11.1% 1.12x (?)
Data.init.Sequence.2049B.Count0.I 479 426 -11.1% 1.12x (?)
CharacterLiteralsSmall 215 193 -10.2% 1.11x
Data.append.Sequence.809B.Count0.RE.I 407 366 -10.1% 1.11x (?)
Data.append.Sequence.809B.Count0.RE 407 366 -10.1% 1.11x (?)
Data.init.Sequence.64kB.Count0.RE 278 250 -10.1% 1.11x
StringComparison_longSharedPrefix 357 322 -9.8% 1.11x (?)
Data.init.Sequence.809B.Count0.I 431 389 -9.7% 1.11x (?)
ProtocolDispatch 217 196 -9.7% 1.11x (?)
Data.init.Sequence.809B.Count0 430 389 -9.5% 1.11x (?)
Data.init.Sequence.809B.Count0.RE.I 447 405 -9.4% 1.10x (?)
Data.init.Sequence.809B.Count0.RE 446 405 -9.2% 1.10x (?)
Data.init.Sequence.513B.Count0.I 457 417 -8.8% 1.10x (?)
SequenceAlgosContiguousArray 1260 1150 -8.7% 1.10x (?)
Data.init.Sequence.511B.Count0.I 456 418 -8.3% 1.09x (?)
Set.isSuperset.Seq.Empty.Int 50 46 -8.0% 1.09x (?)
LuhnAlgoLazy 213 197 -7.5% 1.08x (?)
PrefixArray 14 13 -7.1% 1.08x (?)
StringHasPrefixAscii 1310 1220 -6.9% 1.07x (?)
ArrayLiteral2 76 71 -6.6% 1.07x (?)

Code size: -O

Regression OLD NEW DELTA RATIO
ProtocolDispatch.o 820 910 +11.0% 0.90x
CharacterLiteralsLarge.o 938 1028 +9.6% 0.91x
IterateData.o 1811 1979 +9.3% 0.92x
CaptureProp.o 939 1013 +7.9% 0.93x
TypeFlood.o 963 1035 +7.5% 0.93x
COWArrayGuaranteedParameterOverhead.o 1246 1334 +7.1% 0.93x
DictionaryLiteral.o 1273 1361 +6.9% 0.94x
CharacterLiteralsSmall.o 1386 1476 +6.5% 0.94x
RecursiveOwnedParameter.o 1405 1490 +6.0% 0.94x
NSDictionaryCastToSwift.o 1653 1752 +6.0% 0.94x
PointerArithmetics.o 1799 1906 +5.9% 0.94x
ArraySetElement.o 1211 1283 +5.9% 0.94x
RangeIteration.o 1684 1780 +5.7% 0.95x
Sim2DArray.o 1283 1355 +5.6% 0.95x
NSError.o 1348 1420 +5.3% 0.95x
MonteCarloPi.o 1623 1706 +5.1% 0.95x
SuperChars.o 1478 1552 +5.0% 0.95x
ProtocolDispatch2.o 1762 1850 +5.0% 0.95x
Chars.o 1651 1725 +4.5% 0.96x
OpaqueConsumingUsers.o 2047 2137 +4.4% 0.96x
Fibonacci.o 1650 1722 +4.4% 0.96x
SevenBoom.o 1664 1736 +4.3% 0.96x
ByteSwap.o 1674 1746 +4.3% 0.96x
LinkedList.o 2183 2274 +4.2% 0.96x
DeadArray.o 1898 1970 +3.8% 0.96x
Ackermann.o 1906 1978 +3.8% 0.96x
Join.o 2334 2422 +3.8% 0.96x
BitCount.o 1914 1986 +3.8% 0.96x
StackPromo.o 1986 2058 +3.6% 0.97x
XorLoop.o 2066 2138 +3.5% 0.97x
Memset.o 2192 2264 +3.3% 0.97x
Integrate.o 2351 2423 +3.1% 0.97x
StrComplexWalk.o 2815 2898 +2.9% 0.97x
ArrayOfPOD.o 2442 2514 +2.9% 0.97x
SortArrayInClass.o 2722 2802 +2.9% 0.97x
AnyHashableWithAClass.o 3002 3090 +2.9% 0.97x
DictionaryBridge.o 3286 3374 +2.7% 0.97x
Calculator.o 2762 2834 +2.6% 0.97x
ObserverPartiallyAppliedMethod.o 3419 3507 +2.6% 0.97x
ObserverForwarderStruct.o 3490 3578 +2.5% 0.98x
ErrorHandling.o 3007 3079 +2.4% 0.98x
NIOChannelPipeline.o 4096 4192 +2.3% 0.98x
ObserverClosure.o 3132 3204 +2.3% 0.98x
ObjectAllocation.o 4097 4190 +2.3% 0.98x
OpenClose.o 3270 3342 +2.2% 0.98x
MonteCarloE.o 3386 3458 +2.1% 0.98x
Array2D.o 4236 4324 +2.1% 0.98x
Hanoi.o 3593 3665 +2.0% 0.98x
StrToInt.o 3597 3669 +2.0% 0.98x
ArraySubscript.o 3718 3790 +1.9% 0.98x
SortLargeExistentials.o 20542 20928 +1.9% 0.98x
PopFrontGeneric.o 4684 4772 +1.9% 0.98x
ObserverUnappliedMethod.o 4926 5014 +1.8% 0.98x
Histogram.o 4140 4212 +1.7% 0.98x
SortLettersInPlace.o 8919 9073 +1.7% 0.98x
ClassArrayGetter.o 5631 5719 +1.6% 0.98x
RangeReplaceableCollectionPlusDefault.o 6038 6126 +1.5% 0.99x
NopDeinit.o 5421 5493 +1.3% 0.99x
TwoSum.o 5446 5518 +1.3% 0.99x
RangeAssignment.o 4938 5003 +1.3% 0.99x
Phonebook.o 11633 11779 +1.3% 0.99x
PolymorphicCalls.o 7606 7694 +1.2% 0.99x
NibbleSort.o 12208 12338 +1.1% 0.99x
DiffingMyers.o 8347 8435 +1.1% 0.99x
Combos.o 6866 6938 +1.0% 0.99x
 
Improvement OLD NEW DELTA RATIO
ObjectiveCBridgingStubs.o 22756 17348 -23.8% 1.31x
ExistentialPerformance.o 46241 37057 -19.9% 1.25x
DataBenchmarks.o 84534 70895 -16.1% 1.19x
UTF8Decode.o 12422 10542 -15.1% 1.18x
ArrayAppend.o 38130 32426 -15.0% 1.18x
StringWalk.o 39215 33663 -14.2% 1.16x
StringBuilder.o 8261 7301 -11.6% 1.13x
StringReplaceSubrange.o 5270 4742 -10.0% 1.11x
StringInterpolation.o 7377 6735 -8.7% 1.10x
FloatingPointPrinting.o 6024 5530 -8.2% 1.09x
NSStringConversion.o 8741 8085 -7.5% 1.08x
SetTests.o 143659 134315 -6.5% 1.07x
Substring.o 18041 16879 -6.4% 1.07x
ArrayLiteral.o 3319 3111 -6.3% 1.07x
ObjectiveCNoBridgingStubs.o 8868 8340 -6.0% 1.06x
AngryPhonebook.o 9875 9299 -5.8% 1.06x
InsertCharacter.o 5016 4729 -5.7% 1.06x
Diffing.o 8086 7670 -5.1% 1.05x
ObjectiveCBridging.o 59425 56593 -4.8% 1.05x
StringComparison.o 40084 38228 -4.6% 1.05x
DictionaryCopy.o 10779 10299 -4.5% 1.05x
PrefixWhile.o 20517 19605 -4.4% 1.05x
StringEdits.o 12309 11773 -4.4% 1.05x
DropWhile.o 21561 20649 -4.2% 1.04x
RandomValues.o 2991 2865 -4.2% 1.04x
Prefix.o 21594 20698 -4.1% 1.04x
DropFirst.o 22501 21589 -4.1% 1.04x
FindStringNaive.o 11136 10712 -3.8% 1.04x
DropLast.o 24454 23542 -3.7% 1.04x
Suffix.o 24276 23380 -3.7% 1.04x
MapReduce.o 32139 31099 -3.2% 1.03x
COWTree.o 12346 12018 -2.7% 1.03x
StringMatch.o 4774 4655 -2.5% 1.03x
StringTests.o 8206 8014 -2.3% 1.02x
IntegerParsing.o 19646 19198 -2.3% 1.02x
RemoveWhere.o 24207 23679 -2.2% 1.02x
FloatingPointParsing.o 15480 15192 -1.9% 1.02x
SequenceAlgos.o 19895 19551 -1.7% 1.02x
ReduceInto.o 16823 16551 -1.6% 1.02x
CString.o 7954 7826 -1.6% 1.02x
DriverUtils.o 149565 147369 -1.5% 1.01x
RC4.o 3936 3880 -1.4% 1.01x
Exclusivity.o 4469 4408 -1.4% 1.01x
CSVParsing.o 57880 57096 -1.4% 1.01x
DictTest4Legacy.o 23731 23411 -1.3% 1.01x
DictionaryBridgeToObjC.o 5186 5122 -1.2% 1.01x
BinaryFloatingPointProperties.o 7554 7466 -1.2% 1.01x
DictTest4.o 22884 22628 -1.1% 1.01x

Performance: -Osize

Regression OLD NEW DELTA RATIO
Array2D 3696 4320 +16.9% 0.86x
ObjectiveCBridgeStringHash 40 46 +15.0% 0.87x
RandomShuffleLCG2 368 416 +13.0% 0.88x
RemoveWhereSwapInts 31 35 +12.9% 0.89x (?)
UTF8Decode_InitFromData 119 131 +10.1% 0.91x (?)
MapReduceAnyCollection 240 261 +8.7% 0.92x (?)
MapReduce 219 238 +8.7% 0.92x (?)
DistinctClassFieldAccesses 173 188 +8.7% 0.92x (?)
UTF8Decode_InitFromBytes 129 140 +8.5% 0.92x (?)
ArraySetElement 262 283 +8.0% 0.93x (?)
 
Improvement OLD NEW DELTA RATIO
LessSubstringSubstring 28 21 -25.0% 1.33x
EqualSubstringString 28 21 -25.0% 1.33x (?)
LessSubstringSubstringGenericComparable 28 21 -25.0% 1.33x
EqualStringSubstring 29 22 -24.1% 1.32x (?)
EqualSubstringSubstring 28 22 -21.4% 1.27x (?)
EqualSubstringSubstringGenericEquatable 28 22 -21.4% 1.27x
UTF8Decode_InitDecoding 124 103 -16.9% 1.20x
DataToStringEmpty 500 450 -10.0% 1.11x (?)
StringComparison_longSharedPrefix 359 324 -9.7% 1.11x (?)
SubstringEqualString 266 246 -7.5% 1.08x

Code size: -Osize

Performance: -Onone

Regression OLD NEW DELTA RATIO
ObjectiveCBridgeStringHash 40 46 +15.0% 0.87x (?)
UTF8Decode_InitFromBytes 130 145 +11.5% 0.90x
ArrayAppend 2560 2820 +10.2% 0.91x (?)
UTF8Decode_InitFromData 120 132 +10.0% 0.91x (?)
Memset 7527 8119 +7.9% 0.93x (?)
 
Improvement OLD NEW DELTA RATIO
EqualSubstringSubstringGenericEquatable 31 25 -19.4% 1.24x (?)
LessSubstringSubstringGenericComparable 31 25 -19.4% 1.24x
EqualSubstringSubstring 32 27 -15.6% 1.19x (?)
LessSubstringSubstring 32 27 -15.6% 1.19x
UTF8Decode_InitDecoding 143 121 -15.4% 1.18x (?)
EqualStringSubstring 33 28 -15.2% 1.18x
EqualSubstringString 33 28 -15.2% 1.18x (?)

Code size: -swiftlibs

Improvement OLD NEW DELTA RATIO
libswiftDarwin.dylib 32768 28672 -12.5% 1.14x
libswiftSceneKit.dylib 53248 49152 -7.7% 1.08x
libswiftCloudKit.dylib 73728 69632 -5.6% 1.06x
libswiftsimd.dylib 139264 135168 -2.9% 1.03x
libswiftNetwork.dylib 172032 167936 -2.4% 1.02x
libswiftStdlibUnittest.dylib 368640 364544 -1.1% 1.01x
How to read the data The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the
regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false
alarms. Unexpected regressions which are marked with '(?)' are probably noise.
If you see regressions which you cannot explain you can try to run the
benchmarks again. If regressions still show up, please consult with the
performance team (@eeckstein).

Hardware Overview
  Model Name: Mac mini
  Model Identifier: Macmini8,1
  Processor Name: Intel Core i7
  Processor Speed: 3.2 GHz
  Number of Processors: 1
  Total Number of Cores: 6
  L2 Cache (per Core): 256 KB
  L3 Cache: 12 MB
  Memory: 64 GB

@gottesmm
Copy link
Contributor

@nate-chandler btw the <= 6 field heuristic used by loadable from address is here in this utility: https://github.com/apple/swift/blob/master/lib/SILOptimizer/Utils/Local.cpp#L1497

@nate-chandler
Copy link
Contributor Author

@swift-ci please test macos

@Catfish-Man
Copy link
Contributor

I'm slightly concerned the positive impact of this might be overstated a bit in the benchmark results, since it looks like most of the wins are just repeated variations of 2 tests.

@nate-chandler
Copy link
Contributor Author

@swift-ci please test compiler performance

Copy link
Contributor

@gottesmm gottesmm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick comments

@nate-chandler
Copy link
Contributor Author

@Catfish-Man That's fair. Still doing measurements over here and specifically will be looking at the code size impact on more realistic targets including the compatibility suite.

Added getAllLeafTypes to ProjectionTree.  The new method vends, via an
out paramter, a vector containing the types of all the leaves in a
projection tree in the order that they appear.  The method relies uses a
new convenience on ProjectionTreeNode, isLeaf to include only the types
of those nodes which are leaves.

Excerpted from @GottesM's swiftlang#16756.
Added getUsers to ProjectionTree.  The new method vands, via an out
parameter, a set of all the users of all of the nodes in the projection
tree that are themselves not in the projection tree by way of
getNonProjUsers.  Took this opportunity to tweak getNonProjUsers to vend
a const ArrayRef rather than a SmallVector.

Excerpted from @GottesM's swiftlang#16756.
…leases.

Added getPartiallyPostDomReleaseSet to
ConsumedArgToEpilogueReleaseMatcher.  Given an argument, the new method
returns the array of releases of the argument if there is an array
thereof and if the releases therein do not jointly post-dominate the
argument.

Excerpted from @GottesM's swiftlang#16756.
Replaced some namespace qualified references to ArrayRef and Optional
with the unqualified type.  Reordered the includes per clang-format.

Excerpted from @GottesM's swiftlang#16756.
Added brief doc for FunctionSignatureOpts' ArgumentDescriptor's method
canOptimizeLiveArg and tweaked the style to add braces around the body
of a single line if clause.
The command-line option for sil-fso-enable-generics was previously
visible outside the FunctionSignatureOpts translation unit.  It is not
any longer.
The new flag -sil-fso-optimize-if-not-called forced function signature
optimization to run even on functions which are not called.  Doing so is
helpful for tests to alleviate the burden of writing code to actually
call a function in whose function signature optimization we are
interested.
@swift-ci
Copy link
Contributor

Summary for master full

Unexpected test results, excluded stats for RxCocoa, SwifterSwift, Base64CoderSwiftUI

Regressions found (see below)

Debug-batch

debug-batch brief

Regressed (0)
name old new delta delta_pct
Improved (0)
name old new delta delta_pct
Unchanged (delta < 1.0% or delta < 100.0ms) (3)
name old new delta delta_pct
Frontend.NumInstructionsExecuted 45,466,081,654,868 45,640,339,021,926 174,257,367,058 0.38%
LLVM.NumLLVMBytesOutput 1,797,132,560 1,797,052,058 -80,502 -0.0%
time.swift-driver.wall 4677.4s 4676.9s -494.8ms -0.01%

debug-batch detailed

Regressed (6)
name old new delta delta_pct
Driver.NumDriverPipePolls 69,231 71,952 2,721 3.93% ⛔
Driver.NumDriverPipeReads 58,410 61,057 2,647 4.53% ⛔
Sema.AccessLevelRequest 12,517,247 12,660,691 143,444 1.15% ⛔
Sema.CollectOverriddenDeclsRequest 7,803,192 7,931,445 128,253 1.64% ⛔
Sema.ProvideDefaultImplForRequest 7,803,192 7,931,445 128,253 1.64% ⛔
Sema.USRGenerationRequest 8,968,523 9,103,851 135,328 1.51% ⛔
Improved (0)
name old new delta delta_pct
Unchanged (delta < 1.0% or delta < 100.0ms) (167)
name old new delta delta_pct
AST.ImportSetCacheHit 1,643,060 1,643,721 661 0.04%
AST.ImportSetCacheMiss 493,267 493,387 120 0.02%
AST.ImportSetFoldHit 140,681 140,720 39 0.03%
AST.ImportSetFoldMiss 352,586 352,666 80 0.02%
AST.ModuleShadowCacheHit 2,332 2,332 0 0.0%
AST.ModuleShadowCacheMiss 1,633 1,633 0 0.0%
AST.ModuleVisibilityCacheHit 276,836 276,836 0 0.0%
AST.ModuleVisibilityCacheMiss 19,541 19,541 0 0.0%
AST.NumASTBytesAllocated 59,427,226,740 59,775,450,876 348,224,136 0.59%
AST.NumASTScopeLookups 0 0 0 0.0%
AST.NumBraceStmtASTScopeExpansions 0 0 0 0.0%
AST.NumBraceStmtASTScopes 0 0 0 0.0%
AST.NumDecls 140,692 140,692 0 0.0%
AST.NumDependencies 352,853 352,857 4 0.0%
AST.NumInfixOperators 53,844 53,844 0 0.0%
AST.NumIterableTypeBodyASTScopeExpansions 0 0 0 0.0%
AST.NumIterableTypeBodyASTScopes 0 0 0 0.0%
AST.NumLinkLibraries 0 0 0 0.0%
AST.NumLoadedModules 322,059 322,059 0 0.0%
AST.NumLocalTypeDecls 253 253 0 0.0%
AST.NumLookupInModule 6,062,392 6,081,110 18,718 0.31%
AST.NumLookupQualifiedInAnyObject 291 291 0 0.0%
AST.NumLookupQualifiedInModule 2,997,675 3,016,159 18,484 0.62%
AST.NumLookupQualifiedInNominal 7,642,618 7,652,408 9,790 0.13%
AST.NumModuleLookupClassMember 7,167 7,167 0 0.0%
AST.NumModuleLookupValue 44,804,195 44,828,124 23,929 0.05%
AST.NumObjCMethods 24,701 24,701 0 0.0%
AST.NumPostfixOperators 49 49 0 0.0%
AST.NumPrecedenceGroups 25,736 25,736 0 0.0%
AST.NumPrefixOperators 99 99 0 0.0%
AST.NumReferencedDynamicNames 199 199 0 0.0%
AST.NumReferencedMemberNames 6,347,343 6,347,343 0 0.0%
AST.NumReferencedTopLevelNames 478,041 478,041 0 0.0%
AST.NumSourceBuffers 396,644 396,644 0 0.0%
AST.NumSourceLines 4,696,405 4,696,405 0 0.0%
AST.NumSourceLinesPerSecond 3,300,920 3,286,754 -14,166 -0.43%
AST.NumTotalClangImportedEntities 5,548,939 5,562,324 13,385 0.24%
AST.NumUnqualifiedLookup 3,920,003 3,920,249 246 0.01%
Driver.ChildrenMaxRSS 202,650,640,384 202,600,824,832 -49,815,552 -0.02%
Driver.DriverDepCascadingDynamic 0 0 0 0.0%
Driver.DriverDepCascadingExternal 0 0 0 0.0%
Driver.DriverDepCascadingMember 0 0 0 0.0%
Driver.DriverDepCascadingNominal 0 0 0 0.0%
Driver.DriverDepCascadingTopLevel 0 0 0 0.0%
Driver.DriverDepDynamic 0 0 0 0.0%
Driver.DriverDepExternal 0 0 0 0.0%
Driver.DriverDepMember 0 0 0 0.0%
Driver.DriverDepNominal 0 0 0 0.0%
Driver.DriverDepTopLevel 0 0 0 0.0%
Driver.NumDriverJobsRun 27,973 27,973 0 0.0%
Driver.NumDriverJobsSkipped 0 0 0 0.0%
Driver.NumProcessFailures 0 0 0 0.0%
Frontend.MaxMallocUsage 1,025,701,731,168 1,028,412,267,040 2,710,535,872 0.26%
Frontend.NumInstructionsExecuted 45,466,081,654,868 45,640,339,021,926 174,257,367,058 0.38%
Frontend.NumProcessFailures 0 0 0 0.0%
IRModule.NumIRAliases 197,438 197,361 -77 -0.04%
IRModule.NumIRBasicBlocks 6,840,905 6,840,609 -296 -0.0%
IRModule.NumIRComdatSymbols 0 0 0 0.0%
IRModule.NumIRFunctions 3,370,390 3,370,266 -124 -0.0%
IRModule.NumIRGlobals 3,548,133 3,548,124 -9 -0.0%
IRModule.NumIRIFuncs 0 0 0 0.0%
IRModule.NumIRInsts 87,012,931 87,012,419 -512 -0.0%
IRModule.NumIRNamedMetaData 134,560 134,560 0 0.0%
IRModule.NumIRValueSymbols 6,261,156 6,260,950 -206 -0.0%
LLVM.NumLLVMBytesOutput 1,797,132,560 1,797,052,058 -80,502 -0.0%
Parse.NumFunctionsParsed 268,997 268,997 0 0.0%
Parse.NumIterableDeclContextParsed 718,304 718,306 2 0.0%
Parse.ParseAbstractFunctionBodyRequest 246,804 246,804 0 0.0%
Parse.ParseMembersRequest 576,308 576,310 2 0.0%
SILModule.NumSILGenDefaultWitnessTables 0 0 0 0.0%
SILModule.NumSILGenFunctions 1,713,544 1,713,544 0 0.0%
SILModule.NumSILGenGlobalVariables 53,886 53,886 0 0.0%
SILModule.NumSILGenVtables 18,663 18,663 0 0.0%
SILModule.NumSILGenWitnessTables 72,450 72,450 0 0.0%
SILModule.NumSILOptDefaultWitnessTables 0 0 0 0.0%
SILModule.NumSILOptFunctions 2,456,697 2,456,702 5 0.0%
SILModule.NumSILOptGlobalVariables 55,494 55,494 0 0.0%
SILModule.NumSILOptVtables 31,496 31,496 0 0.0%
SILModule.NumSILOptWitnessTables 158,369 158,369 0 0.0%
Sema.AbstractGenericSignatureRequest 24,264 24,264 0 0.0%
Sema.AttachedFunctionBuilderRequest 3 3 0 0.0%
Sema.AttachedPropertyWrapperTypeRequest 510,509 510,509 0 0.0%
Sema.AttachedPropertyWrappersRequest 2,042,941 2,042,941 0 0.0%
Sema.ClassAncestryFlagsRequest 100,894 100,894 0 0.0%
Sema.CursorInfoRequest 0 0 0 0.0%
Sema.CustomAttrNominalRequest 3 3 0 0.0%
Sema.DefaultAndMaxAccessLevelRequest 55,419 55,419 0 0.0%
Sema.DefaultDefinitionTypeRequest 7,969 7,969 0 0.0%
Sema.DefaultTypeRequest 460,363 460,363 0 0.0%
Sema.EmittedMembersRequest 26,782 26,782 0 0.0%
Sema.EnumRawTypeRequest 22,820 22,820 0 0.0%
Sema.ExistentialConformsToSelfRequest 22,098 22,146 48 0.22%
Sema.ExistentialTypeSupportedRequest 16,571 16,571 0 0.0%
Sema.ExtendedNominalRequest 620,926 620,926 0 0.0%
Sema.ExtendedTypeRequest 78,738 78,745 7 0.01%
Sema.FunctionBuilderTypeRequest 3 3 0 0.0%
Sema.FunctionOperatorRequest 62,927 62,927 0 0.0%
Sema.GenericParamListRequest 15,307,970 15,451,938 143,968 0.94%
Sema.GenericSignatureRequest 4,527,060 4,542,802 15,742 0.35%
Sema.GetDestructorRequest 27,258 27,258 0 0.0%
Sema.HasDynamicMemberLookupAttributeRequest 0 0 0 0.0%
Sema.InferredGenericSignatureRequest 168,911 168,916 5 0.0%
Sema.InheritedDeclsReferencedRequest 5,702,554 5,736,892 34,338 0.6%
Sema.InheritedTypeRequest 292,989 293,176 187 0.06%
Sema.InitKindRequest 103,432 103,432 0 0.0%
Sema.IsAccessorTransparentRequest 315,304 315,304 0 0.0%
Sema.IsDeclApplicableRequest 0 0 0 0.0%
Sema.IsDynamicRequest 1,629,450 1,629,450 0 0.0%
Sema.IsFinalRequest 2,633,647 2,648,020 14,373 0.55%
Sema.IsGetterMutatingRequest 429,051 429,051 0 0.0%
Sema.IsImplicitlyUnwrappedOptionalRequest 2,446,051 2,448,224 2,173 0.09%
Sema.IsObjCRequest 1,562,832 1,563,689 857 0.05%
Sema.IsSetterMutatingRequest 342,709 342,709 0 0.0%
Sema.LazyStoragePropertyRequest 2,401 2,401 0 0.0%
Sema.MangleLocalTypeDeclRequest 506 506 0 0.0%
Sema.NamedLazyMemberLoadFailureCount 19,842 19,860 18 0.09%
Sema.NamedLazyMemberLoadSuccessCount 29,293,191 29,301,290 8,099 0.03%
Sema.NominalTypeLookupDirectCount 35,896,766 35,959,489 62,723 0.17%
Sema.NumAccessorBodiesSynthesized 189,324 189,324 0 0.0%
Sema.NumAccessorsSynthesized 289,323 289,323 0 0.0%
Sema.NumConformancesDeserialized 9,090,428 9,154,684 64,256 0.71%
Sema.NumConstraintScopes 27,044,259 27,051,532 7,273 0.03%
Sema.NumConstraintsConsideredForEdgeContraction 86,089,534 86,091,424 1,890 0.0%
Sema.NumCyclicOneWayComponentsCollapsed 0 0 0 0.0%
Sema.NumDeclsDeserialized 71,399,628 71,883,143 483,515 0.68%
Sema.NumDeclsTypechecked 1,400,714 1,400,714 0 0.0%
Sema.NumDeclsValidated 2,507,892 2,507,905 13 0.0%
Sema.NumFunctionsTypechecked 528,560 528,560 0 0.0%
Sema.NumGenericSignatureBuilders 1,324,606 1,328,855 4,249 0.32%
Sema.NumLazyIterableDeclContexts 8,531,954 8,555,767 23,813 0.28%
Sema.NumLazyRequirementSignatures 859,408 860,050 642 0.07%
Sema.NumLazyRequirementSignaturesLoaded 571,528 572,128 600 0.1%
Sema.NumLeafScopes 17,414,884 17,420,984 6,100 0.04%
Sema.NumTypesDeserialized 20,743,260 20,830,964 87,704 0.42%
Sema.NumTypesValidated 1,712,950 1,712,963 13 0.0%
Sema.NumUnloadedLazyIterableDeclContexts 5,189,288 5,187,313 -1,975 -0.04%
Sema.OpaqueReadOwnershipRequest 273,259 273,259 0 0.0%
Sema.OverriddenDeclsRequest 2,484,305 2,499,220 14,915 0.6%
Sema.PropertyWrapperBackingPropertyInfoRequest 505,385 505,385 0 0.0%
Sema.PropertyWrapperBackingPropertyTypeRequest 510,509 510,509 0 0.0%
Sema.PropertyWrapperMutabilityRequest 602,009 602,009 0 0.0%
Sema.PropertyWrapperTypeInfoRequest 1 1 0 0.0%
Sema.ProtocolRequiresClassRequest 80,173 80,350 177 0.22%
Sema.RangeInfoRequest 0 0 0 0.0%
Sema.RequirementRequest 109,040 109,054 14 0.01%
Sema.RequirementSignatureRequest 645,243 646,043 800 0.12%
Sema.RequiresOpaqueAccessorsRequest 1,301,161 1,301,161 0 0.0%
Sema.RequiresOpaqueModifyCoroutineRequest 264,659 264,659 0 0.0%
Sema.ResilienceExpansionRequest 1,734,853 1,734,880 27 0.0%
Sema.ResolveProtocolNameRequest 0 0 0 0.0%
Sema.RootAndResultTypeOfKeypathDynamicMemberRequest 0 0 0 0.0%
Sema.RootTypeOfKeypathDynamicMemberRequest 0 0 0 0.0%
Sema.SelfAccessKindRequest 5,426,825 5,434,172 7,347 0.14%
Sema.SelfBoundsFromWhereClauseRequest 7,554,123 7,606,017 51,894 0.69%
Sema.SetterAccessLevelRequest 144,457 144,457 0 0.0%
Sema.StorageImplInfoRequest 1,167,624 1,167,624 0 0.0%
Sema.StoredPropertiesAndMissingMembersRequest 31,114 31,114 0 0.0%
Sema.StoredPropertiesRequest 315,149 315,149 0 0.0%
Sema.StructuralTypeRequest 0 0 0 0.0%
Sema.SuperclassDeclRequest 442,212 442,860 648 0.15%
Sema.SuperclassTypeRequest 52,228 52,228 0 0.0%
Sema.SynthesizeAccessorRequest 289,323 289,323 0 0.0%
Sema.TypeCheckFunctionBodyUntilRequest 528,560 528,560 0 0.0%
Sema.TypeDeclsFromWhereClauseRequest 29,326 29,326 0 0.0%
Sema.TypeRelationCheckRequest 0 0 0 0.0%
Sema.UnderlyingTypeDeclsReferencedRequest 248,188 248,859 671 0.27%
Sema.UnderlyingTypeRequest 34,487 34,487 0 0.0%

Release

release brief

Regressed (0)
name old new delta delta_pct
Improved (0)
name old new delta delta_pct
Unchanged (delta < 1.0% or delta < 100.0ms) (3)
name old new delta delta_pct
Frontend.NumInstructionsExecuted 47,863,820,611,476 47,509,065,225,029 -354,755,386,447 -0.74%
LLVM.NumLLVMBytesOutput 1,535,314,418 1,528,530,034 -6,784,384 -0.44%
time.swift-driver.wall 7702.6s 7631.1s -71.6s -0.93%

release detailed

Regressed (0)
name old new delta delta_pct
Improved (2)
name old new delta delta_pct
IRModule.NumIRBasicBlocks 5,616,325 5,524,505 -91,820 -1.63% ✅
IRModule.NumIRInsts 52,747,322 52,213,949 -533,373 -1.01% ✅
Unchanged (delta < 1.0% or delta < 100.0ms) (17)
name old new delta delta_pct
AST.NumLoadedModules 30,351 30,351 0 0.0%
AST.NumTotalClangImportedEntities 1,217,974 1,217,922 -52 -0.0%
IRModule.NumIRFunctions 2,855,525 2,842,442 -13,083 -0.46%
IRModule.NumIRGlobals 3,040,434 3,039,263 -1,171 -0.04%
IRModule.NumIRValueSymbols 5,539,380 5,525,971 -13,409 -0.24%
LLVM.NumLLVMBytesOutput 1,535,314,418 1,528,530,034 -6,784,384 -0.44%
SILModule.NumSILGenFunctions 1,194,764 1,194,764 0 0.0%
SILModule.NumSILOptFunctions 1,714,140 1,729,150 15,010 0.88%
Sema.NumConformancesDeserialized 3,800,853 3,800,853 0 0.0%
Sema.NumConstraintScopes 26,594,750 26,594,750 0 0.0%
Sema.NumDeclsDeserialized 10,417,625 10,408,344 -9,281 -0.09%
Sema.NumDeclsValidated 1,820,076 1,820,076 0 0.0%
Sema.NumFunctionsTypechecked 532,182 532,182 0 0.0%
Sema.NumGenericSignatureBuilders 276,101 276,085 -16 -0.01%
Sema.NumLazyIterableDeclContexts 1,326,855 1,326,803 -52 -0.0%
Sema.NumTypesDeserialized 5,240,594 5,239,859 -735 -0.01%
Sema.NumTypesValidated 950,795 950,795 0 0.0%

@nate-chandler
Copy link
Contributor Author

nate-chandler commented Sep 20, 2019

Debug:

LLVM.NumLLVMBytesOutput | 1,797,132,560 | 1,797,052,058 | -80,502 | -0.0%

from https://ci.swift.org/job/swift-PR-compiler-performance-macOS/539/artifact/comment.md

@nate-chandler
Copy link
Contributor Author

nate-chandler commented Sep 20, 2019

Release:

LLVM.NumLLVMBytesOutput | 1,535,314,418 | 1,528,530,034 | -6,784,384 | -0.44%

@nate-chandler nate-chandler force-pushed the 39957093-argument-explosion-heuristic branch 3 times, most recently from d99fadd to d1f2517 Compare September 21, 2019 01:50
@nate-chandler
Copy link
Contributor Author

@swift-ci please test

@swift-ci
Copy link
Contributor

Build failed
Swift Test Linux Platform
Git Sha - 2fbb9f08e63a787939147d93230aacf4f2ca11f0

@swift-ci
Copy link
Contributor

Build failed
Swift Test Linux Platform
Git Sha - b67709a0fd40af0ecd500f3cf2dbf5443c971e96

@nate-chandler nate-chandler force-pushed the 39957093-argument-explosion-heuristic branch from 9318377 to 3cb413c Compare September 22, 2019 05:13
@nate-chandler
Copy link
Contributor Author

@swift-ci please test

1 similar comment
@nate-chandler
Copy link
Contributor Author

@swift-ci please test

@swift-ci
Copy link
Contributor

Build failed
Swift Test Linux Platform
Git Sha - 3cb413c593ddf96b62a2a2dab9bfbcce829decfa

@swift-ci
Copy link
Contributor

Build failed
Swift Test OS X Platform
Git Sha - 3cb413c593ddf96b62a2a2dab9bfbcce829decfa

@nate-chandler nate-chandler force-pushed the 39957093-argument-explosion-heuristic branch 2 times, most recently from 49da12f to 87dd1b0 Compare September 23, 2019 16:02
@nate-chandler
Copy link
Contributor Author

@swift-ci please test

4 similar comments
@nate-chandler
Copy link
Contributor Author

@swift-ci please test

@nate-chandler
Copy link
Contributor Author

@swift-ci please test

@nate-chandler
Copy link
Contributor Author

@swift-ci please test

@nate-chandler
Copy link
Contributor Author

@swift-ci please test

Previously, CallerAnalysis::FunctionInfo.foundAllCallers(), which is
documented to return true only when specialization of a function will
not require a thunk, returned true for functions which are possibly used
externally.  Now, that member function only returns false for functions
which may be used externally since dead code elimination will not be
able to remove them.
The new rule is that an argument will be exploded if one of the
following sets of conditions hold:

(1) (a) Specializing the function will result in a thunk.  That is, the
        thunk that is generated cannot be inlined everywhere.
    (b) The argument has dead non-trivial leaves.
    (c) The argument has fewer than three live leaves.

(2) (a) Specializing the function will not result in a thunk.  That is,
        the thunk that is generated will be inlined everywhere and
        eliminated as dead code.
    (b) The argument has dead potentially trivial leaves.
    (c) The argument has fewer than six live leaves.

This change is based heavily on @GottesM's
swiftlang#16756 .

rdar://problem/39957093
@nate-chandler nate-chandler force-pushed the 39957093-argument-explosion-heuristic branch from 87dd1b0 to 9567bd4 Compare September 24, 2019 23:00
@nate-chandler
Copy link
Contributor Author

@swift-ci please benchmark

@nate-chandler
Copy link
Contributor Author

@swift-ci please test

@nate-chandler
Copy link
Contributor Author

@swift-ci please test compiler performance

@swift-ci
Copy link
Contributor

Build failed
Swift Test Linux Platform
Git Sha - 87dd1b09acb25cf531bb0f4a5612d62f4427ed1f

@swift-ci
Copy link
Contributor

Build failed
Swift Test OS X Platform
Git Sha - 87dd1b09acb25cf531bb0f4a5612d62f4427ed1f

@swift-ci
Copy link
Contributor

Performance: -O

Regression OLD NEW DELTA RATIO
Phonebook 1645 1939 +17.9% 0.85x
CSVParsingAltIndices2 847 968 +14.3% 0.88x
CharIteration_russian_unicodeScalars 3000 3400 +13.3% 0.88x (?)
ObjectiveCBridgeStubFromNSDateRef 3820 4250 +11.3% 0.90x (?)
CharIndexing_tweet_unicodeScalars 10160 11280 +11.0% 0.90x
CharIndexing_ascii_unicodeScalars 5200 5760 +10.8% 0.90x
ObjectiveCBridgeStubFromNSStringRef 158 173 +9.5% 0.91x (?)
Data.hash.Empty 68 74 +8.8% 0.92x
 
Improvement OLD NEW DELTA RATIO
Data.append.Sequence.64kB.Count0.I 406 341 -16.0% 1.19x
Data.init.Sequence.64kB.Count0 406 342 -15.8% 1.19x
Data.init.Sequence.64kB.Count0.I 405 342 -15.6% 1.18x
Data.append.Sequence.64kB.Count0 403 341 -15.4% 1.18x (?)
Data.append.Sequence.809B.Count0.I 592 504 -14.9% 1.17x
ObjectiveCBridgeStubToNSDate2 1410 1210 -14.2% 1.17x (?)
Data.init.Sequence.2049B.Count0.I 724 622 -14.1% 1.16x
Data.append.Sequence.809B.Count0 579 499 -13.8% 1.16x
Data.init.Sequence.2047B.Count0.I 727 627 -13.8% 1.16x
Data.append.Sequence.809B.Count0.RE.I 588 511 -13.1% 1.15x (?)
Data.init.Sequence.64kB.Count0.RE.I 406 353 -13.1% 1.15x
Data.append.Sequence.64kB.Count0.RE 405 353 -12.8% 1.15x
Data.init.Sequence.64kB.Count0.RE 406 354 -12.8% 1.15x
Data.append.Sequence.64kB.Count0.RE.I 404 353 -12.6% 1.14x
Data.init.Sequence.809B.Count0 645 566 -12.2% 1.14x
Data.append.Sequence.809B.Count0.RE 580 510 -12.1% 1.14x
DataSetCountSmall 142 125 -12.0% 1.14x (?)
Data.init.Sequence.809B.Count0.I 648 571 -11.9% 1.13x (?)
Data.init.Sequence.809B.Count0.RE.I 654 587 -10.2% 1.11x (?)
Data.init.Sequence.513B.Count0.I 681 612 -10.1% 1.11x
Data.init.Sequence.511B.Count0.I 681 613 -10.0% 1.11x (?)
Data.init.Sequence.809B.Count0.RE 646 587 -9.1% 1.10x
ObjectiveCBridgeStubDataAppend 5920 5380 -9.1% 1.10x (?)
SubstringEqualString 456 423 -7.2% 1.08x (?)
ObjectiveCBridgeStubFromNSString 808 755 -6.6% 1.07x (?)

Code size: -O

Improvement OLD NEW DELTA RATIO
ObjectiveCBridgingStubs.o 22756 19092 -16.1% 1.19x
LazyFilter.o 10771 9171 -14.9% 1.17x
StringInterpolation.o 7377 6823 -7.5% 1.08x
DataBenchmarks.o 84534 79655 -5.8% 1.06x
UTF8Decode.o 12422 11958 -3.7% 1.04x
RC4.o 3936 3792 -3.7% 1.04x
FindStringNaive.o 11136 10952 -1.7% 1.02x
StringRemoveDupes.o 7403 7291 -1.5% 1.02x
DictTest4Legacy.o 23731 23395 -1.4% 1.01x
DictTest4.o 22884 22612 -1.2% 1.01x

Performance: -Osize

Regression OLD NEW DELTA RATIO
ObjectiveCBridgeStubFromNSDateRef 3720 4140 +11.3% 0.90x (?)
String.data.Medium 87 96 +10.3% 0.91x (?)
Data.hash.Empty 68 74 +8.8% 0.92x (?)
 
Improvement OLD NEW DELTA RATIO
DataSetCountSmall 142 125 -12.0% 1.14x

Code size: -Osize

Performance: -Onone

Regression OLD NEW DELTA RATIO
FloatingPointPrinting_Double_interpolated 71000 79600 +12.1% 0.89x (?)
FloatingPointPrinting_Float80_interpolated 93200 102200 +9.7% 0.91x (?)
 
Improvement OLD NEW DELTA RATIO
Dictionary2 1380 1210 -12.3% 1.14x (?)
StringInterpolationSmall 3360 2960 -11.9% 1.14x
DataSetCountSmall 200 180 -10.0% 1.11x (?)
Histogram 8235 7564 -8.1% 1.09x (?)
ArrayOfPOD 1146 1063 -7.2% 1.08x (?)

Code size: -swiftlibs

Improvement OLD NEW DELTA RATIO
libswiftsimd.dylib 139264 135168 -2.9% 1.03x
How to read the data The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the
regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false
alarms. Unexpected regressions which are marked with '(?)' are probably noise.
If you see regressions which you cannot explain you can try to run the
benchmarks again. If regressions still show up, please consult with the
performance team (@eeckstein).

Hardware Overview
  Model Name: Mac Pro
  Model Identifier: MacPro6,1
  Processor Name: 12-Core Intel Xeon E5
  Processor Speed: 2.7 GHz
  Number of Processors: 1
  Total Number of Cores: 12
  L2 Cache (per Core): 256 KB
  L3 Cache: 30 MB
  Memory: 64 GB

@nate-chandler
Copy link
Contributor Author

@swift-ci please test macos platform

@swift-ci
Copy link
Contributor

Build failed
Swift Test OS X Platform
Git Sha - 9567bd4

@nate-chandler
Copy link
Contributor Author

@swift-ci please clean smoke test macos platform

@nate-chandler
Copy link
Contributor Author

Release:

LLVM.NumLLVMBytesOutput | 1,924,598,716 | 1,920,184,346 | -4,414,370 | -0.23%

Debug:

LLVM.NumLLVMBytesOutput | 1,796,502,774 | 1,796,441,004 | -61,770 | -0.0%

Source:

https://ci.swift.org/job/swift-PR-compiler-performance-macOS/544/artifact/

@nate-chandler
Copy link
Contributor Author

@swift-ci please clean test macos platform

1 similar comment
@nate-chandler
Copy link
Contributor Author

@swift-ci please clean test macos platform

@nate-chandler nate-chandler merged commit b9b4196 into swiftlang:master Sep 26, 2019
@nate-chandler nate-chandler deleted the 39957093-argument-explosion-heuristic branch September 26, 2019 16:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants