[stdlib] use @inlinable @inline(always) for Optional equality #59519

glessard · 2022-06-17T00:31:00Z

Optional's various == operators were inconsistent in their use of @inlinable or @_transparent. As a consequence, the detailed diagnostics surrounding Optional with == and != differ strangely based on minute details. Another PR tried setting them all to @_transparent (#59327). This one tries setting them all to @inlinable @inline(__always). Benchmarking may have different results.

Resolves rdar://46444561

glessard · 2022-06-17T00:31:08Z

@swift-ci please test

glessard · 2022-06-17T00:31:17Z

@swift-ci please benchmark

glessard · 2022-06-17T04:54:51Z

Performance (x86_64): -O

Regression	OLD	NEW	DELTA	RATIO
LessSubstringSubstring	23	30	+30.4%	0.77x
EqualSubstringSubstring	23	30	+30.4%	0.77x
EqualStringSubstring	23	30	+30.4%	0.77x (?)
EqualSubstringString	23	30	+30.4%	0.77x (?)
LessSubstringSubstringGenericComparable	23	30	+30.4%	0.77x (?)
EqualSubstringSubstringGenericEquatable	23	29	+26.1%	0.79x
StringUTF16Builder	190	220	+15.8%	0.86x (?)
StringBuilder	180	206	+14.4%	0.87x (?)
FindString.Loop1.Substring	264	302	+14.4%	0.87x (?)
StringBuilderSmallReservingCapacity	187	213	+13.9%	0.88x (?)
StringComparison_longSharedPrefix	293	331	+13.0%	0.89x (?)
StringAdder	237	266	+12.2%	0.89x (?)
StringInterpolationSmall	1090	1200	+10.1%	0.91x (?)
SortStringsUnicode	1945	2110	+8.5%	0.92x
Set.isDisjoint.Box.Empty	52	56	+7.7%	0.93x (?)

Improvement	OLD	NEW	DELTA	RATIO
StringFromLongWholeSubstring	3	2	-33.3%	1.50x
Dictionary4	218	152	-30.3%	1.43x
UTF8Decode_InitDecoding	163	124	-23.9%	1.31x (?)
Dictionary4OfObjects	264	207	-21.6%	1.28x
UTF8Decode_InitFromCustom_contiguous	158	126	-20.3%	1.25x (?)
UTF8Decode_InitFromCustom_noncontiguous	277	241	-13.0%	1.15x (?)
Set.isStrictSubset.Int0	82	75	-8.5%	1.09x (?)
Set.isStrictSubset.Box0	84	77	-8.3%	1.09x (?)
Set.isSuperset.Seq.Int0	124	115	-7.3%	1.08x (?)
DictionaryKeysContainsNative	15	14	-6.7%	1.07x (?)

Code size: -O

Performance (x86_64): -Osize

Regression	OLD	NEW	DELTA	RATIO
LessSubstringSubstring	23	30	+30.4%	0.77x (?)
EqualSubstringString	23	30	+30.4%	0.77x (?)
LessSubstringSubstringGenericComparable	23	30	+30.4%	0.77x (?)
EqualSubstringSubstring	23	29	+26.1%	0.79x (?)
EqualStringSubstring	23	29	+26.1%	0.79x (?)
EqualSubstringSubstringGenericEquatable	23	29	+26.1%	0.79x (?)
StringWalk	1240	1520	+22.6%	0.82x (?)
FlattenListFlatMap	2548	3107	+21.9%	0.82x (?)
StringBuilderSmallReservingCapacity	186	221	+18.8%	0.84x (?)
StringBuilder	178	211	+18.5%	0.84x (?)
FindString.Loop1.Substring	253	299	+18.2%	0.85x (?)
StringAdder	223	256	+14.8%	0.87x (?)
DataCreateEmptyArray	1400	1600	+14.3%	0.88x (?)
StrComplexWalk	2760	3140	+13.8%	0.88x (?)
NSStringConversion.InlineBuffer.ASCII	220	250	+13.6%	0.88x (?)
StringComparison_longSharedPrefix	294	332	+12.9%	0.89x (?)
Chars2	3100	3450	+11.3%	0.90x (?)
Data.hash.Empty	47	52	+10.6%	0.90x (?)
StringUTF16Builder	190	210	+10.5%	0.90x (?)
SortStringsUnicode	1970	2170	+10.2%	0.91x (?)
CharIteration_ascii_unicodeScalars	4840	5280	+9.1%	0.92x (?)
Set.isDisjoint.Seq.Int.Empty	45	49	+8.9%	0.92x (?)
CharIteration_tweet_unicodeScalars	9480	10320	+8.9%	0.92x (?)
CharIteration_korean_unicodeScalars	6040	6560	+8.6%	0.92x (?)
StringMatch	4900	5300	+8.2%	0.92x (?)
SubstringEqualString	166	179	+7.8%	0.93x (?)

Improvement	OLD	NEW	DELTA	RATIO
UTF8Decode_InitFromCustom_contiguous	160	124	-22.5%	1.29x (?)
UTF8Decode_InitDecoding	160	125	-21.9%	1.28x (?)
Diffing.Same	5	4	-20.0%	1.25x (?)
UTF8Decode_InitFromCustom_noncontiguous	275	247	-10.2%	1.11x (?)
Array2D	4640	4176	-10.0%	1.11x (?)
PrefixWhileAnySequence	192	173	-9.9%	1.11x (?)
Set.isDisjoint.Empty.Int	89	81	-9.0%	1.10x (?)
Set.isDisjoint.Empty.Box	90	82	-8.9%	1.10x (?)
Set.isDisjoint.Seq.Empty.Box	91	83	-8.8%	1.10x (?)
Set.isSuperset.Seq.Int.Empty	92	84	-8.7%	1.10x (?)
Set.isDisjoint.Seq.Empty.Int	88	81	-8.0%	1.09x (?)
Set.isDisjoint.Seq.Int100	126	116	-7.9%	1.09x (?)
DistinctClassFieldAccesses	245	226	-7.8%	1.08x (?)
RemoveWhereFilterInts	27	25	-7.4%	1.08x (?)
ArrayInClass	1195	1110	-7.1%	1.08x (?)
Set.subtracting.Seq.Empty.Box	149	139	-6.7%	1.07x (?)

Code size: -Osize

Performance (x86_64): -Onone

Regression	OLD	NEW	DELTA	RATIO
EqualSubstringSubstringGenericEquatable	26	33	+26.9%	0.79x
EqualStringSubstring	28	35	+25.0%	0.80x (?)
LessSubstringSubstringGenericComparable	26	32	+23.1%	0.81x (?)
LessSubstringSubstring	28	34	+21.4%	0.82x (?)
EqualSubstringSubstring	28	34	+21.4%	0.82x (?)
EqualSubstringString	29	35	+20.7%	0.83x
SIMDReduce.Int8	6184	7176	+16.0%	0.86x (?)
StringRemoveDupes	498	572	+14.9%	0.87x (?)
SetSymmetricDifferenceInt100	395	452	+14.4%	0.87x
ObjectiveCBridgeStubToNSStringRef	110	125	+13.6%	0.88x (?)
RC4	11231	12391	+10.3%	0.91x (?)

Improvement	OLD	NEW	DELTA	RATIO
UTF8Decode_InitDecoding	166	131	-21.1%	1.27x (?)
UTF8Decode_InitFromCustom_contiguous	168	133	-20.8%	1.26x (?)
StackPromo	34600	29000	-16.2%	1.19x (?)

Code size: -swiftlibs

How to read the data

The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the
regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false
alarms. Unexpected regressions which are marked with '(?)' are probably noise.
If you see regressions which you cannot explain you can try to run the
benchmarks again. If regressions still show up, please consult with the
performance team (@eeckstein).

Hardware Overview

  Model Name: Mac mini
  Model Identifier: Macmini8,1
  Processor Name: 6-Core Intel Core i7
  Processor Speed: 3.2 GHz
  Number of Processors: 1
  Total Number of Cores: 6
  L2 Cache (per Core): 256 KB
  L3 Cache: 12 MB
  Memory: 32 GB

glessard · 2022-06-17T04:58:54Z

Benchmarks have similar noise, wins and losses as #59327 (though maybe a bit worse.)
The tests depend on the behaviour of @_transparent, though, so they'd need to be adjusted.
@lorentey Can you think of a reason to take this version over the other?

glessard · 2022-06-22T14:34:18Z

Closing this after discussing the issue with performance specialists.

[stdlib] use @inlinable @inline(always) for Optional equality

429641c

glessard closed this Jun 22, 2022

glessard deleted the rdar46444561-always-inline-optional-equality-operator branch June 22, 2022 14:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[stdlib] use @inlinable @inline(always) for Optional equality #59519

[stdlib] use @inlinable @inline(always) for Optional equality #59519

Uh oh!

glessard commented Jun 17, 2022

Uh oh!

glessard commented Jun 17, 2022

Uh oh!

glessard commented Jun 17, 2022

Uh oh!

glessard commented Jun 17, 2022

Uh oh!

glessard commented Jun 17, 2022 •

edited

Loading

Uh oh!

glessard commented Jun 22, 2022

Uh oh!

Uh oh!

[stdlib] use @inlinable @inline(always) for Optional equality #59519

[stdlib] use @inlinable @inline(always) for Optional equality #59519

Uh oh!

Conversation

glessard commented Jun 17, 2022

Uh oh!

glessard commented Jun 17, 2022

Uh oh!

glessard commented Jun 17, 2022

Uh oh!

glessard commented Jun 17, 2022

Performance (x86_64): -O

Code size: -O

Performance (x86_64): -Osize

Code size: -Osize

Performance (x86_64): -Onone

Code size: -swiftlibs

Uh oh!

glessard commented Jun 17, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

glessard commented Jun 22, 2022

Uh oh!

Uh oh!

glessard commented Jun 17, 2022 •

edited

Loading