[stdlib] Make some trivial Substring methods inlinable #22616

karwa · 2019-02-14T14:58:32Z

These look like simple oversights.

slavapestov · 2019-02-14T15:50:18Z

This limits future evolution of the standard library. Is there a measurable performance benefit from these changes that justifies any loss in flexibility?

karwa · 2019-02-14T17:22:47Z

@slavapestov

Substring.(subscript|replaceSubrange): trivial forwarding to Slice, like everything else. No way the stdlib could evolve differently given everything else that is inlinable. Almost everything else in Substring is already inlinable.
Substring.UTF8View: the equivalent methods in .UTF16View are inlinable.

@milseman should be able to say more about if these were intentionally withheld from inlining (if so, maybe it's worth a comment).

I don't know if we actually benchmark these code-paths. I think it's likely going to be worth it to inline the subscript, at least, and replaceSubrange can now be specialised.

milseman · 2019-02-14T22:24:53Z

@karwa the reasoning is usually the other way around: we prefer to not mark something inlinable unless there is sufficient justification. Marking a method inlinable can significantly harm performance, as it will block inlining of non-inlinable code that the method calls.

String.replaceSubrange is not inlinable, so making Substring.replaceSubrange is more likely to hurt performance than not. If Substring.replaceSubrange is not inlinable, then String.replaceSubrange can be inlined fully into it.

Substring.subscript could be a candidate, as well as the Substring.UTF8View methods. Do you have a benchmark or scenario demonstrating the performance improvement?

karwa · 2019-02-16T18:50:15Z

@milseman Nope, I just noticed these while adding withContiguousStorageIfAvailable forwarding (String.UTF8View implements it, Substring.UTF8View does not). Could you request some from CI to start with? It can take a solid half-day to rebuild a branch and run benchmarks locally for me.

I didn't notice that String.replaceSubrange wasn't inlinable. Reverted that change.

milseman · 2019-02-18T18:18:33Z

@swift-ci please benchmark

karwa · 2019-02-19T16:53:47Z

@milseman Hmm... that doesn't seem to have worked. Mind trying again?
EDIT: Oh, okay.

swift-ci · 2019-02-19T17:41:44Z

Performance: -O

TEST	OLD	NEW	DELTA	RATIO
Regression
Data.hash.Empty	65	74	+13.8%	0.88x (?)
LessSubstringSubstring	39	44	+12.8%	0.89x
EqualStringSubstring	39	44	+12.8%	0.89x (?)
EqualSubstringSubstringGenericEquatable	39	44	+12.8%	0.89x
EqualSubstringString	39	44	+12.8%	0.89x
LessSubstringSubstringGenericComparable	39	44	+12.8%	0.89x (?)
SortStringsUnicode	3255	3640	+11.8%	0.89x
EqualSubstringSubstring	40	44	+10.0%	0.91x
DataSetCountSmall	131	142	+8.4%	0.92x (?)
StringComparison_latin1	720	776	+7.8%	0.93x (?)
Improvement
StringUTF16Builder	390	350	-10.3%	1.11x
StringBuilder	369	334	-9.5%	1.10x (?)
FlattenListLoop	4331	3970	-8.3%	1.09x (?)
Array2D	7520	6912	-8.1%	1.09x (?)
StringAdder	467	430	-7.9%	1.09x
Breadcrumbs.UTF16ToIdx.longMixed	398	367	-7.8%	1.08x (?)
MapReduce	397	368	-7.3%	1.08x (?)
ObjectiveCBridgeStubNSDateRefAccess	400	371	-7.2%	1.08x (?)
MapReduceAnyCollection	398	370	-7.0%	1.08x
FlattenListFlatMap	6871	6408	-6.7%	1.07x (?)

Code size: -O

TEST	OLD	NEW	DELTA	RATIO
Regression
StringMatch.o	4606	5358	+16.3%	0.86x
StringEdits.o	11983	12735	+6.3%	0.94x
RomanNumbers.o	8507	8795	+3.4%	0.97x
CSVParsing.o	31265	32017	+2.4%	0.98x

Performance: -Osize

TEST	OLD	NEW	DELTA	RATIO
Regression
DataAppendDataLargeToLarge	36800	51200	+39.1%	0.72x (?)
Data.hash.Empty	68	77	+13.2%	0.88x
EqualSubstringSubstring	39	44	+12.8%	0.89x
EqualStringSubstring	39	44	+12.8%	0.89x
EqualSubstringString	39	44	+12.8%	0.89x (?)
LessSubstringSubstringGenericComparable	39	44	+12.8%	0.89x
SortStringsUnicode	3265	3600	+10.3%	0.91x
LessSubstringSubstring	40	44	+10.0%	0.91x (?)
EqualSubstringSubstringGenericEquatable	40	44	+10.0%	0.91x
StringComparison_latin1	722	786	+8.9%	0.92x (?)
StringMatch	8500	9200	+8.2%	0.92x
DataSetCountSmall	134	145	+8.2%	0.92x (?)
Improvement
StringBuilder	361	327	-9.4%	1.10x (?)
RandomIntegersLCG	828	757	-8.6%	1.09x (?)
SortLettersInPlace	561	514	-8.4%	1.09x (?)
StringBuilderSmallReservingCapacity	385	353	-8.3%	1.09x (?)

Code size: -Osize

TEST	OLD	NEW	DELTA	RATIO
Regression
StringMatch.o	4513	5353	+18.6%	0.84x
StringEdits.o	11958	12694	+6.2%	0.94x
RomanNumbers.o	8842	9282	+5.0%	0.95x
CSVParsing.o	31009	31833	+2.7%	0.97x

Performance: -Onone

TEST	OLD	NEW	DELTA	RATIO
Regression
Data.hash.Empty	148	165	+11.5%	0.90x (?)
ArrayOfPOD	776	855	+10.2%	0.91x (?)
StrComplexWalk	6700	7330	+9.4%	0.91x (?)
Improvement
Dictionary3	707	613	-13.3%	1.15x
Dictionary4	1056	970	-8.1%	1.09x (?)

Code size: -swiftlibs

TEST	OLD	NEW	DELTA	RATIO
Regression
libswiftStdlibUnittest.dylib	360448	364544	+1.1%	0.99x

How to read the data

The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the
regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false
alarms. Unexpected regressions which are marked with '(?)' are probably noise.
If you see regressions which you cannot explain you can try to run the
benchmarks again. If regressions still show up, please consult with the
performance team (@eeckstein).

Hardware Overview

  Model Name: Mac Pro
  Model Identifier: MacPro6,1
  Processor Name: 12-Core Intel Xeon E5
  Processor Speed: 2.7 GHz
  Number of Processors: 1
  Total Number of Cores: 12
  L2 Cache (per Core): 256 KB
  L3 Cache: 30 MB
  Memory: 64 GB

milseman · 2019-02-19T18:49:12Z

(Just to double check the trend)

@swift-ci please benchmark

CodaFi · 2019-11-22T19:11:06Z

@swift-ci please benchmark

swift-ci · 2019-11-22T20:00:17Z

Performance: -O

Regression	OLD	NEW	DELTA	RATIO
ArrayAppendGenericStructs	1340	2140	+59.7%	0.63x (?)
Data.hash.Empty	74	80	+8.1%	0.93x (?)

Improvement	OLD	NEW	DELTA	RATIO
CSVParsing.UTF8	61	44	-27.9%	1.39x
LazilyFilteredArrayContains	34800	29500	-15.2%	1.18x
RandomShuffleLCG2	768	704	-8.3%	1.09x
RemoveWhereMoveInts	36	33	-8.3%	1.09x (?)
FlattenListLoop	4453	4092	-8.1%	1.09x (?)
RemoveWhereSwapInts	65	60	-7.7%	1.08x
ObjectiveCBridgeStubFromNSDateRef	4200	3890	-7.4%	1.08x (?)
MapReduce	400	371	-7.2%	1.08x (?)
Set.isSubset.Seq.Int.Empty	210	195	-7.1%	1.08x (?)
MapReduceAnyCollection	397	369	-7.1%	1.08x (?)
Set.isStrictSuperset.Seq.Empty.Int	290	270	-6.9%	1.07x (?)
Set.isSuperset.Seq.Empty.Int	88	82	-6.8%	1.07x (?)
Set.isSubset.Int.Empty	88	82	-6.8%	1.07x (?)
Set.isStrictSubset.Int.Empty	88	82	-6.8%	1.07x (?)
Set.isStrictSubset.Seq.Int.Empty	208	194	-6.7%	1.07x (?)
ArrayLiteral2	137	128	-6.6%	1.07x (?)

Code size: -O

Regression	OLD	NEW	DELTA	RATIO
StringMatch.o	4752	5248	+10.4%	0.91x
FindStringNaive.o	10979	11475	+4.5%	0.96x
StringEdits.o	12143	12639	+4.1%	0.96x

Performance: -Osize

Regression	OLD	NEW	DELTA	RATIO
FindString.Loop1.Substring	817	916	+12.1%	0.89x (?)
StringMatch	8200	8900	+8.5%	0.92x
Data.hash.Empty	74	80	+8.1%	0.93x (?)

Improvement	OLD	NEW	DELTA	RATIO
CSVParsing.UTF8	61	45	-26.2%	1.36x
Set.isSubset.Int.Empty	96	88	-8.3%	1.09x
ObjectiveCBridgeStubNSDateRefAccess	400	371	-7.2%	1.08x (?)
Set.isDisjoint.Seq.Int.Empty	90	84	-6.7%	1.07x (?)
Set.isSuperset.Seq.Empty.Int	90	84	-6.7%	1.07x
Set.isStrictSubset.Int.Empty	90	84	-6.7%	1.07x

Code size: -Osize

Regression	OLD	NEW	DELTA	RATIO
StringMatch.o	4572	5055	+10.6%	0.90x
FindStringNaive.o	10491	11075	+5.6%	0.95x
StringEdits.o	12032	12528	+4.1%	0.96x
RomanNumbers.o	8149	8333	+2.3%	0.98x

Performance: -Onone

Improvement	OLD	NEW	DELTA	RATIO
ObjectiveCBridgeStubFromNSDateRef	4850	4370	-9.9%	1.11x (?)
DataCreateSmallArray	94200	87650	-7.0%	1.07x (?)

Code size: -swiftlibs

How to read the data

The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the
regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false
alarms. Unexpected regressions which are marked with '(?)' are probably noise.
If you see regressions which you cannot explain you can try to run the
benchmarks again. If regressions still show up, please consult with the
performance team (@eeckstein).

Hardware Overview

  Model Name: Mac Pro
  Model Identifier: MacPro6,1
  Processor Name: 12-Core Intel Xeon E5
  Processor Speed: 2.7 GHz
  Number of Processors: 1
  Total Number of Cores: 12
  L2 Cache (per Core): 256 KB
  L3 Cache: 30 MB
  Memory: 64 GB

milseman · 2019-12-03T22:09:27Z

Some of those seem spurious, but the CSVParser.UTF8 one doesn't.

@swift-ci please benchmark

swift-ci · 2019-12-03T22:36:10Z

Performance: -O

Regression	OLD	NEW	DELTA	RATIO
NSStringConversion.Long	514	584	+13.6%	0.88x (?)
IterateData	889	1004	+12.9%	0.89x (?)
NSStringConversion.Medium	246	265	+7.7%	0.93x (?)

Improvement	OLD	NEW	DELTA	RATIO
CSVParsing.UTF8	45	31	-31.1%	1.45x
EqualSubstringSubstring	28	22	-21.4%	1.27x
LessSubstringSubstring	28	22	-21.4%	1.27x
EqualStringSubstring	28	22	-21.4%	1.27x
EqualSubstringSubstringGenericEquatable	28	22	-21.4%	1.27x
EqualSubstringString	28	22	-21.4%	1.27x
LessSubstringSubstringGenericComparable	28	22	-21.4%	1.27x
StringComparison_longSharedPrefix	364	327	-10.2%	1.11x (?)
StringHasPrefixAscii	1370	1280	-6.6%	1.07x (?)

Code size: -O

Regression	OLD	NEW	DELTA	RATIO
StringMatch.o	4752	5248	+10.4%	0.91x
FindStringNaive.o	10979	11475	+4.5%	0.96x
StringEdits.o	12143	12639	+4.1%	0.96x

Performance: -Osize

Regression	OLD	NEW	DELTA	RATIO
NSStringConversion.Long	488	560	+14.8%	0.87x (?)
Set.subtracting.Empty.Box	7	8	+14.3%	0.88x (?)

Improvement	OLD	NEW	DELTA	RATIO
CSVParsing.UTF8	45	31	-31.1%	1.45x
LessSubstringSubstring	27	21	-22.2%	1.29x
EqualSubstringString	27	21	-22.2%	1.29x
LessSubstringSubstringGenericComparable	27	21	-22.2%	1.29x
EqualSubstringSubstring	28	22	-21.4%	1.27x (?)
EqualStringSubstring	28	22	-21.4%	1.27x
EqualSubstringSubstringGenericEquatable	28	22	-21.4%	1.27x (?)
FlattenListLoop	3217	2734	-15.0%	1.18x (?)
ObjectiveCBridgeStubNSDateRefAccess	196	174	-11.2%	1.13x (?)
StringComparison_longSharedPrefix	357	320	-10.4%	1.12x (?)
OpenClose	63	58	-7.9%	1.09x (?)
SortIntPyramid	475	440	-7.4%	1.08x (?)

Code size: -Osize

Regression	OLD	NEW	DELTA	RATIO
StringMatch.o	4572	5055	+10.6%	0.90x
FindStringNaive.o	10491	11075	+5.6%	0.95x
StringEdits.o	12032	12528	+4.1%	0.96x
RomanNumbers.o	8149	8333	+2.3%	0.98x

Performance: -Onone

Improvement	OLD	NEW	DELTA	RATIO
EqualSubstringSubstringGenericEquatable	32	25	-21.9%	1.28x (?)
LessSubstringSubstringGenericComparable	32	25	-21.9%	1.28x
EqualSubstringSubstring	32	26	-18.7%	1.23x
LessSubstringSubstring	32	26	-18.7%	1.23x
EqualStringSubstring	32	27	-15.6%	1.19x
EqualSubstringString	32	27	-15.6%	1.19x (?)
ArrayOfGenericPOD2	793	712	-10.2%	1.11x

Code size: -swiftlibs

How to read the data

The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the
regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false
alarms. Unexpected regressions which are marked with '(?)' are probably noise.
If you see regressions which you cannot explain you can try to run the
benchmarks again. If regressions still show up, please consult with the
performance team (@eeckstein).

Hardware Overview

  Model Name: Mac mini
  Model Identifier: Macmini8,1
  Processor Name: Intel Core i7
  Processor Speed: 3.2 GHz
  Number of Processors: 1
  Total Number of Cores: 6
  L2 Cache (per Core): 256 KB
  L3 Cache: 12 MB
  Memory: 64 GB

karwa · 2019-12-05T15:34:58Z

@milseman what do you think?

milseman · 2019-12-09T21:54:15Z

I'm in favor

@swift-ci please test

milseman · 2020-02-10T21:36:19Z

@swift-ci please test

swift-ci · 2020-02-10T23:21:43Z

Build failed
Swift Test OS X Platform
Git Sha - ab36b51

karwa · 2020-02-11T13:34:01Z

@milseman It seems the test was manually aborted:

00:21:39 ninja: build stopped: interrupted by user.
00:21:39 Build was aborted
00:21:39 Aborted by [email protected]

CodaFi · 2020-02-13T15:13:38Z

@swift-ci test

karwa force-pushed the substring_inline branch 2 times, most recently from 451edf6 to 2c00a23 Compare February 14, 2019 15:02

[stdlib] Make some trivial Substring methods inlinable

ab36b51

karwa force-pushed the substring_inline branch from 2c00a23 to ab36b51 Compare February 16, 2019 18:52

milseman merged commit 2897daa into swiftlang:master Feb 20, 2020

[stdlib] Make some trivial Substring methods inlinable #22616

[stdlib] Make some trivial Substring methods inlinable #22616

Uh oh!

Conversation

karwa commented Feb 14, 2019

Uh oh!

slavapestov commented Feb 14, 2019

Uh oh!

karwa commented Feb 14, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

milseman commented Feb 14, 2019

Uh oh!

karwa commented Feb 16, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

milseman commented Feb 18, 2019

Uh oh!

karwa commented Feb 19, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

swift-ci commented Feb 19, 2019

Performance: -O

Code size: -O

Performance: -Osize

Code size: -Osize

Performance: -Onone

Code size: -swiftlibs

Uh oh!

milseman commented Feb 19, 2019

Uh oh!

CodaFi commented Nov 22, 2019

Uh oh!

swift-ci commented Nov 22, 2019

Performance: -O

Code size: -O

Performance: -Osize

Code size: -Osize

Performance: -Onone

Code size: -swiftlibs

Uh oh!

milseman commented Dec 3, 2019

Uh oh!

swift-ci commented Dec 3, 2019

Performance: -O

Code size: -O

Performance: -Osize

Code size: -Osize

Performance: -Onone

Code size: -swiftlibs

Uh oh!

karwa commented Dec 5, 2019

Uh oh!

milseman commented Dec 9, 2019

Uh oh!

milseman commented Feb 10, 2020

Uh oh!

swift-ci commented Feb 10, 2020

Uh oh!

karwa commented Feb 11, 2020

Uh oh!

CodaFi commented Feb 13, 2020

Uh oh!

Uh oh!

karwa commented Feb 14, 2019 •

edited

Loading

karwa commented Feb 16, 2019 •

edited

Loading

karwa commented Feb 19, 2019 •

edited

Loading