[stdlib] Refactor Range Overlaps for Performance #23293

PatrickPijnappel · 2019-03-14T09:37:21Z

Refactors the overlaps(_:) method for Range & ClosedRange to use fewer comparisons, and adds a benchmark.

Speed-ups for Int ranges based on local running of the benchmark:

Range x Range: 22%
Range x ClosedRange: 14%
ClosedRange x ClosedRange: 10%

Speed-ups for String ranges based on a different local test:

Range x Range: 22%
Range x ClosedRange: 48%
ClosedRange x ClosedRange: 44%

Benchmark available separately in #23294.

PatrickPijnappel · 2019-03-14T09:38:46Z

@swift-ci Please test

swift-ci · 2019-03-14T10:20:45Z

Build failed
Swift Test OS X Platform
Git Sha - 6ac770b8f65b0eefb795916aa6f4f1d915f15a8a

PatrickPijnappel · 2019-03-15T01:57:31Z

@swift-ci Please test OS X platform

airspeedswift · 2019-03-18T14:51:28Z

@swift-ci please benchmark

swift-ci · 2019-03-18T15:41:55Z

Performance: -O

TEST	OLD	NEW	DELTA	RATIO
Regression
ObjectiveCBridgeStubFromNSDateRef	4060	4630	+14.0%	0.88x (?)
ObjectiveCBridgeStubToNSDate2	1380	1530	+10.9%	0.90x (?)
FlattenListLoop	3976	4334	+9.0%	0.92x
Array2D	6928	7520	+8.5%	0.92x (?)
MapReduceAnyCollection	369	398	+7.9%	0.93x (?)
FlattenListFlatMap	6076	6540	+7.6%	0.93x (?)
MapReduce	369	397	+7.6%	0.93x
Improvement
ClosedRangeOverlapsClosedRange	126	90	-28.6%	1.40x
RangeOverlapsClosedRange	127	109	-14.2%	1.17x
LessSubstringSubstring	44	39	-11.4%	1.13x
RangeOverlapsRange	154	138	-10.4%	1.12x
EqualStringSubstring	43	39	-9.3%	1.10x
EqualSubstringString	43	39	-9.3%	1.10x
LessSubstringSubstringGenericComparable	43	39	-9.3%	1.10x
StringHasSuffixAscii	2400	2220	-7.5%	1.08x (?)
EqualSubstringSubstring	42	39	-7.1%	1.08x (?)
EqualSubstringSubstringGenericEquatable	42	39	-7.1%	1.08x
SortStringsUnicode	3540	3305	-6.6%	1.07x

Performance: -Osize

TEST	OLD	NEW	DELTA	RATIO
Regression
FlattenListLoop	4069	4431	+8.9%	0.92x (?)
Array2D	6912	7520	+8.8%	0.92x
Improvement
ClosedRangeOverlapsClosedRange	126	89	-29.4%	1.42x
RangeOverlapsClosedRange	136	108	-20.6%	1.26x
LessSubstringSubstringGenericComparable	44	39	-11.4%	1.13x
EqualSubstringSubstring	43	39	-9.3%	1.10x (?)
EqualStringSubstring	43	39	-9.3%	1.10x (?)
LessSubstringSubstring	44	40	-9.1%	1.10x
EqualSubstringSubstringGenericEquatable	44	40	-9.1%	1.10x
EqualSubstringString	44	40	-9.1%	1.10x (?)
RangeOverlapsRange	150	139	-7.3%	1.08x
StringComparison_fastPrenormal	1050	980	-6.7%	1.07x (?)

Performance: -Onone

TEST	OLD	NEW	DELTA	RATIO
Regression
StringToDataMedium	13400	14900	+11.2%	0.90x (?)
StrComplexWalk	8330	9110	+9.4%	0.91x (?)
StringToDataLargeUnicode	13950	15100	+8.2%	0.92x (?)
Improvement
ArrayOfPOD	856	775	-9.5%	1.10x (?)
DictionaryKeysContainsCocoa	125	114	-8.8%	1.10x (?)

How to read the data

The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the
regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false
alarms. Unexpected regressions which are marked with '(?)' are probably noise.
If you see regressions which you cannot explain you can try to run the
benchmarks again. If regressions still show up, please consult with the
performance team (@eeckstein).

Hardware Overview

  Model Name: Mac Pro
  Model Identifier: MacPro6,1
  Processor Name: 12-Core Intel Xeon E5
  Processor Speed: 2.7 GHz
  Number of Processors: 1
  Total Number of Cores: 12
  L2 Cache (per Core): 256 KB
  L3 Cache: 30 MB
  Memory: 64 GB

PatrickPijnappel · 2019-03-18T22:08:30Z

@swift-ci Please benchmark

swift-ci · 2019-03-19T01:02:22Z

Performance: -O

TEST	OLD	NEW	DELTA	RATIO
Regression
FlattenListLoop	3976	4335	+9.0%	0.92x
StringUTF16Builder	350	380	+8.6%	0.92x (?)
MapReduceAnyCollection	369	398	+7.9%	0.93x
FlattenListFlatMap	6066	6534	+7.7%	0.93x (?)
MapReduce	369	397	+7.6%	0.93x (?)
Improvement
ClosedRangeOverlapsClosedRange	126	90	-28.6%	1.40x
RangeOverlapsClosedRange	127	109	-14.2%	1.17x
LessSubstringSubstring	44	39	-11.4%	1.13x
RangeOverlapsRange	154	137	-11.0%	1.12x
EqualStringSubstring	43	39	-9.3%	1.10x (?)
EqualSubstringString	43	39	-9.3%	1.10x (?)
LessSubstringSubstringGenericComparable	43	39	-9.3%	1.10x
EqualSubstringSubstring	42	39	-7.1%	1.08x (?)
EqualSubstringSubstringGenericEquatable	42	39	-7.1%	1.08x (?)
SortStringsUnicode	3540	3300	-6.8%	1.07x (?)

Performance: -Osize

TEST	OLD	NEW	DELTA	RATIO
Regression
FlattenListLoop	4069	4431	+8.9%	0.92x (?)
Array2D	6912	7520	+8.8%	0.92x
Improvement
ClosedRangeOverlapsClosedRange	126	89	-29.4%	1.42x
RangeOverlapsClosedRange	136	108	-20.6%	1.26x
LessSubstringSubstringGenericComparable	44	39	-11.4%	1.13x (?)
EqualSubstringSubstring	43	39	-9.3%	1.10x
EqualStringSubstring	43	39	-9.3%	1.10x (?)
LessSubstringSubstring	44	40	-9.1%	1.10x
EqualSubstringSubstringGenericEquatable	44	40	-9.1%	1.10x
EqualSubstringString	44	40	-9.1%	1.10x
RangeOverlapsRange	150	138	-8.0%	1.09x
StringComparison_fastPrenormal	1050	980	-6.7%	1.07x (?)

Performance: -Onone

TEST	OLD	NEW	DELTA	RATIO
Regression
StrComplexWalk	8340	9070	+8.8%	0.92x (?)
Improvement
ArrayOfPOD	855	775	-9.4%	1.10x (?)

How to read the data

The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the
regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false
alarms. Unexpected regressions which are marked with '(?)' are probably noise.
If you see regressions which you cannot explain you can try to run the
benchmarks again. If regressions still show up, please consult with the
performance team (@eeckstein).

Hardware Overview

  Model Name: Mac Pro
  Model Identifier: MacPro6,1
  Processor Name: 12-Core Intel Xeon E5
  Processor Speed: 2.7 GHz
  Number of Processors: 1
  Total Number of Cores: 12
  L2 Cache (per Core): 256 KB
  L3 Cache: 30 MB
  Memory: 64 GB

PatrickPijnappel · 2019-04-20T05:33:29Z

Just checked these out, didn't have time to investigate before. I think these regression must be noise, even though they are consistent – they don't use overlaps and overlaps is not used within the standard library itself so they can't use it indirectly either.

The benchmarks that actually use overlaps are consistently better.

@airspeedswift Thoughts?

PatrickPijnappel · 2019-05-03T06:59:06Z

@swift-ci Please benchmark

swift-ci · 2019-05-03T07:21:27Z

Performance: -O

TEST	OLD	NEW	DELTA	RATIO
Regression
DistinctClassFieldAccesses	179	204	+14.0%	0.88x
DataCreateEmptyArray	1650	1800	+9.1%	0.92x (?)
Improvement
ClosedRangeOverlapsClosedRange	56	41	-26.8%	1.37x
EqualStringSubstring	28	21	-25.0%	1.33x
RangeOverlapsClosedRange	64	49	-23.4%	1.31x (?)
EqualSubstringSubstring	27	21	-22.2%	1.29x
LessSubstringSubstring	27	21	-22.2%	1.29x
EqualSubstringSubstringGenericEquatable	27	21	-22.2%	1.29x
EqualSubstringString	27	21	-22.2%	1.29x
LessSubstringSubstringGenericComparable	27	21	-22.2%	1.29x
RangeOverlapsRange	74	65	-12.2%	1.14x (?)
StringComparison_longSharedPrefix	355	318	-10.4%	1.12x (?)

Performance: -Osize

TEST	OLD	NEW	DELTA	RATIO
Regression
DistinctClassFieldAccesses	180	203	+12.8%	0.89x (?)
ArrayInClass	845	935	+10.7%	0.90x (?)
ArrayLiteral2	113	124	+9.7%	0.91x (?)
Breadcrumbs.UTF16ToIdxRange.longMixed	76	82	+7.9%	0.93x (?)
Improvement
LessSubstringSubstring	28	21	-25.0%	1.33x
LessSubstringSubstringGenericComparable	28	21	-25.0%	1.33x
RangeOverlapsClosedRange	68	52	-23.5%	1.31x
EqualSubstringSubstring	27	21	-22.2%	1.29x
EqualStringSubstring	27	21	-22.2%	1.29x
EqualSubstringSubstringGenericEquatable	27	21	-22.2%	1.29x
EqualSubstringString	27	21	-22.2%	1.29x (?)
ClosedRangeOverlapsClosedRange	55	48	-12.7%	1.15x
StringComparison_longSharedPrefix	356	320	-10.1%	1.11x (?)
DataToStringEmpty	500	450	-10.0%	1.11x (?)
RangeOverlapsRange	72	65	-9.7%	1.11x

Performance: -Onone

TEST	OLD	NEW	DELTA	RATIO
Regression
ExclusivityGlobal	104	113	+8.7%	0.92x (?)
UTF8Decode_InitFromBytes	127	137	+7.9%	0.93x (?)
Improvement
EqualSubstringSubstringGenericEquatable	30	24	-20.0%	1.25x (?)
EqualSubstringSubstring	31	25	-19.4%	1.24x (?)
LessSubstringSubstring	31	25	-19.4%	1.24x (?)
EqualSubstringString	32	26	-18.7%	1.23x (?)
LessSubstringSubstringGenericComparable	29	24	-17.2%	1.21x (?)
EqualStringSubstring	31	26	-16.1%	1.19x (?)
ArrayOfPOD	542	469	-13.5%	1.16x (?)
ArrayOfGenericPOD2	617	575	-6.8%	1.07x (?)
StringToDataSmall	750	700	-6.7%	1.07x (?)

How to read the data

The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the
regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false
alarms. Unexpected regressions which are marked with '(?)' are probably noise.
If you see regressions which you cannot explain you can try to run the
benchmarks again. If regressions still show up, please consult with the
performance team (@eeckstein).

Hardware Overview

  Model Name: Mac mini
  Model Identifier: Macmini8,1
  Processor Name: Intel Core i7
  Processor Speed: 3.2 GHz
  Number of Processors: 1
  Total Number of Cores: 6
  L2 Cache (per Core): 256 KB
  L3 Cache: 12 MB
  Memory: 64 GB

PatrickPijnappel · 2019-05-03T07:41:37Z

@swift-ci Please smoke test

PatrickPijnappel · 2019-05-03T08:48:33Z

@swift-ci Please smoke test

PatrickPijnappel · 2019-05-03T22:53:10Z

@swift-ci Please smoke test

PatrickPijnappel · 2019-05-03T23:15:49Z

@swift-ci Please smoke test

PatrickPijnappel · 2019-05-04T00:32:35Z

@airspeedswift Should be good to merge!

PatrickPijnappel mentioned this pull request Mar 14, 2019

[test] Add Range Overlaps Benchmark #23294

Merged

PatrickPijnappel force-pushed the range-overlaps-performance-impl branch 2 times, most recently from 25f469e to 3fa7e82 Compare March 18, 2019 05:19

PatrickPijnappel force-pushed the range-overlaps-performance-impl branch from 3fa7e82 to 1ce3792 Compare March 18, 2019 22:12

PatrickPijnappel requested a review from airspeedswift April 25, 2019 22:51

PatrickPijnappel force-pushed the range-overlaps-performance-impl branch from 1ce3792 to 4e1afce Compare May 3, 2019 08:47

[stdlib] Refactor range overlaps for performance

87f02b1

PatrickPijnappel force-pushed the range-overlaps-performance-impl branch from 4e1afce to 87f02b1 Compare May 3, 2019 23:15

PatrickPijnappel merged commit 70be266 into swiftlang:master May 13, 2019

PatrickPijnappel deleted the range-overlaps-performance-impl branch May 13, 2019 12:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[stdlib] Refactor Range Overlaps for Performance #23293

[stdlib] Refactor Range Overlaps for Performance #23293

Uh oh!

PatrickPijnappel commented Mar 14, 2019 •

edited

Loading

Uh oh!

PatrickPijnappel commented Mar 14, 2019

Uh oh!

swift-ci commented Mar 14, 2019

Uh oh!

PatrickPijnappel commented Mar 15, 2019

Uh oh!

airspeedswift commented Mar 18, 2019

Uh oh!

swift-ci commented Mar 18, 2019

Uh oh!

PatrickPijnappel commented Mar 18, 2019

Uh oh!

swift-ci commented Mar 19, 2019

Uh oh!

PatrickPijnappel commented Apr 20, 2019

Uh oh!

PatrickPijnappel commented May 3, 2019

Uh oh!

swift-ci commented May 3, 2019

Uh oh!

PatrickPijnappel commented May 3, 2019

Uh oh!

PatrickPijnappel commented May 3, 2019

Uh oh!

PatrickPijnappel commented May 3, 2019

Uh oh!

PatrickPijnappel commented May 3, 2019

Uh oh!

PatrickPijnappel commented May 4, 2019

Uh oh!

Uh oh!

[stdlib] Refactor Range Overlaps for Performance #23293

[stdlib] Refactor Range Overlaps for Performance #23293

Uh oh!

Conversation

PatrickPijnappel commented Mar 14, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

PatrickPijnappel commented Mar 14, 2019

Uh oh!

swift-ci commented Mar 14, 2019

Uh oh!

PatrickPijnappel commented Mar 15, 2019

Uh oh!

airspeedswift commented Mar 18, 2019

Uh oh!

swift-ci commented Mar 18, 2019

Performance: -O

Performance: -Osize

Performance: -Onone

Uh oh!

PatrickPijnappel commented Mar 18, 2019

Uh oh!

swift-ci commented Mar 19, 2019

Performance: -O

Performance: -Osize

Performance: -Onone

Uh oh!

PatrickPijnappel commented Apr 20, 2019

Uh oh!

PatrickPijnappel commented May 3, 2019

Uh oh!

swift-ci commented May 3, 2019

Performance: -O

Performance: -Osize

Performance: -Onone

Uh oh!

PatrickPijnappel commented May 3, 2019

Uh oh!

PatrickPijnappel commented May 3, 2019

Uh oh!

PatrickPijnappel commented May 3, 2019

Uh oh!

PatrickPijnappel commented May 3, 2019

Uh oh!

PatrickPijnappel commented May 4, 2019

Uh oh!

Uh oh!

PatrickPijnappel commented Mar 14, 2019 •

edited

Loading