Skip to content

[benchmark] Add benchmarks for IndexPath's subscripts, max, min #34535

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Nov 10, 2020

Conversation

micahbenn
Copy link
Contributor

@micahbenn micahbenn commented Nov 1, 2020

Adds IndexPath benchmarks for the following, and is part of SR-6789:

  • subscript(_:IndexPath.Index)
  • subscript(_:Range<IndexPath.Index>)
  • max()
  • min()

These are the added benchmarks:

  • IndexPathSubscriptMutation: Increments an element at the given index by 1.
  • IndexPathSubscriptRangeMutation: Adds an IndexPath to the given index range.
  • IndexPathMaxBeginning: Checks max(), with the max value at the beginning.
  • IndexPathMaxMiddle: Checks max(), with the max value at the middle.
  • IndexPathMaxEnd: Checks max(), with the max value at the end.
  • IndexPathMinBeginning: Checks min(), with the min value at the beginning.
  • IndexPathMinMiddle: Checks min(), with the min value at the middle.
  • IndexPathMinEnd: Checks min(), with the min value at the end

This is my first contribution; open to any comments!

Resolves SR-13801

Add benchmarks for subscripts, max, min
Copy link
Collaborator

@xwu xwu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some additional clarifications on formatting. If you configure swift-format with "lineLength": 80, "indentation": { "spaces": 2 }, and "prioritizeKeepingFunctionOutputTogether": true, that should get you pretty close.

@xwu
Copy link
Collaborator

xwu commented Nov 2, 2020

@swift-ci Please benchmark

@swift-ci
Copy link
Contributor

swift-ci commented Nov 2, 2020

Performance: -O

Regression OLD NEW DELTA RATIO
DataCreateEmpty 80 100 +25.0% 0.80x
 
Improvement OLD NEW DELTA RATIO
ObjectiveCBridgeStubDateAccess 152 130 -14.5% 1.17x
FlattenListFlatMap 4745 4181 -11.9% 1.13x (?)
DataSubscriptMedium 41 37 -9.8% 1.11x (?)
 
Added MIN MAX MEAN MAX_RSS
IndexPathMaxBeginning 13995 14554 14302
IndexPathMaxEnd 13003 13673 13352
IndexPathMaxMiddle 13606 13662 13625
IndexPathMinBeginning 13083 13552 13361
IndexPathMinEnd 13958 14120 14033
IndexPathMinMiddle 13643 14120 13849
IndexPathSubscriptMutation 32706 32935 32806
IndexPathSubscriptRangeMutation 36609 37065 36908

Code size: -O

Regression OLD NEW DELTA RATIO
TestsUtils.o 27832 28270 +1.6% 0.98x
 
Improvement OLD NEW DELTA RATIO
DriverUtils.o 128895 127343 -1.2% 1.01x

Performance: -Osize

Regression OLD NEW DELTA RATIO
Array2D 4704 5248 +11.6% 0.90x (?)
 
Improvement OLD NEW DELTA RATIO
ReversedArray2 109 96 -11.9% 1.14x (?)
PointerArithmetics 19500 17400 -10.8% 1.12x (?)
DataCountMedium 19 17 -10.5% 1.12x (?)
ProtocolDispatch 239 217 -9.2% 1.10x (?)
 
Added MIN MAX MEAN MAX_RSS
IndexPathMaxBeginning 13971 14066 14018
IndexPathMaxEnd 13034 13531 13280
IndexPathMaxMiddle 13458 13692 13585
IndexPathMinBeginning 13179 13289 13237
IndexPathMinEnd 13758 14166 14012
IndexPathMinMiddle 13494 13949 13672
IndexPathSubscriptMutation 33912 34116 34003
IndexPathSubscriptRangeMutation 35662 35994 35804

Code size: -Osize

Performance: -Onone

Regression OLD NEW DELTA RATIO
DataSubscriptMedium 91 101 +11.0% 0.90x (?)
 
Added MIN MAX MEAN MAX_RSS
IndexPathMaxBeginning 156051 156366 156199
IndexPathMaxEnd 144466 145443 145112
IndexPathMaxMiddle 151959 152526 152336
IndexPathMinBeginning 145086 145704 145449
IndexPathMinEnd 159997 160944 160562
IndexPathMinMiddle 152342 152715 152590
IndexPathSubscriptMutation 170362 170675 170530
IndexPathSubscriptRangeMutation 69211 69345 69294

Code size: -swiftlibs

Benchmark Check Report
⛔️⏱ IndexPathMinMiddle execution took at least 12425 μs.
Decrease the workload of IndexPathMinMiddle by a factor of 16 (100), to be less than 1000 μs.
⛔️⏱ IndexPathMaxBeginning execution took at least 12958 μs.
Decrease the workload of IndexPathMaxBeginning by a factor of 16 (100), to be less than 1000 μs.
⛔️⏱ IndexPathMaxMiddle execution took at least 12424 μs.
Decrease the workload of IndexPathMaxMiddle by a factor of 16 (100), to be less than 1000 μs.
⛔️⏱ IndexPathMinEnd has setup overhead of 764 μs (5.8%).
Move initialization of benchmark data to the setUpFunction registered in BenchmarkInfo.
⛔️⏱ IndexPathMinEnd execution took at least 12430 μs (excluding the setup overhead).
Decrease the workload of IndexPathMinEnd by a factor of 16 (100), to be less than 1000 μs.
⛔️⏱ IndexPathMaxEnd execution took at least 12354 μs.
Decrease the workload of IndexPathMaxEnd by a factor of 16 (100), to be less than 1000 μs.
⚠️🔤 IndexPathSubscriptRangeMutation name is composed of 5 words.
Split IndexPathSubscriptRangeMutation name into dot-separated groups and variants. See http://bit.ly/BenchmarkNaming
⛔️⏱ IndexPathSubscriptRangeMutation has setup overhead of 36224 μs (105.5%).
Move initialization of benchmark data to the setUpFunction registered in BenchmarkInfo.
⚠️ IndexPathSubscriptRangeMutation execution took -1887 μs.
Increase the workload of IndexPathSubscriptRangeMutation to be more than 20 μs.
⚠️Ⓜ️ IndexPathSubscriptRangeMutation has very wide range of memory used between independent, repeated measurements.
IndexPathSubscriptRangeMutation mem_pages [i1, i2]: min=[24, 24] 𝚫=0 R=[23, 18]
⛔️⏱ IndexPathMinBeginning execution took at least 12353 μs.
Decrease the workload of IndexPathMinBeginning by a factor of 16 (100), to be less than 1000 μs.
⛔️⏱ IndexPathSubscriptMutation execution took at least 30289 μs.
Decrease the workload of IndexPathSubscriptMutation by a factor of 32 (100), to be less than 1000 μs.
How to read the data The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the
regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false
alarms. Unexpected regressions which are marked with '(?)' are probably noise.
If you see regressions which you cannot explain you can try to run the
benchmarks again. If regressions still show up, please consult with the
performance team (@eeckstein).

Hardware Overview
  Model Name: Mac mini
  Model Identifier: Macmini8,1
  Processor Name: 6-Core Intel Core i7
  Processor Speed: 3.2 GHz
  Number of Processors: 1
  Total Number of Cores: 6
  L2 Cache (per Core): 256 KB
  L3 Cache: 12 MB
  Memory: 64 GB

@micahbenn
Copy link
Contributor Author

Going to address these results.

@micahbenn
Copy link
Contributor Author

Hopefully fixed those issues now. I ran my changes locally and got improved numbers.

@eeckstein
Copy link
Contributor

As a general comment: we should try to keep the number of benchmarks to a minimum (while still testing what we want to test). Adding a full set of permutations for each language feature would just explode the benchmark run time.

You are not adding too many benchmarks here, but maybe you can still extract a representative set out of these 8 benchmarks - or test several variations in a single benchmark.
Note that it's still possible to keep all benchmarks around by adding the .skip tag. If someone wants to test the complete set of benchmarks, it can be done locally by specifying the IndexPath tag what you added.

@micahbenn
Copy link
Contributor Author

Thanks for the pointers. I opted to test several variations in a single benchmark. Let me know what you think :)

Copy link
Contributor

@eeckstein eeckstein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically LGTM.
Just a few minor comments.

@micahbenn
Copy link
Contributor Author

Cool. Fixed!

@micahbenn micahbenn requested a review from eeckstein November 9, 2020 23:08
Copy link
Contributor

@eeckstein eeckstein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@eeckstein
Copy link
Contributor

@swift-ci benchmark

@eeckstein
Copy link
Contributor

@swift-ci smoke test

@swift-ci
Copy link
Contributor

Performance: -O

Added MIN MAX MEAN MAX_RSS
IndexPath.Max 514 527 521
IndexPath.Min 514 514 514
IndexPath.Subscript.Mutation 475 476 476
IndexPath.Subscript.Range.Mutation 247 253 249

Code size: -O

Regression OLD NEW DELTA RATIO
TestsUtils.o 27832 28270 +1.6% 0.98x
 
Improvement OLD NEW DELTA RATIO
DriverUtils.o 129007 127455 -1.2% 1.01x

Performance: -Osize

Regression OLD NEW DELTA RATIO
ObjectiveCBridgeStubDateMutation 257 285 +10.9% 0.90x (?)
Array2D 6928 7520 +8.5% 0.92x (?)
RandomShuffleLCG2 416 448 +7.7% 0.93x (?)
 
Improvement OLD NEW DELTA RATIO
String.data.LargeUnicode 138 112 -18.8% 1.23x (?)
ProtocolDispatch 371 342 -7.8% 1.08x
 
Added MIN MAX MEAN MAX_RSS
IndexPath.Max 518 526 521
IndexPath.Min 515 516 516
IndexPath.Subscript.Mutation 477 478 477
IndexPath.Subscript.Range.Mutation 233 236 234

Code size: -Osize

Performance: -Onone

Regression OLD NEW DELTA RATIO
NSStringConversion.Rebridge.Mutable 1827 2082 +14.0% 0.88x (?)
 
Improvement OLD NEW DELTA RATIO
DataAppendDataMediumToMedium 7360 6480 -12.0% 1.14x (?)
 
Added MIN MAX MEAN MAX_RSS
IndexPath.Max 672 684 676
IndexPath.Min 669 669 669
IndexPath.Subscript.Mutation 520 520 520
IndexPath.Subscript.Range.Mutation 251 257 253

Code size: -swiftlibs

Benchmark Check Report
How to read the data The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the
regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false
alarms. Unexpected regressions which are marked with '(?)' are probably noise.
If you see regressions which you cannot explain you can try to run the
benchmarks again. If regressions still show up, please consult with the
performance team (@eeckstein).

Hardware Overview
  Model Name: Mac Pro
  Model Identifier: MacPro6,1
  Processor Name: 12-Core Intel Xeon E5
  Processor Speed: 2.7 GHz
  Number of Processors: 1
  Total Number of Cores: 12
  L2 Cache (per Core): 256 KB
  L3 Cache: 30 MB
  Memory: 64 GB

@eeckstein eeckstein merged commit a723cf7 into swiftlang:main Nov 10, 2020
@swift-ci
Copy link
Contributor

Build failed before running benchmark.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants