[DO NOT MERGE][benchmark] Modernize benchmark definitions #39331

lorentey · 2021-09-15T23:19:05Z

This is an experiment in writing benchmark definitions in a way that's friendlier to actual Swift programmers.

Replace struct BenchmarkInfo with a more helpful protocol Benchmark
Stop Capitalizing Function And Parameter Names, That's Not How Swift Code Looks

The potential benefit is that it becomes easier for contributors to develop new benchmarks -- no more weird language dialect.

The risk as I see it is either

perturbing existing benchmarks (if they get adapted to use protocol Benchmark), or
diverging benchmarking definitions into an old and a new syntax, adding to confusion (if existing benchmarks are kept as is).

The basic idea with protocol Benchmark is that instead of creating a bunch of top-level functions and static lets / vars, then registering things using:

  BenchmarkInfo(
    name: "AngryPhonebook.ASCII2.Small",
    runFunction: { angryPhonebook($0*10, ascii) },
    tags: t,
    setUpFunction: { blackHole(ascii) }),
  BenchmarkInfo(
    name: "AngryPhonebook.Strasse.Small",
    runFunction: { angryPhonebook($0, strasse) },
    tags: t,
    setUpFunction: { blackHole(strasse) }),

we'd define a custom struct:

struct AngryPhonebook: Benchmark {
  let name: String
  let input: [String]
  let factor: Int

  init(name: String, input: [String], factor: Int = 1) {
    self.name = name
    self.input = input
    self.factor = factor
  }

  var tags: Tags { [.validation, .api, .String] }

  func run(iterations: Int) {
    ...
  }
}

and simply register instances of it as individual benchmarks:

  AngryPhonebook(
    name: "AngryPhonebook.ASCII2.Small",
    input: ascii,
    factor: 10),
  AngryPhonebook(
    name: "AngryPhonebook.Strasse.Small",
    input: strasse),
  AngryPhonebook(
    name: "AngryPhonebook.Armenian.Small",
    input: armenian),
  AngryPhonebook(
    name: "AngryPhonebook.Cyrillic.Small",
    input: cyrillic),

This gets rid of the closure stuff, and encourages encapsulating the preprocessed input data in the struct itself, preventing the preprocessing from leaking into the measurement. Currently we need to remember to do this by manually triggering swift_onces in the setUpFunction -- this is much too tricky.

`CheckResults` → `check` `False` → `getFalse`

lorentey · 2021-09-15T23:20:18Z

Cc @glessard @eeckstein @gottesmm What do you think?

lorentey · 2021-09-15T23:28:12Z

I'm particularly worried that this would disturb -Onone performance.

@swift-ci benchmark

glessard · 2021-09-16T00:11:24Z

I like it. I look forward to the first benchmark output.

swift-ci · 2021-09-16T01:24:25Z

Performance (x86_64): -O

Regression	OLD	NEW	DELTA	RATIO
Ackermann	299	992	+231.8%	0.30x
DictionaryOfAnyHashableStrings_insert	2940	5572	+89.5%	0.53x (?)
DictionaryKeysContainsNative	21	26	+23.8%	0.81x (?)
DictionaryKeysContainsCocoa	27	31	+14.8%	0.87x (?)

Improvement	OLD	NEW	DELTA	RATIO
FlattenListFlatMap	6871	4745	-30.9%	1.45x (?)
String.data.Medium	113	98	-13.3%	1.15x (?)
String.data.LargeUnicode	118	108	-8.5%	1.09x (?)
RandomShuffleLCG2	480	448	-6.7%	1.07x (?)
OpenClose	45	42	-6.7%	1.07x (?)

Code size: -O

Regression	OLD	NEW	DELTA	RATIO
Ackermann.o	1828	2603	+42.4%	0.70x
AngryPhonebook.o	8951	11941	+33.4%	0.75x
main.o	52761	60416	+14.5%	0.87x
DriverUtils.o	117639	120287	+2.3%	0.98x

Improvement	OLD	NEW	DELTA	RATIO
TestsUtils.o	25435	24789	-2.5%	1.03x

Performance (x86_64): -Osize

Regression	OLD	NEW	DELTA	RATIO
Ackermann	251	1166	+364.5%	0.22x
FlattenListLoop	1629	2552	+56.7%	0.64x (?)
FlattenListFlatMap	4355	6135	+40.9%	0.71x (?)
PointerArithmetics	31400	37100	+18.2%	0.85x
ReversedDictionary2	315	371	+17.8%	0.85x
ReversedArray2	200	228	+14.0%	0.88x
ConvertFloatingPoint.MockFloat64Exactly	8	9	+12.5%	0.89x
DataCountSmall	28	31	+10.7%	0.90x (?)
BinaryFloatingPointPropertiesBinade	28	31	+10.7%	0.90x (?)
DataCreateEmpty	200	220	+10.0%	0.91x
CreateObjects	20	22	+10.0%	0.91x
DataSubscriptSmall	31	34	+9.7%	0.91x
DataCountMedium	31	34	+9.7%	0.91x (?)
FloatingPointPrinting_Float_description_uniform	4800	5200	+8.3%	0.92x (?)

Improvement	OLD	NEW	DELTA	RATIO
ProtocolDispatch	371	342	-7.8%	1.08x (?)

Code size: -Osize

Regression	OLD	NEW	DELTA	RATIO
Ackermann.o	1662	2622	+57.8%	0.63x
AngryPhonebook.o	8227	11696	+42.2%	0.70x
DriverUtils.o	111849	115377	+3.2%	0.97x

Improvement	OLD	NEW	DELTA	RATIO
main.o	48064	10504	-78.1%	4.58x
TestsUtils.o	19338	18500	-4.3%	1.05x

Performance (x86_64): -Onone

Regression	OLD	NEW	DELTA	RATIO
Ackermann	2479	3570	+44.0%	0.69x
LineSink.scalars.alpha	321	348	+8.4%	0.92x (?)
ObjectiveCBridgeStubToNSDateRef	4700	5080	+8.1%	0.93x (?)

Improvement	OLD	NEW	DELTA	RATIO
StringToDataLargeUnicode	7200	6100	-15.3%	1.18x (?)
StringToDataMedium	6700	6100	-9.0%	1.10x (?)

Code size: -swiftlibs

How to read the data

The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the
regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false
alarms. Unexpected regressions which are marked with '(?)' are probably noise.
If you see regressions which you cannot explain you can try to run the
benchmarks again. If regressions still show up, please consult with the
performance team (@eeckstein).

Hardware Overview

  Model Name: Mac Pro
  Model Identifier: MacPro6,1
  Processor Name: 12-Core Intel Xeon E5
  Processor Speed: 2.7 GHz
  Number of Processors: 1
  Total Number of Cores: 12
  L2 Cache (per Core): 256 KB
  L3 Cache: 30 MB
  Memory: 64 GB

lorentey · 2021-09-16T01:28:35Z

Ah, too bad. AngryPhonebook results weren't perturbed at all, which is encouraging, but Ackermann was, which is bad. On top of it all, there are huge code size changes that likely prevent us from adapting existing benchmarks.

lorentey · 2021-09-16T01:30:28Z

(None of this would matter if we ran the same benchmark code across multiple compiler versions, but that's not how we do things yet.)

lorentey · 2021-09-16T05:16:49Z

Applying Swift naming conventions seems like a good idea whether or not we replace/augment BenchmarkInfo.

In #39336, I'm updating the entire benchmark suite to read a little bit more like regular Swift code.

eeckstein · 2021-09-16T06:27:53Z

@lorentey That looks great!
I'm in favor of converting all benchmarks to this new scheme. So that we can get rid of the old BenchmarkInfo.
The one-time code size changes don't matter.

lorentey added 3 commits September 15, 2021 16:04

[benchmark] Introduce protocol Benchmark, replacing BenchmarkInfo

b3865a0

[benchmark] Replace uppercased utility functions

c9b2d76

`CheckResults` → `check` `False` → `getFalse`

[benchmark] Update Ackerman and AngryPhonebook

68f4ea6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[DO NOT MERGE][benchmark] Modernize benchmark definitions #39331

[DO NOT MERGE][benchmark] Modernize benchmark definitions #39331

Uh oh!

lorentey commented Sep 15, 2021 •

edited

Loading

Uh oh!

lorentey commented Sep 15, 2021

Uh oh!

lorentey commented Sep 15, 2021

Uh oh!

glessard commented Sep 16, 2021

Uh oh!

swift-ci commented Sep 16, 2021

Uh oh!

lorentey commented Sep 16, 2021 •

edited

Loading

Uh oh!

lorentey commented Sep 16, 2021 •

edited

Loading

Uh oh!

lorentey commented Sep 16, 2021

Uh oh!

eeckstein commented Sep 16, 2021

Uh oh!

Uh oh!

[DO NOT MERGE][benchmark] Modernize benchmark definitions #39331

Are you sure you want to change the base?

[DO NOT MERGE][benchmark] Modernize benchmark definitions #39331

Uh oh!

Conversation

lorentey commented Sep 15, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lorentey commented Sep 15, 2021

Uh oh!

lorentey commented Sep 15, 2021

Uh oh!

glessard commented Sep 16, 2021

Uh oh!

swift-ci commented Sep 16, 2021

Performance (x86_64): -O

Code size: -O

Performance (x86_64): -Osize

Code size: -Osize

Performance (x86_64): -Onone

Code size: -swiftlibs

Uh oh!

lorentey commented Sep 16, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lorentey commented Sep 16, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lorentey commented Sep 16, 2021

Uh oh!

eeckstein commented Sep 16, 2021

Uh oh!

Uh oh!

lorentey commented Sep 15, 2021 •

edited

Loading

lorentey commented Sep 16, 2021 •

edited

Loading

lorentey commented Sep 16, 2021 •

edited

Loading