[benchmark] Add HTTP2StateMachine benchmark #19954

Lukasa · 2018-10-19T15:23:53Z

This pull request adds a new benchmark to the benchmark suite: HTTP2StateMachine. This benchmark was extracted from work-in-progress code from the SwiftNIO HTTP/2 implementation, and reproduces the core stream state machine that will be used for SwiftNIO's pure-Swift HTTP/2 stack.

This state machine ultimately consists of a moderately-sized enumeration, almost all cases of which contain associated data. Some of this associated data is switched over in some functions, other portions of it are extracted and computed upon in the case bodies.

Why benchmark this?

For SwiftNIO, and network protocol programming in general, it is highly desirable to write protocol implementations that store almost all of their state in associated data of enumerations. This allows the type system to give as much assistance as possible to guaranteeing the correctness of the program.

For this reason, it's important that Swift generates good code for switching over these kinds of enumerations, to avoid providing developers with an incentive to use a less-powerful but faster pattern. Developers are infamous for choosing to do the fast thing instead of the correct thing.

Some quick analysis of the generated assembly for this example reveals that the Swift compiler currently generates pretty good code here, so I think this benchmark is more of a regression test than a target for improvement.

Differences from product code

Some HTTP/2-specific types have been removed and replaced with Int. These types are ultimately structs backed by Int32, so the core layout of the enumeration is not meaningfully different, and I don't think this change will affect the benchmark.
In two of these functions in the product code, the switch statement is contained within a do { } catch block, which has been removed here. That block does meaningfully affect the complexity of the functions in question, so it may be worth adding it back in: please let me know if that's worthwhile.

lorentey · 2018-10-19T15:28:37Z

@swift-ci please smoke test

lorentey · 2018-10-19T15:28:47Z

@swift-ci smoke benchmark

weissi · 2018-10-19T15:35:54Z

@swift-ci Please smoke test OS X platform

weissi · 2018-10-19T15:36:13Z

@swift-ci Please benchmark

weissi · 2018-10-19T15:36:42Z

oh sorry, Karoy had already kicked that off :)

swift-ci · 2018-10-19T16:08:03Z

Build comment file:

Build failed before running benchmark.

swift-ci · 2018-10-19T16:14:09Z

Build comment file:

Build failed before running benchmark.

Lukasa · 2018-10-19T16:15:51Z

@swift-ci please smoke test

Lukasa · 2018-10-19T16:16:13Z

Aww, clearly requires the commit bit. 😞

weissi · 2018-10-19T16:16:47Z

@swift-ci Please smoke benchmark

swift-ci · 2018-10-19T16:44:19Z

Build comment file:

Performance: -O

TEST	OLD	NEW	DELTA	RATIO
Improvement
DictionaryKeysContainsNative	44	30	-31.8%	1.47x
Array2D	7505	6907	-8.0%	1.09x
MapReduce	428	397	-7.2%	1.08x
Added
HTTP2StateMachine	0	0	0	—

Performance: -Osize

TEST	OLD	NEW	DELTA	RATIO
Improvement
Array2D	7208	6611	-8.3%	1.09x
MapReduceAnyCollection	437	408	-6.6%	1.07x
Added
HTTP2StateMachine	0	0	0	—

Performance: -Onone

TEST	MIN	MAX	MEAN	MAX_RSS
Added
HTTP2StateMachine	11	12	12	—

How to read the data

The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false alarms. Unexpected regressions which are marked with '(?)' are probably noise. If you see regressions which you cannot explain you can try to run the benchmarks again. If regressions still show up, please consult with the performance team (@eeckstein).

Hardware Overview

  Model Name: Mac Pro
  Model Identifier: MacPro6,1
  Processor Name: 12-Core Intel Xeon E5
  Processor Speed: 2.7 GHz
  Number of Processors: 1
  Total Number of Cores: 12
  L2 Cache (per Core): 256 KB
  L3 Cache: 30 MB
  Memory: 64 GB

Lukasa · 2018-10-19T16:48:47Z

Added a few orders of magnitude.

weissi · 2018-10-19T16:49:51Z

@swift-ci Please benchmark

weissi · 2018-10-19T16:50:44Z

@swift-ci Please smoke benchmark

swift-ci · 2018-10-19T17:19:11Z

Build comment file:

Performance: -O

TEST	OLD	NEW	DELTA	RATIO
Improvement
Array2D	7505	6907	-8.0%	1.09x
OpenClose	77	71	-7.8%	1.08x
MapReduceAnyCollection	398	370	-7.0%	1.08x
MapReduce	427	397	-7.0%	1.08x
Added
HTTP2StateMachine	5	6	5	—

Performance: -Osize

TEST	OLD	NEW	DELTA	RATIO
Improvement
Array2D	7208	6611	-8.3%	1.09x
Added
HTTP2StateMachine	5	5	5	—

Performance: -Onone

TEST	MIN	MAX	MEAN	MAX_RSS
Added
HTTP2StateMachine	11000	11164	11055	—

How to read the data

The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false alarms. Unexpected regressions which are marked with '(?)' are probably noise. If you see regressions which you cannot explain you can try to run the benchmarks again. If regressions still show up, please consult with the performance team (@eeckstein).

Hardware Overview

  Model Name: Mac Pro
  Model Identifier: MacPro6,1
  Processor Name: 12-Core Intel Xeon E5
  Processor Speed: 2.7 GHz
  Number of Processors: 1
  Total Number of Cores: 12
  L2 Cache (per Core): 256 KB
  L3 Cache: 30 MB
  Memory: 64 GB

swift-ci · 2018-10-19T17:36:28Z

Build comment file:

Performance: -O

TEST	OLD	NEW	DELTA	RATIO
Improvement
Array2D	7505	6907	-8.0%	1.09x
MapReduceAnyCollection	398	370	-7.0%	1.08x
MapReduce	425	397	-6.6%	1.07x
Added
HTTP2StateMachine	5	6	5	—

Performance: -Osize

TEST	OLD	NEW	DELTA	RATIO
Improvement
Array2D	7208	6611	-8.3%	1.09x
MapReduceAnyCollection	437	408	-6.6%	1.07x (?)
Added
HTTP2StateMachine	5	5	5	—

Performance: -Onone

TEST	MIN	MAX	MEAN	MAX_RSS
Added
HTTP2StateMachine	11518	11685	11574	—

How to read the data

The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false alarms. Unexpected regressions which are marked with '(?)' are probably noise. If you see regressions which you cannot explain you can try to run the benchmarks again. If regressions still show up, please consult with the performance team (@eeckstein).

Hardware Overview

  Model Name: Mac Pro
  Model Identifier: MacPro6,1
  Processor Name: 12-Core Intel Xeon E5
  Processor Speed: 2.7 GHz
  Number of Processors: 1
  Total Number of Cores: 12
  L2 Cache (per Core): 256 KB
  L3 Cache: 30 MB
  Memory: 64 GB

Lukasa · 2018-10-22T21:10:10Z

Ok, added two more orders of magnitude.

weissi · 2018-10-24T04:31:26Z

@swift-ci Please smoke benchmark

swift-ci · 2018-10-24T05:18:14Z

Build comment file:

Performance: -O

TEST	OLD	NEW	DELTA	RATIO
Regression
DataCopyBytes	560	612	+9.3%	0.92x
Added
HTTP2StateMachine	571	571	571	—

Performance: -Osize

TEST	MIN	MAX	MEAN	MAX_RSS
Added
HTTP2StateMachine	571	571	571	—

Performance: -Onone

TEST	MIN	MAX	MEAN	MAX_RSS
Added
HTTP2StateMachine	1131384	1132150	1131804	—

How to read the data

The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false alarms. Unexpected regressions which are marked with '(?)' are probably noise. If you see regressions which you cannot explain you can try to run the benchmarks again. If regressions still show up, please consult with the performance team (@eeckstein).

Hardware Overview

  Model Name: Mac Pro
  Model Identifier: MacPro6,1
  Processor Name: 12-Core Intel Xeon E5
  Processor Speed: 2.7 GHz
  Number of Processors: 1
  Total Number of Cores: 12
  L2 Cache (per Core): 256 KB
  L3 Cache: 30 MB
  Memory: 64 GB

CodaFi · 2019-11-18T20:48:10Z

@Lukasa This benchmark still seems valuable to have. Could you rebase this patch so we can get it merged?

Lukasa · 2019-11-19T21:10:59Z

@swift-ci please smoke test

Lukasa · 2019-11-19T21:11:05Z

@swift-ci please benchmark

Lukasa · 2019-11-19T21:11:40Z

@CodaFi Sure thing, rebased and resolved the conflicts, and kicked off some tests.

Lukasa · 2019-11-19T21:32:22Z

I don't think that smoke test failure is on me.

swift-ci · 2019-11-19T21:37:29Z

Performance: -O

Regression	OLD	NEW	DELTA	RATIO
FlattenListFlatMap	7038	9437	+34.1%	0.75x (?)
CharacterLiteralsLarge	97	108	+11.3%	0.90x

Added	MIN	MAX	MEAN	MAX_RSS
HTTP2StateMachine	571	571	571	—

Removed	MIN	MAX	MEAN	MAX_RSS
InsertCharacterEndIndex	159	160	160	—
InsertCharacterEndIndexNonASCII	56	57	56	—
InsertCharacterStartIndex	656	656	656	—
InsertCharacterStartIndexNonASCII	362	365	363	—
InsertCharacterTowardsEndIndex	199	201	200	—
InsertCharacterTowardsEndIndexNonASCII	216	218	217	—
ParseInt.Large.Binary	271	271	271	—
ParseInt.Large.Decimal	127	137	131	—
ParseInt.Large.Hex	195	197	196	—
ParseInt.Large.UncommonRadix	144	144	144	—
ParseInt.Small.Binary	552	552	552	—
ParseInt.Small.Decimal	334	334	334	—
ParseInt.Small.Hex	344	367	352	—
ParseInt.Small.UncommonRadix	358	358	358	—

Code size: -O

Performance: -Osize

Improvement	OLD	NEW	DELTA	RATIO
FlattenListLoop	4950	4079	-17.6%	1.21x (?)
NSStringConversion.LongUTF8	593	544	-8.3%	1.09x (?)
ObjectiveCBridgeStubNSDateRefAccess	400	371	-7.2%	1.08x (?)

Added	MIN	MAX	MEAN	MAX_RSS
HTTP2StateMachine	571	571	571	—

Removed	MIN	MAX	MEAN	MAX_RSS
InsertCharacterEndIndex	159	174	164	—
InsertCharacterEndIndexNonASCII	56	60	58	—
InsertCharacterStartIndex	659	659	659	—
InsertCharacterStartIndexNonASCII	365	367	366	—
InsertCharacterTowardsEndIndex	213	218	215	—
InsertCharacterTowardsEndIndexNonASCII	222	225	223	—
ParseInt.Large.Binary	267	267	267	—
ParseInt.Large.Decimal	131	131	131	—
ParseInt.Large.Hex	199	201	200	—
ParseInt.Large.UncommonRadix	144	144	144	—
ParseInt.Small.Binary	566	575	569	—
ParseInt.Small.Decimal	334	334	334	—
ParseInt.Small.Hex	351	352	351	—
ParseInt.Small.UncommonRadix	354	354	354	—

Code size: -Osize

Performance: -Onone

Added	MIN	MAX	MEAN	MAX_RSS
HTTP2StateMachine	3419182	3505167	3476193	—

Removed	MIN	MAX	MEAN	MAX_RSS
InsertCharacterEndIndex	229	232	230	—
InsertCharacterEndIndexNonASCII	80	82	81	—
InsertCharacterStartIndex	754	758	757	—
InsertCharacterStartIndexNonASCII	413	413	413	—
InsertCharacterTowardsEndIndex	243	246	244	—
InsertCharacterTowardsEndIndexNonASCII	235	239	236	—
ParseInt.Large.Binary	11822	11857	11838	—
ParseInt.Large.Decimal	4090	4100	4095	—
ParseInt.Large.Hex	3737	3832	3772	—
ParseInt.Large.UncommonRadix	4679	4689	4685	—
ParseInt.Small.Binary	23380	23492	23433	—
ParseInt.Small.Decimal	13721	13761	13747	—
ParseInt.Small.Hex	13392	13446	13415	—
ParseInt.Small.UncommonRadix	15865	16021	15926	—

Code size: -swiftlibs

✅	Benchmark Check Report

How to read the data

The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the
regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false
alarms. Unexpected regressions which are marked with '(?)' are probably noise.
If you see regressions which you cannot explain you can try to run the
benchmarks again. If regressions still show up, please consult with the
performance team (@eeckstein).

Hardware Overview

  Model Name: Mac Pro
  Model Identifier: MacPro6,1
  Processor Name: 12-Core Intel Xeon E5
  Processor Speed: 2.7 GHz
  Number of Processors: 1
  Total Number of Cores: 12
  L2 Cache (per Core): 256 KB
  L3 Cache: 30 MB
  Memory: 64 GB

CodaFi · 2020-04-14T23:19:44Z

@swift-ci test

CodaFi · 2020-04-15T03:03:53Z

⛵

Lukasa force-pushed the cb-bench-h2-enumeration branch from 87af504 to 906a9fc Compare October 19, 2018 15:24

Lukasa force-pushed the cb-bench-h2-enumeration branch from 906a9fc to 0caab21 Compare October 19, 2018 16:15

Lukasa force-pushed the cb-bench-h2-enumeration branch from 0caab21 to 794fea9 Compare October 19, 2018 16:48

Lukasa force-pushed the cb-bench-h2-enumeration branch from 794fea9 to 32c366e Compare October 22, 2018 21:10

[benchmark] Add HTTP2StateMachine benchmark

7e82b6f

Lukasa force-pushed the cb-bench-h2-enumeration branch from 32c366e to 7e82b6f Compare November 19, 2019 21:10

CodaFi merged commit 80cc726 into swiftlang:master Apr 15, 2020

PatrickPijnappel mentioned this pull request Apr 26, 2020

[benchmark] Revert removal of InsertCharacter & IntegerParsing #31326

Merged

[benchmark] Add HTTP2StateMachine benchmark #19954

[benchmark] Add HTTP2StateMachine benchmark #19954

Uh oh!

Conversation

Lukasa commented Oct 19, 2018

Why benchmark this?

Differences from product code

Uh oh!

lorentey commented Oct 19, 2018

Uh oh!

lorentey commented Oct 19, 2018

Uh oh!

weissi commented Oct 19, 2018

Uh oh!

weissi commented Oct 19, 2018

Uh oh!

weissi commented Oct 19, 2018

Uh oh!

swift-ci commented Oct 19, 2018

Build comment file:

Uh oh!

swift-ci commented Oct 19, 2018

Build comment file:

Uh oh!

Lukasa commented Oct 19, 2018

Uh oh!

Lukasa commented Oct 19, 2018

Uh oh!

weissi commented Oct 19, 2018

Uh oh!

swift-ci commented Oct 19, 2018

Build comment file:

Performance: -O

Performance: -Osize

Performance: -Onone

Uh oh!

Lukasa commented Oct 19, 2018

Uh oh!

weissi commented Oct 19, 2018

Uh oh!

weissi commented Oct 19, 2018

Uh oh!

swift-ci commented Oct 19, 2018

Build comment file:

Performance: -O

Performance: -Osize

Performance: -Onone

Uh oh!

swift-ci commented Oct 19, 2018

Build comment file:

Performance: -O

Performance: -Osize

Performance: -Onone

Uh oh!

Lukasa commented Oct 22, 2018

Uh oh!

weissi commented Oct 24, 2018

Uh oh!

swift-ci commented Oct 24, 2018

Build comment file:

Performance: -O

Performance: -Osize

Performance: -Onone

Uh oh!

CodaFi commented Nov 18, 2019

Uh oh!

Lukasa commented Nov 19, 2019

Uh oh!

Lukasa commented Nov 19, 2019

Uh oh!

Lukasa commented Nov 19, 2019

Uh oh!

Lukasa commented Nov 19, 2019

Uh oh!

swift-ci commented Nov 19, 2019

Performance: -O

Code size: -O

Performance: -Osize

Code size: -Osize

Performance: -Onone