🚱[DNM][stdlib] Finalize Hasher layout by adding 256 bits of extra state #20185

lorentey · 2018-10-31T18:15:01Z

It seems too costly to make Hasher a resilient struct, so leave it @_fixed_layout.

To partially compensate for this, add 256 bits of extra state space, reserved for potential future use.

lorentey · 2018-10-31T18:15:30Z

C.f. #20180

lorentey · 2018-10-31T18:15:37Z

@swift-ci please benchmark

lorentey · 2018-10-31T18:15:45Z

@swift-ci please test

lorentey · 2018-10-31T19:32:31Z

stdlib/public/core/SipHash.swift

  private var _state: _SipHashState

-  @inline(__always)
+  @usableFromInline
+  @_effects(releasenone)


Gah, these should've remained force-inlined...

lorentey · 2018-10-31T19:45:52Z

Let's run a control benchmark without the padding. This should have minimal/no regressions from master.

@swift-ci benchmark

lorentey · 2018-10-31T19:47:44Z

@swift-ci smoke test

lorentey · 2018-10-31T21:10:43Z

Those code size increases... 🤔

@inline

These accidentally lost their @inline(__always) attributes during the reshuffle.

…nlined

lorentey · 2018-10-31T22:14:11Z

@swift-ci benchmark

lorentey · 2018-10-31T22:44:39Z

It looks like eliminating the _HasherCore protocol changes inlining behavior, dramatically increasing the code size of specialized Set/Dictionary methods. (By as much as a factor of 11.)

I spun off #20198 to isolate this.

swift-ci · 2018-10-31T23:03:48Z

Build comment file:

Performance: -O

TEST	OLD	NEW	DELTA	RATIO
Regression
CountAlgoString	2749	3355	+22.0%	0.82x
StringHasPrefixUnicode	97650	116482	+19.3%	0.84x
StringHashing_ascii	33	39	+18.2%	0.85x
Dictionary	364	424	+16.5%	0.86x
StringMatch	7116	8265	+16.1%	0.86x (?)
FrequenciesUsingReduceInto	1167	1343	+15.1%	0.87x
Dictionary3	159	179	+12.6%	0.89x
Chars2	2618	2901	+10.8%	0.90x
CStringLongAscii	3296	3638	+10.4%	0.91x
WordCountUniqueASCII	1612	1763	+9.4%	0.91x
Improvement
IterateData	1787	1665	-6.8%	1.07x (?)

Code size: -O

TEST	OLD	NEW	DELTA	RATIO
Regression
StringMatch.o	15991	41607	+160.2%	0.38x
CountAlgo.o	20893	45750	+119.0%	0.46x
StringRemoveDupes.o	15445	33514	+117.0%	0.46x
ReduceInto.o	24735	52873	+113.8%	0.47x
WordCount.o	65320	107570	+64.7%	0.61x
SuperChars.o	9350	13702	+46.5%	0.68x
RomanNumbers.o	11205	15253	+36.1%	0.73x
CSVParsing.o	45008	55635	+23.6%	0.81x
Substring.o	27833	31401	+12.8%	0.89x
DriverUtils.o	167991	188767	+12.4%	0.89x
Combos.o	16525	18237	+10.4%	0.91x
Chars.o	12212	13172	+7.9%	0.93x
StringComparison.o	51900	55468	+6.9%	0.94x
DictionaryRemove.o	15122	15730	+4.0%	0.96x
DictTest2.o	19692	20220	+2.7%	0.97x
DictionarySwap.o	27430	28038	+2.2%	0.98x
RGBHistogram.o	24645	25173	+2.1%	0.98x
DictionaryCopy.o	8929	9120	+2.1%	0.98x
DictTest3.o	28348	28932	+2.1%	0.98x
DictionaryCompactMapValues.o	22526	22958	+1.9%	0.98x
DictTest4Legacy.o	26936	27416	+1.8%	0.98x
DictTest4.o	26046	26318	+1.0%	0.99x
DictTest.o	51083	51611	+1.0%	0.99x
Improvement
DictionarySubscriptDefault.o	30271	29455	-2.7%	1.03x

Performance: -Osize

TEST	OLD	NEW	DELTA	RATIO
Regression
FrequenciesUsingReduceInto	1202	1436	+19.5%	0.84x
StringHasPrefixUnicode	97515	116262	+19.2%	0.84x
StringMatch	7479	8821	+17.9%	0.85x (?)
CountAlgoString	2828	3319	+17.4%	0.85x
StringHashing_ascii	34	39	+14.7%	0.87x
Chars2	2572	2929	+13.9%	0.88x
Dictionary3	163	184	+12.9%	0.89x
StringHasSuffixUnicode	97936	109607	+11.9%	0.89x
WordCountUniqueASCII	1852	2060	+11.2%	0.90x
StringRemoveDupes	411	457	+11.2%	0.90x
CStringLongAscii	3274	3529	+7.8%	0.93x
Improvement
DictionarySubscriptDefaultMutation	295	262	-11.2%	1.13x (?)
IterateData	1842	1636	-11.2%	1.13x (?)
SuperChars	21380	19062	-10.8%	1.12x

Code size: -Osize

TEST	OLD	NEW	DELTA	RATIO
Regression
StringMatch.o	7855	33575	+327.4%	0.23x
SuperChars.o	3407	11743	+244.7%	0.29x
StringRemoveDupes.o	8973	28378	+216.3%	0.32x
CountAlgo.o	14368	39277	+173.4%	0.37x
ReduceInto.o	17179	45728	+166.2%	0.38x
RomanNumbers.o	5806	13830	+138.2%	0.42x
Chars.o	5405	11469	+112.2%	0.47x
WordCount.o	54952	93530	+70.2%	0.59x
Combos.o	10749	16581	+54.3%	0.65x
Substring.o	20129	27619	+37.2%	0.73x
CSVParsing.o	37096	50483	+36.1%	0.73x
StringComparison.o	42388	49996	+17.9%	0.85x
DriverUtils.o	143811	161659	+12.4%	0.89x
DictionaryRemove.o	13983	14463	+3.4%	0.97x
DictTest2.o	15958	16454	+3.1%	0.97x
DictionaryCopy.o	8137	8377	+2.9%	0.97x
DictTest3.o	22486	22982	+2.2%	0.98x
RGBHistogram.o	22541	23021	+2.1%	0.98x
DictionaryCompactMapValues.o	21054	21470	+2.0%	0.98x
DictionarySwap.o	25163	25643	+1.9%	0.98x
RemoveWhere.o	24302	24606	+1.3%	0.99x
Improvement
DictionarySubscriptDefault.o	27079	26647	-1.6%	1.02x
DictTest4.o	21199	20911	-1.4%	1.01x
RangeAssignment.o	5229	5162	-1.3%	1.01x
DictionaryOfAnyHashableStrings.o	10109	9997	-1.1%	1.01x

Performance: -Onone

TEST	OLD	NEW	DELTA	RATIO
Regression
StringHasPrefixUnicode	100681	120749	+19.9%	0.83x
StringHasSuffixUnicode	100930	112409	+11.4%	0.90x
Improvement
Histogram	6713	5768	-14.1%	1.16x
CaptureProp	334906	290237	-13.3%	1.15x
RGBHistogram	23678	21397	-9.6%	1.11x
ArrayOfPOD	856	782	-8.6%	1.09x (?)

Code size: Swift libraries

TEST	OLD	NEW	DELTA	RATIO
Regression
libswiftCore.dylib	3944448	4014080	+1.8%	0.98x
Improvement
libswiftSwiftPrivateLibcExtras.dylib	24576	20480	-16.7%	1.20x

How to read the data

The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false alarms. Unexpected regressions which are marked with '(?)' are probably noise. If you see regressions which you cannot explain you can try to run the benchmarks again. If regressions still show up, please consult with the performance team (@eeckstein).

Hardware Overview

  Model Name: Mac Pro
  Model Identifier: MacPro6,1
  Processor Name: 12-Core Intel Xeon E5
  Processor Speed: 2.7 GHz
  Number of Processors: 1
  Total Number of Cores: 12
  L2 Cache (per Core): 256 KB
  L3 Cache: 30 MB
  Memory: 64 GB

…ations

lorentey · 2018-11-01T01:36:36Z

@swift-ci please benchmark

swift-ci · 2018-11-01T02:21:29Z

Build comment file:

Performance: -O

TEST	OLD	NEW	DELTA	RATIO
Regression
StringHashing_ascii	33	39	+18.2%	0.85x
Dictionary	368	422	+14.7%	0.87x

Code size: -O

TEST	OLD	NEW	DELTA	RATIO
Improvement
DictionarySwap.o	28502	28038	-1.6%	1.02x
DictTest4Legacy.o	27864	27416	-1.6%	1.02x
DictTest4.o	26702	26318	-1.4%	1.01x
DictionarySubscriptDefault.o	29759	29455	-1.0%	1.01x

Performance: -Osize

TEST	OLD	NEW	DELTA	RATIO
Regression
StringHashing_ascii	33	39	+18.2%	0.85x
Dictionary	371	429	+15.6%	0.86x
Dictionary3	162	181	+11.7%	0.90x
Improvement
StringHashing_abnormal	1464	1344	-8.2%	1.09x

Code size: -Osize

TEST	OLD	NEW	DELTA	RATIO
Improvement
DictTest4.o	21247	20911	-1.6%	1.02x
DictTest4Legacy.o	23033	22713	-1.4%	1.01x
StringRemoveDupes.o	28746	28378	-1.3%	1.01x

Performance: -Onone

TEST	OLD	NEW	DELTA	RATIO
Regression
ArrayOfPOD	782	859	+9.8%	0.91x (?)
Improvement
ObjectiveCBridgeStubFromArrayOfNSString2	4169	3520	-15.6%	1.18x (?)

How to read the data

The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false alarms. Unexpected regressions which are marked with '(?)' are probably noise. If you see regressions which you cannot explain you can try to run the benchmarks again. If regressions still show up, please consult with the performance team (@eeckstein).

Hardware Overview

  Model Name: Mac Pro
  Model Identifier: MacPro6,1
  Processor Name: 12-Core Intel Xeon E5
  Processor Speed: 2.7 GHz
  Number of Processors: 1
  Total Number of Cores: 12
  L2 Cache (per Core): 256 KB
  L3 Cache: 30 MB
  Memory: 64 GB

lorentey · 2018-11-06T15:38:39Z

Closing. The last remaining piece of this is #20198.

lorentey requested a review from airspeedswift October 31, 2018 18:20

This comment has been minimized.

Sign in to view

lorentey commented Oct 31, 2018

View reviewed changes

This comment has been minimized.

Sign in to view

lorentey added 5 commits October 31, 2018 22:04

[stdlib] Finalize Hasher layout by adding 256 bits of extra state

d80c46b

[stdlib] Remove SipHash from ABI names

7b7b2e9

[stdlib] Hasher._Core: Force-inline compress and finalize

d777a9b

These accidentally lost their @inline(__always) attributes during the reshuffle.

[stdlib] Temporarily disable Hasher padding

9c617ab

[stdlib] Hasher._combine(bytes:,count:): Make internal and forcibly i…

297c82d

…nlined

lorentey force-pushed the hasher-padding-redux branch from 493a634 to 297c82d Compare October 31, 2018 22:13

lorentey mentioned this pull request Oct 31, 2018

[stdlib] Eliminate _HasherCore protocol #20198

Merged

lorentey mentioned this pull request Oct 31, 2018

[stdlib] Add 256 bits of extra state to Hasher #20202

Merged

[stdlib] Character: Don’t force-inline Equatable/Comparable implement…

7aabf88

…ations

lorentey changed the title ~~[stdlib] Finalize Hasher layout by adding 256 bits of extra state~~ 🚱[DNM][stdlib] Finalize Hasher layout by adding 256 bits of extra state Nov 1, 2018

lorentey mentioned this pull request Nov 1, 2018

[WIP][stdlib] Finalize Hasher ABI #19685

Closed

lorentey closed this Nov 6, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

🚱[DNM][stdlib] Finalize Hasher layout by adding 256 bits of extra state #20185

🚱[DNM][stdlib] Finalize Hasher layout by adding 256 bits of extra state #20185

Uh oh!

lorentey commented Oct 31, 2018

Uh oh!

lorentey commented Oct 31, 2018

Uh oh!

lorentey commented Oct 31, 2018

Uh oh!

lorentey commented Oct 31, 2018

Uh oh!

This comment has been minimized.

This comment has been minimized.

lorentey Oct 31, 2018 •

edited

Loading

Uh oh!

lorentey commented Oct 31, 2018 •

edited

Loading

Uh oh!

lorentey commented Oct 31, 2018

Uh oh!

This comment has been minimized.

lorentey commented Oct 31, 2018

Uh oh!

lorentey commented Oct 31, 2018

Uh oh!

lorentey commented Oct 31, 2018 •

edited

Loading

Uh oh!

swift-ci commented Oct 31, 2018

Uh oh!

lorentey commented Nov 1, 2018

Uh oh!

swift-ci commented Nov 1, 2018

Uh oh!

lorentey commented Nov 6, 2018

Uh oh!

Uh oh!

🚱[DNM][stdlib] Finalize Hasher layout by adding 256 bits of extra state #20185

🚱[DNM][stdlib] Finalize Hasher layout by adding 256 bits of extra state #20185

Uh oh!

Conversation

lorentey commented Oct 31, 2018

Uh oh!

lorentey commented Oct 31, 2018

Uh oh!

lorentey commented Oct 31, 2018

Uh oh!

lorentey commented Oct 31, 2018

Uh oh!

This comment has been minimized.

This comment has been minimized.

lorentey Oct 31, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lorentey commented Oct 31, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lorentey commented Oct 31, 2018

Uh oh!

This comment has been minimized.

lorentey commented Oct 31, 2018

Uh oh!

lorentey commented Oct 31, 2018

Uh oh!

lorentey commented Oct 31, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

swift-ci commented Oct 31, 2018

Build comment file:

Performance: -O

Code size: -O

Performance: -Osize

Code size: -Osize

Performance: -Onone

Code size: Swift libraries

Uh oh!

lorentey commented Nov 1, 2018

Uh oh!

swift-ci commented Nov 1, 2018

Build comment file:

Performance: -O

Code size: -O

Performance: -Osize

Code size: -Osize

Performance: -Onone

Uh oh!

lorentey commented Nov 6, 2018

Uh oh!

Uh oh!

lorentey Oct 31, 2018 •

edited

Loading

lorentey commented Oct 31, 2018 •

edited

Loading

lorentey commented Oct 31, 2018 •

edited

Loading