Skip to content

🚱[DNM][stdlib] Finalize Hasher layout by adding 256 bits of extra state #20185

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 6 commits into from

Conversation

lorentey
Copy link
Member

It seems too costly to make Hasher a resilient struct, so leave it @_fixed_layout.

To partially compensate for this, add 256 bits of extra state space, reserved for potential future use.

@lorentey
Copy link
Member Author

C.f. #20180

@lorentey
Copy link
Member Author

@swift-ci please benchmark

@lorentey
Copy link
Member Author

@swift-ci please test

@swift-ci

This comment has been minimized.

@swift-ci

This comment has been minimized.

private var _state: _SipHashState

@inline(__always)
@usableFromInline
@_effects(releasenone)
Copy link
Member Author

@lorentey lorentey Oct 31, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gah, these should've remained force-inlined...

@lorentey
Copy link
Member Author

lorentey commented Oct 31, 2018

Let's run a control benchmark without the padding. This should have minimal/no regressions from master.

@swift-ci benchmark

@lorentey
Copy link
Member Author

@swift-ci smoke test

@swift-ci

This comment has been minimized.

@lorentey
Copy link
Member Author

Those code size increases... 🤔

@lorentey lorentey force-pushed the hasher-padding-redux branch from 493a634 to 297c82d Compare October 31, 2018 22:13
@lorentey
Copy link
Member Author

@swift-ci benchmark

@lorentey
Copy link
Member Author

lorentey commented Oct 31, 2018

It looks like eliminating the _HasherCore protocol changes inlining behavior, dramatically increasing the code size of specialized Set/Dictionary methods. (By as much as a factor of 11.)

I spun off #20198 to isolate this.

@swift-ci
Copy link
Contributor

Build comment file:

Performance: -O

TEST OLD NEW DELTA RATIO
Regression
CountAlgoString 2749 3355 +22.0% 0.82x
StringHasPrefixUnicode 97650 116482 +19.3% 0.84x
StringHashing_ascii 33 39 +18.2% 0.85x
Dictionary 364 424 +16.5% 0.86x
StringMatch 7116 8265 +16.1% 0.86x (?)
FrequenciesUsingReduceInto 1167 1343 +15.1% 0.87x
Dictionary3 159 179 +12.6% 0.89x
Chars2 2618 2901 +10.8% 0.90x
CStringLongAscii 3296 3638 +10.4% 0.91x
WordCountUniqueASCII 1612 1763 +9.4% 0.91x
Improvement
IterateData 1787 1665 -6.8% 1.07x (?)

Code size: -O

TEST OLD NEW DELTA RATIO
Regression
StringMatch.o 15991 41607 +160.2% 0.38x
CountAlgo.o 20893 45750 +119.0% 0.46x
StringRemoveDupes.o 15445 33514 +117.0% 0.46x
ReduceInto.o 24735 52873 +113.8% 0.47x
WordCount.o 65320 107570 +64.7% 0.61x
SuperChars.o 9350 13702 +46.5% 0.68x
RomanNumbers.o 11205 15253 +36.1% 0.73x
CSVParsing.o 45008 55635 +23.6% 0.81x
Substring.o 27833 31401 +12.8% 0.89x
DriverUtils.o 167991 188767 +12.4% 0.89x
Combos.o 16525 18237 +10.4% 0.91x
Chars.o 12212 13172 +7.9% 0.93x
StringComparison.o 51900 55468 +6.9% 0.94x
DictionaryRemove.o 15122 15730 +4.0% 0.96x
DictTest2.o 19692 20220 +2.7% 0.97x
DictionarySwap.o 27430 28038 +2.2% 0.98x
RGBHistogram.o 24645 25173 +2.1% 0.98x
DictionaryCopy.o 8929 9120 +2.1% 0.98x
DictTest3.o 28348 28932 +2.1% 0.98x
DictionaryCompactMapValues.o 22526 22958 +1.9% 0.98x
DictTest4Legacy.o 26936 27416 +1.8% 0.98x
DictTest4.o 26046 26318 +1.0% 0.99x
DictTest.o 51083 51611 +1.0% 0.99x
Improvement
DictionarySubscriptDefault.o 30271 29455 -2.7% 1.03x

Performance: -Osize

TEST OLD NEW DELTA RATIO
Regression
FrequenciesUsingReduceInto 1202 1436 +19.5% 0.84x
StringHasPrefixUnicode 97515 116262 +19.2% 0.84x
StringMatch 7479 8821 +17.9% 0.85x (?)
CountAlgoString 2828 3319 +17.4% 0.85x
StringHashing_ascii 34 39 +14.7% 0.87x
Chars2 2572 2929 +13.9% 0.88x
Dictionary3 163 184 +12.9% 0.89x
StringHasSuffixUnicode 97936 109607 +11.9% 0.89x
WordCountUniqueASCII 1852 2060 +11.2% 0.90x
StringRemoveDupes 411 457 +11.2% 0.90x
CStringLongAscii 3274 3529 +7.8% 0.93x
Improvement
DictionarySubscriptDefaultMutation 295 262 -11.2% 1.13x (?)
IterateData 1842 1636 -11.2% 1.13x (?)
SuperChars 21380 19062 -10.8% 1.12x

Code size: -Osize

TEST OLD NEW DELTA RATIO
Regression
StringMatch.o 7855 33575 +327.4% 0.23x
SuperChars.o 3407 11743 +244.7% 0.29x
StringRemoveDupes.o 8973 28378 +216.3% 0.32x
CountAlgo.o 14368 39277 +173.4% 0.37x
ReduceInto.o 17179 45728 +166.2% 0.38x
RomanNumbers.o 5806 13830 +138.2% 0.42x
Chars.o 5405 11469 +112.2% 0.47x
WordCount.o 54952 93530 +70.2% 0.59x
Combos.o 10749 16581 +54.3% 0.65x
Substring.o 20129 27619 +37.2% 0.73x
CSVParsing.o 37096 50483 +36.1% 0.73x
StringComparison.o 42388 49996 +17.9% 0.85x
DriverUtils.o 143811 161659 +12.4% 0.89x
DictionaryRemove.o 13983 14463 +3.4% 0.97x
DictTest2.o 15958 16454 +3.1% 0.97x
DictionaryCopy.o 8137 8377 +2.9% 0.97x
DictTest3.o 22486 22982 +2.2% 0.98x
RGBHistogram.o 22541 23021 +2.1% 0.98x
DictionaryCompactMapValues.o 21054 21470 +2.0% 0.98x
DictionarySwap.o 25163 25643 +1.9% 0.98x
RemoveWhere.o 24302 24606 +1.3% 0.99x
Improvement
DictionarySubscriptDefault.o 27079 26647 -1.6% 1.02x
DictTest4.o 21199 20911 -1.4% 1.01x
RangeAssignment.o 5229 5162 -1.3% 1.01x
DictionaryOfAnyHashableStrings.o 10109 9997 -1.1% 1.01x

Performance: -Onone

TEST OLD NEW DELTA RATIO
Regression
StringHasPrefixUnicode 100681 120749 +19.9% 0.83x
StringHasSuffixUnicode 100930 112409 +11.4% 0.90x
Improvement
Histogram 6713 5768 -14.1% 1.16x
CaptureProp 334906 290237 -13.3% 1.15x
RGBHistogram 23678 21397 -9.6% 1.11x
ArrayOfPOD 856 782 -8.6% 1.09x (?)

Code size: Swift libraries

TEST OLD NEW DELTA RATIO
Regression
libswiftCore.dylib 3944448 4014080 +1.8% 0.98x
Improvement
libswiftSwiftPrivateLibcExtras.dylib 24576 20480 -16.7% 1.20x
How to read the data The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false alarms. Unexpected regressions which are marked with '(?)' are probably noise. If you see regressions which you cannot explain you can try to run the benchmarks again. If regressions still show up, please consult with the performance team (@eeckstein).

Hardware Overview
  Model Name: Mac Pro
  Model Identifier: MacPro6,1
  Processor Name: 12-Core Intel Xeon E5
  Processor Speed: 2.7 GHz
  Number of Processors: 1
  Total Number of Cores: 12
  L2 Cache (per Core): 256 KB
  L3 Cache: 30 MB
  Memory: 64 GB

@lorentey
Copy link
Member Author

lorentey commented Nov 1, 2018

@swift-ci please benchmark

@lorentey lorentey changed the title [stdlib] Finalize Hasher layout by adding 256 bits of extra state 🚱[DNM][stdlib] Finalize Hasher layout by adding 256 bits of extra state Nov 1, 2018
@swift-ci
Copy link
Contributor

swift-ci commented Nov 1, 2018

Build comment file:

Performance: -O

TEST OLD NEW DELTA RATIO
Regression
StringHashing_ascii 33 39 +18.2% 0.85x
Dictionary 368 422 +14.7% 0.87x

Code size: -O

TEST OLD NEW DELTA RATIO
Improvement
DictionarySwap.o 28502 28038 -1.6% 1.02x
DictTest4Legacy.o 27864 27416 -1.6% 1.02x
DictTest4.o 26702 26318 -1.4% 1.01x
DictionarySubscriptDefault.o 29759 29455 -1.0% 1.01x

Performance: -Osize

TEST OLD NEW DELTA RATIO
Regression
StringHashing_ascii 33 39 +18.2% 0.85x
Dictionary 371 429 +15.6% 0.86x
Dictionary3 162 181 +11.7% 0.90x
Improvement
StringHashing_abnormal 1464 1344 -8.2% 1.09x

Code size: -Osize

TEST OLD NEW DELTA RATIO
Improvement
DictTest4.o 21247 20911 -1.6% 1.02x
DictTest4Legacy.o 23033 22713 -1.4% 1.01x
StringRemoveDupes.o 28746 28378 -1.3% 1.01x

Performance: -Onone

TEST OLD NEW DELTA RATIO
Regression
ArrayOfPOD 782 859 +9.8% 0.91x (?)
Improvement
ObjectiveCBridgeStubFromArrayOfNSString2 4169 3520 -15.6% 1.18x (?)
How to read the data The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false alarms. Unexpected regressions which are marked with '(?)' are probably noise. If you see regressions which you cannot explain you can try to run the benchmarks again. If regressions still show up, please consult with the performance team (@eeckstein).

Hardware Overview
  Model Name: Mac Pro
  Model Identifier: MacPro6,1
  Processor Name: 12-Core Intel Xeon E5
  Processor Speed: 2.7 GHz
  Number of Processors: 1
  Total Number of Cores: 12
  L2 Cache (per Core): 256 KB
  L3 Cache: 30 MB
  Memory: 64 GB

@lorentey
Copy link
Member Author

lorentey commented Nov 6, 2018

Closing. The last remaining piece of this is #20198.

@lorentey lorentey closed this Nov 6, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants