-
Notifications
You must be signed in to change notification settings - Fork 10.5k
[stdlib/Sema/AST] Adopt a commonly used implementation for combining hashes #13056
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
/cc @allevato As we'd discussed. |
// result = _mixForSynthesizedHashValue(result, a1.hashValue) | ||
// result = _combineHashValues(result, 2) | ||
// result = _combineHashValues(result, a0.hashValue) | ||
// result = _combineHashValues(result, a1.hashValue) | ||
// } | ||
// result = _mixInt(result) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think it would be worthwhile to eliminate these final calls to _mixInt
now? They might be redundant since the seed in your new algorithm starts out well-randomized already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They might be redundant now, but I haven't thought it through and it can't hurt, so we can wait on that, maybe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fine—it might also be good (in a future PR) to have benchmarks against the synthesized hashValue
so that future enhancements can be assured to not have performance regressions. (Not that I believe that would be the case here.)
This looks good to me—and I agree that the new name is clearer. |
@swift-ci please smoke benchmark |
Build comment file:Optimized (O)Regression (1)
Improvement (1)
No Changes (332)
Unoptimized (Onone)Regression (2)
Improvement (3)
No Changes (329)
Hardware Overview
|
@CodaFi Can you think of others who might be interested in reviewing this before potential merging? |
This patch LGTM, but I'll tag Slava too. |
@swift-ci please test |
Rebased to resolve documentation update conflicts. @slavapestov, LGTY? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new _combineHashValues
looks great! LGTM
(It would be nice if we moved the choice of a hash function out of Hashable, so that we would not need to hard-wire any particular combinator here. However, that's clearly out of scope for this PR, and we definitely want to have better hashing until that's done.)
@swift-ci please test |
Shall we go ahead? |
LGTM! |
In #9619, @allevato left room for adopting a more sophisticated algorithm for combining hashes. This PR adopts a commonly used implementation first described by Hoad and Zobel and since adopted by others, including the Boost C++ library.
While working with the code, I've also renamed the function to
_combineHashValues
. As I understand it, the term mix is generally used for(T) -> T
operations that are intended to improve the distribution without creating new collisions, whereas the term combine is more descriptive of(T, T) -> T
operations that inevitably create new collisions (though it may perform some mixing).