-
Notifications
You must be signed in to change notification settings - Fork 10.5k
Use a better hash seed when doing string hashing in the compiler #26685
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@swift-ci Please test |
/// | ||
/// This is the same as the default used by llvm::djbHash, just provided | ||
/// explicitly here to note that it's part of the format. | ||
const uint32_t SWIFTMODULE_HASH_SEED = 5381; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity: Is the fact that the format depends on the hash seed a bug?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it's an OnDiskHashTable, so the hash used is inherently part of the format.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense!
@@ -52,7 +52,13 @@ const uint16_t SWIFTMODULE_VERSION_MAJOR = 0; | |||
/// describe what change you made. The content of this comment isn't important; | |||
/// it just ensures a conflict if two people change the module format. | |||
/// Don't worry about adhering to the 80-column limit for this line. | |||
const uint16_t SWIFTMODULE_VERSION_MINOR = 509; // [dynamic_lifetime] sil flag | |||
const uint16_t SWIFTMODULE_VERSION_MINOR = 510; // better string hash seed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's up with the end-of-line comment both on the LHS and the RHS of the patch? Seems like neither one belongs on this line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah.. it's the last change that causes us to bump the version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, see the comment above.
@@ -388,8 +388,7 @@ class ModuleFile::DeclTableInfo { | |||
|
|||
hash_value_type ComputeHash(internal_key_type key) { | |||
if (key.first == DeclBaseName::Kind::Normal) { | |||
// FIXME: DJB seed=0, audit whether the default seed could be used. | |||
return llvm::djbHash(key.second, 0); | |||
return llvm::djbHash(key.second, SWIFTMODULE_HASH_SEED); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps factor this out into a ModuleFile::hashFunction(). I'm not sure whether this is much better, I'll leave it up to you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I considered that, but figured this was slightly more likely to keep people from doing the wrong thing in the future. I'm not 100% sure of that though.
Build failed |
3142ce8
to
f51b775
Compare
@swift-ci Please test |
Build failed |
Build failed |
f51b775
to
2273fdc
Compare
Updated to make the swiftdoc hash seed more explicit. @swift-ci Please test |
Build failed |
Build failed |
Hmm. @swift-ci Please clean test |
Build failed |
Build failed |
Definitely concerning. What changed between now and last time? Maybe someone added another hash? |
Oops, yes, they did. Or I resolved a conflict incorrectly. |
2273fdc
to
d2826e5
Compare
@swift-ci Please test |
Build failed |
Build failed |
d2826e5
to
03d2ba7
Compare
@swift-ci Please test |
Build failed |
Build failed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Keeping the hash seed for swiftdoc files makes sense to me. Thank you for the heads-up!
03d2ba7
to
c06e105
Compare
@swift-ci Please smoke test |
The one for enum raw value checking isn't serialized anywhere, so we can use normal in-process hashing. The ones in the Clang importer's lookup tables are serialized, but the lookup table is already only valid for a particular compiler build anyway (it goes into a Clang PCM, which has the full compiler version in its cache key).
Unlike compiled modules, swiftdoc files are considered a stable format, so we can't change how information is stored in them. If we add any more string-hashed tables to swiftdoc files, we should consider using a new hash seed for those.
...fulfilling the promised audit from 0747d9a. No intended functionality change /other/ than the order of already-unsorted lists. This affected a number of SIL tests that relied on deserialization order matching the original source order; I have no idea why the old hash logic would make that the case. If we think that's a valuable property, we should serialize a list of functions in addition to the iterable table. (Maybe just in SIB mode?)
Remove explicit seeds from string hashes that are compiler-version-dependent already.
Comment that we can't change the seed for swiftdoc files (@nkcsgexi).
Switch to a "better" explicit seed for serialization lookup tables. This last one changed the order of SIL entity deserialization (as @adrian-prantl noticed in the original Force the DJB hash seed to 0 for compatibility with HashString. #14890), but we shouldn't have been relying on that anyway (@gottesmm). It's actually rather scary that that was matching source order at the time.