You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[llvm-profdata] Remove MD5 collision check in D147740
This is the patch at https://reviews.llvm.org/D153692, migrating to
Github
After testing D147740 with multiple industrial projects with ~10 million FunctionSamples, no MD5 collision has been found.
In perfect hashing, the probability of collision for N symbols over K possible hash value is 1 - K!/((K-N)! * K^N). When N is 1 million and K is 2^64, the probability is 3*10^-8, when N is 10 million the probability is 3*10^-6, so we are probably not going to find an actual case in real world application. (However if K is 2^32, the probability of collision is almost 1, this is indeed a problem, if anyone still use a large profile on 32-bit machine, as hash_code is tied to size_t).
Furthermore, when a collision happens we can't do anything to recover it, unless using a multi-map, but that is significantly slower, which contradicts the purpose of optimizing the profile reader.
One more thing, since we have been using profiles with MD5 names, and they have to be coming from non-MD5 sources, so if hash collision is to happen, it already happened when we convert a non-MD5 profile to a MD5 one, so there's no point to check for that in the reader, and this feature can be removed.
0 commit comments