-
Notifications
You must be signed in to change notification settings - Fork 933
Generate stable sql objects names #1802
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generate stable sql objects names #1802
Conversation
579dcb0
to
464cb10
Compare
using NHibernate.Mapping; | ||
using NUnit.Framework; | ||
|
||
namespace NHibernate.Test.NHSpecificTest.NH1399 | ||
{ | ||
[TestFixture] | ||
[TestFixture, Obsolete] | ||
public class Fixture |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fixture is testing Table.UniqueColumnString
with a non-null referencedTable parameter, case which was no more having any usage in NHibernate. So indeed these tests are obsolete since some times.
if (string.IsNullOrEmpty(fk.Name)) | ||
{ | ||
fk.Name = Constraint.GenerateName( | ||
fk.GeneratedConstraintNamePrefix, table, fk.ReferencedTable, fk.Columns); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a heavily simplified port of the Hibernate code, which instead uses name policy generation and bothers with quoting/unquoting (like the things removed in #1703).
src/NHibernate/Mapping/Constraint.cs
Outdated
// Hash the generated name for avoiding collisions with user choosen names. | ||
// This is not 100% reliable, as hashing may still have a chance of generating | ||
// collisions. | ||
var name = prefix + HashName(sb.ToString()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I quite dislike the idea of "uniqueness through hashcode", since it is fundamentally flawed: collisions may occur, even if very unlikely. But I have no better idea here.
src/NHibernate/Mapping/Constraint.cs
Outdated
if (name.Length > 30) | ||
{ | ||
// This, of course, increases the collision risk. | ||
name = name.Substring(0, 30); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here I was tempted to use Dialect.MaxAliasLength
. But that would require to partially undo #1703... Maybe instead should we add some setting for the max length here. After all we are not generating an alias. And a setting would be quite less heavy to resolve than the dialect.
@hazzik, any thoughts here?
src/NHibernate/Mapping/Constraint.cs
Outdated
/// <returns>The hased name.</returns> | ||
private static string HashName(string name) | ||
{ | ||
// Hibernate uses MD5, but with .Net this would throw on FIPS enabled machine. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This thing is from my point of view a very unfortunate design decision of the .Net Framework: if FIPS is enabled on the machine, any usage of non FIPS compliant algorithm is forbidden. But FIPS does not forbid them, it just forbids to use them for handling a security related task. And here we are not handling a security related task. But there is no way to tell it for allowing using MD5 here even on FIPS enabled machine. See this Microsoft blog (may be very long to load) for more information.
So if we want to use MD5 here without causing NHibernate to be unusable on a FIPS enabled machine, we would have to use a custom implementation, like this one.
src/NHibernate/Mapping/Constraint.cs
Outdated
{ | ||
// Hibernate uses MD5, but with .Net this would throw on FIPS enabled machine. | ||
// As a consequence generated names will be quite longer. | ||
using (var hasher = SHA256.Create()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one should pass on a FIPS enabled machine, SHA256.Create
being documented as choosing a FIPS compliant one if needed. At first I was willing to use HashAlgorithm.Create
instead (which default to SHA-1, a bit shorter), but it throws PlatformNotSupported
under Core.
src/NHibernate/Mapping/Constraint.cs
Outdated
// Converting to base 32 for shortening the name. | ||
// Hibernate uses base 35, but we do not have a native implementation | ||
// in .Net, and base 32 is easier to implement. | ||
return ToBase32String(hash); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hibernate creates a BigInteger
from the byte array (.Net can do that too) then asks its base 35 representation: .Net does not have this natively, and implementing it was a bit heavy in my opinion.
I think using a cryptographic hash for this trivial task is an overkill. I was researching using lightweight non-cryptographic hashes to perform the operation, like MurmurHash2 and xxHash. xx claims to be the fastest. They both generate 32 bit integer as a result (they have 64 bit versions, but I think this is an overkill again) and provide 10/10 results on SMHasher. I think it should be enough for us. Both algorithms are open-sourced with compatible licenses |
For avoiding adding a new dependency, we could take xxHash version from Seok-Ju Yun by taking just the part of the source we need. The speed claim is about C version, so it may not be faster than MurMurHash2 unsafe or hackish versions from David Landmann (which we could also incorporate as source code). But for the usage it will have, mapping compilation and only on some foreign keys and unique constraints, I do not think it will make much differences. |
I benchmarked the implementations: BenchmarkDotNet=v0.10.14, OS=Windows 10.0.17134
Intel Core i7-2670QM CPU 2.20GHz (Sandy Bridge), 1 CPU, 8 logical and 4 physical cores
Frequency=2143568 Hz, Resolution=466.5119 ns, Timer=TSC
.NET Core SDK=2.1.302
[Host] : .NET Core 2.1.2 (CoreCLR 4.6.26628.05, CoreFX 4.6.26629.01), 64bit RyuJIT
Clr : .NET Framework 4.7 (CLR 4.0.30319.42000), 64bit RyuJIT-v4.7.3131.0
Core : .NET Core 2.1.2 (CoreCLR 4.6.26628.05, CoreFX 4.6.26629.01), 64bit RyuJIT
|
Let's stick with MurMurHash2Simple |
Remove cryptographic hash
464cb10
to
3323db7
Compare
src/NHibernate/Mapping/Constraint.cs
Outdated
{ | ||
// Use a concatenation that guarantees uniqueness, even if identical names | ||
// exist between all table and column identifiers. | ||
var sb = new StringBuilder("table`" + table.Name + "`"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use StringBuilder.Append instead of concatenation of the string. Eg:
var sb = new StringBuilder("table`").Append(table.Name).Append("`");
src/NHibernate/Mapping/Constraint.cs
Outdated
// exist between all table and column identifiers. | ||
var sb = new StringBuilder("table`" + table.Name + "`"); | ||
if (referencedTable != null) | ||
sb.Append("references`" + referencedTable.Name + "`"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above
src/NHibernate/Mapping/Constraint.cs
Outdated
public static readonly ColumnComparator Instance = new ColumnComparator(); | ||
|
||
public int Compare(Column col1, Column col2) { | ||
return StringComparer.Ordinal.Compare(col1?.Name, col2?.Name); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should compare on Column.CanonicalName
. The Name
of a Column class is case-sensitive when quoted and case-insensitive when not quoted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it depends on database. SQL Server 2017 keeps respecting the collation of the database (in fact, of its metadata tables), quoted or not.
It may also be case sensitive when unquoted, not only when quoted. And this varies from databases to databases.
Quoted identifier being case sensitive is an Oracle feature. So I think it is better taking the name "as is", relying on the user to map its objects in a consistent way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not talking about databases here, but about rules of NHibernate itself. Column has overridden Equals and GetHashCode operators which follow this rule. So we need to be consistent.
src/NHibernate/Mapping/Constraint.cs
Outdated
foreach (var column in alphabeticalColumns) | ||
{ | ||
var columnName = column == null ? "" : column.Name; | ||
sb.Append("column`").Append(columnName).Append("`"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, here should be CanonicalName prob.
Use StringBuilder fully
src/NHibernate/Mapping/Constraint.cs
Outdated
// bindings are given. | ||
var alphabeticalColumns = new List<Column>(columns); | ||
alphabeticalColumns.Sort(ColumnComparator.Instance); | ||
foreach (var column in alphabeticalColumns) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just use foreach (var column in columns.OrderBy(c=>c.Name))
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because I have ported the Java code without thinking about it.
Simplify code and use canonical name
Maybe we should include in generated key some truncated real parts like table and/or column names to minimize collisions risks on large databases and make generated keys more "user friendly" |
User has an ability to specify a name, so I would not bother |
Fix #1769.
Possible breaking change: