Skip to content

NH-3931 - Invalid order of child inserts when using TPH inheritance #577

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

fredericDelaporte
Copy link
Member

Test cases and fix for NH-3931.

Ported from HHH-9864 and HHH-11216.

Some additional changes were required on NHibernate side, see comments on file ActionQueue.

@@ -107,17 +109,17 @@ public void AddAction(BulkOperationCleanupAction cleanupAction)
{
RegisterCleanupActions(cleanupAction);
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While working, I have ended up CTRL+K CTRL+D due to wrecked indentation in Java ported code for the fix, without noticing the amount of bad spaces already littering this file. Actual changes start at ligne 507.

{
log.WarnFormat("Entity {0} persister does not provide meta-data, its batch ordering may be wrong and cause a failure.",
batchIdentifier.EntityName);
return;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See code comment above.
Both on Hibernate side and NHibernate side, the ClassMetadata has following reference code comment:

Get the user-visible metadata for the class (optional operation)

Does that mean we have to support null when calling this property? Not "supporting" it was causing a test, Legacy.MasterDetailTest, to fail.

The "support" I have implemented just give up ordering. Obviously a project using some custom provider without meta-data will very likely run into troubles with this change. Must we support this case, custom provider without meta-data?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you trace what has happened to Legacy.MasterDetailTest in Hibernate?

Copy link
Member Author

@fredericDelaporte fredericDelaporte Mar 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still there. There are some changes but getClassMetadata still yield null.

Hibernate does not have insert ordering and batching enabled by default. This test on their side may run unbatched. It would explain why they do not see this failure, neither maybe the one-to-one failure we have below.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have left them some comments.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They have "solved" it the same way: no ordering in case the persister does not provide class meta data. But I have noticed that on their side, their custom provider does provide an entity meta model while ours does not. Maybe syncing our custom provider with their and fallback-ing to the entity meta model could work.

var entityType = (EntityType)type;
// NHibernate has non constrained one-to-one among its state while it seems Hibernate does not.
// We have to filter them out, since they are indeed children, not parents.
if (!(entityType.IsOneToOne && entityType.IsNullable))
Copy link
Member Author

@fredericDelaporte fredericDelaporte Mar 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See code comment, either Hibernate has still a bug with those changes, or Hibernate does not fetch the same data in action.State.

Without that, I get both one-to-one sides referencing each other as parent. This causes the sorting algorithm to swap them, even if the actual parent was the first. 14 tests were failing before I changed that.

var entityName = batchIdentifier.EntityName;

//Make sure that child entries are not before parents
for (var j = i - 1; j >= 0; j--)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While everything is great with your work, I think that this 2 (3) for loops are complete crap.

Copy link
Member Author

@fredericDelaporte fredericDelaporte Mar 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but well, sorting algorithms never gets done without some looping... And I am not very knowledgeable about efficient sorting algorithms. At least the lookup are done as O(1) with set. (By the way, Hibernate was doing a lookup in a list, I have change that for a dictionary while porting.)

Throwing in some usual sorting algorithm will probably not be easy, since there is no actual comparison between elements. Many of them could be "neutral" with most others, while requiring an order with some. So those "neutral" elements may not be contiguous in a correctly sorted result, which may cause usual sorting algorithms to fail.

Copy link
Member Author

@fredericDelaporte fredericDelaporte Mar 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still thinking about it, this sort is O(N²). N is the number of distinct entity classes involved in the flush. If this gets merged, we may document that session having a lot of distinct entity classes to flush at once would trigger a costly queries sort if ordering is enabled.

In principle a comparison based sort is Ω(N log N) (cannot be better than that), which already does not scale well. So with our added constraint of not being able of comparing all elements, I do not think we can do better than this O(N²).

About comparing all elements, we may have two unrelated set of entity classes, 1: A < B < C and 2: D < E < F, requiring ordering among themselves but not caring about the other set. When comparing one entity from a set (B by example) to an entity class of the other set (let's take F), they should not be considered equals.
Otherwise a comparison based sorting algorithm may encounter B == F == A while A must be < B, which would wreck the algorithm.

So for using a comparison based algorithm, we would have to detect each set of comparable entity classes in the batch, assign them a global ranking arbitrarily, and use it when comparing entity classes across sets: could be worth it on big sorts, but normally a session is not supposed to have a big list of entity classes to handle in the same flush. (Note that ado.batch_size does not limit the number of classes to order, as of my understanding. It only causes the resulting set of queries for a class to get split after ordering.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was almost able to achieve the sort by doing OrderBy+ThenBy, but it didn't completely work.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By merging ChildrenNames and ParentNames into single collection I was able to implement just a bubble sort algo.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, we have even worse performance. It's close to N^6 (but Hibernate has N^2)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

List<> are that bad about removals and insertions? I should have checked that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, O(n) ops, this causes the global sort to be O(n^3) on NHibernate side. How do you come to O(n^6)? ( O(2n^3) is indeed equivalent to O(n^3), such factor is usually just dropped in O notation.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just double check everything I'm saying - it's 2:45 AM here:)

Copy link
Member Author

@fredericDelaporte fredericDelaporte Mar 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And well, there is a LinkedList<>, I just have to change for it and take care about that next time I have to deal with inserts and removals.

@fredericDelaporte
Copy link
Member Author

Fixed NH-3931 test cases restriction: it can not run with ODBC. Still some other tests are failing: an optimistic concurrency test with ODBC; and a bunch of tests under Firebird seems to have ordering issues. I will look at that later.

@fredericDelaporte
Copy link
Member Author

It was not very logic to ignore one-to-one detected as children, not parents. Maybe this was the Firebird failure cause. Now I add them in the children set.

hazzik
hazzik previously requested changes Mar 22, 2017
// _orderedBatches and _entityNameBatchIdentifier replace _latestBatch from Hibernate implementation.
private readonly List<BatchIdentifier> _orderedBatches = new List<BatchIdentifier>();
private readonly Dictionary<string, BatchIdentifier> _entityNameBatchIdentifier = new Dictionary<string, BatchIdentifier>();
private readonly Dictionary<object, BatchIdentifier> _entityBatchIdentifier;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Content of _entityBatchIdentifier is not used, so I think we need to remove it.


// _orderedBatches and _entityNameBatchIdentifier replace _latestBatch from Hibernate implementation.
private readonly List<BatchIdentifier> _orderedBatches = new List<BatchIdentifier>();
private readonly Dictionary<string, BatchIdentifier> _entityNameBatchIdentifier = new Dictionary<string, BatchIdentifier>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_orderedBatches and _entityNameBatchIdentifier are used only within Sort method, please demote them to local variables.

_entityNameBatchIdentifier[action.EntityName] = batchIdentifier;
_orderedBatches.Add(batchIdentifier);
}
AddParentChildEntityNames(action, batchIdentifier);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be inside if block above as this need to be preformed only once.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, this is to "optimize" dynamic inserts.

using System;
using System.Collections.Generic;

namespace NHibernate.Test.NHSpecificTest.NH3931
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please move tests to NHibernate.Test.Insertordering where the old one lives?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please move tests to NHibernate.Test.Insertordering where the old one lives?

Done, completed with HHH-11585 additional tests. They have added an additional bug on their side by badly simplifying my fix for one-to-one case. So I have not ported that. I have removed the useless dictionary though.

I have not worked on the flawed sorting algorithm today, I will do maybe tomorrow.

fredericDelaporte referenced this pull request in hibernate/hibernate-orm Mar 22, 2017
@fredericDelaporte
Copy link
Member Author

fredericDelaporte commented Mar 22, 2017

On Hibernate side, they have committed an additional patch for handling one-to-one (HHH-11585) and the CustomProvider.
It is almost same as what I have done. I will adjust this PR to be closer (remove an useless condition) to Hibernate resulting implementation.

var prevBatchIdentifier = _orderedBatches[j];
if (prevBatchIdentifier.ParentEntityNames.Contains(entityName))
{
_orderedBatches.RemoveAt(i);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IF we'll have 2 prevBatchIdentifiers matching the criteria, we'll throw out to the void the second element which'll get to the i place and will add the second batchIdentifier to the list?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, it looks like, well spotted... All this change seems a hell buggy on Hibernate side...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just loud thinking (I did not check though and don't have time to check now): This is, probably, what happens on Oracle and Firebird. Most likely it got exposed because these tests use native generator which is identity on MSSQL (identity prevents inserts batching and so sorting), but on Oracle and Firebird native generator is sequence, which allows batching and sorting.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these = failing*

Copy link
Member Author

@fredericDelaporte fredericDelaporte Mar 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hum, using a LinkedList could bring us back to O(n²) and avoid this kind of issues, since we will be able to work with nodes instead of indexes. It should be way more readable, and so easier to write correctly.

I will give it a try.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, once I had understood your point, I just told me "hopefully that is the cause of those Teamcity build failures". We should copy those tests to insertorder and use something like guid.comb.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(And maybe is it time for you to go have a good night. I have stayed till 4am "yesterday" (technically this morning indeed) but I did not have to go to work today. You are in the -11 time zone? That is what I compute when looking at hours. I am in the +1.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it's +12 here in NZ, I live in the future:)

@hazzik
Copy link
Member

hazzik commented Mar 22, 2017

I've briefly checked Can_discriminate_subclass_on_list_with_lazy_loading_when_used_get test with the "hilo" generator which permits batching. As I thought, Sort method just corrupts the ordering: duplicating some batches and dropping others.

NHibernate.Test.NHSpecificTest.NH1391.Animal
NHibernate.Test.NHSpecificTest.NH1391.Dog
NHibernate.Test.NHSpecificTest.NH1391.Cat
NHibernate.Test.NHSpecificTest.NH1391.PersonWithAnimals
NHibernate.Test.NHSpecificTest.NH1391.SivasKangal
NHibernate.Test.NHSpecificTest.NH1391.PersonWithCats
NHibernate.Test.NHSpecificTest.NH1391.PersonWithDogs
NHibernate.Test.NHSpecificTest.NH1391.PersonWithSivasKangals

"Sorts" to:

NHibernate.Test.NHSpecificTest.NH1391.Animal
NHibernate.Test.NHSpecificTest.NH1391.PersonWithCats
NHibernate.Test.NHSpecificTest.NH1391.Animal
NHibernate.Test.NHSpecificTest.NH1391.PersonWithSivasKangals
NHibernate.Test.NHSpecificTest.NH1391.Animal
NHibernate.Test.NHSpecificTest.NH1391.Animal
NHibernate.Test.NHSpecificTest.NH1391.SivasKangal
NHibernate.Test.NHSpecificTest.NH1391.Animal

@fredericDelaporte
Copy link
Member Author

fredericDelaporte commented Mar 24, 2017

Animal test case ported into Insertordering tests. It confirms it fails with NH-3931 current state, but works with previous code of ActionQueue (locally tested).

I have also added a circular references test: it works with previous code, but not with current. Current code is completely unaware of circular references.

I wonder if we should actually fix ported Hibernate code, which solve some unsupported cases by causing a lot of older supported cases to fail, or come back to older code and fix older code instead.

@hazzik
Copy link
Member

hazzik commented Mar 24, 2017

I prefer this:

come back to older code and fix older code instead

@hazzik hazzik dismissed their stale review March 24, 2017 01:15

To review when it's "green"

@hazzik
Copy link
Member

hazzik commented Mar 26, 2017

Closing in favor of #582

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants