Skip to content

NH-3850 - Polymorphic Linq query results aggregators failures #584

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Apr 23, 2017

Conversation

fredericDelaporte
Copy link
Member

@fredericDelaporte fredericDelaporte commented Mar 26, 2017

Fix and test cases for NH-3850.

Currently, LINQ queries such as query.Count(), query.Any() do not support polymorphic queries: they take into account only the results of the first entity class matching the query and ignores the other classes results.

This fix implements polymorphic support for Aggregate, Any, Count, LongCount, Max, Min and Sum.
Average is not fixed: supporting it requires to select additional columns in the query (Sum and Count) and compute the average on client side. I do not think it is worth the hassle.

This fix introduces a possible breaking change: Sum on nullables will no more yield null when there is no rows to compute from, but will yield 0. This change is due to usage of Enumerable.Sum for aggregating polymorphic results, and this is its documented behavior. This happens even on non polymorphic queries: when adding the Enumerable.Sum post result transformer, we do not already knows if the query will be polymorphic, so we add it to non polymorphic queries too. (Determining if the query is polymorphic would require some computation currently done in query plan computation.)
Avoiding this change should be doable by building a more elaborated expression as post result transformer. But this changes looks to me very minor and not contradictory with Linq contracts. It causes the loss of the ability to tell from the result there was nothing to Sum, but that does not look to me as being part of Sum contract. So I have not implemented a workaround for avoiding this change.

@fredericDelaporte fredericDelaporte force-pushed the NH-3850 branch 3 times, most recently from 08abfa6 to bb5a9fb Compare March 27, 2017 01:54
@fredericDelaporte
Copy link
Member Author

@gliljas, @hazzik, does the change on Sum behavior looks ok for you?

This fix introduces a possible breaking change: Sum on nullables will no more yield null when there is no rows to compute from, but will yield 0. This change is due to usage of Enumerable.Sum for aggregating polymorphic results, and this is its documented behavior. This happens even on non polymorphic queries: when adding the Enumerable.Sum post result transformer, we do not already knows if the query will be polymorphic, so we add it to non polymorphic queries too. (Determining if the query is polymorphic would require some computation currently done in query plan computation.)
Avoiding this change should be doable by building a more elaborated expression as post result transformer. But this changes looks to me very minor and not contradictory with Linq contracts. It causes the loss of the ability to tell from the result there was nothing to Sum, but that does not look to me as being part of Sum contract. So I have not implemented a workaround for avoiding this change.

@gliljas
Copy link
Member

gliljas commented Mar 29, 2017

I'm not sure. It satisfies the LINQ contract (which is probably paramount) but sacrifices SQL compatibility. Not a big problem, but it becomes a bit unpredictable if this only happens on results.

customersQueryable.Where(x=>x.Orders.Sum(y=>y.Amount)>=0)

will not return customers with no orders. In a LINQ to Objects query, it would.

I've handled this is my current project by introducing an expression visitor which rewrites all such queries, adding a Coalesce to the expression tree. It adds query cost, so it's not used everywhere.

@fredericDelaporte
Copy link
Member Author

Good point. Well I will have a look to check if that is that hard to preserve this behavior.

@fredericDelaporte
Copy link
Member Author

Ok, done. At first it was looking easy, but I had discovered another polymorphic case still not supported : aggregation of non-nullable values when some concrete class have no rows when other was. I have fixed it.

Now there is still a breaking change: when aggregating non nullable values while there is no rows for all concrete classes, it yields an exception, as expected. But previously, it was a GenericADOException having an ArgumentNullException as inner exception. Now it is an InvalidOperationException "Value cannot be null".

// Leaving it untouched for allowing non polymorphic cases to work.
break;
case NhExpressionType.Max:
AddPostExecuteTransformerForResultAggregate(ReflectionCache.QueryableMethods.MaxDefinition);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you use IQueryable<> for aggregate methods? It does not make sense for me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The IntermediateHqlTree._postExecuteTransformers is a list of LambdaExpression which get merged together then compiled and called on IQueryable<T>. As if they should be chainable, as if we it were useful and possible to add many of them, though the usages I have currently seen for them were all incompatible with chaining (they all yield another type than their input).

Yes that does not make sens to me either, that overly complicates things for defining the post execute transform (messing with expressions for runtime execution instead of just supplying directly an executable delegate), but there are no comments as for why such a design decision, and I have not decided to redesign this.

Redesign those transformers should imply considering the design of the list transformer, which works the same way and is way more used. But maybe this one has some cases where it could be translated to DB, though I doubt.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The IEnumerable<> just works here.


if (elementType.IsNullableOrReference())
{
var inputListType = typeof(IQueryable<>).MakeGenericType(elementType);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, why IQueryable<>?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With my previous answer in mind, it would work with the IEnumerable version too for current usage of the post execute transformer. I have made that choice for aligning with current input type of post execute transformers. True, there is an AsQueryable call just for supplying that input type to it. I really do not understand what were the design rationals behind the way this post execute transformer currently works. It looks to me as lot of overhead for no added value. (Enable "merge" of execute transformer, but none of existing one seems mergeable to me. Maybe some serialization needs again. All that ends up in the query plan, which is serializable.)

With the current state of usage of this post execute transformer, it should works with IEnumerable methods too.

Copy link
Member

@hazzik hazzik Mar 30, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It requires .AsQueryable()/IQueryable<> for some cases where the method is from the original query expression (See ProcessFirstOrSingleBase, so for First, FirstOrDefault, Single, SingleOrDefault). Hm... there we also specify the method explicitly.

Copy link
Member

@hazzik hazzik Mar 30, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really do not understand what were the design rationale behind the way this post execute transformer currently works

I think it's just a copy of ListTransformer implementation, which is, indeed, chainable.

}

// Non-reg case
[Test, Ignore("Candidate potential breaking changes: Sum would now yield 0 instead of null on empty result sets.")]
Copy link
Member

@hazzik hazzik Mar 30, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I removed Ignore, test did pass.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's because the query isn't polymorphic

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rebase & conflict resolution miss, I had removed it prior to rebase. Going to remove it again.

var allCall = Expression.Call(allMethod, inputList, Expression.Lambda(noSumCondition, element));

Expression conditionalAggregateCall = Expression.Condition(allCall,
Expression.Convert(Expression.Constant(null), concreteQueryElementType),
Copy link
Member

@hazzik hazzik Mar 30, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we do not need a cast expression. Expression.Constant has an overload which accepts a type.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing it on the above constant too, better to compare both side having the same type.

@fredericDelaporte
Copy link
Member Author

Changes done. Added some more comments in code too.

// NH-3850: changed from list transformer (working on IEnumerable<object>) to post execute
// transformer (working on IEnumerable<inputType>) for globally aggregating polymorphic results
// instead of aggregating results for each class separately and yielding only the first.
// If the aggregation relies on ordering, final result will still be wrong due to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the aggregation relies on ordering, final result will still be wrong

Can you demonstrate this with a test?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test added. By the way, hardened the non mutable seed test for it to be no more sensible to ordering. I prefer avoid relying on current polymorphic classes ordering whenever possible for tests, since it does look undocumented.

// a GenericADOException with an ArgumentNullException inner exception.

var nullableElementType = typeof(Nullable<>).MakeGenericType(new[] { elementType });
var nullableInputListType = typeof(IQueryable<>).MakeGenericType(nullableElementType);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to use IEnumerable<> here (and everywhere else).

Copy link
Member Author

@fredericDelaporte fredericDelaporte Mar 30, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewing again some other parts of the fix, it appears to me I was not consistent with that. This fix has spanned a long period, I have ended up not remembering all of it. (That is the first NHibernate Jira I have started to work on.)

// This still causes a change: the expected failure case will yield an InvalidOperationException instead of
// a GenericADOException with an ArgumentNullException inner exception.

var nullableElementType = typeof(Nullable<>).MakeGenericType(new[] { elementType });
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do not use array creation in MakeGenericType

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for having not spotted the params modifier. Added to the new ReflectionHelper.GetMethodOverload by the way.

@fredericDelaporte
Copy link
Member Author

Newer changes done. And removed some unneeded private setter, internal => private.

@fredericDelaporte
Copy link
Member Author

Some additional considerations about those post result transformer: I have contemplated adding a PolymorphicResultTransformer into the mix, which would be call before the post result transformer, only if the query yielded many polymorphic results.

This would avoid running this transformer on non polymorphic cases. But for handling polymorphic cases requiring a type override, this would require an additional PolymorphicTypeOverride for applying it if the query is polymorphic, while keeping the normal type otherwise. (And the aggregate case would require even more tweaking, such as an option for discarding the list transformer if the query is polymorphic, ...)

Moreover, the main overhead is probably currently more in constructing this transformer than in using it.

Now if we wanted to fix the ordering issue of polymorphic issue, we could try to extract the order by from the Linq query for reapplying it on union-ed polymorphic results. But then, re-ordering non polymorphic results would be a significant overhead without any value. There a PolymorphicResultTransformer for re-ordering only polymorphic cases could be useful.

@hazzik
Copy link
Member

hazzik commented Mar 30, 2017

After all these efforts I'm thinking about banning all implicit polymorphism with LINQ.

Otherwise the cases needs to be supported as well (thru other JIRA & PR):

  • ordering
  • paging

@fredericDelaporte
Copy link
Member Author

fredericDelaporte commented Mar 30, 2017

While ordering alone looks achievable, though already bad for performances (in-memory ordering required), paging support with ordering would be quite a monster for performances, especially for reaching far away pages. Depending on required order, the polymorphic page could have to pick elements ranging from row 0 up to "end of required page row number" in each concrete table. This would cause paging to be pointless.

Polymorphism with other API are not in a better state, they do not support paging or ordering better. Indeed, polymorphic aggregates with HQL yields as many rows as concrete classes. So taking the first achieve the same result than current Linq. Trying UniqueResult blows away.

Concrete results sets are just union-ed without ordering with HQL too. Linq an HQL run the same code for that. There is some code for trying a paging accounting polymorphism, but without handling the ordering issue, and as far as I remember, quite flawed (was not yielding pages of expected size I believe).

(I have not checked the criteria/queryover cases.)

So well, why not disabling it for all API if we decide to disable it for Linq?
Apart for test teardown (delete * from System.Object), are there many actual usages of implicit polymorphism? (In the apps I had to work with, it never served me.)

@fredericDelaporte
Copy link
Member Author

Rebased for resolving conflicts with reflection refactoring.

@fredericDelaporte
Copy link
Member Author

fredericDelaporte commented Apr 7, 2017

Fix some bad white-spaces I had inadvertently introduced, fix some comments and adjusted some reflection cache parameter specification syntax (using default instead of cast or instanciation of structs).
(And rebased.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants