-
Notifications
You must be signed in to change notification settings - Fork 10.5k
[ConstraintSystem] Try to be more efficient solving linked operator expressions #26140
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@swift-ci Please smoke test. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have attempted similar solution already since I'm tracking multiple dupes of this issue, the problem is that this makes some expressions too complex in our internal suites and I think it used to happen in public source compat suite as well :(
I haven't had time since but if you really want to fix this, LinkedExprAnalyzer
has to get removed from CSGen and its logic has to get ported over to the solver itself, this way we wouldn't have a problem with eagerness (and assumption that all of the overloads are homogeneous) and the same performance in the common case.
I think you can add a custom overload to |
The couple days ago commits here essentially do as you (@xedin) suggest and moves the linked operator logic into the solver itself. During overload selection on a disjunction, any overload for an operator which is either already bound in the constraint system (e.g. we're already using this exact Also during selection of which disjunction gets picked next, it sorts on number of favored possibilities (if any) as well, so disjunctions with many possibilities but 1 (or a few) favored get explored before disjunctions with few possibilities over all but none favored. With these changes the solver tends to search (approximately) depth-first down branches of matching operators. Finally, when a solution is found with a given operator disjunction choice, all the typevars for any bindings in that solution to that exact overload are then merged (for examining any other possibilities). So the result is that the typevars aren't eagerly bound together, but the solver looks at same-overloads in operator expressions first, and if it is able to form a solution, the typevars are bound together afterwards to greatly cut down the remaining search space (identical to the original eager optimization, but delayed, and potentially piece-wise). This works to fix sr-10324, the original goal here, but maintains almost all of the speed of the original optimization in the 'normal' arithmetical cases. |
It turns out that "almost all of the speed" wasn't enough for Today's commits here repurposes If we're able to do that (form a closed domain of types), then we can disable all of the potential overloads for the operators that don't involve types in the domain and not explore them at all, because they are impossible. This is a very productive optimization when SIMD is involved. E.g. SR-10461:
vs
It's still a bit buggy with this code not correctly determining in a few circumstances that it can't prove the domain is actually closed. (And thus it disables overloads that it should leave alone.) Should work those out soon. |
@swift-ci Please test compiler performance |
This all works now. There are 4 validation tests that fail. They all include
|
@swift-ci Please test compiler performance |
@swift-ci Please Test Source Compatibility |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honestly after trying so many different things I'm starting to think that the only way to properly fix this would be attempting literal bindings earlier and backtracking if they don't match keeping a set of already attempted types. The first step is to remove LinkedExprAnalyzer
completely and go from there... Another reason why we need to get rid of all of the perf hacks in CSGen is diagnostics, in "diagnostic mode" solver should be able to attempt all possible overloads and bindings to be able to find a solution.
The reason why we can't solve operators efficiently is related to the fact that overloads like (Self, Self) -> Self
when opened create a single type variable for parameters and a result type, which makes the constraint system un-splittable into separate components. Possible ways to improve that I see is to either open Self
as separate type variables in different positions and then have a separate solver step to Type::join
them together (and possibly re-attempt some components with correct contextual type), or attempt bindings for arguments earlier, especially for the literals, because literal expressions have a type variable type with a literal protocol conformance which then forms <T_param> arg conv <T_arg>
constraints so the graph could only be disconnected only if one of the sides gets a type.
lib/Sema/CSGen.cpp
Outdated
|
||
|
||
bool expandCollectedTypesFromOperatorOverloads(ConstraintSystem &CS, | ||
LinkedTypeInfo <i) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm actually hoping that we could remove all of this code from CSGen instead of adding more :)
@xedin I understand your hesitance about doing more pre-checking but I think narrowing the number of possibilities in large overload disjunctions is a productive way forward. I'm traveling the next few days, but I'd like to investigate the source compatibility failures here when I have the chance, and then try to convince you that this is a worthwhile thing to do once they are taken care of. Meanwhile, if you think the commits on delaying operator tyvar merging (the first 5 up through Finally, I think a good future step would be to do this sort of thing in the solver itself rather than as a pre-step in CSGen. I imagine a separate possible-overloads disjunction constraint type, which references the tyvars for the operator arguments and attempts simplification by reducing the set of valid overloads as the argument tyvars get more constrained. In these arithmetic expressions it really seems like the problem is the large branching factor in constraint scopes, and reducing the fanout helps a lot. |
That's what we keep trying to do and keep finding edge cases with. I think it's time for us to take a step back and try to see if it could be done in the solver instead. @DougGregor was working on similar ideas, he might be able to share some of his thinking. @rudkx was investigating a different approach with designated operators. |
@swift-ci Please Test Source Compatibility |
@swift-ci Please Test Source Compatibility |
Hmm. Getting identical failures when tried on the nightly toolchain (
|
@swift-ci Please smoke test |
@swift-ci Please test source compatibility. |
1 similar comment
@swift-ci Please test source compatibility. |
@swift-ci Please smoke test. |
Resolves: SR-10324, SR-10461, SR-10601, SR-10606. So with this PR we'd no longer immediately bind arithmetic operators to be identical (SR-10324), but at the same time it greatly improves type checking time for expressions with lots of operators, especially those involving simd. (And passes source compatibility except for the current RxSwift |
@swift-ci Please smoke test. |
I will put some time aside to take a look at this upcoming week. |
Just a reminder to review when you have a chance. |
Sorry @gregomni I have this open and ready just need to finish up some other stuff first before I dig in again. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see the benefit but I'm still skeptical about this approach. The problem for me, as I mentioned before, is the fact that merging of type variables associated with literal arguments shouldn't happen during constraint generation. If we can avoid that it would spare us a lot of work which is currently done upfront e.g. trying to collect all possible types for literal arguments, disabling overloads.
As an alternative approach - while solving, we could try to re-arrange choices in other disjunctions to much the previous choice taken for the same operator e.g. 1 + 2 + 3
binding first +
to (Int, Int) -> Int
would bubble up (Int, Int) -> Int
to the top so it could be attempted first for the second +
as well.
It seems such behavior could imitate this (LinkedExprAnalyzer) optimization almost exactly for "concrete" overloads because there is another "hack" in the solver which doesn't attempt generic overloads when there is a concrete match already available.
lib/Sema/CSGen.cpp
Outdated
} | ||
|
||
for (auto convertTy : lti.collectedTypes) { | ||
if (CS.TC.isConvertibleTo(convertTy, type, CS.DC)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isConvertiableTo
(which creates new constraint system every time) and conformsToProtocol
are pretty heavy checks which, in my opinion, we are better off only performing while solving for each individual overload.
There is a long standing improvement opportunity where determineBestBindings
could actually check whether found type conforms to all of the protocols associated with its type variable via {Literal}ConformsTo
constraint and avoid that binding.
lib/Sema/CSGen.cpp
Outdated
|
||
bool possibleDecl = true; | ||
|
||
if (auto generic = resultTy->getAs<GenericTypeParamType>()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a note - overloads like (Self, Self) -> Self
or (T, T) -> T
are the biggest problem is at the moment - inability to filter out generic overloads which connect arguments to a result and prevent efficient solving of disconnected components, which means we have to explore a lot deeper into solution trees.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be simplified down to
if (llvm::any_of(fnType->getParams(), [&resultTy](const AnyFunctionType::Param ¶m) {
return resultTy->isEqual(param.getPlainType());
})
continue;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a note - overloads like
(Self, Self) -> Self
or(T, T) -> T
are the biggest problem is at the moment - inability to filter out generic overloads which connect arguments to a result and prevent efficient solving of disconnected components, which means we have to explore a lot deeper into solution trees.
These types of overloads can't be ruled out as possibilities up front, nor disconnected, but this PR (by favoring choosing duplicate bindings of existing bindings and by merging those type variables once a solution exists) makes it so that the solution tree gets pruned to be much less branchy once we find a solution. The depth is unaffected, but the breadth gets vastly narrower.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry but I don't understand what you are trying to say, the problem is that in the worse case, especially when expression is invalid e.g 1 + "hello"
, we'd still get to explore almost everything just because generic overloads like (Self, Self) -> Self
carry almost no context for the bindings and we can't verify conformances until generic parameter is actually bound to something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, with this PR, you'd disable every possible overload of +
except for the generic (Self, Self) -> Self
style, because all the non-generic ones are impossible. So you explore much less. I was trying to say that this is a big performance win, even when you can't do anything efficient with the generic style ones directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I'm trying to say is that presence of overloads with concrete types is not a problem if arguments are concrete as well or could be bound to something relatively easily.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like in the examples you have in the PR, all of the arguments are going to be concrete so it's pretty easy for the solver to avoid going deeper if argument conversions could be tested and rejected right away.
lib/Sema/CSGen.cpp
Outdated
continue; | ||
|
||
llvm::TinyPtrVector<Constraint *> disjunctions = | ||
CS.getConstraintGraph().gatherConstraints( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Performance wise - it would be better to invert that check and have outer loop be for every disjunction constraint, at least it's going to be linear over a number of constraints in the system.
Sorry, I'm not sure I understand this comment. Can you expand, please? There isn't any merging of type variables happening during constraint generation here. (The old -- sr-10324 bug causing -- merging of operator overload type variables is removed as part of this PR.) This does examine all the literals in order to determine which operator overloads can be disabled as impossible in this expression, but the literal type variables aren't touched and can still potentially be solved as different types from each other.
This is part of what is already in this PR. If we've chosen |
Sorry I was referring to the old behavior which was problematic. I understand that new logic is trying to disable some overloads upfront, but I just don't think that CSGen is a good place for that since it could be done while solving without the overhead of examining overload sets.
Favoring overloads but itself is not enough, I think to get the biggest win here and be 1-1 with previous behavior solver would have to - get newly favored overload to the top of the list, and adjust |
Well, there are two pieces, there is (1) determining the subset of all types that the expression could possibly involve, and then there is (2) avoiding taking paths in the solution tree that involve impossible types. (1) is a property of the expression as a whole, not any particular variable or constraint, and so I don't see a reasonable way to do it during solving. It seems clearly to belong during CSGen to me. On the other hand, it would be possible to have the constraint system hold the set of possible types and do (2), the disjunction path filtering, entirely during solving. But the code for determining which overloads are skippable given the set of possible types is very similar to determining the set of types in the first place, and so sharing code and containing it and the set of types in one place seems better to me. I think of it as generating a less bushy tree in the first place, which seems a reasonable job for CSGen to be doing.
I did effectively that, favoring the matching overload (which in these sorts of expressions is almost always the only favored choice, then), and sorting disjunctions by minimum-favored. The solver immediately picked all the same operators. It turns out that this was not enough to solve several of the SIMD expressions in tests once the old It also turns out that, once the new |
Here is how I see this - (1) is impossible to do properly in CSGen in presence of generic overloads, because it was possible we could use a classic algorithm with deterministic type variable ordering and solver wouldn't have any exponential behavior; (2) Solver is already pretty good at not taking any path where both sides of the argument conversion constraint or result are concrete, so pruning not a problem in this case, but when it comes to generic overloads all bets off at the moment.
This is exactly the main problem we are yet to solve which no hacks or tweaks to thereof can. I think we should concentrate on trying to solve a problem related to over-exploration in case of generic overloads, once that is done we could remove all of the existing hacks as obsolete. One of the ways to fix this problem for literal arguments - is to attempt default literal types earlier and backtrack if they don't match. |
…rload is (T,T)->T, with no (T,T)->U
…e types match existing binding choices
…n to speed up further searching
Did the cleanup you suggested, rebased to HEAD, fixed for a change in the
Determining subset of types (1) is actually trivial (but unhelpful) for generic overloads. If it takes As for (2): I think both the scope performance counter and the list of bugs which the solver previously failed to solve and with this PR successfully solves shows that the solver can still use some help here.
I understand that this PR does not solve what you see as the main problem. It greatly ameliorates the situation by improving performance and solving expressions that weren't getting solved otherwise. It also replaces a hack that had incorrect language semantics - erroring in the case of valid code (SR-10324) - with (what you see as) a different hack that doesn't have that problem. I don't think we know how to solve the main problem right now. Wouldn't it be better to successfully compile more valid code until we figure it out? |
@swift-ci Please smoke test. |
The reason why I'm hesitant to take these changes is as follows - every time we try to tweak CSGen/solver hacks we get either more "too complex" reports or source breaks. I can't prove to myself that this is not going to cause either. At this point I'm happy to maintain the status quo until we can up with principled solution which doesn't require any hacks even though some of the valid code is going to stay broken meanwhile. |
…disjunctions by testing last opened typevar.
@swift-ci Please test source compatibility. |
@gregomni I'm going to close this out due to age and inactivity. Thank you for taking a look here. |
Explicitly check that every possible overload of the binary operator whose two arguments are the same type also results in the same type. (That is, for any
(T,T)
args(T,T)->T
, with no(T,T)->U
.)Resolves SR-10324.
The problem in that issue is that
(x * y) * z
requires the two multiplications to use two different overloads(A,A)->B
in the parens and then(B,A)->B
outside. But during constraint optimization, we see a series of binops with all the same type underlying variable types(A,A,A)
and assume that we'll use the same overload for all of them (merging the type variables for both*
s). Once we've done that, the constraint system is unsolvable.This PR adds an additional check that none of the potential overloads has the form
(T,T)->U
, the existence of which is what breaks the optimization assumption.Not done here, but I think that with this additional check it becomes safe to remove
isArithmeticOperatorDecl
and resolve this comment: