[ConstraintSystem] Try to be more efficient solving linked operator expressions #26140

gregomni · 2019-07-14T17:22:31Z

Explicitly check that every possible overload of the binary operator whose two arguments are the same type also results in the same type. (That is, for any (T,T) args (T,T)->T, with no (T,T)->U.)

Resolves SR-10324.

The problem in that issue is that (x * y) * z requires the two multiplications to use two different overloads (A,A)->B in the parens and then (B,A)->B outside. But during constraint optimization, we see a series of binops with all the same type underlying variable types (A,A,A) and assume that we'll use the same overload for all of them (merging the type variables for both *s). Once we've done that, the constraint system is unsolvable.

This PR adds an additional check that none of the potential overloads has the form (T,T)->U, the existence of which is what breaks the optimization assumption.

Not done here, but I think that with this additional check it becomes safe to remove isArithmeticOperatorDecl and resolve this comment:

            // TODO: We currently limit this optimization to known arithmetic
            // operators, but we should be able to broaden this out to
            // logical operators as well.

gregomni · 2019-07-14T17:22:54Z

@swift-ci Please smoke test.

xedin

I have attempted similar solution already since I'm tracking multiple dupes of this issue, the problem is that this makes some expressions too complex in our internal suites and I think it used to happen in public source compat suite as well :(

I haven't had time since but if you really want to fix this, LinkedExprAnalyzer has to get removed from CSGen and its logic has to get ported over to the solver itself, this way we wouldn't have a problem with eagerness (and assumption that all of the overloads are homogeneous) and the same performance in the common case.

gregomni · 2019-07-15T19:15:16Z

@xedin Ah, makes sense. Trying the public source-compat here, because it would be convenient to have a real code example to look at that fails on complexity, but then I'll go take a look at LinkedExprAnalyzer and see what I can do.

@swift-ci Please Test Source Compatibility

xedin · 2019-07-15T20:25:28Z

I think you can add a custom overload to + and some of the tests in validation-test/Sema/type_checker_perf would fail. import Foundation and import simd make huge different too since they load a bunch of overloads to arithmetic operators.

xedin · 2019-07-16T18:43:05Z

@gregomni For inspiration, here is one of my attempts to get rid of LinkedExprAnalyzer and other hacks - #15246

gregomni · 2019-07-19T16:36:15Z

The couple days ago commits here essentially do as you (@xedin) suggest and moves the linked operator logic into the solver itself.

During overload selection on a disjunction, any overload for an operator which is either already bound in the constraint system (e.g. we're already using this exact +) or has identical types to an operator bound in the constraint system (e.g. we're using a * with these exact types) gets favored.

Also during selection of which disjunction gets picked next, it sorts on number of favored possibilities (if any) as well, so disjunctions with many possibilities but 1 (or a few) favored get explored before disjunctions with few possibilities over all but none favored.

With these changes the solver tends to search (approximately) depth-first down branches of matching operators.

Finally, when a solution is found with a given operator disjunction choice, all the typevars for any bindings in that solution to that exact overload are then merged (for examining any other possibilities).

So the result is that the typevars aren't eagerly bound together, but the solver looks at same-overloads in operator expressions first, and if it is able to form a solution, the typevars are bound together afterwards to greatly cut down the remaining search space (identical to the original eager optimization, but delayed, and potentially piece-wise).

This works to fix sr-10324, the original goal here, but maintains almost all of the speed of the original optimization in the 'normal' arithmetical cases.

gregomni · 2019-07-19T17:06:21Z

It turns out that "almost all of the speed" wasn't enough for validation-test/Sema/type_checker_perf/fast/simd_add.gyb, a long addition expression involving SIMD4<Float>s. Delaying the eager optimization was enough to break this test case because there are just so many overloads of + to look at before finding the solution and binding all occurrences together. So that got me looking at SIMD.

Today's commits here repurposes LinkedExprAnalyzer. The idea is to look at the known argument types and literals of the operators in the expression and to form a closed algebra containing all of the types that the intermediates and final result could be. (E.g. a simple example is if we're doing ordinary arithmetic involving integer literals, we could be dealing with any of the Int or UInt or Float types, but there's no way with any of those starting types and possible operators to get a String result, so String is outside of the domain & range of the algebra for the expression.)

If we're able to do that (form a closed domain of types), then we can disable all of the potential overloads for the operators that don't involve types in the domain and not explore them at all, because they are impossible. This is a very productive optimization when SIMD is involved.

E.g. SR-10461:

% time xcrun swiftc -typecheck sr10461.swift                                                  
sr10461.swift:4:28: error: the compiler is unable to type-check this expression in reasonable time; try breaking up the expression into distinct sub-expressions
    let i = 3*(-p0 + 3*p1) - (3*p2 + p3)
            ~~~~~~~~~~~~~~~^~~~~~~~~~~~~
xcrun swiftc -typecheck sr10461.swift  16.92s user 0.44s system 99% cpu 17.409 total

vs

% time xcrun build/Ninja-ReleaseAssert/swift-macosx-x86_64/bin/swiftc -typecheck sr10461.swift
xcrun build/Ninja-ReleaseAssert/swift-macosx-x86_64/bin/swiftc -typecheck   8.95s user 0.30s system 99% cpu 9.293 total

It's still a bit buggy with this code not correctly determining in a few circumstances that it can't prove the domain is actually closed. (And thus it disables overloads that it should leave alone.) Should work those out soon.

gregomni · 2019-07-19T22:17:18Z

@swift-ci Please test compiler performance

gregomni · 2019-07-19T22:37:00Z

This all works now. There are 4 validation tests that fail. They all include -solver-enable-operator-designated-types, and that setting overlaps and interferes somewhat with the disjunction handling added here. Removing the special -solver switches from those 4 tests cause them all to compile correctly, but since they are mostly scaling tests and I can't access the radars to tell what their purpose is, I'm not sure if or how I should modify those test files. They are:

validation-test/Sema/type_checker_perf/fast/rdar18360240.swift.gyb
validation-test/Sema/type_checker_perf/fast/rdar23327871.swift.gyb
validation-test/Sema/type_checker_perf/fast/rdar25866240.swift.gyb
validation-test/Sema/type_checker_perf/fast/rdar46687985.swift

Many changes, asking for new review.

gregomni · 2019-07-20T17:59:01Z

@swift-ci Please test compiler performance

swift-ci · 2019-07-21T05:42:01Z

Summary for master full

Unexpected test results, excluded stats for Tagged, NIO, Deferred, Kommander, Dollar, IBAnimatableApp, CoreStore, ProcedureKit, swiftlint, Alamofire, Wordy, SwifterSwift

Regressions found (see below)

Debug-batch
- debug-batch brief
- debug-batch detailed
Release
- release brief
- release detailed

Debug-batch

debug-batch brief

Regressed (0)

name	old	new	delta	delta_pct

Improved (3)

name	old	new	delta	delta_pct
Frontend.NumInstructionsExecuted	24,353,995,254,420	23,760,829,848,130	-593,165,406,290	-2.44% ✅
LLVM.NumLLVMBytesOutput	884,275,700	866,509,244	-17,766,456	-2.01% ✅
time.swift-driver.wall	2413.3s	2324.7s	-88.6s	-3.67% ✅

Unchanged (delta < 1.0% or delta < 100.0ms) (0)

name	old	new	delta	delta_pct

debug-batch detailed

Regressed (5)

name	old	new	delta	delta_pct
Sema.ExtendedNominalRequest	2,809,687	2,869,631	59,944	2.13% ⛔
Sema.InheritedDeclsReferencedRequest	3,217,162	3,464,454	247,292	7.69% ⛔
Sema.NumConformancesDeserialized	5,023,442	5,197,107	173,665	3.46% ⛔
Sema.NumUnloadedLazyIterableDeclContexts	3,394,621	3,435,185	40,564	1.19% ⛔
Sema.SelfBoundsFromWhereClauseRequest	4,167,500	4,563,830	396,330	9.51% ⛔

Improved (79)

name	old	new	delta	delta_pct
AST.NumDecls	71,968	70,346	-1,622	-2.25% ✅
AST.NumDependencies	194,316	190,198	-4,118	-2.12% ✅
AST.NumInfixOperators	25,810	25,223	-587	-2.27% ✅
AST.NumLoadedModules	238,616	233,374	-5,242	-2.2% ✅
AST.NumLocalTypeDecls	104	102	-2	-1.92% ✅
AST.NumPrecedenceGroups	13,071	12,848	-223	-1.71% ✅
AST.NumReferencedMemberNames	3,114,111	3,047,053	-67,058	-2.15% ✅
AST.NumReferencedTopLevelNames	232,133	227,135	-4,998	-2.15% ✅
AST.NumSourceLines	2,154,179	2,105,197	-48,982	-2.27% ✅
AST.NumSourceLinesPerSecond	1,896,636	1,834,648	-61,988	-3.27% ✅
AST.NumTotalClangImportedEntities	3,633,699	3,584,922	-48,777	-1.34% ✅
Driver.ChildrenMaxRSS	126,032,842,752	120,826,662,912	-5,206,179,840	-4.13% ✅
Driver.NumDriverJobsRun	15,335	15,072	-263	-1.72% ✅
Driver.NumDriverPipePolls	81,664	80,491	-1,173	-1.44% ✅
Driver.NumDriverPipeReads	79,500	78,598	-902	-1.13% ✅
Frontend.MaxMallocUsage	735,380,575,888	723,440,312,576	-11,940,263,312	-1.62% ✅
Frontend.NumInstructionsExecuted	24,353,995,254,420	23,760,829,848,130	-593,165,406,290	-2.44% ✅
IRModule.NumIRAliases	91,961	90,295	-1,666	-1.81% ✅
IRModule.NumIRBasicBlocks	3,439,178	3,358,341	-80,837	-2.35% ✅
IRModule.NumIRFunctions	1,599,879	1,564,689	-35,190	-2.2% ✅
IRModule.NumIRGlobals	1,766,744	1,735,873	-30,871	-1.75% ✅
IRModule.NumIRInsts	46,635,371	45,841,392	-793,979	-1.7% ✅
IRModule.NumIRNamedMetaData	74,090	73,110	-980	-1.32% ✅
IRModule.NumIRValueSymbols	2,973,529	2,912,864	-60,665	-2.04% ✅
LLVM.NumLLVMBytesOutput	884,275,700	866,509,244	-17,766,456	-2.01% ✅
Parse.NumFunctionsParsed	117,680	115,461	-2,219	-1.89% ✅
Parse.NumIterableDeclContextParsed	852,766	838,308	-14,458	-1.7% ✅
SILModule.NumSILGenFunctions	789,715	770,368	-19,347	-2.45% ✅
SILModule.NumSILGenGlobalVariables	30,553	30,115	-438	-1.43% ✅
SILModule.NumSILGenWitnessTables	34,768	33,498	-1,270	-3.65% ✅
SILModule.NumSILOptFunctions	1,132,762	1,104,550	-28,212	-2.49% ✅
SILModule.NumSILOptGlobalVariables	31,158	30,710	-448	-1.44% ✅
SILModule.NumSILOptVtables	16,598	16,421	-177	-1.07% ✅
SILModule.NumSILOptWitnessTables	73,636	70,925	-2,711	-3.68% ✅
Sema.AttachedPropertyWrapperTypeRequest	250,678	245,067	-5,611	-2.24% ✅
Sema.AttachedPropertyWrappersRequest	1,547,914	1,505,816	-42,098	-2.72% ✅
Sema.DefaultAndMaxAccessLevelRequest	39,275	38,489	-786	-2.0% ✅
Sema.DefaultDefinitionTypeRequest	4,012	3,874	-138	-3.44% ✅
Sema.DefaultTypeRequest	169,639	164,158	-5,481	-3.23% ✅
Sema.EnumRawTypeRequest	10,896	10,654	-242	-2.22% ✅
Sema.ExistentialConformsToSelfRequest	12,442	12,316	-126	-1.01% ✅
Sema.IsDynamicRequest	801,220	780,045	-21,175	-2.64% ✅
Sema.IsFinalRequest	2,192,229	2,154,325	-37,904	-1.73% ✅
Sema.IsGetterMutatingRequest	219,648	215,255	-4,393	-2.0% ✅
Sema.IsObjCRequest	762,521	747,806	-14,715	-1.93% ✅
Sema.IsSetterMutatingRequest	178,453	174,600	-3,853	-2.16% ✅
Sema.MangleLocalTypeDeclRequest	208	204	-4	-1.92% ✅
Sema.NamedLazyMemberLoadFailureCount	16,019	15,749	-270	-1.69% ✅
Sema.NamedLazyMemberLoadSuccessCount	15,849,402	15,508,546	-340,856	-2.15% ✅
Sema.NominalTypeLookupDirectCount	18,840,925	18,542,344	-298,581	-1.58% ✅
Sema.NumConstraintScopes	14,212,758	8,655,258	-5,557,500	-39.1% ✅
Sema.NumConstraintsConsideredForEdgeContraction	49,038,944	35,276,972	-13,761,972	-28.06% ✅
Sema.NumDeclsDeserialized	39,947,429	39,421,601	-525,828	-1.32% ✅
Sema.NumDeclsFinalized	317,643	310,764	-6,879	-2.17% ✅
Sema.NumDeclsTypechecked	682,073	666,737	-15,336	-2.25% ✅
Sema.NumDeclsValidated	1,212,445	1,184,675	-27,770	-2.29% ✅
Sema.NumFunctionsTypechecked	247,028	241,721	-5,307	-2.15% ✅
Sema.NumGenericSignatureBuilders	751,804	738,885	-12,919	-1.72% ✅
Sema.NumLazyGenericEnvironments	7,259,752	7,139,717	-120,035	-1.65% ✅
Sema.NumLazyGenericEnvironmentsLoaded	169,422	166,497	-2,925	-1.73% ✅
Sema.NumLazyRequirementSignatures	576,362	564,376	-11,986	-2.08% ✅
Sema.NumLazyRequirementSignaturesLoaded	355,976	346,551	-9,425	-2.65% ✅
Sema.NumLeafScopes	8,788,593	5,354,514	-3,434,079	-39.07% ✅
Sema.NumTypesDeserialized	11,853,855	11,635,625	-218,230	-1.84% ✅
Sema.NumTypesValidated	960,090	946,273	-13,817	-1.44% ✅
Sema.OpaqueReadOwnershipRequest	232,298	227,907	-4,391	-1.89% ✅
Sema.OverriddenDeclsRequest	1,188,294	1,173,177	-15,117	-1.27% ✅
Sema.PropertyWrapperBackingPropertyInfoRequest	247,889	242,372	-5,517	-2.23% ✅
Sema.PropertyWrapperBackingPropertyTypeRequest	250,678	245,067	-5,611	-2.24% ✅
Sema.ProtocolRequiresClassRequest	48,675	48,044	-631	-1.3% ✅
Sema.RequirementSignatureRequest	408,603	398,494	-10,109	-2.47% ✅
Sema.SelfAccessKindRequest	3,554,214	3,505,598	-48,616	-1.37% ✅
Sema.SetterAccessLevelRequest	64,263	62,173	-2,090	-3.25% ✅
Sema.StoredPropertiesAndMissingMembersRequest	17,147	16,850	-297	-1.73% ✅
Sema.StoredPropertiesRequest	197,999	191,341	-6,658	-3.36% ✅
Sema.SuperclassDeclRequest	257,180	252,025	-5,155	-2.0% ✅
Sema.SuperclassTypeRequest	29,905	29,468	-437	-1.46% ✅
Sema.TypeCheckFunctionBodyUntilRequest	247,028	241,721	-5,307	-2.15% ✅
Sema.UnderlyingTypeDeclsReferencedRequest	123,320	121,066	-2,254	-1.83% ✅

Unchanged (delta < 1.0% or delta < 100.0ms) (41)

name	old	new	delta	delta_pct
AST.NumASTBytesAllocated	35,873,006,810	35,530,282,500	-342,724,310	-0.96%
AST.NumLinkLibraries	0	0	0	0.0%
AST.NumObjCMethods	8,483	8,431	-52	-0.61%
AST.NumPostfixOperators	18	18	0	0.0%
AST.NumPrefixOperators	45	45	0	0.0%
AST.NumReferencedDynamicNames	84	84	0	0.0%
AST.NumSourceBuffers	312,077	309,418	-2,659	-0.85%
Driver.DriverDepCascadingDynamic	0	0	0	0.0%
Driver.DriverDepCascadingExternal	0	0	0	0.0%
Driver.DriverDepCascadingMember	0	0	0	0.0%
Driver.DriverDepCascadingNominal	0	0	0	0.0%
Driver.DriverDepCascadingTopLevel	0	0	0	0.0%
Driver.DriverDepDynamic	0	0	0	0.0%
Driver.DriverDepExternal	0	0	0	0.0%
Driver.DriverDepMember	0	0	0	0.0%
Driver.DriverDepNominal	0	0	0	0.0%
Driver.DriverDepTopLevel	0	0	0	0.0%
Driver.NumDriverJobsSkipped	0	0	0	0.0%
Driver.NumProcessFailures	0	0	0	0.0%
Frontend.NumProcessFailures	0	0	0	0.0%
IRModule.NumIRComdatSymbols	0	0	0	0.0%
IRModule.NumIRIFuncs	0	0	0	0.0%
SILModule.NumSILGenDefaultWitnessTables	0	0	0	0.0%
SILModule.NumSILGenVtables	10,360	10,259	-101	-0.97%
SILModule.NumSILOptDefaultWitnessTables	0	0	0	0.0%
Sema.AccessLevelRequest	6,396,310	6,384,573	-11,737	-0.18%
Sema.AttachedFunctionBuilderRequest	0	0	0	0.0%
Sema.CollectOverriddenDeclsRequest	4,036,778	4,037,402	624	0.02%
Sema.CursorInfoRequest	0	0	0	0.0%
Sema.CustomAttrNominalRequest	0	0	0	0.0%
Sema.FunctionBuilderTypeRequest	0	0	0	0.0%
Sema.InheritedTypeRequest	156,246	154,695	-1,551	-0.99%
Sema.LazyStoragePropertyRequest	1,021	1,021	0	0.0%
Sema.NumLazyIterableDeclContexts	5,246,627	5,265,199	18,572	0.35%
Sema.PropertyWrapperTypeInfoRequest	0	0	0	0.0%
Sema.ProvideDefaultImplForRequest	4,036,778	4,037,402	624	0.02%
Sema.RangeInfoRequest	0	0	0	0.0%
Sema.RequirementRequest	56,087	55,582	-505	-0.9%
Sema.StructuralTypeRequest	0	0	0	0.0%
Sema.TypeDeclsFromWhereClauseRequest	24,328	24,176	-152	-0.62%
Sema.USRGenerationRequest	4,593,868	4,580,484	-13,384	-0.29%

Release

release brief

Regressed (0)

name	old	new	delta	delta_pct

Improved (3)

name	old	new	delta	delta_pct
Frontend.NumInstructionsExecuted	24,781,453,740,579	23,971,854,504,912	-809,599,235,667	-3.27% ✅
LLVM.NumLLVMBytesOutput	751,584,864	740,215,546	-11,369,318	-1.51% ✅
time.swift-driver.wall	4356.1s	4186.8s	-169.3s	-3.89% ✅

Unchanged (delta < 1.0% or delta < 100.0ms) (0)

name	old	new	delta	delta_pct

release detailed

Regressed (0)

name	old	new	delta	delta_pct

Improved (21)

name	old	new	delta	delta_pct
AST.NumLoadedModules	16,695	15,851	-844	-5.06% ✅
AST.NumTotalClangImportedEntities	677,254	659,369	-17,885	-2.64% ✅
IRModule.NumIRBasicBlocks	2,838,957	2,742,722	-96,235	-3.39% ✅
IRModule.NumIRFunctions	1,344,913	1,314,329	-30,584	-2.27% ✅
IRModule.NumIRGlobals	1,556,920	1,528,147	-28,773	-1.85% ✅
IRModule.NumIRInsts	27,254,444	26,567,898	-686,546	-2.52% ✅
IRModule.NumIRValueSymbols	2,684,445	2,629,158	-55,287	-2.06% ✅
LLVM.NumLLVMBytesOutput	751,584,864	740,215,546	-11,369,318	-1.51% ✅
SILModule.NumSILGenFunctions	554,213	538,761	-15,452	-2.79% ✅
SILModule.NumSILOptFunctions	780,723	750,116	-30,607	-3.92% ✅
Sema.NumConformancesDeserialized	1,991,132	1,917,151	-73,981	-3.72% ✅
Sema.NumConstraintScopes	13,988,821	8,346,614	-5,642,207	-40.33% ✅
Sema.NumDeclsDeserialized	5,656,320	5,333,343	-322,977	-5.71% ✅
Sema.NumDeclsValidated	843,337	821,401	-21,936	-2.6% ✅
Sema.NumFunctionsTypechecked	248,501	243,191	-5,310	-2.14% ✅
Sema.NumGenericSignatureBuilders	154,293	148,819	-5,474	-3.55% ✅
Sema.NumLazyGenericEnvironments	1,100,209	1,034,259	-65,950	-5.99% ✅
Sema.NumLazyGenericEnvironmentsLoaded	18,996	17,914	-1,082	-5.7% ✅
Sema.NumLazyIterableDeclContexts	702,215	683,358	-18,857	-2.69% ✅
Sema.NumTypesDeserialized	2,824,945	2,655,052	-169,893	-6.01% ✅
Sema.NumTypesValidated	523,620	515,302	-8,318	-1.59% ✅

Unchanged (delta < 1.0% or delta < 100.0ms) (0)

name	old	new	delta	delta_pct

gregomni · 2019-07-21T15:57:56Z

@swift-ci Please Test Source Compatibility

xedin

Honestly after trying so many different things I'm starting to think that the only way to properly fix this would be attempting literal bindings earlier and backtracking if they don't match keeping a set of already attempted types. The first step is to remove LinkedExprAnalyzer completely and go from there... Another reason why we need to get rid of all of the perf hacks in CSGen is diagnostics, in "diagnostic mode" solver should be able to attempt all possible overloads and bindings to be able to find a solution.

The reason why we can't solve operators efficiently is related to the fact that overloads like (Self, Self) -> Self when opened create a single type variable for parameters and a result type, which makes the constraint system un-splittable into separate components. Possible ways to improve that I see is to either open Self as separate type variables in different positions and then have a separate solver step to Type::join them together (and possibly re-attempt some components with correct contextual type), or attempt bindings for arguments earlier, especially for the literals, because literal expressions have a type variable type with a literal protocol conformance which then forms <T_param> arg conv <T_arg> constraints so the graph could only be disconnected only if one of the sides gets a type.

xedin · 2019-07-22T20:30:34Z

lib/Sema/CSGen.cpp

-
+
+  bool expandCollectedTypesFromOperatorOverloads(ConstraintSystem &CS,
+                                                 LinkedTypeInfo &lti) {


I'm actually hoping that we could remove all of this code from CSGen instead of adding more :)

gregomni · 2019-07-22T22:54:25Z

@xedin I understand your hesitance about doing more pre-checking but I think narrowing the number of possibilities in large overload disjunctions is a productive way forward. I'm traveling the next few days, but I'd like to investigate the source compatibility failures here when I have the chance, and then try to convince you that this is a worthwhile thing to do once they are taken care of.

Meanwhile, if you think the commits on delaying operator tyvar merging (the first 5 up through 4ef4342604d79e5e19398fc8bcd1dbf015335974) are worth contributing as a separate PR, I think those stand on their own.

Finally, I think a good future step would be to do this sort of thing in the solver itself rather than as a pre-step in CSGen. I imagine a separate possible-overloads disjunction constraint type, which references the tyvars for the operator arguments and attempts simplification by reducing the set of valid overloads as the argument tyvars get more constrained. In these arithmetic expressions it really seems like the problem is the large branching factor in constraint scopes, and reducing the fanout helps a lot.

xedin · 2019-07-22T23:03:37Z

I understand your hesitance about doing more pre-checking but I think narrowing the number of possibilities in large overload disjunctions is a productive way forward.

That's what we keep trying to do and keep finding edge cases with. I think it's time for us to take a step back and try to see if it could be done in the solver instead. @DougGregor was working on similar ideas, he might be able to share some of his thinking. @rudkx was investigating a different approach with designated operators.

gregomni · 2019-07-28T21:37:54Z

@swift-ci Please Test Source Compatibility

gregomni · 2019-07-30T22:37:08Z

@swift-ci Please Test Source Compatibility

gregomni · 2019-07-31T15:21:15Z

Hmm. Getting identical failures when tried on the nightly toolchain (swift-DEVELOPMENT-SNAPSHOT-2019-07-30-a.xctoolchain) for Sourcery that seems unrelated to my changes:

SIL verification failed: Unknown formal access pattern: storage
Verifying instruction:
   %7 = argument of bb1 : $*Optional<EJSTemplate.Error> // user: %95
->   %95 = begin_access [read] [dynamic] [no_nested_conflict] %7 : $*Optional<EJSTemplate.Error> // users: %96, %97
     %96 = load %95 : $*Optional<EJSTemplate.Error> // users: %131, %142, %105, %98
     end_access %95 : $*Optional<EJSTemplate.Error> // id: %97

gregomni · 2019-07-31T18:18:49Z

@swift-ci Please smoke test

gregomni · 2019-07-31T18:19:33Z

@swift-ci Please test source compatibility.

gregomni · 2019-08-02T16:35:51Z

@swift-ci Please test source compatibility.

gregomni · 2019-08-03T03:38:24Z

@swift-ci Please smoke test.

gregomni · 2019-08-03T03:49:20Z

Resolves: SR-10324, SR-10461, SR-10601, SR-10606.

So with this PR we'd no longer immediately bind arithmetic operators to be identical (SR-10324), but at the same time it greatly improves type checking time for expressions with lots of operators, especially those involving simd. (And passes source compatibility except for the current RxSwift incompatible with weak references error that ought to be an XFAIL.)

gregomni · 2019-08-03T06:27:25Z

@swift-ci Please smoke test.

xedin · 2019-08-18T18:11:35Z

I will put some time aside to take a look at this upcoming week.

gregomni · 2019-08-28T17:00:14Z

Just a reminder to review when you have a chance.

xedin · 2019-08-30T22:06:36Z

Sorry @gregomni I have this open and ready just need to finish up some other stuff first before I dig in again.

xedin

I see the benefit but I'm still skeptical about this approach. The problem for me, as I mentioned before, is the fact that merging of type variables associated with literal arguments shouldn't happen during constraint generation. If we can avoid that it would spare us a lot of work which is currently done upfront e.g. trying to collect all possible types for literal arguments, disabling overloads.

As an alternative approach - while solving, we could try to re-arrange choices in other disjunctions to much the previous choice taken for the same operator e.g. 1 + 2 + 3 binding first + to (Int, Int) -> Int would bubble up (Int, Int) -> Int to the top so it could be attempted first for the second + as well.

It seems such behavior could imitate this (LinkedExprAnalyzer) optimization almost exactly for "concrete" overloads because there is another "hack" in the solver which doesn't attempt generic overloads when there is a concrete match already available.

xedin · 2019-09-04T20:26:23Z

lib/Sema/CSGen.cpp

+    }
+
+    for (auto convertTy : lti.collectedTypes) {
+      if (CS.TC.isConvertibleTo(convertTy, type, CS.DC))


This isConvertiableTo (which creates new constraint system every time) and conformsToProtocol are pretty heavy checks which, in my opinion, we are better off only performing while solving for each individual overload.

There is a long standing improvement opportunity where determineBestBindings could actually check whether found type conforms to all of the protocols associated with its type variable via {Literal}ConformsTo constraint and avoid that binding.

xedin · 2019-09-04T20:35:22Z

lib/Sema/CSGen.cpp

+
+          bool possibleDecl = true;
+
+          if (auto generic = resultTy->getAs<GenericTypeParamType>()) {


Just a note - overloads like (Self, Self) -> Self or (T, T) -> T are the biggest problem is at the moment - inability to filter out generic overloads which connect arguments to a result and prevent efficient solving of disconnected components, which means we have to explore a lot deeper into solution trees.

This could be simplified down to

if (llvm::any_of(fnType->getParams(), [&resultTy](const AnyFunctionType::Param &param) { return resultTy->isEqual(param.getPlainType()); }) continue;

Just a note - overloads like (Self, Self) -> Self or (T, T) -> T are the biggest problem is at the moment - inability to filter out generic overloads which connect arguments to a result and prevent efficient solving of disconnected components, which means we have to explore a lot deeper into solution trees.

These types of overloads can't be ruled out as possibilities up front, nor disconnected, but this PR (by favoring choosing duplicate bindings of existing bindings and by merging those type variables once a solution exists) makes it so that the solution tree gets pruned to be much less branchy once we find a solution. The depth is unaffected, but the breadth gets vastly narrower.

Sorry but I don't understand what you are trying to say, the problem is that in the worse case, especially when expression is invalid e.g 1 + "hello", we'd still get to explore almost everything just because generic overloads like (Self, Self) -> Self carry almost no context for the bindings and we can't verify conformances until generic parameter is actually bound to something.

Well, with this PR, you'd disable every possible overload of + except for the generic (Self, Self) -> Self style, because all the non-generic ones are impossible. So you explore much less. I was trying to say that this is a big performance win, even when you can't do anything efficient with the generic style ones directly.

What I'm trying to say is that presence of overloads with concrete types is not a problem if arguments are concrete as well or could be bound to something relatively easily.

Like in the examples you have in the PR, all of the arguments are going to be concrete so it's pretty easy for the solver to avoid going deeper if argument conversions could be tested and rejected right away.

xedin · 2019-09-04T20:42:54Z

lib/Sema/CSGen.cpp

+        continue;
+
+      llvm::TinyPtrVector<Constraint *> disjunctions =
+          CS.getConstraintGraph().gatherConstraints(


Performance wise - it would be better to invert that check and have outer loop be for every disjunction constraint, at least it's going to be linear over a number of constraints in the system.

gregomni · 2019-09-05T18:17:42Z

@xedin

I see the benefit but I'm still skeptical about this approach. The problem for me, as I mentioned before, is the fact that merging of type variables associated with literal arguments shouldn't happen during constraint generation. If we can avoid that it would spare us a lot of work which is currently done upfront e.g. trying to collect all possible types for literal arguments, disabling overloads.

Sorry, I'm not sure I understand this comment. Can you expand, please? There isn't any merging of type variables happening during constraint generation here. (The old -- sr-10324 bug causing -- merging of operator overload type variables is removed as part of this PR.)

This does examine all the literals in order to determine which operator overloads can be disabled as impossible in this expression, but the literal type variables aren't touched and can still potentially be solved as different types from each other.

As an alternative approach - while solving, we could try to re-arrange choices in other disjunctions to much the previous choice taken for the same operator e.g. 1 + 2 + 3 binding first + to (Int, Int) -> Int would bubble up (Int, Int) -> Int to the top so it could be attempted first for the second + as well.

This is part of what is already in this PR. If we've chosen (Int, Int) -> Int for the first binding, then the (Int, Int) -> Int binding becomes preferred when looking at the second overload disjunction.

xedin · 2019-09-05T19:21:10Z

This does examine all the literals in order to determine which operator overloads can be disabled as impossible in this expression, but the literal type variables aren't touched and can still potentially be solved as different types from each other.

Sorry I was referring to the old behavior which was problematic. I understand that new logic is trying to disable some overloads upfront, but I just don't think that CSGen is a good place for that since it could be done while solving without the overhead of examining overload sets.

This is part of what is already in this PR. If we've chosen (Int, Int) -> Int for the first binding, then the (Int, Int) -> Int binding becomes preferred when looking at the second overload disjunction.

Favoring overloads but itself is not enough, I think to get the biggest win here and be 1-1 with previous behavior solver would have to - get newly favored overload to the top of the list, and adjust selectBestBindingDisjunction to pick the same kind of operator next if it's adjacent. That way we can quickly test the same overload for all adjacent operators in the chain and stop if it matches. I think with that strategy we can remove LinkedExprAnalyzer completely.

gregomni · 2019-09-05T22:08:04Z

Sorry I was referring to the old behavior which was problematic. I understand that new logic is trying to disable some overloads upfront, but I just don't think that CSGen is a good place for that since it could be done while solving without the overhead of examining overload sets.

Well, there are two pieces, there is (1) determining the subset of all types that the expression could possibly involve, and then there is (2) avoiding taking paths in the solution tree that involve impossible types.

(1) is a property of the expression as a whole, not any particular variable or constraint, and so I don't see a reasonable way to do it during solving. It seems clearly to belong during CSGen to me. On the other hand, it would be possible to have the constraint system hold the set of possible types and do (2), the disjunction path filtering, entirely during solving. But the code for determining which overloads are skippable given the set of possible types is very similar to determining the set of types in the first place, and so sharing code and containing it and the set of types in one place seems better to me.

I think of it as generating a less bushy tree in the first place, which seems a reasonable job for CSGen to be doing.

Favoring overloads but itself is not enough, I think to get the biggest win here and be 1-1 with previous behavior solver would have to - get newly favored overload to the top of the list, and adjust selectBestBindingDisjunction to pick the same kind of operator next if it's adjacent. That way we can quickly test the same overload for all adjacent operators in the chain and stop if it matches. I think with that strategy we can remove LinkedExprAnalyzer completely.

I did effectively that, favoring the matching overload (which in these sorts of expressions is almost always the only favored choice, then), and sorting disjunctions by minimum-favored. The solver immediately picked all the same operators.

It turns out that this was not enough to solve several of the SIMD expressions in tests once the old LinkedExprAnalyzer overload type-var merging was removed. So there were several test failures taking too much time to solve.

It also turns out that, once the new LinkedExprAnalyzer here was in, the extra work to pick an optimum disjunction order was a net performance loss, so I removed it in e18f1355f1c6267e53ab2e17b83180704ffdafd1.

xedin · 2019-09-05T22:23:43Z

Here is how I see this - (1) is impossible to do properly in CSGen in presence of generic overloads, because it was possible we could use a classic algorithm with deterministic type variable ordering and solver wouldn't have any exponential behavior; (2) Solver is already pretty good at not taking any path where both sides of the argument conversion constraint or result are concrete, so pruning not a problem in this case, but when it comes to generic overloads all bets off at the moment.

It turns out that this was not enough to solve several of the SIMD expressions in tests once the old LinkedExprAnalyzer overload type-var merging was removed. So there were several test failures taking too much time to solve.

This is exactly the main problem we are yet to solve which no hacks or tweaks to thereof can. I think we should concentrate on trying to solve a problem related to over-exploration in case of generic overloads, once that is done we could remove all of the existing hacks as obsolete.

One of the ways to fix this problem for literal arguments - is to attempt default literal types earlier and backtrack if they don't match.

…rload is (T,T)->T, with no (T,T)->U

…e types match existing binding choices

…n to speed up further searching

…eadability.

gregomni · 2019-09-06T16:51:58Z

Did the cleanup you suggested, rebased to HEAD, fixed for a change in the simplifyLocatorToAnchor() API.

Here is how I see this - (1) is impossible to do properly in CSGen in presence of generic overloads, because it was possible we could use a classic algorithm with deterministic type variable ordering and solver wouldn't have any exponential behavior; (2) Solver is already pretty good at not taking any path where both sides of the argument conversion constraint or result are concrete, so pruning not a problem in this case, but when it comes to generic overloads all bets off at the moment.

Determining subset of types (1) is actually trivial (but unhelpful) for generic overloads. If it takes Self and returns Self then the possible types are exactly the same as they would be in the absence of the operator altogether. So it's true that these overloads are never disabled, but they are easy to deal with in the algorithm.

As for (2): I think both the scope performance counter and the list of bugs which the solver previously failed to solve and with this PR successfully solves shows that the solver can still use some help here.

This is exactly the main problem we are yet to solve which no hacks or tweaks to thereof can. I think we should concentrate on trying to solve a problem related to over-exploration in case of generic overloads, once that is done we could remove all of the existing hacks as obsolete.

I understand that this PR does not solve what you see as the main problem. It greatly ameliorates the situation by improving performance and solving expressions that weren't getting solved otherwise. It also replaces a hack that had incorrect language semantics - erroring in the case of valid code (SR-10324) - with (what you see as) a different hack that doesn't have that problem.

I don't think we know how to solve the main problem right now. Wouldn't it be better to successfully compile more valid code until we figure it out?

gregomni · 2019-09-06T16:52:16Z

@swift-ci Please smoke test.

xedin · 2019-09-06T16:58:09Z

I don't think we know how to solve the main problem right now. Wouldn't it be better to successfully compile more valid code until we figure it out?

The reason why I'm hesitant to take these changes is as follows - every time we try to tweak CSGen/solver hacks we get either more "too complex" reports or source breaks. I can't prove to myself that this is not going to cause either. At this point I'm happy to maintain the status quo until we can up with principled solution which doesn't require any hacks even though some of the valid code is going to stay broken meanwhile.

…disjunctions by testing last opened typevar.

…at max size.

gregomni · 2019-09-14T00:42:32Z

@swift-ci Please test source compatibility.

CodaFi · 2020-04-15T00:11:21Z

@gregomni I'm going to close this out due to age and inactivity. Thank you for taking a look here.

gregomni requested a review from xedin July 14, 2019 17:24

xedin previously requested changes Jul 15, 2019

View reviewed changes

gregomni changed the title ~~[Constraints] Add check to be more cautious with optimizing series of arithmetic binops~~ [ConstraintSystem] Try to be more efficient solving linked operator expressions Jul 19, 2019

gregomni force-pushed the sr-10324 branch from 2c5c4fa to bcf455a Compare July 19, 2019 21:51

gregomni requested a review from xedin July 19, 2019 22:37

xedin reviewed Jul 22, 2019

View reviewed changes

gregomni force-pushed the sr-10324 branch from bcf455a to af0f97a Compare July 30, 2019 22:35

xedin reviewed Sep 4, 2019

View reviewed changes

gregomni added 6 commits September 6, 2019 08:52

Add condition for optimizing series of binops that every possible ove…

a4de4c1

…rload is (T,T)->T, with no (T,T)->U

Instead of chaining binops, favor disjunctions with op overloads whos…

1bf5744

…e types match existing binding choices

Favor operators based on existing binds via both basename and type

f56cbf3

Merge typevars of operators with shared decls after finding a solutio…

385eebb

…n to speed up further searching

Slight simplification of the operator merge logic

136b5de

Dump disjunction possibilities on multiple aligned lines for better r…

fcf742c

…eadability.

gregomni force-pushed the sr-10324 branch from 824e827 to 73eb486 Compare September 6, 2019 16:27

gregomni added 4 commits September 10, 2019 08:40

Extra work in picking disjunction not necessary.

9201731

Fix for API change in simplifyLocatorToAnchor()

217be63

Remove LinkedExprAnalyzer completely. Short-circuit generic overload …

6507563

…disjunctions by testing last opened typevar.

scale-test can't fit a curve to these now, converted to simple tests …

1584472

…at max size.

gregomni force-pushed the sr-10324 branch from 73eb486 to 1584472 Compare September 13, 2019 23:41

xedin mentioned this pull request Feb 7, 2020

[Sema] Convert LinkedExprAnalyzer if chain to ASTVisitor #29706

Closed

CodaFi closed this Apr 15, 2020

gregomni deleted the sr-10324 branch August 18, 2020 23:13

hborla mentioned this pull request Oct 22, 2020

[Constraint System] Implement heuristics for linked operator expressions in the solver proper. #34399

Merged

gregomni mentioned this pull request Jul 14, 2019

[SR-10324] Unexpected three-term "Ambiguous reference" #52724

Open



		bool expandCollectedTypesFromOperatorOverloads(ConstraintSystem &CS,
		LinkedTypeInfo &lti) {


		bool possibleDecl = true;

		if (auto generic = resultTy->getAs<GenericTypeParamType>()) {

[ConstraintSystem] Try to be more efficient solving linked operator expressions #26140

[ConstraintSystem] Try to be more efficient solving linked operator expressions #26140

Uh oh!

Conversation

gregomni commented Jul 14, 2019

Uh oh!

gregomni commented Jul 14, 2019

Uh oh!

xedin left a comment

Choose a reason for hiding this comment

Uh oh!

gregomni commented Jul 15, 2019

Uh oh!

xedin commented Jul 15, 2019

Uh oh!

xedin commented Jul 16, 2019

Uh oh!

gregomni commented Jul 19, 2019

Uh oh!

gregomni commented Jul 19, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gregomni commented Jul 19, 2019

Uh oh!

gregomni commented Jul 19, 2019

Uh oh!

gregomni commented Jul 20, 2019

Uh oh!

swift-ci commented Jul 21, 2019

Summary for master full

Debug-batch

debug-batch brief

debug-batch detailed

Release

release brief

release detailed

Uh oh!

gregomni commented Jul 21, 2019

Uh oh!

xedin left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gregomni commented Jul 22, 2019

Uh oh!

xedin commented Jul 22, 2019

Uh oh!

gregomni commented Jul 28, 2019

Uh oh!

gregomni commented Jul 30, 2019

Uh oh!

gregomni commented Jul 31, 2019

Uh oh!

gregomni commented Jul 31, 2019

Uh oh!

gregomni commented Jul 31, 2019

Uh oh!

gregomni commented Aug 2, 2019

Uh oh!

gregomni commented Aug 3, 2019

Uh oh!

gregomni commented Aug 3, 2019

Uh oh!

gregomni commented Aug 3, 2019

Uh oh!

xedin commented Aug 18, 2019

Uh oh!

gregomni commented Aug 28, 2019

Uh oh!

xedin commented Aug 30, 2019

Uh oh!

xedin left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

gregomni commented Jul 19, 2019 •

edited

Loading