Skip to content

[ConstraintSystem] Try to be more efficient solving linked operator expressions #26140

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 10 commits into from

Conversation

gregomni
Copy link
Contributor

Explicitly check that every possible overload of the binary operator whose two arguments are the same type also results in the same type. (That is, for any (T,T) args (T,T)->T, with no (T,T)->U.)

Resolves SR-10324.

The problem in that issue is that (x * y) * z requires the two multiplications to use two different overloads (A,A)->B in the parens and then (B,A)->B outside. But during constraint optimization, we see a series of binops with all the same type underlying variable types (A,A,A) and assume that we'll use the same overload for all of them (merging the type variables for both *s). Once we've done that, the constraint system is unsolvable.

This PR adds an additional check that none of the potential overloads has the form (T,T)->U, the existence of which is what breaks the optimization assumption.

Not done here, but I think that with this additional check it becomes safe to remove isArithmeticOperatorDecl and resolve this comment:

            // TODO: We currently limit this optimization to known arithmetic
            // operators, but we should be able to broaden this out to
            // logical operators as well.

@gregomni
Copy link
Contributor Author

@swift-ci Please smoke test.

@gregomni gregomni requested a review from xedin July 14, 2019 17:24
xedin
xedin previously requested changes Jul 15, 2019
Copy link
Contributor

@xedin xedin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have attempted similar solution already since I'm tracking multiple dupes of this issue, the problem is that this makes some expressions too complex in our internal suites and I think it used to happen in public source compat suite as well :(

I haven't had time since but if you really want to fix this, LinkedExprAnalyzer has to get removed from CSGen and its logic has to get ported over to the solver itself, this way we wouldn't have a problem with eagerness (and assumption that all of the overloads are homogeneous) and the same performance in the common case.

@gregomni
Copy link
Contributor Author

@xedin Ah, makes sense. Trying the public source-compat here, because it would be convenient to have a real code example to look at that fails on complexity, but then I'll go take a look at LinkedExprAnalyzer and see what I can do.

@swift-ci Please Test Source Compatibility

@xedin
Copy link
Contributor

xedin commented Jul 15, 2019

I think you can add a custom overload to + and some of the tests in validation-test/Sema/type_checker_perf would fail. import Foundation and import simd make huge different too since they load a bunch of overloads to arithmetic operators.

@xedin
Copy link
Contributor

xedin commented Jul 16, 2019

@gregomni For inspiration, here is one of my attempts to get rid of LinkedExprAnalyzer and other hacks - #15246

@gregomni gregomni changed the title [Constraints] Add check to be more cautious with optimizing series of arithmetic binops [ConstraintSystem] Try to be more efficient solving linked operator expressions Jul 19, 2019
@gregomni
Copy link
Contributor Author

The couple days ago commits here essentially do as you (@xedin) suggest and moves the linked operator logic into the solver itself.

During overload selection on a disjunction, any overload for an operator which is either already bound in the constraint system (e.g. we're already using this exact +) or has identical types to an operator bound in the constraint system (e.g. we're using a * with these exact types) gets favored.

Also during selection of which disjunction gets picked next, it sorts on number of favored possibilities (if any) as well, so disjunctions with many possibilities but 1 (or a few) favored get explored before disjunctions with few possibilities over all but none favored.

With these changes the solver tends to search (approximately) depth-first down branches of matching operators.

Finally, when a solution is found with a given operator disjunction choice, all the typevars for any bindings in that solution to that exact overload are then merged (for examining any other possibilities).

So the result is that the typevars aren't eagerly bound together, but the solver looks at same-overloads in operator expressions first, and if it is able to form a solution, the typevars are bound together afterwards to greatly cut down the remaining search space (identical to the original eager optimization, but delayed, and potentially piece-wise).

This works to fix sr-10324, the original goal here, but maintains almost all of the speed of the original optimization in the 'normal' arithmetical cases.

@gregomni
Copy link
Contributor Author

gregomni commented Jul 19, 2019

It turns out that "almost all of the speed" wasn't enough for validation-test/Sema/type_checker_perf/fast/simd_add.gyb, a long addition expression involving SIMD4<Float>s. Delaying the eager optimization was enough to break this test case because there are just so many overloads of + to look at before finding the solution and binding all occurrences together. So that got me looking at SIMD.

Today's commits here repurposes LinkedExprAnalyzer. The idea is to look at the known argument types and literals of the operators in the expression and to form a closed algebra containing all of the types that the intermediates and final result could be. (E.g. a simple example is if we're doing ordinary arithmetic involving integer literals, we could be dealing with any of the Int or UInt or Float types, but there's no way with any of those starting types and possible operators to get a String result, so String is outside of the domain & range of the algebra for the expression.)

If we're able to do that (form a closed domain of types), then we can disable all of the potential overloads for the operators that don't involve types in the domain and not explore them at all, because they are impossible. This is a very productive optimization when SIMD is involved.

E.g. SR-10461:

% time xcrun swiftc -typecheck sr10461.swift                                                  
sr10461.swift:4:28: error: the compiler is unable to type-check this expression in reasonable time; try breaking up the expression into distinct sub-expressions
    let i = 3*(-p0 + 3*p1) - (3*p2 + p3)
            ~~~~~~~~~~~~~~~^~~~~~~~~~~~~
xcrun swiftc -typecheck sr10461.swift  16.92s user 0.44s system 99% cpu 17.409 total

vs

% time xcrun build/Ninja-ReleaseAssert/swift-macosx-x86_64/bin/swiftc -typecheck sr10461.swift
xcrun build/Ninja-ReleaseAssert/swift-macosx-x86_64/bin/swiftc -typecheck   8.95s user 0.30s system 99% cpu 9.293 total

It's still a bit buggy with this code not correctly determining in a few circumstances that it can't prove the domain is actually closed. (And thus it disables overloads that it should leave alone.) Should work those out soon.

@gregomni
Copy link
Contributor Author

@swift-ci Please test compiler performance

@gregomni
Copy link
Contributor Author

This all works now. There are 4 validation tests that fail. They all include -solver-enable-operator-designated-types, and that setting overlaps and interferes somewhat with the disjunction handling added here. Removing the special -solver switches from those 4 tests cause them all to compile correctly, but since they are mostly scaling tests and I can't access the radars to tell what their purpose is, I'm not sure if or how I should modify those test files. They are:

validation-test/Sema/type_checker_perf/fast/rdar18360240.swift.gyb
validation-test/Sema/type_checker_perf/fast/rdar23327871.swift.gyb
validation-test/Sema/type_checker_perf/fast/rdar25866240.swift.gyb
validation-test/Sema/type_checker_perf/fast/rdar46687985.swift

@gregomni gregomni requested a review from xedin July 19, 2019 22:37
@gregomni gregomni dismissed xedin’s stale review July 19, 2019 22:38

Many changes, asking for new review.

@gregomni
Copy link
Contributor Author

@swift-ci Please test compiler performance

@swift-ci
Copy link
Contributor

Summary for master full

Unexpected test results, excluded stats for Tagged, NIO, Deferred, Kommander, Dollar, IBAnimatableApp, CoreStore, ProcedureKit, swiftlint, Alamofire, Wordy, SwifterSwift

Regressions found (see below)

Debug-batch

debug-batch brief

Regressed (0)
name old new delta delta_pct
Improved (3)
name old new delta delta_pct
Frontend.NumInstructionsExecuted 24,353,995,254,420 23,760,829,848,130 -593,165,406,290 -2.44% ✅
LLVM.NumLLVMBytesOutput 884,275,700 866,509,244 -17,766,456 -2.01% ✅
time.swift-driver.wall 2413.3s 2324.7s -88.6s -3.67% ✅
Unchanged (delta < 1.0% or delta < 100.0ms) (0)
name old new delta delta_pct

debug-batch detailed

Regressed (5)
name old new delta delta_pct
Sema.ExtendedNominalRequest 2,809,687 2,869,631 59,944 2.13% ⛔
Sema.InheritedDeclsReferencedRequest 3,217,162 3,464,454 247,292 7.69% ⛔
Sema.NumConformancesDeserialized 5,023,442 5,197,107 173,665 3.46% ⛔
Sema.NumUnloadedLazyIterableDeclContexts 3,394,621 3,435,185 40,564 1.19% ⛔
Sema.SelfBoundsFromWhereClauseRequest 4,167,500 4,563,830 396,330 9.51% ⛔
Improved (79)
name old new delta delta_pct
AST.NumDecls 71,968 70,346 -1,622 -2.25% ✅
AST.NumDependencies 194,316 190,198 -4,118 -2.12% ✅
AST.NumInfixOperators 25,810 25,223 -587 -2.27% ✅
AST.NumLoadedModules 238,616 233,374 -5,242 -2.2% ✅
AST.NumLocalTypeDecls 104 102 -2 -1.92% ✅
AST.NumPrecedenceGroups 13,071 12,848 -223 -1.71% ✅
AST.NumReferencedMemberNames 3,114,111 3,047,053 -67,058 -2.15% ✅
AST.NumReferencedTopLevelNames 232,133 227,135 -4,998 -2.15% ✅
AST.NumSourceLines 2,154,179 2,105,197 -48,982 -2.27% ✅
AST.NumSourceLinesPerSecond 1,896,636 1,834,648 -61,988 -3.27% ✅
AST.NumTotalClangImportedEntities 3,633,699 3,584,922 -48,777 -1.34% ✅
Driver.ChildrenMaxRSS 126,032,842,752 120,826,662,912 -5,206,179,840 -4.13% ✅
Driver.NumDriverJobsRun 15,335 15,072 -263 -1.72% ✅
Driver.NumDriverPipePolls 81,664 80,491 -1,173 -1.44% ✅
Driver.NumDriverPipeReads 79,500 78,598 -902 -1.13% ✅
Frontend.MaxMallocUsage 735,380,575,888 723,440,312,576 -11,940,263,312 -1.62% ✅
Frontend.NumInstructionsExecuted 24,353,995,254,420 23,760,829,848,130 -593,165,406,290 -2.44% ✅
IRModule.NumIRAliases 91,961 90,295 -1,666 -1.81% ✅
IRModule.NumIRBasicBlocks 3,439,178 3,358,341 -80,837 -2.35% ✅
IRModule.NumIRFunctions 1,599,879 1,564,689 -35,190 -2.2% ✅
IRModule.NumIRGlobals 1,766,744 1,735,873 -30,871 -1.75% ✅
IRModule.NumIRInsts 46,635,371 45,841,392 -793,979 -1.7% ✅
IRModule.NumIRNamedMetaData 74,090 73,110 -980 -1.32% ✅
IRModule.NumIRValueSymbols 2,973,529 2,912,864 -60,665 -2.04% ✅
LLVM.NumLLVMBytesOutput 884,275,700 866,509,244 -17,766,456 -2.01% ✅
Parse.NumFunctionsParsed 117,680 115,461 -2,219 -1.89% ✅
Parse.NumIterableDeclContextParsed 852,766 838,308 -14,458 -1.7% ✅
SILModule.NumSILGenFunctions 789,715 770,368 -19,347 -2.45% ✅
SILModule.NumSILGenGlobalVariables 30,553 30,115 -438 -1.43% ✅
SILModule.NumSILGenWitnessTables 34,768 33,498 -1,270 -3.65% ✅
SILModule.NumSILOptFunctions 1,132,762 1,104,550 -28,212 -2.49% ✅
SILModule.NumSILOptGlobalVariables 31,158 30,710 -448 -1.44% ✅
SILModule.NumSILOptVtables 16,598 16,421 -177 -1.07% ✅
SILModule.NumSILOptWitnessTables 73,636 70,925 -2,711 -3.68% ✅
Sema.AttachedPropertyWrapperTypeRequest 250,678 245,067 -5,611 -2.24% ✅
Sema.AttachedPropertyWrappersRequest 1,547,914 1,505,816 -42,098 -2.72% ✅
Sema.DefaultAndMaxAccessLevelRequest 39,275 38,489 -786 -2.0% ✅
Sema.DefaultDefinitionTypeRequest 4,012 3,874 -138 -3.44% ✅
Sema.DefaultTypeRequest 169,639 164,158 -5,481 -3.23% ✅
Sema.EnumRawTypeRequest 10,896 10,654 -242 -2.22% ✅
Sema.ExistentialConformsToSelfRequest 12,442 12,316 -126 -1.01% ✅
Sema.IsDynamicRequest 801,220 780,045 -21,175 -2.64% ✅
Sema.IsFinalRequest 2,192,229 2,154,325 -37,904 -1.73% ✅
Sema.IsGetterMutatingRequest 219,648 215,255 -4,393 -2.0% ✅
Sema.IsObjCRequest 762,521 747,806 -14,715 -1.93% ✅
Sema.IsSetterMutatingRequest 178,453 174,600 -3,853 -2.16% ✅
Sema.MangleLocalTypeDeclRequest 208 204 -4 -1.92% ✅
Sema.NamedLazyMemberLoadFailureCount 16,019 15,749 -270 -1.69% ✅
Sema.NamedLazyMemberLoadSuccessCount 15,849,402 15,508,546 -340,856 -2.15% ✅
Sema.NominalTypeLookupDirectCount 18,840,925 18,542,344 -298,581 -1.58% ✅
Sema.NumConstraintScopes 14,212,758 8,655,258 -5,557,500 -39.1% ✅
Sema.NumConstraintsConsideredForEdgeContraction 49,038,944 35,276,972 -13,761,972 -28.06% ✅
Sema.NumDeclsDeserialized 39,947,429 39,421,601 -525,828 -1.32% ✅
Sema.NumDeclsFinalized 317,643 310,764 -6,879 -2.17% ✅
Sema.NumDeclsTypechecked 682,073 666,737 -15,336 -2.25% ✅
Sema.NumDeclsValidated 1,212,445 1,184,675 -27,770 -2.29% ✅
Sema.NumFunctionsTypechecked 247,028 241,721 -5,307 -2.15% ✅
Sema.NumGenericSignatureBuilders 751,804 738,885 -12,919 -1.72% ✅
Sema.NumLazyGenericEnvironments 7,259,752 7,139,717 -120,035 -1.65% ✅
Sema.NumLazyGenericEnvironmentsLoaded 169,422 166,497 -2,925 -1.73% ✅
Sema.NumLazyRequirementSignatures 576,362 564,376 -11,986 -2.08% ✅
Sema.NumLazyRequirementSignaturesLoaded 355,976 346,551 -9,425 -2.65% ✅
Sema.NumLeafScopes 8,788,593 5,354,514 -3,434,079 -39.07% ✅
Sema.NumTypesDeserialized 11,853,855 11,635,625 -218,230 -1.84% ✅
Sema.NumTypesValidated 960,090 946,273 -13,817 -1.44% ✅
Sema.OpaqueReadOwnershipRequest 232,298 227,907 -4,391 -1.89% ✅
Sema.OverriddenDeclsRequest 1,188,294 1,173,177 -15,117 -1.27% ✅
Sema.PropertyWrapperBackingPropertyInfoRequest 247,889 242,372 -5,517 -2.23% ✅
Sema.PropertyWrapperBackingPropertyTypeRequest 250,678 245,067 -5,611 -2.24% ✅
Sema.ProtocolRequiresClassRequest 48,675 48,044 -631 -1.3% ✅
Sema.RequirementSignatureRequest 408,603 398,494 -10,109 -2.47% ✅
Sema.SelfAccessKindRequest 3,554,214 3,505,598 -48,616 -1.37% ✅
Sema.SetterAccessLevelRequest 64,263 62,173 -2,090 -3.25% ✅
Sema.StoredPropertiesAndMissingMembersRequest 17,147 16,850 -297 -1.73% ✅
Sema.StoredPropertiesRequest 197,999 191,341 -6,658 -3.36% ✅
Sema.SuperclassDeclRequest 257,180 252,025 -5,155 -2.0% ✅
Sema.SuperclassTypeRequest 29,905 29,468 -437 -1.46% ✅
Sema.TypeCheckFunctionBodyUntilRequest 247,028 241,721 -5,307 -2.15% ✅
Sema.UnderlyingTypeDeclsReferencedRequest 123,320 121,066 -2,254 -1.83% ✅
Unchanged (delta < 1.0% or delta < 100.0ms) (41)
name old new delta delta_pct
AST.NumASTBytesAllocated 35,873,006,810 35,530,282,500 -342,724,310 -0.96%
AST.NumLinkLibraries 0 0 0 0.0%
AST.NumObjCMethods 8,483 8,431 -52 -0.61%
AST.NumPostfixOperators 18 18 0 0.0%
AST.NumPrefixOperators 45 45 0 0.0%
AST.NumReferencedDynamicNames 84 84 0 0.0%
AST.NumSourceBuffers 312,077 309,418 -2,659 -0.85%
Driver.DriverDepCascadingDynamic 0 0 0 0.0%
Driver.DriverDepCascadingExternal 0 0 0 0.0%
Driver.DriverDepCascadingMember 0 0 0 0.0%
Driver.DriverDepCascadingNominal 0 0 0 0.0%
Driver.DriverDepCascadingTopLevel 0 0 0 0.0%
Driver.DriverDepDynamic 0 0 0 0.0%
Driver.DriverDepExternal 0 0 0 0.0%
Driver.DriverDepMember 0 0 0 0.0%
Driver.DriverDepNominal 0 0 0 0.0%
Driver.DriverDepTopLevel 0 0 0 0.0%
Driver.NumDriverJobsSkipped 0 0 0 0.0%
Driver.NumProcessFailures 0 0 0 0.0%
Frontend.NumProcessFailures 0 0 0 0.0%
IRModule.NumIRComdatSymbols 0 0 0 0.0%
IRModule.NumIRIFuncs 0 0 0 0.0%
SILModule.NumSILGenDefaultWitnessTables 0 0 0 0.0%
SILModule.NumSILGenVtables 10,360 10,259 -101 -0.97%
SILModule.NumSILOptDefaultWitnessTables 0 0 0 0.0%
Sema.AccessLevelRequest 6,396,310 6,384,573 -11,737 -0.18%
Sema.AttachedFunctionBuilderRequest 0 0 0 0.0%
Sema.CollectOverriddenDeclsRequest 4,036,778 4,037,402 624 0.02%
Sema.CursorInfoRequest 0 0 0 0.0%
Sema.CustomAttrNominalRequest 0 0 0 0.0%
Sema.FunctionBuilderTypeRequest 0 0 0 0.0%
Sema.InheritedTypeRequest 156,246 154,695 -1,551 -0.99%
Sema.LazyStoragePropertyRequest 1,021 1,021 0 0.0%
Sema.NumLazyIterableDeclContexts 5,246,627 5,265,199 18,572 0.35%
Sema.PropertyWrapperTypeInfoRequest 0 0 0 0.0%
Sema.ProvideDefaultImplForRequest 4,036,778 4,037,402 624 0.02%
Sema.RangeInfoRequest 0 0 0 0.0%
Sema.RequirementRequest 56,087 55,582 -505 -0.9%
Sema.StructuralTypeRequest 0 0 0 0.0%
Sema.TypeDeclsFromWhereClauseRequest 24,328 24,176 -152 -0.62%
Sema.USRGenerationRequest 4,593,868 4,580,484 -13,384 -0.29%

Release

release brief

Regressed (0)
name old new delta delta_pct
Improved (3)
name old new delta delta_pct
Frontend.NumInstructionsExecuted 24,781,453,740,579 23,971,854,504,912 -809,599,235,667 -3.27% ✅
LLVM.NumLLVMBytesOutput 751,584,864 740,215,546 -11,369,318 -1.51% ✅
time.swift-driver.wall 4356.1s 4186.8s -169.3s -3.89% ✅
Unchanged (delta < 1.0% or delta < 100.0ms) (0)
name old new delta delta_pct

release detailed

Regressed (0)
name old new delta delta_pct
Improved (21)
name old new delta delta_pct
AST.NumLoadedModules 16,695 15,851 -844 -5.06% ✅
AST.NumTotalClangImportedEntities 677,254 659,369 -17,885 -2.64% ✅
IRModule.NumIRBasicBlocks 2,838,957 2,742,722 -96,235 -3.39% ✅
IRModule.NumIRFunctions 1,344,913 1,314,329 -30,584 -2.27% ✅
IRModule.NumIRGlobals 1,556,920 1,528,147 -28,773 -1.85% ✅
IRModule.NumIRInsts 27,254,444 26,567,898 -686,546 -2.52% ✅
IRModule.NumIRValueSymbols 2,684,445 2,629,158 -55,287 -2.06% ✅
LLVM.NumLLVMBytesOutput 751,584,864 740,215,546 -11,369,318 -1.51% ✅
SILModule.NumSILGenFunctions 554,213 538,761 -15,452 -2.79% ✅
SILModule.NumSILOptFunctions 780,723 750,116 -30,607 -3.92% ✅
Sema.NumConformancesDeserialized 1,991,132 1,917,151 -73,981 -3.72% ✅
Sema.NumConstraintScopes 13,988,821 8,346,614 -5,642,207 -40.33% ✅
Sema.NumDeclsDeserialized 5,656,320 5,333,343 -322,977 -5.71% ✅
Sema.NumDeclsValidated 843,337 821,401 -21,936 -2.6% ✅
Sema.NumFunctionsTypechecked 248,501 243,191 -5,310 -2.14% ✅
Sema.NumGenericSignatureBuilders 154,293 148,819 -5,474 -3.55% ✅
Sema.NumLazyGenericEnvironments 1,100,209 1,034,259 -65,950 -5.99% ✅
Sema.NumLazyGenericEnvironmentsLoaded 18,996 17,914 -1,082 -5.7% ✅
Sema.NumLazyIterableDeclContexts 702,215 683,358 -18,857 -2.69% ✅
Sema.NumTypesDeserialized 2,824,945 2,655,052 -169,893 -6.01% ✅
Sema.NumTypesValidated 523,620 515,302 -8,318 -1.59% ✅
Unchanged (delta < 1.0% or delta < 100.0ms) (0)
name old new delta delta_pct

@gregomni
Copy link
Contributor Author

@swift-ci Please Test Source Compatibility

Copy link
Contributor

@xedin xedin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly after trying so many different things I'm starting to think that the only way to properly fix this would be attempting literal bindings earlier and backtracking if they don't match keeping a set of already attempted types. The first step is to remove LinkedExprAnalyzer completely and go from there... Another reason why we need to get rid of all of the perf hacks in CSGen is diagnostics, in "diagnostic mode" solver should be able to attempt all possible overloads and bindings to be able to find a solution.

The reason why we can't solve operators efficiently is related to the fact that overloads like (Self, Self) -> Self when opened create a single type variable for parameters and a result type, which makes the constraint system un-splittable into separate components. Possible ways to improve that I see is to either open Self as separate type variables in different positions and then have a separate solver step to Type::join them together (and possibly re-attempt some components with correct contextual type), or attempt bindings for arguments earlier, especially for the literals, because literal expressions have a type variable type with a literal protocol conformance which then forms <T_param> arg conv <T_arg> constraints so the graph could only be disconnected only if one of the sides gets a type.



bool expandCollectedTypesFromOperatorOverloads(ConstraintSystem &CS,
LinkedTypeInfo &lti) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm actually hoping that we could remove all of this code from CSGen instead of adding more :)

@gregomni
Copy link
Contributor Author

@xedin I understand your hesitance about doing more pre-checking but I think narrowing the number of possibilities in large overload disjunctions is a productive way forward. I'm traveling the next few days, but I'd like to investigate the source compatibility failures here when I have the chance, and then try to convince you that this is a worthwhile thing to do once they are taken care of.

Meanwhile, if you think the commits on delaying operator tyvar merging (the first 5 up through 4ef4342604d79e5e19398fc8bcd1dbf015335974) are worth contributing as a separate PR, I think those stand on their own.

Finally, I think a good future step would be to do this sort of thing in the solver itself rather than as a pre-step in CSGen. I imagine a separate possible-overloads disjunction constraint type, which references the tyvars for the operator arguments and attempts simplification by reducing the set of valid overloads as the argument tyvars get more constrained. In these arithmetic expressions it really seems like the problem is the large branching factor in constraint scopes, and reducing the fanout helps a lot.

@xedin
Copy link
Contributor

xedin commented Jul 22, 2019

I understand your hesitance about doing more pre-checking but I think narrowing the number of possibilities in large overload disjunctions is a productive way forward.

That's what we keep trying to do and keep finding edge cases with. I think it's time for us to take a step back and try to see if it could be done in the solver instead. @DougGregor was working on similar ideas, he might be able to share some of his thinking. @rudkx was investigating a different approach with designated operators.

@gregomni
Copy link
Contributor Author

@swift-ci Please Test Source Compatibility

@gregomni
Copy link
Contributor Author

@swift-ci Please Test Source Compatibility

@gregomni
Copy link
Contributor Author

Hmm. Getting identical failures when tried on the nightly toolchain (swift-DEVELOPMENT-SNAPSHOT-2019-07-30-a.xctoolchain) for Sourcery that seems unrelated to my changes:

SIL verification failed: Unknown formal access pattern: storage
Verifying instruction:
   %7 = argument of bb1 : $*Optional<EJSTemplate.Error> // user: %95
->   %95 = begin_access [read] [dynamic] [no_nested_conflict] %7 : $*Optional<EJSTemplate.Error> // users: %96, %97
     %96 = load %95 : $*Optional<EJSTemplate.Error> // users: %131, %142, %105, %98
     end_access %95 : $*Optional<EJSTemplate.Error> // id: %97

@gregomni
Copy link
Contributor Author

@swift-ci Please smoke test

@gregomni
Copy link
Contributor Author

@swift-ci Please test source compatibility.

1 similar comment
@gregomni
Copy link
Contributor Author

gregomni commented Aug 2, 2019

@swift-ci Please test source compatibility.

@gregomni
Copy link
Contributor Author

gregomni commented Aug 3, 2019

@swift-ci Please smoke test.

@gregomni
Copy link
Contributor Author

gregomni commented Aug 3, 2019

Resolves: SR-10324, SR-10461, SR-10601, SR-10606.

So with this PR we'd no longer immediately bind arithmetic operators to be identical (SR-10324), but at the same time it greatly improves type checking time for expressions with lots of operators, especially those involving simd. (And passes source compatibility except for the current RxSwift incompatible with weak references error that ought to be an XFAIL.)

@gregomni
Copy link
Contributor Author

gregomni commented Aug 3, 2019

@swift-ci Please smoke test.

@xedin
Copy link
Contributor

xedin commented Aug 18, 2019

I will put some time aside to take a look at this upcoming week.

@gregomni
Copy link
Contributor Author

Just a reminder to review when you have a chance.

@xedin
Copy link
Contributor

xedin commented Aug 30, 2019

Sorry @gregomni I have this open and ready just need to finish up some other stuff first before I dig in again.

Copy link
Contributor

@xedin xedin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the benefit but I'm still skeptical about this approach. The problem for me, as I mentioned before, is the fact that merging of type variables associated with literal arguments shouldn't happen during constraint generation. If we can avoid that it would spare us a lot of work which is currently done upfront e.g. trying to collect all possible types for literal arguments, disabling overloads.

As an alternative approach - while solving, we could try to re-arrange choices in other disjunctions to much the previous choice taken for the same operator e.g. 1 + 2 + 3 binding first + to (Int, Int) -> Int would bubble up (Int, Int) -> Int to the top so it could be attempted first for the second + as well.

It seems such behavior could imitate this (LinkedExprAnalyzer) optimization almost exactly for "concrete" overloads because there is another "hack" in the solver which doesn't attempt generic overloads when there is a concrete match already available.

}

for (auto convertTy : lti.collectedTypes) {
if (CS.TC.isConvertibleTo(convertTy, type, CS.DC))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isConvertiableTo (which creates new constraint system every time) and conformsToProtocol are pretty heavy checks which, in my opinion, we are better off only performing while solving for each individual overload.

There is a long standing improvement opportunity where determineBestBindings could actually check whether found type conforms to all of the protocols associated with its type variable via {Literal}ConformsTo constraint and avoid that binding.


bool possibleDecl = true;

if (auto generic = resultTy->getAs<GenericTypeParamType>()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a note - overloads like (Self, Self) -> Self or (T, T) -> T are the biggest problem is at the moment - inability to filter out generic overloads which connect arguments to a result and prevent efficient solving of disconnected components, which means we have to explore a lot deeper into solution trees.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be simplified down to

if (llvm::any_of(fnType->getParams(), [&resultTy](const AnyFunctionType::Param &param) {
   return resultTy->isEqual(param.getPlainType());
})
  continue;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a note - overloads like (Self, Self) -> Self or (T, T) -> T are the biggest problem is at the moment - inability to filter out generic overloads which connect arguments to a result and prevent efficient solving of disconnected components, which means we have to explore a lot deeper into solution trees.

These types of overloads can't be ruled out as possibilities up front, nor disconnected, but this PR (by favoring choosing duplicate bindings of existing bindings and by merging those type variables once a solution exists) makes it so that the solution tree gets pruned to be much less branchy once we find a solution. The depth is unaffected, but the breadth gets vastly narrower.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry but I don't understand what you are trying to say, the problem is that in the worse case, especially when expression is invalid e.g 1 + "hello", we'd still get to explore almost everything just because generic overloads like (Self, Self) -> Self carry almost no context for the bindings and we can't verify conformances until generic parameter is actually bound to something.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, with this PR, you'd disable every possible overload of + except for the generic (Self, Self) -> Self style, because all the non-generic ones are impossible. So you explore much less. I was trying to say that this is a big performance win, even when you can't do anything efficient with the generic style ones directly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I'm trying to say is that presence of overloads with concrete types is not a problem if arguments are concrete as well or could be bound to something relatively easily.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like in the examples you have in the PR, all of the arguments are going to be concrete so it's pretty easy for the solver to avoid going deeper if argument conversions could be tested and rejected right away.

continue;

llvm::TinyPtrVector<Constraint *> disjunctions =
CS.getConstraintGraph().gatherConstraints(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Performance wise - it would be better to invert that check and have outer loop be for every disjunction constraint, at least it's going to be linear over a number of constraints in the system.

@gregomni
Copy link
Contributor Author

gregomni commented Sep 5, 2019

@xedin

I see the benefit but I'm still skeptical about this approach. The problem for me, as I mentioned before, is the fact that merging of type variables associated with literal arguments shouldn't happen during constraint generation. If we can avoid that it would spare us a lot of work which is currently done upfront e.g. trying to collect all possible types for literal arguments, disabling overloads.

Sorry, I'm not sure I understand this comment. Can you expand, please? There isn't any merging of type variables happening during constraint generation here. (The old -- sr-10324 bug causing -- merging of operator overload type variables is removed as part of this PR.)

This does examine all the literals in order to determine which operator overloads can be disabled as impossible in this expression, but the literal type variables aren't touched and can still potentially be solved as different types from each other.

As an alternative approach - while solving, we could try to re-arrange choices in other disjunctions to much the previous choice taken for the same operator e.g. 1 + 2 + 3 binding first + to (Int, Int) -> Int would bubble up (Int, Int) -> Int to the top so it could be attempted first for the second + as well.

This is part of what is already in this PR. If we've chosen (Int, Int) -> Int for the first binding, then the (Int, Int) -> Int binding becomes preferred when looking at the second overload disjunction.

@xedin
Copy link
Contributor

xedin commented Sep 5, 2019

This does examine all the literals in order to determine which operator overloads can be disabled as impossible in this expression, but the literal type variables aren't touched and can still potentially be solved as different types from each other.

Sorry I was referring to the old behavior which was problematic. I understand that new logic is trying to disable some overloads upfront, but I just don't think that CSGen is a good place for that since it could be done while solving without the overhead of examining overload sets.

This is part of what is already in this PR. If we've chosen (Int, Int) -> Int for the first binding, then the (Int, Int) -> Int binding becomes preferred when looking at the second overload disjunction.

Favoring overloads but itself is not enough, I think to get the biggest win here and be 1-1 with previous behavior solver would have to - get newly favored overload to the top of the list, and adjust selectBestBindingDisjunction to pick the same kind of operator next if it's adjacent. That way we can quickly test the same overload for all adjacent operators in the chain and stop if it matches. I think with that strategy we can remove LinkedExprAnalyzer completely.

@gregomni
Copy link
Contributor Author

gregomni commented Sep 5, 2019

Sorry I was referring to the old behavior which was problematic. I understand that new logic is trying to disable some overloads upfront, but I just don't think that CSGen is a good place for that since it could be done while solving without the overhead of examining overload sets.

Well, there are two pieces, there is (1) determining the subset of all types that the expression could possibly involve, and then there is (2) avoiding taking paths in the solution tree that involve impossible types.

(1) is a property of the expression as a whole, not any particular variable or constraint, and so I don't see a reasonable way to do it during solving. It seems clearly to belong during CSGen to me. On the other hand, it would be possible to have the constraint system hold the set of possible types and do (2), the disjunction path filtering, entirely during solving. But the code for determining which overloads are skippable given the set of possible types is very similar to determining the set of types in the first place, and so sharing code and containing it and the set of types in one place seems better to me.

I think of it as generating a less bushy tree in the first place, which seems a reasonable job for CSGen to be doing.

Favoring overloads but itself is not enough, I think to get the biggest win here and be 1-1 with previous behavior solver would have to - get newly favored overload to the top of the list, and adjust selectBestBindingDisjunction to pick the same kind of operator next if it's adjacent. That way we can quickly test the same overload for all adjacent operators in the chain and stop if it matches. I think with that strategy we can remove LinkedExprAnalyzer completely.

I did effectively that, favoring the matching overload (which in these sorts of expressions is almost always the only favored choice, then), and sorting disjunctions by minimum-favored. The solver immediately picked all the same operators.

It turns out that this was not enough to solve several of the SIMD expressions in tests once the old LinkedExprAnalyzer overload type-var merging was removed. So there were several test failures taking too much time to solve.

It also turns out that, once the new LinkedExprAnalyzer here was in, the extra work to pick an optimum disjunction order was a net performance loss, so I removed it in e18f1355f1c6267e53ab2e17b83180704ffdafd1.

@xedin
Copy link
Contributor

xedin commented Sep 5, 2019

Here is how I see this - (1) is impossible to do properly in CSGen in presence of generic overloads, because it was possible we could use a classic algorithm with deterministic type variable ordering and solver wouldn't have any exponential behavior; (2) Solver is already pretty good at not taking any path where both sides of the argument conversion constraint or result are concrete, so pruning not a problem in this case, but when it comes to generic overloads all bets off at the moment.

It turns out that this was not enough to solve several of the SIMD expressions in tests once the old LinkedExprAnalyzer overload type-var merging was removed. So there were several test failures taking too much time to solve.

This is exactly the main problem we are yet to solve which no hacks or tweaks to thereof can. I think we should concentrate on trying to solve a problem related to over-exploration in case of generic overloads, once that is done we could remove all of the existing hacks as obsolete.

One of the ways to fix this problem for literal arguments - is to attempt default literal types earlier and backtrack if they don't match.

@gregomni
Copy link
Contributor Author

gregomni commented Sep 6, 2019

Did the cleanup you suggested, rebased to HEAD, fixed for a change in the simplifyLocatorToAnchor() API.

Here is how I see this - (1) is impossible to do properly in CSGen in presence of generic overloads, because it was possible we could use a classic algorithm with deterministic type variable ordering and solver wouldn't have any exponential behavior; (2) Solver is already pretty good at not taking any path where both sides of the argument conversion constraint or result are concrete, so pruning not a problem in this case, but when it comes to generic overloads all bets off at the moment.

Determining subset of types (1) is actually trivial (but unhelpful) for generic overloads. If it takes Self and returns Self then the possible types are exactly the same as they would be in the absence of the operator altogether. So it's true that these overloads are never disabled, but they are easy to deal with in the algorithm.

As for (2): I think both the scope performance counter and the list of bugs which the solver previously failed to solve and with this PR successfully solves shows that the solver can still use some help here.

This is exactly the main problem we are yet to solve which no hacks or tweaks to thereof can. I think we should concentrate on trying to solve a problem related to over-exploration in case of generic overloads, once that is done we could remove all of the existing hacks as obsolete.

I understand that this PR does not solve what you see as the main problem. It greatly ameliorates the situation by improving performance and solving expressions that weren't getting solved otherwise. It also replaces a hack that had incorrect language semantics - erroring in the case of valid code (SR-10324) - with (what you see as) a different hack that doesn't have that problem.

I don't think we know how to solve the main problem right now. Wouldn't it be better to successfully compile more valid code until we figure it out?

@gregomni
Copy link
Contributor Author

gregomni commented Sep 6, 2019

@swift-ci Please smoke test.

@xedin
Copy link
Contributor

xedin commented Sep 6, 2019

I don't think we know how to solve the main problem right now. Wouldn't it be better to successfully compile more valid code until we figure it out?

The reason why I'm hesitant to take these changes is as follows - every time we try to tweak CSGen/solver hacks we get either more "too complex" reports or source breaks. I can't prove to myself that this is not going to cause either. At this point I'm happy to maintain the status quo until we can up with principled solution which doesn't require any hacks even though some of the valid code is going to stay broken meanwhile.

@gregomni
Copy link
Contributor Author

@swift-ci Please test source compatibility.

@CodaFi
Copy link
Contributor

CodaFi commented Apr 15, 2020

@gregomni I'm going to close this out due to age and inactivity. Thank you for taking a look here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants