Skip to content

Cut down on exponential growth in checking switch exhaustivity. #21999

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

jrose-apple
Copy link
Contributor

The problem Assume 'A' and 'B' are enums with cases .a1, .a2, etc. and .b1, .b2, etc. If we try to typecheck this switch:

switch (a, b) {
case (.a1, .b1),
     (.a1, .b2):
  break
...

The compiler is going to try to perform the following series of operations:

> var s = (A, B)
> s -= (.a1, .b1)
((A - .a1, B) | (A, B - .b1))
> s -= (.a1, .b2)
(((A - .a1, B) | (A - .a1, B - .b2)) |
 ((A - .a1, B - .b1) | (A, B - .b1 - .b2)))
...

As you can see, the disjunction representing the uncovered space is growing exponentially. Eventually some of these will start disappearing (for instance, if B only has .b1 and .b2, that last term can go away), and if the switch is exhaustive they can all start disappearing. But several of them are also redundant: the second and third cases are fully covered by the first.

I exaggerated a little: the compiler is already smart enough to know that the second case is redundant with the first, because it already knows that (.a1, .b2) isn't a subset of (A - .a1, B). However, the third and fourth cases are generated separately from the first two, and so nothing ever checks that the third case is also redundant.

This patch changes the logic for subtracting from a disjunction so that

  1. any resulting disjunctions are flattened, and
  2. any later terms that are subspaces of earlier terms are dropped

This is a quadratic algorithm in the worst case (compare every space against every earlier space), but because it saves us from exponential growth (or at least brings down the exponent) it's worth it. For the test case now committed in the repository, we went from 40 seconds (using -switch-checking-invocation-threshold=20000000 to avoid cutting off early) to 0.2 seconds.

I'll admit I was only timing this one input, and it's possible that other complex switches will not see the same benefit, or may even see a slowdown. But I do think this kind of switch is common in both hand-written and auto-generated code, and therefore this is likely to be a benefit in many places.

rdar://problem/47365349

Saves a few copies of linked lists. No real impact expected,
certainly no user-facing changes.
The problem Assume 'A' and 'B' are enums with cases .a1, .a2, etc. and
.b1, .b2, etc. If we try to typecheck this switch:

    switch (a, b) {
    case (.a1, .b1),
         (.a1, .b2):
      break
    ...

The compiler is going to try to perform the following series of
operations:

    > var s = (A, B)
    > s -= (.a1, .b1)
    ((A - .a1, B) | (A, B - .b1))
    > s -= (.a1, .b2)
    (((A - .a1, B) | (A - .a1, B - .b2)) |
     ((A - .a1, B - .b1) | (A, B - .b1 - .b2)))
    ...

As you can see, the disjunction representing the uncovered space is
growing exponentially. Eventually some of these will start
disappearing (for instance, if B only has .b1 and .b2, that last term
can go away), and if the switch is exhaustive they can /all/ start
disappearing. But several of them are also redundant: the second and
third cases are fully covered by the first.

I exaggerated a little: the compiler is already smart enough to know
that the second case is redundant with the first, because it already
knows that (.a1, .b2) isn't a subset of (A - .a1, B). However, the
third and fourth cases are generated separately from the first two,
and so nothing ever checks that the third case is also redundant.

This patch changes the logic for subtracting from a disjunction so
that

1. any resulting disjunctions are flattened, and
2. any later terms that are subspaces of earlier terms are dropped

This is a quadratic algorithm in the worst case (compare every space
against every earlier space), but because it saves us from exponential
growth (or at least brings down the exponent) it's worth it. For the
test case now committed in the repository, we went from 40 seconds
(using -switch-checking-invocation-threshold=20000000 to avoid cutting
off early) to 0.2 seconds.

I'll admit I was only timing this one input, and it's possible that
other complex switches will not see the same benefit, or may even see
a slowdown. But I do think this kind of switch is common in both
hand-written and auto-generated code, and therefore this is likely to
be a benefit in many places.

rdar://problem/47365349
@jrose-apple jrose-apple force-pushed the from-exponential-expletives-to-quadratic-quality branch from 9234619 to 671944d Compare January 19, 2019 02:37
@jrose-apple
Copy link
Contributor Author

@swift-ci Please test

@jrose-apple
Copy link
Contributor Author

@swift-ci Please test compiler performance

@jrose-apple
Copy link
Contributor Author

@swift-ci Please test source compatibility

Copy link
Contributor

@davidungar davidungar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sigh... I would guess that this change, considered in isolation, is probably good. But I can't help thinking that this whole file seems to be much too much to maintain relative to its utility. As I understand it, the benefit of this algorithm is good diagnostics about missing cases. Something much simpler could just say covered or not-covered, which is what I think we concluded last year when we discussed the topic.

I hate to say it, and maybe I'm wrong, but maintaining this periodically-troublesome and seemingly-complicated-to-me sophisticated diagnostic capability just may not be the best use of our precious cognitons.

@swift-ci
Copy link
Contributor

Build comment file:

Summary for master full

Unexpected test results, excluded stats for Tagged, Wordy, Routing, GRDB

Regressions found (see below)

Debug-batch

debug-batch brief

Regressed (1)
name old new delta delta_pct
Frontend.NumInstructionsExecuted 21,714,015,444,005 75,592,067,597,648 53,878,052,153,643 248.13% ⛔
Improved (0)
name old new delta delta_pct
Unchanged (delta < 1.0% or delta < 100.0ms) (2)
name old new delta delta_pct
LLVM.NumLLVMBytesOutput 865,623,482 865,623,252 -230 -0.0%
time.swift-driver.wall 2156.5s 2160.1s 3.6s 0.17%

debug-batch detailed

Regressed (6)
name old new delta delta_pct
AST.NumASTBytesAllocated 44,829,545,176 45,574,131,972 744,586,796 1.66% ⛔
Frontend.NumInstructionsExecuted 21,714,015,444,005 75,592,067,597,648 53,878,052,153,643 248.13% ⛔
Sema.NumConformancesDeserialized 4,316,431 4,371,749 55,318 1.28% ⛔
Sema.NumDeclsDeserialized 34,542,561 34,982,722 440,161 1.27% ⛔
Sema.OverriddenDeclsRequest 4,670,628 4,813,479 142,851 3.06% ⛔
Sema.USRGenerationRequest 7,069,729 7,376,255 306,526 4.34% ⛔
Improved (0)
name old new delta delta_pct
Unchanged (delta < 1.0% or delta < 100.0ms) (89)
name old new delta delta_pct
AST.NumDecls 67,055 67,055 0 0.0%
AST.NumDependencies 155,308 155,294 -14 -0.01%
AST.NumImportedExternalDefinitions 991,341 991,341 0 0.0%
AST.NumInfixOperators 25,081 25,081 0 0.0%
AST.NumLinkLibraries 0 0 0 0.0%
AST.NumLoadedModules 184,316 184,316 0 0.0%
AST.NumLocalTypeDecls 113 113 0 0.0%
AST.NumObjCMethods 12,675 12,675 0 0.0%
AST.NumPostfixOperators 13 13 0 0.0%
AST.NumPrecedenceGroups 12,994 12,994 0 0.0%
AST.NumPrefixOperators 71 71 0 0.0%
AST.NumReferencedDynamicNames 102 102 0 0.0%
AST.NumReferencedMemberNames 3,083,408 3,083,408 0 0.0%
AST.NumReferencedTopLevelNames 213,714 213,714 0 0.0%
AST.NumSourceBuffers 295,839 295,839 0 0.0%
AST.NumSourceLines 2,144,353 2,144,353 0 0.0%
AST.NumSourceLinesPerSecond 1,353,100 1,353,650 550 0.04%
AST.NumTotalClangImportedEntities 3,650,250 3,666,590 16,340 0.45%
AST.NumUsedConformances 190,894 190,894 0 0.0%
Driver.ChildrenMaxRSS 72,629,645,312 72,779,403,264 149,757,952 0.21%
Driver.DriverDepCascadingDynamic 0 0 0 0.0%
Driver.DriverDepCascadingExternal 0 0 0 0.0%
Driver.DriverDepCascadingMember 0 0 0 0.0%
Driver.DriverDepCascadingNominal 0 0 0 0.0%
Driver.DriverDepCascadingTopLevel 0 0 0 0.0%
Driver.DriverDepDynamic 0 0 0 0.0%
Driver.DriverDepExternal 0 0 0 0.0%
Driver.DriverDepMember 0 0 0 0.0%
Driver.DriverDepNominal 0 0 0 0.0%
Driver.DriverDepTopLevel 0 0 0 0.0%
Driver.NumDriverJobsRun 13,878 13,878 0 0.0%
Driver.NumDriverJobsSkipped 0 0 0 0.0%
Driver.NumDriverPipePolls 250,269 251,532 1,263 0.5%
Driver.NumDriverPipeReads 289,715 287,315 -2,400 -0.83%
Driver.NumProcessFailures 0 0 0 0.0%
Frontend.MaxMallocUsage 385,201,178,720 388,256,138,720 3,054,960,000 0.79%
Frontend.NumProcessFailures 0 0 0 0.0%
IRModule.NumIRAliases 95,532 95,532 0 0.0%
IRModule.NumIRBasicBlocks 3,278,350 3,278,350 0 0.0%
IRModule.NumIRComdatSymbols 0 0 0 0.0%
IRModule.NumIRFunctions 1,542,687 1,542,687 0 0.0%
IRModule.NumIRGlobals 1,650,418 1,650,418 0 0.0%
IRModule.NumIRIFuncs 0 0 0 0.0%
IRModule.NumIRInsts 40,165,326 40,165,326 0 0.0%
IRModule.NumIRNamedMetaData 67,867 67,867 0 0.0%
IRModule.NumIRValueSymbols 2,887,932 2,887,932 0 0.0%
LLVM.NumLLVMBytesOutput 865,623,482 865,623,252 -230 -0.0%
Parse.NumFunctionsParsed 123,148 123,148 0 0.0%
Parse.NumIterableDeclContextParsed 889,457 889,457 0 0.0%
SILModule.NumSILGenDefaultWitnessTables 0 0 0 0.0%
SILModule.NumSILGenFunctions 791,744 791,744 0 0.0%
SILModule.NumSILGenGlobalVariables 27,186 27,186 0 0.0%
SILModule.NumSILGenVtables 10,231 10,231 0 0.0%
SILModule.NumSILGenWitnessTables 32,056 32,056 0 0.0%
SILModule.NumSILOptDefaultWitnessTables 0 0 0 0.0%
SILModule.NumSILOptFunctions 1,115,006 1,115,006 0 0.0%
SILModule.NumSILOptGlobalVariables 27,887 27,887 0 0.0%
SILModule.NumSILOptVtables 16,473 16,473 0 0.0%
SILModule.NumSILOptWitnessTables 68,885 68,885 0 0.0%
Sema.AccessLevelRequest 1,973,124 1,976,688 3,564 0.18%
Sema.DefaultAndMaxAccessLevelRequest 46,902 46,909 7 0.01%
Sema.EnumRawTypeRequest 14,410 14,410 0 0.0%
Sema.ExtendedNominalRequest 2,932,882 2,943,704 10,822 0.37%
Sema.InheritedDeclsReferencedRequest 3,522,981 3,555,829 32,848 0.93%
Sema.InheritedTypeRequest 427,253 427,609 356 0.08%
Sema.IsDynamicRequest 1,579,559 1,579,559 0 0.0%
Sema.IsObjCRequest 1,359,998 1,361,071 1,073 0.08%
Sema.NamedLazyMemberLoadFailureCount 18,704 18,797 93 0.5%
Sema.NamedLazyMemberLoadSuccessCount 13,464,642 13,463,766 -876 -0.01%
Sema.NominalTypeLookupDirectCount 25,467,821 25,533,826 66,005 0.26%
Sema.NumConstraintScopes 16,174,388 16,181,733 7,345 0.05%
Sema.NumConstraintsConsideredForEdgeContraction 44,392,610 44,394,734 2,124 0.0%
Sema.NumDeclsValidated 1,691,998 1,691,998 0 0.0%
Sema.NumFunctionsTypechecked 853,337 853,337 0 0.0%
Sema.NumGenericSignatureBuilders 935,453 939,950 4,497 0.48%
Sema.NumLazyGenericEnvironments 6,975,446 7,020,092 44,646 0.64%
Sema.NumLazyGenericEnvironmentsLoaded 171,011 171,111 100 0.06%
Sema.NumLazyIterableDeclContexts 5,295,936 5,311,968 16,032 0.3%
Sema.NumLeafScopes 11,036,952 11,043,113 6,161 0.06%
Sema.NumTypesDeserialized 12,224,408 12,293,912 69,504 0.57%
Sema.NumTypesValidated 1,097,653 1,097,653 0 0.0%
Sema.NumUnloadedLazyIterableDeclContexts 3,646,712 3,640,334 -6,378 -0.17%
Sema.RequirementRequest 57,460 57,460 0 0.0%
Sema.SelfBoundsFromWhereClauseRequest 4,861,575 4,904,129 42,554 0.88%
Sema.SetterAccessLevelRequest 112,442 112,442 0 0.0%
Sema.SuperclassDeclRequest 65,058 65,194 136 0.21%
Sema.SuperclassTypeRequest 30,358 30,358 0 0.0%
Sema.TypeDeclsFromWhereClauseRequest 26,845 26,852 7 0.03%
Sema.UnderlyingTypeDeclsReferencedRequest 177,512 178,400 888 0.5%

Release

release brief

Regressed (0)
name old new delta delta_pct
Improved (1)
name old new delta delta_pct
Frontend.NumInstructionsExecuted 80,357,896,580,119 50,019,090,775,890 -30,338,805,804,229 -37.75% ✅
Unchanged (delta < 1.0% or delta < 100.0ms) (2)
name old new delta delta_pct
LLVM.NumLLVMBytesOutput 762,626,112 762,635,856 9,744 0.0%
time.swift-driver.wall 3621.7s 3624.0s 2.3s 0.06%

release detailed

Regressed (0)
name old new delta delta_pct
Improved (0)
name old new delta delta_pct
Unchanged (delta < 1.0% or delta < 100.0ms) (23)
name old new delta delta_pct
AST.NumImportedExternalDefinitions 184,868 184,868 0 0.0%
AST.NumLoadedModules 11,580 11,580 0 0.0%
AST.NumTotalClangImportedEntities 624,476 624,476 0 0.0%
AST.NumUsedConformances 191,718 191,718 0 0.0%
IRModule.NumIRBasicBlocks 2,725,967 2,725,967 0 0.0%
IRModule.NumIRFunctions 1,321,909 1,321,909 0 0.0%
IRModule.NumIRGlobals 1,439,450 1,439,450 0 0.0%
IRModule.NumIRInsts 25,601,425 25,601,425 0 0.0%
IRModule.NumIRValueSymbols 2,577,655 2,577,655 0 0.0%
LLVM.NumLLVMBytesOutput 762,626,112 762,635,856 9,744 0.0%
SILModule.NumSILGenFunctions 547,486 547,486 0 0.0%
SILModule.NumSILOptFunctions 723,654 723,654 0 0.0%
Sema.NumConformancesDeserialized 1,656,657 1,656,657 0 0.0%
Sema.NumConstraintScopes 14,592,344 14,592,344 0 0.0%
Sema.NumDeclsDeserialized 4,394,337 4,394,337 0 0.0%
Sema.NumDeclsValidated 871,881 871,881 0 0.0%
Sema.NumFunctionsTypechecked 367,803 367,803 0 0.0%
Sema.NumGenericSignatureBuilders 152,973 152,973 0 0.0%
Sema.NumLazyGenericEnvironments 897,508 897,508 0 0.0%
Sema.NumLazyGenericEnvironmentsLoaded 16,423 16,423 0 0.0%
Sema.NumLazyIterableDeclContexts 569,943 569,943 0 0.0%
Sema.NumTypesDeserialized 2,331,680 2,331,680 0 0.0%
Sema.NumTypesValidated 407,582 407,582 0 0.0%

Copy link
Contributor

@xedin xedin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @davidungar, but I also think that this is a viable for the time being until one of us gets to implement a better algorithm here...

@xedin
Copy link
Contributor

xedin commented Jan 20, 2019

@jrose-apple for types A { .a1, .a2 } and B { .b1, .b2 } could a tuple of (A, B) be expanded into a Cartesian product (disjunction) of constructor pairs e.g. (.a1, .b1), (.a1, .b2), etc.?

Copy link
Contributor

@CodaFi CodaFi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The optimization is sound and warranted. There are probably a few more places where it applies as well.

@CodaFi
Copy link
Contributor

CodaFi commented Jan 20, 2019

@davidungar It's important to note that, while we're seeing space complexity blowups, the overall operations involved in the algorithm itself are relatively cheap to compute and scale well with the problem space. If you take a step back and look at where we're seeing these cases, it's consistently large products of large enums. If we were losing on straight-line switches I'd be more fervently looking into a replacement. Even then, it seems like the replacements might only postpone the inevitable.

The ideas we discussed before about optimizing the check seemed attractive at first (briefly - a hypothetical quasi-linear-time algorithm for a quick checks then the full monty if you fail that) but the inherently infinitely-recursive nature of the problem space means it resists the kinds of caching you have in mind. My experience tells me that we can make this check fast by focusing on memoizing or pruning (sub)-computations since that's the generalization of the solution @jrose-apple has identified here: we're duplicating a ton of work we've already performed down different subtrees.

@davidungar
Copy link
Contributor

@CodaFi Thanks for your response. I trust your judgement here; I've not studied this algorithm closely enough. Forgive the repetition, but I'm hoping to get to a point where we put no further effort into this area. I harbor this hope because I feel that we can more effectively improve the experience for our users by investing our time elsewhere, just a personal opinion. I had thought that the counter I had put into this code a while ago might accomplish that feat, and am dismayed that this code continues to divert us from other pursuits.

@jrose-apple
Copy link
Contributor Author

@jrose-apple for types A { .a1, .a2 } and B { .b1, .b2 } could a tuple of (A, B) be expanded into a Cartesian product (disjunction) of constructor pairs e.g. (.a1, .b1), (.a1, .b2), etc.?

That ends up not helping because of @unknown default. Since you are allowed but not required to match an unknown default for a non-resilient enum (or any enum in Swift 4.2 mode), the difference between the total space and the covered space is pretty much never 0 anymore. (We also tried it that one time and it produced worse diagnostics, matching every case of an ignored payload enum instead of just ignoring it.)

Also, recursive enums. :-(

Something much simpler could just say covered or not-covered, which is what I think we concluded last year when we discussed the topic.

This is a true statement, but it doesn't seem helpful. If you're missing a case from a switch, how would you ever find out which case it was? You can't even break apart the switch like you can with a complex expression.


By the way, I don't believe the compiler perf instruction counter there—it says we executed three times as many instructions but the overall run time didn't go up by more than a few percent. So I guess that wasn't useful after all.

@jrose-apple
Copy link
Contributor Author

By the way…

I had thought that the counter I had put into this code a while ago might accomplish that feat, and am dismayed that this code continues to divert us from other pursuits.

The impetus to look into this is pretty much directly from Robert removing the last source of unsoundness from the implementation, an early exit that approximated whether the switch was covered. Without that, an existing moderately complicated switch was suddenly being checked "properly", and thus ran afoul of the counter-based timeout. So we might be closer to that goal now, where we shouldn't regress anything for our users as long as enum capabilities don't change again. On the other hand, we might just not have encountered some other bad patterns that other languages handle fine and Swift does not.

@davidungar
Copy link
Contributor

People who use complex case statements will be well-served by these 1400 lines of code. I hope we don't have to put much more effort into them (the code, not the people!). Jordan- and Robert- cycles are rare and precious!

@jrose-apple jrose-apple merged commit 37c30b1 into swiftlang:master Jan 22, 2019
@jrose-apple
Copy link
Contributor Author

Thanks to all three of you for reviewing!

@jrose-apple jrose-apple deleted the from-exponential-expletives-to-quadratic-quality branch January 22, 2019 19:50
jrose-apple added a commit to jrose-apple/swift that referenced this pull request Jan 22, 2019
…expletives-to-quadratic-quality

Cut down on exponential growth in checking switch exhaustivity.

(cherry picked from commit 37c30b1)
@xedin
Copy link
Contributor

xedin commented Jan 22, 2019

@jrose-apple If we remove recursive enums from consideration (assuming that it's easy to detect enums like that) - wouldn't it be a better algorithm to eagerly expand types into their Cartesian products and match linearly and handle @unknown default separately after subtraction is done?

@jrose-apple
Copy link
Contributor Author

I don't thiiink so but I'll come back and think about it again later!

@davidungar
Copy link
Contributor

@xedin Your comment reminds me of the sort of algorithm I expected to see there before I looked at the code. My naive intuition felt that what you describe ought to be much faster for some cases.

jrose-apple added a commit that referenced this pull request Jan 24, 2019
…s-to-quadratic-quality (#22045)

Cut down on exponential growth in checking switch exhaustivity.

rdar://problem/47365349
(cherry picked from commit 37c30b1)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants