Cut down on exponential growth in checking switch exhaustivity. #21999

jrose-apple · 2019-01-19T02:36:40Z

The problem Assume 'A' and 'B' are enums with cases .a1, .a2, etc. and .b1, .b2, etc. If we try to typecheck this switch:

switch (a, b) {
case (.a1, .b1),
     (.a1, .b2):
  break
...

The compiler is going to try to perform the following series of operations:

> var s = (A, B)
> s -= (.a1, .b1)
((A - .a1, B) | (A, B - .b1))
> s -= (.a1, .b2)
(((A - .a1, B) | (A - .a1, B - .b2)) |
 ((A - .a1, B - .b1) | (A, B - .b1 - .b2)))
...

As you can see, the disjunction representing the uncovered space is growing exponentially. Eventually some of these will start disappearing (for instance, if B only has .b1 and .b2, that last term can go away), and if the switch is exhaustive they can all start disappearing. But several of them are also redundant: the second and third cases are fully covered by the first.

I exaggerated a little: the compiler is already smart enough to know that the second case is redundant with the first, because it already knows that (.a1, .b2) isn't a subset of (A - .a1, B). However, the third and fourth cases are generated separately from the first two, and so nothing ever checks that the third case is also redundant.

This patch changes the logic for subtracting from a disjunction so that

any resulting disjunctions are flattened, and
any later terms that are subspaces of earlier terms are dropped

This is a quadratic algorithm in the worst case (compare every space against every earlier space), but because it saves us from exponential growth (or at least brings down the exponent) it's worth it. For the test case now committed in the repository, we went from 40 seconds (using -switch-checking-invocation-threshold=20000000 to avoid cutting off early) to 0.2 seconds.

I'll admit I was only timing this one input, and it's possible that other complex switches will not see the same benefit, or may even see a slowdown. But I do think this kind of switch is common in both hand-written and auto-generated code, and therefore this is likely to be a benefit in many places.

rdar://problem/47365349

Saves a few copies of linked lists. No real impact expected, certainly no user-facing changes.

The problem Assume 'A' and 'B' are enums with cases .a1, .a2, etc. and .b1, .b2, etc. If we try to typecheck this switch: switch (a, b) { case (.a1, .b1), (.a1, .b2): break ... The compiler is going to try to perform the following series of operations: > var s = (A, B) > s -= (.a1, .b1) ((A - .a1, B) | (A, B - .b1)) > s -= (.a1, .b2) (((A - .a1, B) | (A - .a1, B - .b2)) | ((A - .a1, B - .b1) | (A, B - .b1 - .b2))) ... As you can see, the disjunction representing the uncovered space is growing exponentially. Eventually some of these will start disappearing (for instance, if B only has .b1 and .b2, that last term can go away), and if the switch is exhaustive they can /all/ start disappearing. But several of them are also redundant: the second and third cases are fully covered by the first. I exaggerated a little: the compiler is already smart enough to know that the second case is redundant with the first, because it already knows that (.a1, .b2) isn't a subset of (A - .a1, B). However, the third and fourth cases are generated separately from the first two, and so nothing ever checks that the third case is also redundant. This patch changes the logic for subtracting from a disjunction so that 1. any resulting disjunctions are flattened, and 2. any later terms that are subspaces of earlier terms are dropped This is a quadratic algorithm in the worst case (compare every space against every earlier space), but because it saves us from exponential growth (or at least brings down the exponent) it's worth it. For the test case now committed in the repository, we went from 40 seconds (using -switch-checking-invocation-threshold=20000000 to avoid cutting off early) to 0.2 seconds. I'll admit I was only timing this one input, and it's possible that other complex switches will not see the same benefit, or may even see a slowdown. But I do think this kind of switch is common in both hand-written and auto-generated code, and therefore this is likely to be a benefit in many places. rdar://problem/47365349

jrose-apple · 2019-01-19T02:37:15Z

@swift-ci Please test

jrose-apple · 2019-01-19T02:37:22Z

@swift-ci Please test compiler performance

jrose-apple · 2019-01-19T02:39:29Z

@swift-ci Please test source compatibility

davidungar

Sigh... I would guess that this change, considered in isolation, is probably good. But I can't help thinking that this whole file seems to be much too much to maintain relative to its utility. As I understand it, the benefit of this algorithm is good diagnostics about missing cases. Something much simpler could just say covered or not-covered, which is what I think we concluded last year when we discussed the topic.

I hate to say it, and maybe I'm wrong, but maintaining this periodically-troublesome and seemingly-complicated-to-me sophisticated diagnostic capability just may not be the best use of our precious cognitons.

swift-ci · 2019-01-19T10:42:05Z

Build comment file:

Summary for master full

Unexpected test results, excluded stats for Tagged, Wordy, Routing, GRDB

Regressions found (see below)

Debug-batch
- debug-batch brief
- debug-batch detailed
Release
- release brief
- release detailed

Debug-batch

debug-batch brief

Regressed (1)

name	old	new	delta	delta_pct
Frontend.NumInstructionsExecuted	21,714,015,444,005	75,592,067,597,648	53,878,052,153,643	248.13% ⛔

Improved (0)

name	old	new	delta	delta_pct

Unchanged (delta < 1.0% or delta < 100.0ms) (2)

name	old	new	delta	delta_pct
LLVM.NumLLVMBytesOutput	865,623,482	865,623,252	-230	-0.0%
time.swift-driver.wall	2156.5s	2160.1s	3.6s	0.17%

debug-batch detailed

Regressed (6)

name	old	new	delta	delta_pct
AST.NumASTBytesAllocated	44,829,545,176	45,574,131,972	744,586,796	1.66% ⛔
Frontend.NumInstructionsExecuted	21,714,015,444,005	75,592,067,597,648	53,878,052,153,643	248.13% ⛔
Sema.NumConformancesDeserialized	4,316,431	4,371,749	55,318	1.28% ⛔
Sema.NumDeclsDeserialized	34,542,561	34,982,722	440,161	1.27% ⛔
Sema.OverriddenDeclsRequest	4,670,628	4,813,479	142,851	3.06% ⛔
Sema.USRGenerationRequest	7,069,729	7,376,255	306,526	4.34% ⛔

Improved (0)

name	old	new	delta	delta_pct

Unchanged (delta < 1.0% or delta < 100.0ms) (89)

name	old	new	delta	delta_pct
AST.NumDecls	67,055	67,055	0	0.0%
AST.NumDependencies	155,308	155,294	-14	-0.01%
AST.NumImportedExternalDefinitions	991,341	991,341	0	0.0%
AST.NumInfixOperators	25,081	25,081	0	0.0%
AST.NumLinkLibraries	0	0	0	0.0%
AST.NumLoadedModules	184,316	184,316	0	0.0%
AST.NumLocalTypeDecls	113	113	0	0.0%
AST.NumObjCMethods	12,675	12,675	0	0.0%
AST.NumPostfixOperators	13	13	0	0.0%
AST.NumPrecedenceGroups	12,994	12,994	0	0.0%
AST.NumPrefixOperators	71	71	0	0.0%
AST.NumReferencedDynamicNames	102	102	0	0.0%
AST.NumReferencedMemberNames	3,083,408	3,083,408	0	0.0%
AST.NumReferencedTopLevelNames	213,714	213,714	0	0.0%
AST.NumSourceBuffers	295,839	295,839	0	0.0%
AST.NumSourceLines	2,144,353	2,144,353	0	0.0%
AST.NumSourceLinesPerSecond	1,353,100	1,353,650	550	0.04%
AST.NumTotalClangImportedEntities	3,650,250	3,666,590	16,340	0.45%
AST.NumUsedConformances	190,894	190,894	0	0.0%
Driver.ChildrenMaxRSS	72,629,645,312	72,779,403,264	149,757,952	0.21%
Driver.DriverDepCascadingDynamic	0	0	0	0.0%
Driver.DriverDepCascadingExternal	0	0	0	0.0%
Driver.DriverDepCascadingMember	0	0	0	0.0%
Driver.DriverDepCascadingNominal	0	0	0	0.0%
Driver.DriverDepCascadingTopLevel	0	0	0	0.0%
Driver.DriverDepDynamic	0	0	0	0.0%
Driver.DriverDepExternal	0	0	0	0.0%
Driver.DriverDepMember	0	0	0	0.0%
Driver.DriverDepNominal	0	0	0	0.0%
Driver.DriverDepTopLevel	0	0	0	0.0%
Driver.NumDriverJobsRun	13,878	13,878	0	0.0%
Driver.NumDriverJobsSkipped	0	0	0	0.0%
Driver.NumDriverPipePolls	250,269	251,532	1,263	0.5%
Driver.NumDriverPipeReads	289,715	287,315	-2,400	-0.83%
Driver.NumProcessFailures	0	0	0	0.0%
Frontend.MaxMallocUsage	385,201,178,720	388,256,138,720	3,054,960,000	0.79%
Frontend.NumProcessFailures	0	0	0	0.0%
IRModule.NumIRAliases	95,532	95,532	0	0.0%
IRModule.NumIRBasicBlocks	3,278,350	3,278,350	0	0.0%
IRModule.NumIRComdatSymbols	0	0	0	0.0%
IRModule.NumIRFunctions	1,542,687	1,542,687	0	0.0%
IRModule.NumIRGlobals	1,650,418	1,650,418	0	0.0%
IRModule.NumIRIFuncs	0	0	0	0.0%
IRModule.NumIRInsts	40,165,326	40,165,326	0	0.0%
IRModule.NumIRNamedMetaData	67,867	67,867	0	0.0%
IRModule.NumIRValueSymbols	2,887,932	2,887,932	0	0.0%
LLVM.NumLLVMBytesOutput	865,623,482	865,623,252	-230	-0.0%
Parse.NumFunctionsParsed	123,148	123,148	0	0.0%
Parse.NumIterableDeclContextParsed	889,457	889,457	0	0.0%
SILModule.NumSILGenDefaultWitnessTables	0	0	0	0.0%
SILModule.NumSILGenFunctions	791,744	791,744	0	0.0%
SILModule.NumSILGenGlobalVariables	27,186	27,186	0	0.0%
SILModule.NumSILGenVtables	10,231	10,231	0	0.0%
SILModule.NumSILGenWitnessTables	32,056	32,056	0	0.0%
SILModule.NumSILOptDefaultWitnessTables	0	0	0	0.0%
SILModule.NumSILOptFunctions	1,115,006	1,115,006	0	0.0%
SILModule.NumSILOptGlobalVariables	27,887	27,887	0	0.0%
SILModule.NumSILOptVtables	16,473	16,473	0	0.0%
SILModule.NumSILOptWitnessTables	68,885	68,885	0	0.0%
Sema.AccessLevelRequest	1,973,124	1,976,688	3,564	0.18%
Sema.DefaultAndMaxAccessLevelRequest	46,902	46,909	7	0.01%
Sema.EnumRawTypeRequest	14,410	14,410	0	0.0%
Sema.ExtendedNominalRequest	2,932,882	2,943,704	10,822	0.37%
Sema.InheritedDeclsReferencedRequest	3,522,981	3,555,829	32,848	0.93%
Sema.InheritedTypeRequest	427,253	427,609	356	0.08%
Sema.IsDynamicRequest	1,579,559	1,579,559	0	0.0%
Sema.IsObjCRequest	1,359,998	1,361,071	1,073	0.08%
Sema.NamedLazyMemberLoadFailureCount	18,704	18,797	93	0.5%
Sema.NamedLazyMemberLoadSuccessCount	13,464,642	13,463,766	-876	-0.01%
Sema.NominalTypeLookupDirectCount	25,467,821	25,533,826	66,005	0.26%
Sema.NumConstraintScopes	16,174,388	16,181,733	7,345	0.05%
Sema.NumConstraintsConsideredForEdgeContraction	44,392,610	44,394,734	2,124	0.0%
Sema.NumDeclsValidated	1,691,998	1,691,998	0	0.0%
Sema.NumFunctionsTypechecked	853,337	853,337	0	0.0%
Sema.NumGenericSignatureBuilders	935,453	939,950	4,497	0.48%
Sema.NumLazyGenericEnvironments	6,975,446	7,020,092	44,646	0.64%
Sema.NumLazyGenericEnvironmentsLoaded	171,011	171,111	100	0.06%
Sema.NumLazyIterableDeclContexts	5,295,936	5,311,968	16,032	0.3%
Sema.NumLeafScopes	11,036,952	11,043,113	6,161	0.06%
Sema.NumTypesDeserialized	12,224,408	12,293,912	69,504	0.57%
Sema.NumTypesValidated	1,097,653	1,097,653	0	0.0%
Sema.NumUnloadedLazyIterableDeclContexts	3,646,712	3,640,334	-6,378	-0.17%
Sema.RequirementRequest	57,460	57,460	0	0.0%
Sema.SelfBoundsFromWhereClauseRequest	4,861,575	4,904,129	42,554	0.88%
Sema.SetterAccessLevelRequest	112,442	112,442	0	0.0%
Sema.SuperclassDeclRequest	65,058	65,194	136	0.21%
Sema.SuperclassTypeRequest	30,358	30,358	0	0.0%
Sema.TypeDeclsFromWhereClauseRequest	26,845	26,852	7	0.03%
Sema.UnderlyingTypeDeclsReferencedRequest	177,512	178,400	888	0.5%

Release

release brief

Regressed (0)

name	old	new	delta	delta_pct

Improved (1)

name	old	new	delta	delta_pct
Frontend.NumInstructionsExecuted	80,357,896,580,119	50,019,090,775,890	-30,338,805,804,229	-37.75% ✅

Unchanged (delta < 1.0% or delta < 100.0ms) (2)

name	old	new	delta	delta_pct
LLVM.NumLLVMBytesOutput	762,626,112	762,635,856	9,744	0.0%
time.swift-driver.wall	3621.7s	3624.0s	2.3s	0.06%

release detailed

Regressed (0)

name	old	new	delta	delta_pct

Improved (0)

name	old	new	delta	delta_pct

Unchanged (delta < 1.0% or delta < 100.0ms) (23)

name	old	new	delta	delta_pct
AST.NumImportedExternalDefinitions	184,868	184,868	0	0.0%
AST.NumLoadedModules	11,580	11,580	0	0.0%
AST.NumTotalClangImportedEntities	624,476	624,476	0	0.0%
AST.NumUsedConformances	191,718	191,718	0	0.0%
IRModule.NumIRBasicBlocks	2,725,967	2,725,967	0	0.0%
IRModule.NumIRFunctions	1,321,909	1,321,909	0	0.0%
IRModule.NumIRGlobals	1,439,450	1,439,450	0	0.0%
IRModule.NumIRInsts	25,601,425	25,601,425	0	0.0%
IRModule.NumIRValueSymbols	2,577,655	2,577,655	0	0.0%
LLVM.NumLLVMBytesOutput	762,626,112	762,635,856	9,744	0.0%
SILModule.NumSILGenFunctions	547,486	547,486	0	0.0%
SILModule.NumSILOptFunctions	723,654	723,654	0	0.0%
Sema.NumConformancesDeserialized	1,656,657	1,656,657	0	0.0%
Sema.NumConstraintScopes	14,592,344	14,592,344	0	0.0%
Sema.NumDeclsDeserialized	4,394,337	4,394,337	0	0.0%
Sema.NumDeclsValidated	871,881	871,881	0	0.0%
Sema.NumFunctionsTypechecked	367,803	367,803	0	0.0%
Sema.NumGenericSignatureBuilders	152,973	152,973	0	0.0%
Sema.NumLazyGenericEnvironments	897,508	897,508	0	0.0%
Sema.NumLazyGenericEnvironmentsLoaded	16,423	16,423	0	0.0%
Sema.NumLazyIterableDeclContexts	569,943	569,943	0	0.0%
Sema.NumTypesDeserialized	2,331,680	2,331,680	0	0.0%
Sema.NumTypesValidated	407,582	407,582	0	0.0%

xedin

I agree with @davidungar, but I also think that this is a viable for the time being until one of us gets to implement a better algorithm here...

xedin · 2019-01-20T07:43:18Z

@jrose-apple for types A { .a1, .a2 } and B { .b1, .b2 } could a tuple of (A, B) be expanded into a Cartesian product (disjunction) of constructor pairs e.g. (.a1, .b1), (.a1, .b2), etc.?

CodaFi

The optimization is sound and warranted. There are probably a few more places where it applies as well.

CodaFi · 2019-01-20T18:04:01Z

@davidungar It's important to note that, while we're seeing space complexity blowups, the overall operations involved in the algorithm itself are relatively cheap to compute and scale well with the problem space. If you take a step back and look at where we're seeing these cases, it's consistently large products of large enums. If we were losing on straight-line switches I'd be more fervently looking into a replacement. Even then, it seems like the replacements might only postpone the inevitable.

The ideas we discussed before about optimizing the check seemed attractive at first (briefly - a hypothetical quasi-linear-time algorithm for a quick checks then the full monty if you fail that) but the inherently infinitely-recursive nature of the problem space means it resists the kinds of caching you have in mind. My experience tells me that we can make this check fast by focusing on memoizing or pruning (sub)-computations since that's the generalization of the solution @jrose-apple has identified here: we're duplicating a ton of work we've already performed down different subtrees.

davidungar · 2019-01-20T19:21:00Z

@CodaFi Thanks for your response. I trust your judgement here; I've not studied this algorithm closely enough. Forgive the repetition, but I'm hoping to get to a point where we put no further effort into this area. I harbor this hope because I feel that we can more effectively improve the experience for our users by investing our time elsewhere, just a personal opinion. I had thought that the counter I had put into this code a while ago might accomplish that feat, and am dismayed that this code continues to divert us from other pursuits.

jrose-apple · 2019-01-22T17:28:53Z

@jrose-apple for types A { .a1, .a2 } and B { .b1, .b2 } could a tuple of (A, B) be expanded into a Cartesian product (disjunction) of constructor pairs e.g. (.a1, .b1), (.a1, .b2), etc.?

That ends up not helping because of @unknown default. Since you are allowed but not required to match an unknown default for a non-resilient enum (or any enum in Swift 4.2 mode), the difference between the total space and the covered space is pretty much never 0 anymore. (We also tried it that one time and it produced worse diagnostics, matching every case of an ignored payload enum instead of just ignoring it.)

Also, recursive enums. :-(

Something much simpler could just say covered or not-covered, which is what I think we concluded last year when we discussed the topic.

This is a true statement, but it doesn't seem helpful. If you're missing a case from a switch, how would you ever find out which case it was? You can't even break apart the switch like you can with a complex expression.

By the way, I don't believe the compiler perf instruction counter there—it says we executed three times as many instructions but the overall run time didn't go up by more than a few percent. So I guess that wasn't useful after all.

jrose-apple · 2019-01-22T17:34:57Z

By the way…

I had thought that the counter I had put into this code a while ago might accomplish that feat, and am dismayed that this code continues to divert us from other pursuits.

The impetus to look into this is pretty much directly from Robert removing the last source of unsoundness from the implementation, an early exit that approximated whether the switch was covered. Without that, an existing moderately complicated switch was suddenly being checked "properly", and thus ran afoul of the counter-based timeout. So we might be closer to that goal now, where we shouldn't regress anything for our users as long as enum capabilities don't change again. On the other hand, we might just not have encountered some other bad patterns that other languages handle fine and Swift does not.

davidungar · 2019-01-22T17:51:57Z

People who use complex case statements will be well-served by these 1400 lines of code. I hope we don't have to put much more effort into them (the code, not the people!). Jordan- and Robert- cycles are rare and precious!

jrose-apple · 2019-01-22T19:50:08Z

Thanks to all three of you for reviewing!

…expletives-to-quadratic-quality Cut down on exponential growth in checking switch exhaustivity. (cherry picked from commit 37c30b1)

xedin · 2019-01-22T20:15:10Z

@jrose-apple If we remove recursive enums from consideration (assuming that it's easy to detect enums like that) - wouldn't it be a better algorithm to eagerly expand types into their Cartesian products and match linearly and handle @unknown default separately after subtraction is done?

jrose-apple · 2019-01-22T23:08:39Z

I don't thiiink so but I'll come back and think about it again later!

davidungar · 2019-01-22T23:20:43Z

@xedin Your comment reminds me of the sort of algorithm I expected to see there before I looked at the code. My naive intuition felt that what you describe ought to be much faster for some cases.

…s-to-quadratic-quality (#22045) Cut down on exponential growth in checking switch exhaustivity. rdar://problem/47365349 (cherry picked from commit 37c30b1)

jrose-apple requested review from xedin, davidungar and CodaFi January 19, 2019 02:36

jrose-apple added 2 commits January 18, 2019 18:36

Don't bother forming a Space just to check if a case is redundant

f5a8f26

Saves a few copies of linked lists. No real impact expected, certainly no user-facing changes.

jrose-apple force-pushed the from-exponential-expletives-to-quadratic-quality branch from 9234619 to 671944d Compare January 19, 2019 02:37

jrose-apple mentioned this pull request Jan 19, 2019

[WIP] Quick-and-dirty speed-up for certain large switches #21981

Closed

davidungar reviewed Jan 19, 2019

View reviewed changes

xedin approved these changes Jan 20, 2019

View reviewed changes

CodaFi approved these changes Jan 20, 2019

View reviewed changes

jrose-apple merged commit 37c30b1 into swiftlang:master Jan 22, 2019

jrose-apple deleted the from-exponential-expletives-to-quadratic-quality branch January 22, 2019 19:50

jrose-apple mentioned this pull request Jan 22, 2019

[Jan02] Cut down on exponential growth in checking switch exhaustivity #22045

Merged

Cut down on exponential growth in checking switch exhaustivity. #21999

Cut down on exponential growth in checking switch exhaustivity. #21999

Uh oh!

Conversation

jrose-apple commented Jan 19, 2019

Uh oh!

jrose-apple commented Jan 19, 2019

Uh oh!

jrose-apple commented Jan 19, 2019

Uh oh!

jrose-apple commented Jan 19, 2019

Uh oh!

davidungar left a comment

Choose a reason for hiding this comment

Uh oh!

swift-ci commented Jan 19, 2019

Build comment file:

Summary for master full

Debug-batch

debug-batch brief

debug-batch detailed

Release

release brief

release detailed

Uh oh!

xedin left a comment

Choose a reason for hiding this comment

Uh oh!

xedin commented Jan 20, 2019

Uh oh!

CodaFi left a comment

Choose a reason for hiding this comment

Uh oh!

CodaFi commented Jan 20, 2019

Uh oh!

davidungar commented Jan 20, 2019

Uh oh!

jrose-apple commented Jan 22, 2019

Uh oh!

jrose-apple commented Jan 22, 2019

Uh oh!

davidungar commented Jan 22, 2019

Uh oh!

jrose-apple commented Jan 22, 2019

Uh oh!

xedin commented Jan 22, 2019

Uh oh!

jrose-apple commented Jan 22, 2019

Uh oh!

davidungar commented Jan 22, 2019

Uh oh!

Uh oh!