Skip to content

Tweak FSO explosion heuristics #16756

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 12 commits into from

Conversation

gottesmm
Copy link
Contributor

This is still in progress.

rdar://39957093

@gottesmm gottesmm force-pushed the func_sig_opts_codesize_fix branch 2 times, most recently from 59741ef to 5c9616e Compare June 4, 2018 20:59
@gottesmm
Copy link
Contributor Author

gottesmm commented Jun 4, 2018

@swift-ci smoke test

@gottesmm
Copy link
Contributor Author

gottesmm commented Jun 4, 2018

@swift-ci performance

@gottesmm
Copy link
Contributor Author

gottesmm commented Jun 4, 2018

or actually I should make sure it actually compiles first I guess before seeing code-size numbers.

@gottesmm
Copy link
Contributor Author

gottesmm commented Jun 4, 2018

Something we may want to try later as well is if we have a large non-trivial type only whose trivial parts are used, we could create a new struct that only contains those trivial leaf parts (maybe flattened?) and just pass those in. We won't have the code-size impact from creating many arguments since we aren't creating many arguments and the IRGen passes should still be able to change it to be passed by pointer.

Getting rid of the ARC overhead for such structs would improve performance.

@gottesmm gottesmm force-pushed the func_sig_opts_codesize_fix branch 2 times, most recently from 2381324 to 88c0508 Compare June 20, 2018 23:13
@gottesmm gottesmm changed the title [NO-MERGE] Tweak FSO explosion heuristics Tweak FSO explosion heuristics Jun 20, 2018
@gottesmm
Copy link
Contributor Author

Finishing this up.

@gottesmm gottesmm force-pushed the func_sig_opts_codesize_fix branch 2 times, most recently from 3b8ffe3 to 41ea273 Compare June 21, 2018 20:51
@gottesmm
Copy link
Contributor Author

Some notes for those watching:

  1. I tried disabling our partial PRE of trivial arguments. Turns out we /are/ dependent on this feature of FSO and if we turn it off the stdlib increases in size by 1%. But it does give us significantly more on the benchmark suite. I am going to investigate what we can do about this.
  2. This is just the first part of a few more heuristics that I am going to add.

@gottesmm
Copy link
Contributor Author

@swift-ci smoke test

1 similar comment
@gottesmm
Copy link
Contributor Author

@swift-ci smoke test

@gottesmm
Copy link
Contributor Author

Some data.

This reduces the amount of FSO thunks in the stdlib by 20%. In Benchmark_O it reduces it by 16%.

@gottesmm gottesmm force-pushed the func_sig_opts_codesize_fix branch from 41ea273 to 098003c Compare June 21, 2018 20:58
@gottesmm
Copy link
Contributor Author

@swift-ci smoke test

1 similar comment
@gottesmm
Copy link
Contributor Author

@swift-ci smoke test

@gottesmm
Copy link
Contributor Author

@swift-ci smoke compiler performance

1 similar comment
@gottesmm
Copy link
Contributor Author

@swift-ci smoke compiler performance

@gottesmm
Copy link
Contributor Author

@swift-ci smoke test compiler performance

1 similar comment
@gottesmm
Copy link
Contributor Author

@swift-ci smoke test compiler performance

@gottesmm
Copy link
Contributor Author

NOTE I am expecting a few fso test failures since I need to re-update the tests.

@swift-ci
Copy link
Contributor

Build comment file:

Summary for master smoketest

Unexpected test results, excluded stats for ReactiveCocoa

Regressions found (see below)

Debug

debug brief

Regressed (1)
name old new delta delta_pct
time.swift-driver.wall 76.7s 77.8s 1.1s 1.41% ⛔
Improved (0)
name old new delta delta_pct
Unchanged (delta < 1.0% or delta < 100.0ms) (1)
name old new delta delta_pct
LLVM.NumLLVMBytesOutput 36,083,564 36,083,664 100 0.0%

debug detailed

Regressed (0)
name old new delta delta_pct
Improved (0)
name old new delta delta_pct
Unchanged (delta < 1.0% or delta < 100.0ms) (23)
name old new delta delta_pct
AST.NumImportedExternalDefinitions 84,845 84,845 0 0.0%
AST.NumLoadedModules 12,998 12,998 0 0.0%
AST.NumTotalClangImportedEntities 258,395 258,395 0 0.0%
AST.NumUsedConformances 8,764 8,764 0 0.0%
IRModule.NumIRBasicBlocks 127,389 127,389 0 0.0%
IRModule.NumIRFunctions 75,647 75,647 0 0.0%
IRModule.NumIRGlobals 68,681 68,681 0 0.0%
IRModule.NumIRInsts 1,423,020 1,423,020 0 0.0%
IRModule.NumIRValueSymbols 129,248 129,248 0 0.0%
LLVM.NumLLVMBytesOutput 36,083,564 36,083,664 100 0.0%
SILModule.NumSILGenFunctions 42,780 42,780 0 0.0%
SILModule.NumSILOptFunctions 47,121 47,121 0 0.0%
Sema.NumConformancesDeserialized 356,925 356,925 0 0.0%
Sema.NumConstraintScopes 899,499 899,499 0 0.0%
Sema.NumDeclsDeserialized 2,112,347 2,112,347 0 0.0%
Sema.NumDeclsValidated 189,609 189,609 0 0.0%
Sema.NumFunctionsTypechecked 48,071 48,071 0 0.0%
Sema.NumGenericSignatureBuilders 81,994 81,994 0 0.0%
Sema.NumLazyGenericEnvironments 423,779 423,779 0 0.0%
Sema.NumLazyGenericEnvironmentsLoaded 38,680 38,680 0 0.0%
Sema.NumLazyIterableDeclContexts 340,146 340,146 0 0.0%
Sema.NumTypesDeserialized 2,331,851 2,331,851 0 0.0%
Sema.NumTypesValidated 163,037 163,037 0 0.0%

Release

release brief

Regressed (0)
name old new delta delta_pct
Improved (0)
name old new delta delta_pct
Unchanged (delta < 1.0% or delta < 100.0ms) (2)
name old new delta delta_pct
LLVM.NumLLVMBytesOutput 26,665,036 26,634,724 -30,312 -0.11%
time.swift-driver.wall 157.9s 156.7s -1.2s -0.75%

release detailed

Regressed (0)
name old new delta delta_pct
Improved (0)
name old new delta delta_pct
Unchanged (delta < 1.0% or delta < 100.0ms) (23)
name old new delta delta_pct
AST.NumImportedExternalDefinitions 8,857 8,857 0 0.0%
AST.NumLoadedModules 408 408 0 0.0%
AST.NumTotalClangImportedEntities 25,770 25,770 0 0.0%
AST.NumUsedConformances 8,770 8,770 0 0.0%
IRModule.NumIRBasicBlocks 148,649 148,277 -372 -0.25%
IRModule.NumIRFunctions 56,679 56,640 -39 -0.07%
IRModule.NumIRGlobals 54,037 54,044 7 0.01%
IRModule.NumIRInsts 1,220,641 1,219,251 -1,390 -0.11%
IRModule.NumIRValueSymbols 101,459 101,427 -32 -0.03%
LLVM.NumLLVMBytesOutput 26,665,036 26,634,724 -30,312 -0.11%
SILModule.NumSILGenFunctions 22,181 22,181 0 0.0%
SILModule.NumSILOptFunctions 35,901 35,823 -78 -0.22%
Sema.NumConformancesDeserialized 86,063 86,063 0 0.0%
Sema.NumConstraintScopes 798,228 798,228 0 0.0%
Sema.NumDeclsDeserialized 205,415 205,415 0 0.0%
Sema.NumDeclsValidated 59,866 59,866 0 0.0%
Sema.NumFunctionsTypechecked 10,716 10,716 0 0.0%
Sema.NumGenericSignatureBuilders 9,750 9,750 0 0.0%
Sema.NumLazyGenericEnvironments 32,340 32,340 0 0.0%
Sema.NumLazyGenericEnvironmentsLoaded 4,556 4,556 0 0.0%
Sema.NumLazyIterableDeclContexts 20,652 20,652 0 0.0%
Sema.NumTypesDeserialized 269,226 269,226 0 0.0%
Sema.NumTypesValidated 33,026 33,026 0 0.0%

@gottesmm
Copy link
Contributor Author

@swift-ci test compiler performance

1 similar comment
@gottesmm
Copy link
Contributor Author

@swift-ci test compiler performance

@gottesmm
Copy link
Contributor Author

@swift-ci smoke benchmark

2 similar comments
@gottesmm
Copy link
Contributor Author

@swift-ci smoke benchmark

@gottesmm
Copy link
Contributor Author

@swift-ci smoke benchmark

gottesmm added 3 commits July 6, 2018 18:31
In the future, when we support specializing vtables and witness tables this will
become a different question. But for today, this is correct.
@gottesmm gottesmm force-pushed the func_sig_opts_codesize_fix branch from 42c59f5 to 4429861 Compare July 7, 2018 01:41
@gottesmm
Copy link
Contributor Author

gottesmm commented Jul 7, 2018

Found the problem!

@gottesmm
Copy link
Contributor Author

gottesmm commented Jul 7, 2018

@swift-ci test compiler performance

2 similar comments
@gottesmm
Copy link
Contributor Author

gottesmm commented Jul 7, 2018

@swift-ci test compiler performance

@gottesmm
Copy link
Contributor Author

gottesmm commented Jul 7, 2018

@swift-ci test compiler performance

@swift-ci
Copy link
Contributor

swift-ci commented Jul 7, 2018

Build comment file:

Summary for master full

Unexpected test results, excluded stats for Core, xcproj, StencilSwiftKit, CoreStore, Turnstile, ObjectMapper

Regressions found (see below)

Debug-batch

debug-batch brief

Regressed (0)
name old new delta delta_pct
Improved (0)
name old new delta delta_pct
Unchanged (delta < 1.0% or delta < 100.0ms) (2)
name old new delta delta_pct
LLVM.NumLLVMBytesOutput 1,118,356,756 1,118,329,774 -26,982 -0.0%
time.swift-driver.wall 2199.7s 2204.2s 4.6s 0.21%

debug-batch detailed

Regressed (2)
name old new delta delta_pct
Driver.NumDriverPipePolls 261,818 269,228 7,410 2.83% ⛔
Driver.NumDriverPipeReads 286,732 293,275 6,543 2.28% ⛔
Improved (0)
name old new delta delta_pct
Unchanged (delta < 1.0% or delta < 100.0ms) (78)
name old new delta delta_pct
AST.NumASTBytesAllocated 27,738,494,478 27,738,001,530 -492,948 -0.0%
AST.NumDecls 81,585 81,585 0 0.0%
AST.NumDependencies 163,291 163,291 0 0.0%
AST.NumImportedExternalDefinitions 1,385,796 1,385,796 0 0.0%
AST.NumInfixOperators 25,948 25,948 0 0.0%
AST.NumLinkLibraries 0 0 0 0.0%
AST.NumLoadedModules 203,549 203,549 0 0.0%
AST.NumLocalTypeDecls 6 6 0 0.0%
AST.NumObjCMethods 22,398 22,398 0 0.0%
AST.NumPostfixOperators 14 14 0 0.0%
AST.NumPrecedenceGroups 15,898 15,898 0 0.0%
AST.NumPrefixOperators 123 123 0 0.0%
AST.NumReferencedDynamicNames 179 179 0 0.0%
AST.NumReferencedMemberNames 3,738,331 3,738,331 0 0.0%
AST.NumReferencedTopLevelNames 267,416 267,416 0 0.0%
AST.NumSourceBuffers 354,227 354,227 0 0.0%
AST.NumSourceLines 2,296,744 2,296,744 0 0.0%
AST.NumSourceLinesPerSecond 1,675,169 1,675,032 -137 -0.01%
AST.NumTotalClangImportedEntities 4,555,589 4,555,589 0 0.0%
AST.NumUsedConformances 191,779 191,779 0 0.0%
Driver.ChildrenMaxRSS 69,907,623,936 69,956,560,896 48,936,960 0.07%
Driver.DriverDepCascadingDynamic 0 0 0 0.0%
Driver.DriverDepCascadingExternal 0 0 0 0.0%
Driver.DriverDepCascadingMember 0 0 0 0.0%
Driver.DriverDepCascadingNominal 0 0 0 0.0%
Driver.DriverDepCascadingTopLevel 0 0 0 0.0%
Driver.DriverDepDynamic 0 0 0 0.0%
Driver.DriverDepExternal 0 0 0 0.0%
Driver.DriverDepMember 0 0 0 0.0%
Driver.DriverDepNominal 0 0 0 0.0%
Driver.DriverDepTopLevel 0 0 0 0.0%
Driver.NumDriverJobsRun 16,311 16,311 0 0.0%
Driver.NumDriverJobsSkipped 0 0 0 0.0%
Driver.NumProcessFailures 0 0 0 0.0%
Frontend.NumProcessFailures 0 0 0 0.0%
IRModule.NumIRAliases 12,783 12,777 -6 -0.05%
IRModule.NumIRBasicBlocks 3,689,471 3,688,993 -478 -0.01%
IRModule.NumIRComdatSymbols 0 0 0 0.0%
IRModule.NumIRFunctions 2,108,881 2,108,825 -56 -0.0%
IRModule.NumIRGlobals 2,235,017 2,235,015 -2 -0.0%
IRModule.NumIRIFuncs 0 0 0 0.0%
IRModule.NumIRInsts 41,522,338 41,521,046 -1,292 -0.0%
IRModule.NumIRNamedMetaData 79,965 79,965 0 0.0%
IRModule.NumIRValueSymbols 3,846,383 3,846,319 -64 -0.0%
LLVM.NumLLVMBytesOutput 1,118,356,756 1,118,329,774 -26,982 -0.0%
Parse.NumFunctionsParsed 146,373 146,373 0 0.0%
SILModule.NumSILGenDefaultWitnessTables 0 0 0 0.0%
SILModule.NumSILGenFunctions 1,498,599 1,498,599 0 0.0%
SILModule.NumSILGenGlobalVariables 24,237 24,237 0 0.0%
SILModule.NumSILGenVtables 16,084 16,084 0 0.0%
SILModule.NumSILGenWitnessTables 42,120 42,120 0 0.0%
SILModule.NumSILOptDefaultWitnessTables 0 0 0 0.0%
SILModule.NumSILOptFunctions 1,365,702 1,365,587 -115 -0.01%
SILModule.NumSILOptGlobalVariables 24,990 24,990 0 0.0%
SILModule.NumSILOptVtables 24,614 24,614 0 0.0%
SILModule.NumSILOptWitnessTables 73,353 73,353 0 0.0%
Sema.AccessLevelRequest 1,904,716 1,904,716 0 0.0%
Sema.DefaultAndMaxAccessLevelRequest 58,172 58,172 0 0.0%
Sema.EnumRawTypeRequest 14,390 14,390 0 0.0%
Sema.InheritedTypeRequest 94,672 94,672 0 0.0%
Sema.NamedLazyMemberLoadFailureCount 23,420 23,420 0 0.0%
Sema.NamedLazyMemberLoadSuccessCount 3,842,043 3,842,043 0 0.0%
Sema.NominalTypeLookupDirectCount 27,911,806 27,911,595 -211 -0.0%
Sema.NumConformancesDeserialized 4,717,531 4,717,531 0 0.0%
Sema.NumConstraintScopes 9,582,433 9,582,433 0 0.0%
Sema.NumConstraintsConsideredForEdgeContraction 27,569,411 27,569,411 0 0.0%
Sema.NumDeclsDeserialized 33,258,545 33,258,545 0 0.0%
Sema.NumDeclsValidated 2,834,695 2,834,695 0 0.0%
Sema.NumFunctionsTypechecked 916,570 916,570 0 0.0%
Sema.NumGenericSignatureBuilders 1,603,856 1,603,856 0 0.0%
Sema.NumLazyGenericEnvironments 6,424,500 6,424,500 0 0.0%
Sema.NumLazyGenericEnvironmentsLoaded 731,861 731,861 0 0.0%
Sema.NumLazyIterableDeclContexts 5,573,439 5,573,439 0 0.0%
Sema.NumTypesDeserialized 35,179,054 35,179,054 0 0.0%
Sema.NumTypesValidated 2,992,395 2,992,395 0 0.0%
Sema.NumUnloadedLazyIterableDeclContexts 3,778,985 3,778,985 0 0.0%
Sema.SetterAccessLevelRequest 109,472 109,472 0 0.0%
Sema.SuperclassTypeRequest 130,441 130,441 0 0.0%

Release

release brief

Regressed (0)
name old new delta delta_pct
Improved (0)
name old new delta delta_pct
Unchanged (delta < 1.0% or delta < 100.0ms) (2)
name old new delta delta_pct
LLVM.NumLLVMBytesOutput 975,607,352 975,050,096 -557,256 -0.06%
time.swift-driver.wall 3609.1s 3611.4s 2.2s 0.06%

release detailed

Regressed (0)
name old new delta delta_pct
Improved (0)
name old new delta delta_pct
Unchanged (delta < 1.0% or delta < 100.0ms) (23)
name old new delta delta_pct
AST.NumImportedExternalDefinitions 300,341 300,341 0 0.0%
AST.NumLoadedModules 18,081 18,081 0 0.0%
AST.NumTotalClangImportedEntities 981,719 981,687 -32 -0.0%
AST.NumUsedConformances 197,185 197,185 0 0.0%
IRModule.NumIRBasicBlocks 3,056,849 3,040,596 -16,253 -0.53%
IRModule.NumIRFunctions 1,705,413 1,703,852 -1,561 -0.09%
IRModule.NumIRGlobals 1,860,412 1,859,940 -472 -0.03%
IRModule.NumIRInsts 29,267,121 29,204,832 -62,289 -0.21%
IRModule.NumIRValueSymbols 3,268,126 3,266,068 -2,058 -0.06%
LLVM.NumLLVMBytesOutput 975,607,352 975,050,096 -557,256 -0.06%
SILModule.NumSILGenFunctions 724,600 724,600 0 0.0%
SILModule.NumSILOptFunctions 976,931 974,474 -2,457 -0.25%
Sema.NumConformancesDeserialized 2,240,761 2,239,856 -905 -0.04%
Sema.NumConstraintScopes 9,445,306 9,445,306 0 0.0%
Sema.NumDeclsDeserialized 6,912,017 6,910,170 -1,847 -0.03%
Sema.NumDeclsValidated 2,053,799 2,053,799 0 0.0%
Sema.NumFunctionsTypechecked 409,266 409,266 0 0.0%
Sema.NumGenericSignatureBuilders 345,727 345,647 -80 -0.02%
Sema.NumLazyGenericEnvironments 1,195,187 1,194,661 -526 -0.04%
Sema.NumLazyGenericEnvironmentsLoaded 147,885 147,568 -317 -0.21%
Sema.NumLazyIterableDeclContexts 811,169 811,121 -48 -0.01%
Sema.NumTypesDeserialized 8,406,624 8,403,576 -3,048 -0.04%
Sema.NumTypesValidated 1,353,321 1,353,321 0 0.0%

@gottesmm
Copy link
Contributor Author

gottesmm commented Jul 7, 2018

@graydon Question. Is there scripting/facilities for gathering this data per project? I would like to get a more finer grained picture here and would rather not script it myself if you already have something. = ).

nate-chandler added a commit to nate-chandler/swift that referenced this pull request Sep 19, 2019
The new rule on the basis of which an argument will be exploded is that
an argument will be exploded if the following conditions hold:

(1) The argument has greater than zero and less than or equal to three
    live leaves.

(2) The argument has greater than zero non-trivial leaves.

This change is based heavily on @GottesM's
swiftlang#16756 .

rdar://problem/39957093
nate-chandler added a commit to nate-chandler/swift that referenced this pull request Sep 20, 2019
Added getAllLeafTypes to ProjectionTree.  The new method vends, via an
out paramter, a vector containing the types of all the leaves in a
projection tree in the order that they appear.  The method relies uses a
new convenience on ProjectionTreeNode, isLeaf to include only the types
of those nodes which are leaves.

Excerpted from @GottesM's swiftlang#16756.
nate-chandler added a commit to nate-chandler/swift that referenced this pull request Sep 20, 2019
Added getUsers to ProjectionTree.  The new method vands, via an out
parameter, a set of all the users of all of the nodes in the projection
tree that are themselves not in the projection tree by way of
getNonProjUsers.  Took this opportunity to tweak getNonProjUsers to vend
a const ArrayRef rather than a SmallVector.

Excerpted from @GottesM's swiftlang#16756.
nate-chandler added a commit to nate-chandler/swift that referenced this pull request Sep 20, 2019
…leases.

Added getPartiallyPostDomReleaseSet to
ConsumedArgToEpilogueReleaseMatcher.  Given an argument, the new method
returns the array of releases of the argument if there is an array
thereof and if the releases therein do not jointly post-dominate the
argument.

Excerpted from @GottesM's swiftlang#16756.
nate-chandler added a commit to nate-chandler/swift that referenced this pull request Sep 20, 2019
Replaced some namespace qualified references to ArrayRef and Optional
with the unqualified type.  Reordered the includes per clang-format.

Excerpted from @GottesM's swiftlang#16756.
nate-chandler added a commit to nate-chandler/swift that referenced this pull request Sep 24, 2019
The new rule is that an argument will be exploded if one of the
following sets of conditions hold:

(1) (a) Specializing the function will result in a thunk.  That is, the
        thunk that is generated cannot be inlined everywhere.
    (b) The argument has dead non-trivial leaves.
    (c) The argument has fewer than three live leaves.

(2) (a) Specializing the function will not result in a thunk.  That is,
        the thunk that is generated will be inlined everywhere and
        eliminated as dead code.
    (b) The argument has dead potentially trivial leaves.
    (c) The argument has fewer than six live leaves.

This change is based heavily on @GottesM's
swiftlang#16756 .

rdar://problem/39957093
@gottesmm gottesmm closed this Mar 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants