Skip to content

SIL optimizer: fix a compile time performance problem in UpdatingInstructionIteratorRegistry #41239

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 7, 2022

Conversation

eeckstein
Copy link
Contributor

C++ closures can implicitly malloc.
Avoid this by just capturing this and nothing else.

Reduces the time spent in the SIL pass pipeline by 25% when compiling the stdlib core.

rdar://88567996

…ructionIteratorRegistry

C++ closures can implicitly malloc.
Avoid this by just capturing `this` and nothing else.

Reduces the time spent in the SIL pass pipeline by 25% when compiling the stdlib core.

rdar://88567996
@eeckstein eeckstein force-pushed the fix-compile-time-perf branch from b7a8789 to d418192 Compare February 7, 2022 15:19
@eeckstein
Copy link
Contributor Author

@swift-ci smoke test

@eeckstein
Copy link
Contributor Author

@swift-ci benchmark

@eeckstein eeckstein requested a review from atrick February 7, 2022 15:19
@swift-ci
Copy link
Contributor

swift-ci commented Feb 7, 2022

Performance (x86_64): -O

Improvement OLD NEW DELTA RATIO
ArrayAppendGenericStructs 2090 1330 -36.4% 1.57x (?)
FlattenListLoop 2552 1673 -34.4% 1.53x (?)
StringBuilderWithLongSubstring 1630 1470 -9.8% 1.11x (?)
FlattenListFlatMap 6763 6101 -9.8% 1.11x (?)

Code size: -O

Performance (x86_64): -Osize

Improvement OLD NEW DELTA RATIO
FlattenListFlatMap 6669 3975 -40.4% 1.68x (?)
FlattenListLoop 2644 1770 -33.1% 1.49x (?)
Set.subtracting.Seq.Empty.Box 268 222 -17.2% 1.21x (?)
DictionaryKeysContainsCocoa 25 23 -8.0% 1.09x (?)

Code size: -Osize

Performance (x86_64): -Onone

Regression OLD NEW DELTA RATIO
ObjectiveCBridgeFromNSSetAnyObjectForced 5540 6120 +10.5% 0.91x (?)
 
Improvement OLD NEW DELTA RATIO
RandomDoubleDef 55700 51200 -8.1% 1.09x (?)
RandomDoubleOpaqueDef 56200 52000 -7.5% 1.08x (?)

Code size: -swiftlibs

How to read the data The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the
regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false
alarms. Unexpected regressions which are marked with '(?)' are probably noise.
If you see regressions which you cannot explain you can try to run the
benchmarks again. If regressions still show up, please consult with the
performance team (@eeckstein).

Hardware Overview
  Model Name: Mac Pro
  Model Identifier: MacPro6,1
  Processor Name: 12-Core Intel Xeon E5
  Processor Speed: 2.7 GHz
  Number of Processors: 1
  Total Number of Cores: 12
  L2 Cache (per Core): 256 KB
  L3 Cache: 30 MB
  Memory: 64 GB

Copy link
Contributor

@atrick atrick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! Great compile time fix.

@eeckstein eeckstein merged commit c1d3cd9 into swiftlang:main Feb 7, 2022
@eeckstein eeckstein deleted the fix-compile-time-perf branch February 7, 2022 18:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants