Make the demangler in the runtime use stack allocated memory. #22655

eeckstein · 2019-02-15T21:41:18Z

A big part of the change is to reduce the size of demangler Node: This is done by disallowing nodes with children to also have index or text payloads.
In some cases those payloads were not needed anyway, because the information can be derived later.
In other cases the fix was to insert an additional child node with the index/text payload.

The demangler now supports stack allocated memory for its allocator.
The demangler can be initialized with a preallocated memory on the stack. Only in case of an overflow, the bump pointer allocator mallocs new memory.
Also, support that a new instance of a demangler can "borrow" the free memory from an existing demangler. This is useful because in the runtime the demangler is invoked recursively. With this feature, all the nested demanglers can share a single stack allocated space.

rdar://problem/47357709

This is done by disallowing nodes with children to also have index or text payloads. In some cases those payloads were not needed anyway, because the information can be derived later. In other cases the fix was to insert an additional child node with the index/text payload. Also, implement single or double children as "inline" children, which avoids needing a separate node vector for children. All this reduces the needed size for node trees by over 2x.

Log allocated memory and indent according to the nesting level

The demangler can be initialized with a preallocated memory on the stack. Only in case of an overflow, the bump pointer allocator mallocs new memory. Also, support that a new instance of a demangler can "borrow" the free memory from an existing demangler. This is useful because in the runtime the demangler is invoked recursively. With this feature, all the nested demanglers can share a single stack allocated space.

This reduces the amount of mallocs significantly.

eeckstein · 2019-02-15T21:41:57Z

@swift-ci test

eeckstein · 2019-02-15T21:42:04Z

@swift-ci benchmark

swift-ci · 2019-02-15T22:29:13Z

Performance: -O

TEST	OLD	NEW	DELTA	RATIO
Regression
StringAdder	427	470	+10.1%	0.91x (?)
StringBuilderSmallReservingCapacity	350	381	+8.9%	0.92x (?)
Improvement
SortStringsUnicode	3565	3315	-7.0%	1.08x (?)

Performance: -Osize

TEST	OLD	NEW	DELTA	RATIO
Regression
StringBuilder	327	370	+13.1%	0.88x
StringBuilderSmallReservingCapacity	341	382	+12.0%	0.89x (?)
Improvement
SortStringsUnicode	3560	3310	-7.0%	1.08x (?)

Performance: -Onone

TEST	OLD	NEW	DELTA	RATIO
Regression
StrComplexWalk	6680	7330	+9.7%	0.91x (?)
Improvement
ArrayOfGenericPOD2	1179	1065	-9.7%	1.11x (?)
ArrayOfPOD	855	775	-9.4%	1.10x (?)
Dictionary3	759	704	-7.2%	1.08x (?)
SortStringsUnicode	5205	4840	-7.0%	1.08x (?)

Code size: -swiftlibs

TEST	OLD	NEW	DELTA	RATIO
Regression
libswiftRemoteMirror.dylib	364544	368640	+1.1%	0.99x

How to read the data

The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the
regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false
alarms. Unexpected regressions which are marked with '(?)' are probably noise.
If you see regressions which you cannot explain you can try to run the
benchmarks again. If regressions still show up, please consult with the
performance team (@eeckstein).

Hardware Overview

  Model Name: Mac Pro
  Model Identifier: MacPro6,1
  Processor Name: 12-Core Intel Xeon E5
  Processor Speed: 2.7 GHz
  Number of Processors: 1
  Total Number of Cores: 12
  L2 Cache (per Core): 256 KB
  L3 Cache: 30 MB
  Memory: 64 GB

jckarter · 2019-02-15T22:31:21Z

lib/Demangling/Demangler.cpp

  if (CurrentSlab) {
+#ifdef NODE_FACTORY_DEBUGGING
+    std::cerr << indent() << "## clear: allocated memory = " << allocatedMemory  << "\n";
+#endif


Would LLVM's DEBUG(...) macros work from the runtime?

I didn't try, but we are not linking llvm to the runtime

jckarter

Looks good, thanks for doing this Erik!

eeckstein added 4 commits February 15, 2019 09:29

Demangler: improve debug logging.

b04ebcf

Log allocated memory and indent according to the nesting level

Runtime: make the demangler use stack allocated memory.

80e86fb

This reduces the amount of mallocs significantly.

eeckstein requested a review from DougGregor February 15, 2019 21:41

eeckstein requested a review from jckarter February 15, 2019 21:42

jckarter reviewed Feb 15, 2019

View reviewed changes

jckarter approved these changes Feb 16, 2019

View reviewed changes

eeckstein merged commit bf909ca into swiftlang:master Feb 18, 2019

eeckstein deleted the stack-allocated-demangler branch February 18, 2019 16:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Make the demangler in the runtime use stack allocated memory. #22655

Make the demangler in the runtime use stack allocated memory. #22655

Uh oh!

eeckstein commented Feb 15, 2019

Uh oh!

eeckstein commented Feb 15, 2019

Uh oh!

eeckstein commented Feb 15, 2019

Uh oh!

swift-ci commented Feb 15, 2019

Uh oh!

jckarter Feb 15, 2019

Uh oh!

eeckstein Feb 15, 2019

Uh oh!

jckarter left a comment

Uh oh!

Uh oh!

Make the demangler in the runtime use stack allocated memory. #22655

Make the demangler in the runtime use stack allocated memory. #22655

Uh oh!

Conversation

eeckstein commented Feb 15, 2019

Uh oh!

eeckstein commented Feb 15, 2019

Uh oh!

eeckstein commented Feb 15, 2019

Uh oh!

swift-ci commented Feb 15, 2019

Performance: -O

Performance: -Osize

Performance: -Onone

Code size: -swiftlibs

Uh oh!

jckarter Feb 15, 2019

Choose a reason for hiding this comment

Uh oh!

eeckstein Feb 15, 2019

Choose a reason for hiding this comment

Uh oh!

jckarter left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!