Skip to content

Implementation of new mangling (but not enabled yet) #6040

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Dec 3, 2016

Conversation

eeckstein
Copy link
Contributor

This PR contains several commits which add the implementation of the new mangling scheme.

It's mostly a NFC since the new mangling is not enabled yet (and it still needs to be wired up in several places in the compiler).
The only functional change is that the demangler can already demangle new symbol names.

For details see the individual commit messages and docs/ABI.rst, which contains the full definition of the new mangling scheme.

A quick experiment showed that the new scheme gives ~20% reduction of the trie size and ~10% reduction of the string table (nlist) for the stdlib dylib.
About 10% of the 20% trie-size-reduction are due to a more compact mangling (e.g. using word substitutions).
The other 10% are because of the reversed order of the mangling ("post-fix"). This result in more common pre-fixes in symbol names.

…g/demangling/remangling

Also add the missing DestructiveInjectEnumTag entry.
…not appear in symbol names.

Such characters (like ‘.’) can be punycode-encoded just like non-ASCII unicode characters.
These are the main changes:

*) Change the order of the mangling to a post-fix like structure.
This is the biggest change.
It will help to get more common prefixes in the mangled names to optimize the trie in the mach-o object files.
The length of the mangled names will mostly stay the same but the order of 'operands' inside the mangling is more or less reversed.
This change also required to use different 'operator' characters in some cases.

*) Word-substitutions
Similar to the S-substitutions, but finer grained. See section 'Identifiers'.
Reduces the size of mangled names in general.

*) Combined substitutions
A more efficient way to mangle multiple A-substitutions (which were S-substitutions in the old scheme)
Reduces the size of mangled names with lots of substitutions, e.g. specialized functions.

*) Change the '_T' prefix to '_S'
Because it's basically a completely new mangling scheme.
…ng scheme.

Following classes provide symbol mangling for specific purposes:
*) Mangler: the base mangler class, just providing some basic utilities
*) ASTMangler: for mangling AST declarations
*) SpecializationMangler: to be used in the optimizer for mangling specialized function names
*) IRGenMangler: mangling all kind of symbols in IRGen

All those classes are not used yet, so it’s basically a NFC.

Another change is that some demangler node types are added (either because they were missing or the new demangler needs them).
Those new nodes also need to be handled in the old demangler, but this should also be a NFC as those nodes are not created by the old demangler.

My plan is to keep the old and new mangling implementation in parallel for some time. After that we can remove the old mangler.
Currently the new implementation is scoped in the NewMangling namespace. This namespace should be renamed after the old mangler is removed.
This makes it easier to switch between the old and new mangling scheme.
@eeckstein
Copy link
Contributor Author

@swift-ci Please smoke test

@eeckstein eeckstein merged commit 1ff2154 into swiftlang:master Dec 3, 2016
@eeckstein eeckstein deleted the newmangling branch December 5, 2016 16:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant