Skip to content

[ABI] Introduce indirect symbolic references to context descriptors. #20005

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Oct 24, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
74 changes: 69 additions & 5 deletions docs/ABI/Mangling.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,9 +9,15 @@ Mangling
--------
::

mangled-name ::= '$s' global
mangled-name ::= '$s' global // Swift stable mangling
mangled-name ::= '_T0' global // Swift 4.0
mangled-name ::= '$S' global // Swift 4.2

All Swift-mangled names begin with this prefix.
All Swift-mangled names begin with a common prefix. Since Swift 4.0, the
compiler has used variations of the mangling described in this document, though
pre-stable versions may not exactly conform to this description. By using
distinct prefixes, tools can attempt to accommodate bugs and version variations
in pre-stable versions of Swift.

The basic mangling scheme is a list of 'operators' where the operators are
structured in a post-fix order. For example the mangling may start with an
Expand All @@ -20,8 +26,6 @@ identifier has to be interpreted::

4Test3FooC // The trailing 'C' says that 'Foo' is a class in module 'Test'



Operators are either identifiers or a sequence of one or more characters,
like ``C`` for class.
All operators share the same name-space. Important operators are a single
Expand All @@ -42,6 +46,48 @@ mangled name will start with the module name (after the ``_S``).
In the following, productions which are only _part_ of an operator, are
named with uppercase letters.

Symbolic references
~~~~~~~~~~~~~~~~~~~

The Swift compiler emits mangled names into binary images to encode
references to types for runtime instantiation and reflection. In a binary,
these mangled names may embed pointers to runtime data
structures in order to more efficiently represent locally-defined types.
We call these pointers **symbolic references**.
These references will be introduced by a control character in the range
`\x01` ... `\x1F`, which indicates the kind of symbolic reference, followed by
some number of arbitrary bytes *which may include null bytes*. Code that
processes mangled names out of Swift binaries needs to be aware of symbolic
references in order to properly terminate strings; a null terminator may be
part of a symbolic reference.

::

symbolic-reference ::= [\x01-\x17] .{4} // Relative symbolic reference
#if sizeof(void*) == 8
symbolic-reference ::= [\x18-\x1F] .{8} // Absolute symbolic reference
#elif sizeof(void*) == 4
symbolic-reference ::= [\x18-\x1F] .{4} // Absolute symbolic reference
#endif

Symbolic references are only valid in compiler-emitted metadata structures
and must only appear in read-only parts of a binary image. APIs and tools
that interpret Swift mangled names from potentially uncontrolled inputs must
refuse to interpret symbolic references.

The following symbolic reference kinds are currently implemented:

::

{any-generic-type, protocol} ::= '\x01' .{4} // Reference points directly to context descriptor
{any-generic-type, protocol} ::= '\x02' .{4} // Reference points indirectly to context descriptor
// The grammatical role of the symbolic reference is determined by the
// kind of context descriptor referenced

protocol-conformance-ref ::= '\x03' .{4} // Reference points directly to protocol conformance descriptor (NOT IMPLEMENTED)
protocol-conformance-ref ::= '\x04' .{4} // Reference points indirectly to protocol conformance descriptor (NOT IMPLEMENTED)


Globals
~~~~~~~

Expand Down Expand Up @@ -549,18 +595,36 @@ Generics

::

protocol-conformance ::= type protocol module generic-signature?
protocol-conformance-context ::= protocol module generic-signature?

protocol-conformance ::= type protocol-conformance-context

``<protocol-conformance>`` refers to a type's conformance to a protocol. The
named module is the one containing the extension or type declaration that
declared the conformance.

::

protocol-conformance ::= type protocol

If ``type`` is a generic parameter or associated type of one, then no module
is mangled, because the conformance must be resolved from the generic
environment.

protocol-conformance ::= context identifier protocol identifier generic-signature? // Property behavior conformance

Property behaviors are implemented using private protocol conformances.

::

concrete-protocol-conformance ::= type protocol-conformance-ref
protocol-conformance-ref ::= protocol module?

A compact representation used to represent mangled protocol conformance witness
arguments at runtime. The ``module`` is only specified for conformances that
are "retroactive", meaning that the context in which the conformance is defined
is in neither the protocol or type module.

::

generic-signature ::= requirement* 'l' // one generic parameter
Expand Down
24 changes: 17 additions & 7 deletions include/swift/AST/ASTMangler.h
Original file line number Diff line number Diff line change
Expand Up @@ -46,12 +46,22 @@ class ASTMangler : public Mangler {
/// If disabled, it is an error to try to mangle such an entity.
bool AllowNamelessEntities = false;

/// If nonnull, provides a callback to encode symbolic references to
/// type contexts.
std::function<bool (const DeclContext *Context)>
CanSymbolicReference;

std::vector<std::pair<const DeclContext *, unsigned>> SymbolicReferences;
/// If enabled, some entities will be emitted as symbolic reference
/// placeholders. The offsets of these references will be stored in the
/// `SymbolicReferences` vector, and it is up to the consumer of the mangling
/// to fill these in.
bool AllowSymbolicReferences = false;

public:
using SymbolicReferent = llvm::PointerUnion<const NominalTypeDecl *,
const ProtocolConformance *>;
protected:

/// If set, the mangler calls this function to determine whether to symbolic
/// reference a given entity. Defaults to always returning true.
std::function<bool (SymbolicReferent)> CanSymbolicReference;

std::vector<std::pair<SymbolicReferent, unsigned>> SymbolicReferences;

public:
enum class SymbolKind {
Expand Down Expand Up @@ -292,7 +302,7 @@ class ASTMangler : public Mangler {

void appendOpParamForLayoutConstraint(LayoutConstraint Layout);

void appendSymbolicReference(const DeclContext *context);
void appendSymbolicReference(SymbolicReferent referent);

std::string mangleTypeWithoutPrefix(Type type) {
appendType(type);
Expand Down
8 changes: 7 additions & 1 deletion include/swift/Demangling/Demangle.h
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,8 @@ namespace llvm {
namespace swift {
namespace Demangle {

enum class SymbolicReferenceKind : uint8_t;

struct DemangleOptions {
bool SynthesizeSugarOnTypes = false;
bool DisplayDebuggerGeneratedModule = true;
Expand Down Expand Up @@ -473,7 +475,9 @@ void mangleIdentifier(const char *data, size_t length,
/// This should always round-trip perfectly with demangleSymbolAsNode.
std::string mangleNode(const NodePointer &root);

using SymbolicResolver = llvm::function_ref<Demangle::NodePointer (const void *)>;
using SymbolicResolver =
llvm::function_ref<Demangle::NodePointer (SymbolicReferenceKind,
const void *)>;

/// \brief Remangle a demangled parse tree, using a callback to resolve
/// symbolic references.
Expand Down Expand Up @@ -537,6 +541,8 @@ class DemanglerPrinter {
return std::move(*this << std::forward<T>(x));
}

DemanglerPrinter &writeHex(unsigned long long n) &;

std::string &&str() && { return std::move(Stream); }

llvm::StringRef getStringRef() const { return Stream; }
Expand Down
4 changes: 2 additions & 2 deletions include/swift/Demangling/DemangleNodes.def
Original file line number Diff line number Diff line change
Expand Up @@ -143,6 +143,7 @@ NODE(PrefixOperator)
NODE(PrivateDeclName)
NODE(PropertyDescriptor)
CONTEXT_NODE(Protocol)
CONTEXT_NODE(ProtocolSymbolicReference)
NODE(ProtocolConformance)
NODE(ProtocolDescriptor)
NODE(ProtocolConformanceDescriptor)
Expand Down Expand Up @@ -172,13 +173,13 @@ NODE(SpecializationIsFragile)
CONTEXT_NODE(Static)
CONTEXT_NODE(Structure)
CONTEXT_NODE(Subscript)
CONTEXT_NODE(SymbolicReference)
NODE(Suffix)
NODE(ThinFunctionType)
NODE(Tuple)
NODE(TupleElement)
NODE(TupleElementName)
NODE(Type)
CONTEXT_NODE(TypeSymbolicReference)
CONTEXT_NODE(TypeAlias)
NODE(TypeList)
NODE(TypeMangling)
Expand All @@ -192,7 +193,6 @@ NODE(TypeMetadataLazyCache)
NODE(UncurriedFunctionType)
#define REF_STORAGE(Name, ...) NODE(Name)
#include "swift/AST/ReferenceStorage.def"
CONTEXT_NODE(UnresolvedSymbolicReference)
CONTEXT_NODE(UnsafeAddressor)
CONTEXT_NODE(UnsafeMutableAddressor)
NODE(ValueWitness)
Expand Down
18 changes: 15 additions & 3 deletions include/swift/Demangling/Demangler.h
Original file line number Diff line number Diff line change
Expand Up @@ -280,6 +280,17 @@ class CharVector : public Vector<char> {
}
};

/// Kinds of symbolic reference supported.
enum class SymbolicReferenceKind : uint8_t {
/// A symbolic reference to a context descriptor, representing the
/// (unapplied generic) context.
Context,
};

using SymbolicReferenceResolver_t = NodePointer (SymbolicReferenceKind,
Directness,
int32_t, const void *);

/// The demangler.
///
/// It de-mangles a string and it also owns the returned node-tree. This means
Expand All @@ -301,7 +312,7 @@ class Demangler : public NodeFactory {
StringRef Words[MaxNumWords];
int NumWords = 0;

std::function<NodePointer (int32_t, const void *)> SymbolicReferenceResolver;
std::function<SymbolicReferenceResolver_t> SymbolicReferenceResolver;

bool nextIf(StringRef str) {
if (!Text.substr(Pos).startswith(str)) return false;
Expand Down Expand Up @@ -472,7 +483,8 @@ class Demangler : public NodeFactory {

NodePointer demangleObjCTypeName();
NodePointer demangleTypeMangling();
NodePointer demangleSymbolicReference(const void *at);
NodePointer demangleSymbolicReference(unsigned char rawKind,
const void *at);

void dump();

Expand All @@ -483,7 +495,7 @@ class Demangler : public NodeFactory {

/// Install a resolver for symbolic references in a mangled string.
void setSymbolicReferenceResolver(
std::function<NodePointer (int32_t, const void*)> resolver) {
std::function<SymbolicReferenceResolver_t> resolver) {
SymbolicReferenceResolver = resolver;
}

Expand Down
19 changes: 10 additions & 9 deletions include/swift/Demangling/TypeDecoder.h
Original file line number Diff line number Diff line change
Expand Up @@ -114,7 +114,7 @@ class TypeDecoder {
case NodeKind::Enum:
case NodeKind::Structure:
case NodeKind::TypeAlias: // This can show up for imported Clang decls.
case NodeKind::SymbolicReference:
case NodeKind::TypeSymbolicReference:
{
BuiltNominalTypeDecl typeDecl = BuiltNominalTypeDecl();
BuiltType parent = BuiltType();
Expand Down Expand Up @@ -228,7 +228,8 @@ class TypeDecoder {
IsClassBound);
}

case NodeKind::Protocol: {
case NodeKind::Protocol:
case NodeKind::ProtocolSymbolicReference: {
if (auto Proto = decodeMangledProtocolType(Node)) {
return Builder.createProtocolCompositionType(Proto, BuiltType(),
/*IsClassBound=*/false);
Expand Down Expand Up @@ -473,14 +474,14 @@ class TypeDecoder {
}

private:
bool decodeMangledNominalType(const Demangle::NodePointer &node,
bool decodeMangledNominalType(Demangle::NodePointer node,
BuiltNominalTypeDecl &typeDecl,
BuiltType &parent) {
if (node->getKind() == NodeKind::Type)
return decodeMangledNominalType(node->getChild(0), typeDecl, parent);

Demangle::NodePointer nominalNode;
if (node->getKind() == NodeKind::SymbolicReference) {
if (node->getKind() == NodeKind::TypeSymbolicReference) {
// A symbolic reference can be directly resolved to a nominal type.
nominalNode = node;
} else {
Expand Down Expand Up @@ -519,19 +520,19 @@ class TypeDecoder {
return true;
}

BuiltProtocolDecl decodeMangledProtocolType(
const Demangle::NodePointer &node) {
BuiltProtocolDecl decodeMangledProtocolType(Demangle::NodePointer node) {
if (node->getKind() == NodeKind::Type)
return decodeMangledProtocolType(node->getChild(0));

if (node->getNumChildren() < 2 || node->getKind() != NodeKind::Protocol)
if ((node->getNumChildren() < 2 || node->getKind() != NodeKind::Protocol)
&& node->getKind() != NodeKind::ProtocolSymbolicReference)
return BuiltProtocolDecl();

return Builder.createProtocolDecl(node);
}

bool decodeMangledFunctionInputType(
const Demangle::NodePointer &node,
Demangle::NodePointer node,
std::vector<FunctionParam<BuiltType>> &params,
bool &hasParamFlags) {
// Look through a couple of sugar nodes.
Expand All @@ -542,7 +543,7 @@ class TypeDecoder {
}

auto decodeParamTypeAndFlags =
[&](const Demangle::NodePointer &typeNode,
[&](Demangle::NodePointer typeNode,
FunctionParam<BuiltType> &param) -> bool {
Demangle::NodePointer node = typeNode;

Expand Down
12 changes: 12 additions & 0 deletions include/swift/IRGen/Linking.h
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ class Triple;
namespace swift {
namespace irgen {
class IRGenModule;
class Alignment;

/// Determine if the triple uses the DLL storage.
bool useDllStorage(const llvm::Triple &triple);
Expand Down Expand Up @@ -936,6 +937,17 @@ class LinkEntity {

return getDecl()->isWeakImported(module);
}

/// Return the source file whose codegen should trigger emission of this
/// link entity, if one can be identified.
const SourceFile *getSourceFileForEmission() const;

/// Get the preferred alignment for the definition of this entity.
Alignment getAlignment(IRGenModule &IGM) const;

/// Get the default LLVM type to use for forward declarations of this
/// entity.
llvm::Type *getDefaultDeclarationType(IRGenModule &IGM) const;
#undef LINKENTITY_GET_FIELD
#undef LINKENTITY_SET_FIELD
};
Expand Down
31 changes: 23 additions & 8 deletions include/swift/Reflection/TypeRefBuilder.h
Original file line number Diff line number Diff line change
Expand Up @@ -354,15 +354,30 @@ class TypeRefBuilder {
// demangling out of the referenced context descriptors in the target
// process.
Dem.setSymbolicReferenceResolver(
[this, &reader](int32_t offset, const void *base) -> Demangle::NodePointer {
// Resolve the reference to a remote address.
auto remoteAddress = getRemoteAddrOfTypeRefPointer(base);
if (remoteAddress == 0)
[this, &reader](SymbolicReferenceKind kind,
Directness directness,
int32_t offset, const void *base) -> Demangle::NodePointer {
// Resolve the reference to a remote address.
auto remoteAddress = getRemoteAddrOfTypeRefPointer(base);
if (remoteAddress == 0)
return nullptr;

auto address = remoteAddress + offset;
if (directness == Directness::Indirect) {
if (auto indirectAddress = reader.readPointerValue(address)) {
address = *indirectAddress;
} else {
return nullptr;

return reader.readDemanglingForContextDescriptor(remoteAddress + offset,
Dem);
});
}
}

switch (kind) {
case Demangle::SymbolicReferenceKind::Context:
return reader.readDemanglingForContextDescriptor(address, Dem);
}

return nullptr;
});
}

TypeConverter &getTypeConverter() { return TC; }
Expand Down
Loading