-
Notifications
You must be signed in to change notification settings - Fork 174
Polymorphism
As of 0.6.0, MessagePack for CLI supports polymorphism for object members and collection items. Actually, there are 2 kinds of polymorphism, one is the "Known subtypes based polymorphism" and another is "Runtime type based polymorphism."
There are serveral usecases to use polymorphism.
- Serialize polimorphic collection (issue #58). Sometimes you want to serialize hetero genious collection items.
- Serialize 'rich' domain model which has own data and logic (issue #47). You can deserialize objects and invoke their virtual methods.
- Pros ** Easy to interop. Known Subtypes Based Polymorphism uses simple format, so you can easily implement counterpart system. ** Naturally secure. You can control possible instance types via custom attribute, so there are few chance to inject malicious code except you also download untrusted assembly.
- Cons ** You must continuously maintain known subtype list(s). ** All types must be known at compilation time.
- Pros ** Easy to use and maintain. You just put custom attribute to the member, and it will work fine in the future. ** You don't have to know possible subtypes at compilation time.
- Cons ** It uses native .NET type identifier based format, so it is hart to keep interoperability because other systems must interpret the information and translate them to their own type system requirement. ** You cannot control possible subtypes, it might hurt stability of your application. ** If deserializing binary will be come from external environment, the binary may contain malicious type information. Atackers can specify special type(s) which has default constructor which causes significant side effects like file/registry manipulation etc.
You can specify some members(fields/properties) are polymorphic by marking them with custom attribute like following:
// Known subtypes based polymorphism.
[MessagePackKnownType( 0, typeof( FileInfo ) )]
[MessagePackKnownType( 1, typeof( DirectoryInfo ) )]
public FileSystemInfo Info { get; set; }
// Runtime type based polymorphism.
[MessagePackRuntimeType]
public object Data { get; set; }
As you imagine, you cannot mix multiple polymorphism custom attribute to the member.
You can also specify polymorphism to collections themselvs, each collection items, each dictionary keys/values, and each Tuple
items. This table shows valid combination and meanings of the attributes:
|Kind|Attribute|Target|Note|
|||::||
|Know Subtype Based|MessagePackKnownTypeAttribute
|Noncollection objects or Collections them selves||
||MessagePackKnownCollectionItemTypeAttribute
|Collection items or Dictionary values|For example, items of List<object>
typed property value.|
||MessagePackKnownDictionaryKeyTypeAttribute
|Dictionary keys|For example, keys of Dictionary<object, object>
typed property value.|
||MessagePackKnownTupleItemTypeAttribute
|An item of tuples|1st argument specifies 1 based item number (n of ItemN property).|
|Runtime Type Based|MessagePackRuntimeTypeAttribute
|Noncollection objects or Collections them selves||
||MessagePackRuntimeCollectionItemTypeAttribute
|Collection items or Dictionary values|For example, items of List<object>
typed property value.|
||MessagePackRuntimeDictionaryKeyTypeAttribute
|Dictionary keys|For example, keys of Dictionary<object, object>
typed property value.|
||MessagePackRuntimeTupleItemTypeAttribute
|An item of tuples|1st argument specifies 1 based item number (n of ItemN property).|
As you see, you can specify both of collections themselves are polymorphic and their keys/items are polymorphic for collection typed (that is, the type implements IEnumerable
, but not IDictionary
and not sealed) or dictionary typed members. In addition, you can specify polymorphic to tuple item(s). Note that System.Tuple
are sealed, so you cannot specify as that the Tuple
typed member itslef is polymorphic.
For the remainder, there are default behavior of collection and System.Object
typed members.
-
System.Object
means boxedMessagePackObject
when the member is not marked with above polymorphic attribute(s). - Deserialized abstract collection typed member value is determined by
SerializationContext.DefaultCollectionTypes
registration. Defaults areList<T>
andDictionary<TKey, TValue>
.
This section discusses about type information format to develop interoperable implementation.
- Objects' type information will be serialized together with their values.
- The type information and values are serialized within single array.
- The type information consists of their data.
- The type information itself will be encoded in an array.
It will be encoded as simple 2 elements array.
[<StringTypeCode>, <Data>]
In above figure, "StringTypeCode" is type code string specified in the custom attributes. It will be encoded as MessagePack str
(raw
) format. It should be encoded as compact as possible. "Data" is serialized object values and its form will be array or map.
It will be encoded as 2 elements array.
[<EncodedNETType>, <Data>]
In above figure, "Data" is serialized object values and its form will be array or map. The "EncodedNETType" is 6 element array formatted structured data and it is equivelant to .NET type name with assembly qualified name. This table shows contents of the structured type information and mapping between type qualified name and the structured data:
|Index|Type|Content|
|::|::||
|0|integer
|Format ID. Only 1 is valid. Discussed later.|
|1|str
(raw
)|Compressed type full name. Discussed later.|
|2|str
(raw
)|Assembly's simple name.|
|3|array
|Assembly's version with 4 element int
array.|
|4|str
(raw
)|Assembly's culture name. nil
for neutral assembly.|
|5|bin
(raw
)|Assembly's public key token. nil
for null
.|
Note that the Format ID 1
means this format uses "Compressed Format". This format compresses the type name. Because many type owns the prefix as namespace, ant the prefix often matches its declaring assembly simple name, we can save space with omit the duplicated substring. The format replaces such prefix with '.'.
For example, the type which has "TheCompany.TheProduct.TheComponent.TheLayer.TheType, TheCompany.TheProduct.TheComponent, Version=1.2.3.4, Culture=neutral, PublicKeyToken=null", then the result type information logically should be following:
[1, ".TheLayer.TheType", "TheCompany.TheProduct.TheComponent", [1, 2, 3, 4], nil, nil]
The physical format looks like following:
0x96 0x01 0xB12E5468654C617965722E54686554797065 0xD922546865436F6D70616E792E54686550726F647563742E546865436F6D706F6E656E74 0x94 0x01 0x02 0x03 0x04 0xC0 0xC0
It is 63 bytes binary instead of 142 bytes UTF-8 encoded string.