-
Notifications
You must be signed in to change notification settings - Fork 78
Writing custom Picklers Unpicklers
Here we give you a guide on how to write a custom Pickler/Unpickler pair for scala pickling. This guide is for the 0.10.x series of picklers, and outlines the set of scenarios you're likely to encounter.
Let's start by creating a simple pickler for a very simple type:
class Foo(val x: String)
This type is basically a simple wrapper around a string. Our first basic pickler will look like the following:
import scala.pickling._
object Foo {
implicit object pickler extends AbstractPicklerUnpickler[Foo] {
private val stringPickler = implicitly[Pickler[String]]
private val stringUnpickler = implicitly[Unpickler[String]]
override val tag = FastTypeTag[Foo]
override def pickle(picklee: Foo, builder: PBuilder): Unit = {
builder.hintTag(tag) // This is always required
builder.beginEntry(picklee)
builder.putField("x", { fieldBuilder =>
stringPickler.pickle(picklee.x, fieldBuilder)
})
builder.endEntry()
}
override def unpickle(tag: String, reader: PReader): Any = {
val x = stringUnpickler.unpickleEntry(reader.readField("x"))
new Foo(x.asInstanceOf[String])
}
}
}
There's a lot going on in this code, so let's walk through it each step by step.
First thing we have to do is grab the picklers we will use when pickling/unpickling the members of our object. Since the Foo
class consists of only a single string, this means grabing the pickler/unpickler for string:
private val stringPickler = implicitly[Pickler[String]]
private val stringUnpickler = implicitly[Unpickler[String]]
Pickling requires a type-string to handle subclassing at runtime. Because of this, every pickler is required to provide a tag before sending any data down. The Pickling framework has a convenience method for autogenerating these type strings from a literal type.
override val tag = FastTypeTag[Foo]
Now we can write the pickling logc. Pickling logic basically follows the following form:
- Hint your type tag to the reader
- begin the pickling entry
- Serialize any fields (Note: This is different for collections)
- Inform the builder we are done with the current entry.
We can see these four stages in our logic:
override def pickle(picklee: Foo, builder: PBuilder): Unit = {
builder.hintTag(tag) // This is always required
builder.beginEntry(picklee)
builder.putField("x", { fieldBuilder =>
stringPickler.pickle(picklee.x, fieldBuilder)
})
builder.endEntry()
}
The unpickling logic is a bit different than the pickling logic. This is
because the pickling framework requires the use of type tags to ensure proper
runtime decoding of class hierarchies. As such, the unpickling logic that
needs to be written does not include any beginEntry
/endEntry
calls, as
these will occur elsewhere in the framework. Instead, we focus on Grabbing
our fields and returning a value as the only thing we have to do.
override def unpickle(tag: String, reader: PReader): Any = {
val x = stringUnpickler.unpickleEntry(reader.readField("x"))
new Foo(x.asInstanceOf[String])
}
You can see here that we immediately begin reading fields out of the pickle reader,
and use these fields to instantiate the value. Note that during unpickling, we
are always returning Any
and therefor casting will be needed on values.
If you're pickling a value which is allowed to be null
, you will need to add some explicit handling of null in the pickle/unpickle methods.
First, the pickle method should explicitly delegate to the null pickler
override def pickle(picklee: Foo, builder: PBuilder): Unit =
if(picklee == null) {
builder.hintTag(Defaults.nullPickler.tag)
Defaults.nullPickler.pickle(null, builder)
} else {
builder.hintTag(tag) // This is always required
builder.beginEntry(picklee)
builder.putField("x", { fieldBuilder =>
stringPickler.pickle(picklee.x, fieldBuilder)
})
builder.endEntry()
}
Next, the unpickler needs to check to see if the pickle is tagged as null:
override def unpickle(tag: String, reader: PReader): Any =
if(tag == FastTypeTag.Null.key) null else {
val x = stringUnpickler.unpickleEntry(reader.readField("x"))
new Foo(x.asInstanceOf[String])
}
You can see the benefit of having a type tag, as we're using this to
catch a serialized null
value versus an instance of Foo
.
You may have noticed that the Foo
class is NOT final, so it could have subclasses.
Our current pickler will drop any information from these subclasses. Ideally,
we'd like to be able to handle possible subclasses in our pickler. To do so,
we'll need to rely on the runtime registry of picklers
and unpicklers
built
into to scala pickling.
To do so, we first need to alter our pickling methods to check if we're pickling
a subclass of Foo
, and delegate to its pickler in that instance.
override def pickle(picklee: Foo, builder: PBuilder): Unit =
if(picklee == null) {
builder.hintTag(Defaults.nullPickler.tag)
Defaults.nullPickler.pickle(null, builder)
} else if(picklee.getClass != classOf[Foo]) {
// We are pickling a subclass, let's delegate down to its pickler
val realTag =
FastTypeTag.mkRaw(clazz, scala.reflect.runtime.universe.runtimeMirror(picklee.getClass.getClassLoader))
// Lookup a preregistered pickler for this tag,
// or possibly create a new one on the fly.
val pickler =
scala.pickling.internal.currentRuntime.picklers.genPickler(classLoader, clazz, tag)
// Pickle the entry
pickler.pickle(picklee, builder)
} else {
builder.hintTag(tag) // This is always required
builder.beginEntry(picklee)
builder.putField("x", { fieldBuilder =>
stringPickler.pickle(picklee.x, fieldBuilder)
})
builder.endEntry()
}
This has the basic algorithm outline of:
- Check to see if we're on a subclass of
Foo
- Create a new string type tag for the actual subclass.
- Lookup (or create) a pickler for the subclass and delegate to its implementation.
Next, we also need to adjust our unpickler to handle the new (unknown) subclass.
override def unpickle(tag: String, reader: PReader): Any =
if(tag == FastTypeTag.Null.key) null
else tag match {
case FastTypeTag[Foo].key =>
val x = stringUnpickler.unpickleEntry(reader.readField("x"))
new Foo(x.asInstanceOf[String])
case other =>
// Look up an unpickler, or generate one on the fly
val up = scala.pickling.internal.currentRuntime.picklers.genUnpickler(
scala.pickling.internal.currentMirror, tagKey)
up.unpickle(tag, reader)
}
Here, we simple check to see if the tag matches our expected value, and if not we look for the registered pickler for the given tag.
So, while this custom pickler works great, you may notice immediately that the resulting pickle is a bit verbose:
scala> import scala.pickling._, json._, Defaults._
scala> Foo("hello").pickle
res0: JsonPickle("{
"$tag": "Foo",
"x": {
"$tag": "java.lang.String",
"hi"
}
}")
You'll see that the pickling library is storing the type tag for java.lang.String
,
just as it is for our object. While these tags are great for allowing us to handle
classes that may have subclasses, java.lang.String
is final and we'd prefer not to
have to waste pickle size just to store a tag we already know about from the Foo
pickler.
To remove the tag, we use an "elide" hint to the pickler. While elide hints are completely optional for a pickling format, all of the built in formats will make use of elide tags.
Let's alter our pickle method to be as follows:
override def pickle(picklee: Foo, builder: PBuilder): Unit = {
builder.hintTag(tag) // This is always required
builder.beginEntry(picklee)
builder.putField("x", { fieldBuilder =>
fieldBuilder.hintStaticallyElided(); // Tells the PBuilder to drop the hint.
stringPickler.pickle(picklee.x, fieldBuilder)
})
builder.endEntry()
}
Now when we pickle our object, we get:
scala> Foo("hello").pickle
res1: JsonPickle("{
"$tag": "Foo",
"x": "hi"
}")
This is great! Unfortunately, we may no longer be able to unpickle the object. Since we've elided the type during pickling, we ALSO need to tell the builder we'll be eliding the type during unpickling:
override def unpickle(tag: String, reader: PReader): Any = {
val xreader = reader.readField("x")
xreader.hintStaticallyElided() // Tells the reader not to expect a type tag.
val x = stringUnpickler.unpickleEntry(xreader)
new Foo(x.asInstanceOf[String])
}
Now we are able to pickle and unpickle successfully.
Pickling has built in support for sharing values of objects while picklling unpickling. This feature can be a space-saving optimization in teh event that some values are often repeated, but can also be used to help pickle circular data structures.
Note: While pickling (and the default formats), provide hooks for sharing instances out of thebox, not all formats are required to support sharing. Please check the documentation of your format to ensure that it is able to handle sharing.
Let's create a circular data structure, for which we'd like to pickle/unpicle the results.
final class Node(val name: String, var edge: Node)
The class Node
represents the node of a list. It can contain pointers to another node within the list, and may be circular. Let's first start by creating a simple pickler for this data structure.
import scala.pickling._
object Node {
implicit object pickler extends AbstractPicklerUnpickler[Node] {
private val stringPickler = implicitly[Pickler[String]]
private val stringUnpickler = implicitly[Unpickler[String]]
override val tag = FastTypeTag[Graph]
override def pickle(picklee: Node, builder: PBuilder): Unit = {
builder.hintTag(tag) // This is always required
builder.beginEntry(picklee)
builder.putField("name", { fieldBuilder =>
fieldBuilder.hintStaticallyElided()
stringPickler.pickle(picklee.x, fieldBuilder)
})
builder.putField("edge", { fieldBuilder =>
fieldBuilder.hintStaticallyElided()
pickle(picklee.edge, fieldBuilder)
})
builder.endEntry()
}
override def unpickle(tag: String, reader: PReader): Any = {
val name = stringUnpickler.unpickleEntry(reader.readField("name"))
val edge = unpickleEntry(reader.readField("edge"))
new Node(name.asInstanceOf[String], edge.asInstanceOf[Node])
}
}
}
So far, this structure functions jsut fine. However, it will recurse infinitely if the Graph is circular. To fix this, we need to alter the pickle method so that we can store reference ids and share them.
override def pickle(picklee: Node, builder: PBuilder): Unit = {
// Check to see if we've encountered this `Graph` before
val oid = scala.pickling.internal.lookupPicklee(picklee)
// Inform the builder of the previous `oid` (or -1 if this is new)
builder.hintOid(oid)
builder.hintTag(tag) // This is always required
builder.beginEntry(picklee)
// Pickling 0.10.x - do not pickle values if we were previously pickled
if(oid == -1) {
builder.putField("name", { fieldBuilder =>
fieldBuilder.hintStaticallyElided()
stringPickler.pickle(picklee.x, fieldBuilder)
})
builder.putField("edge", { fieldBuilder =>
fieldBuilder.hintStaticallyElided()
pickle(picklee.edge, fieldBuilder)
})
}
builder.endEntry()
}
The above code is doing three things:
- It is checking to see if the current
Node
was seen before. THis also registers the node with the current picklee cache. - We inform the builder of our previous obejct id, if we had one.
- We do not serialize any of our values if there was a previous obejct id.
Note: This breaks if the format does not support object sharing. In
Pickling 0.11.x this will be corrected and the
if
check is no longer necessary.
Next, we ened to update the unpickler to handle sharing as well. We do so via:
override def unpickle(tag: String, reader: PReader): Any =
if(tag === FastTypeTag.Ref.key) {
Defaults.refUnpickler.unpickle(tag, reader)
} else {
// Grab the `oid` that will match up with the exact same `oid` from the pickler.
val oid = _root_.scala.pickling.internal.`package`.preregisterUnpicklee()
val name = stringUnpickler.unpickleEntry(reader.readField("name"))
// Register the value of this pickle with the cache. We MUST do this
// before we recurse down fields, because
val result = Node(name.asInstanceOf[String], edge.asInstanceOf[Node])
scala.pickling.internal.registerUnpicklee(result, oid)
// Asign the edge to our node
val edge = unpickleEntry(reader.readField("edge"))
result.edge = edge
result
}
The above code follows this algorithm:
- First check to see if we're just looking up a reference from the cache.
- If not unpickling a previous value,
Preregister for the
oid
you'll need for your object. - Create an instance of
Node
we register with the object cache. This instance will be used everywhere the reader finds an oid reference. - Read any fields which may have circular references. note: this must be done after registering an object instance, or we will fail to handle nested recursive structures.