Skip to content
This repository was archived by the owner on Feb 20, 2019. It is now read-only.

Writing custom Picklers Unpicklers

Jeremy edited this page Jun 16, 2016 · 7 revisions

Writing custom picklers

Here we give you a guide on how to write a custom Pickler/Unpickler pair for scala pickling. This guide is for the 0.10.x series of picklers, and outlines the set of scenarios you're likely to encounter.

A basic pickler/unpickler

Let's start by creating a simple pickler for a very simple type:

class Foo(val x: String)

This type is basically a simple wrapper around a string. Our first basic pickler will look like the following:

import scala.pickling._
object Foo {
  implicit object pickler extends AbstractPicklerUnpickler[Foo] {
    private val stringPickler = implicitly[Pickler[String]]
    private val stringUnpickler = implicitly[Unpickler[String]]

    override val tag = FastTypeTag[Foo]

    override def pickle(picklee: Foo, builder: PBuilder): Unit = {
      builder.hintTag(tag) // This is always required
      builder.beginEntry(picklee)
         builder.putField("x", { fieldBuilder =>
            stringPickler.pickle(picklee.x, fieldBuilder)
         })
      builder.endEntry()
    }

    override def unpickle(tag: String, reader: PReader): Any = {
      val x = stringUnpickler.unpickleEntry(reader.readField("x"))
      new Foo(x.asInstanceOf[String])
    }
  }
}

There's a lot going on in this code, so let's walk through it each step by step.

Grabbing picklers for our nested members

First thing we have to do is grab the picklers we will use when pickling/unpickling the members of our object. Since the Foo class consists of only a single string, this means grabing the pickler/unpickler for string:

private val stringPickler = implicitly[Pickler[String]]
private val stringUnpickler = implicitly[Unpickler[String]]

Creating a stringy type tag

Pickling requires a type-string to handle subclassing at runtime. Because of this, every pickler is required to provide a tag before sending any data down. The Pickling framework has a convenience method for autogenerating these type strings from a literal type.

override val tag = FastTypeTag[Foo]

Writing the pickling logic.

Now we can write the pickling logc. Pickling logic basically follows the following form:

  1. Hint your type tag to the reader
  2. begin the pickling entry
  3. Serialize any fields (Note: This is different for collections)
  4. Inform the builder we are done with the current entry.

We can see these four stages in our logic:

override def pickle(picklee: Foo, builder: PBuilder): Unit = {
  builder.hintTag(tag) // This is always required
  builder.beginEntry(picklee)
     builder.putField("x", { fieldBuilder =>
        stringPickler.pickle(picklee.x, fieldBuilder)
     })
  builder.endEntry()
}

Writing the unpickling logic

The unpickling logic is a bit different than the pickling logic. This is because the pickling framework requires the use of type tags to ensure proper runtime decoding of class hierarchies. As such, the unpickling logic that needs to be written does not include any beginEntry/endEntry calls, as these will occur elsewhere in the framework. Instead, we focus on Grabbing our fields and returning a value as the only thing we have to do.

override def unpickle(tag: String, reader: PReader): Any = {
  val x = stringUnpickler.unpickleEntry(reader.readField("x"))
  new Foo(x.asInstanceOf[String])
}

You can see here that we immediately begin reading fields out of the pickle reader, and use these fields to instantiate the value. Note that during unpickling, we are always returning Any and therefor casting will be needed on values.

Handling null

If you're pickling a value which is allowed to be null, you will need to add some explicit handling of null in the pickle/unpickle methods.

First, the pickle method should explicitly delegate to the null pickler

override def pickle(picklee: Foo, builder: PBuilder): Unit =
  if(picklee == null) {
    builder.hintTag(Defaults.nullPickler.tag)
    Defaults.nullPickler.pickle(null, builder)
  } else {
    builder.hintTag(tag) // This is always required
    builder.beginEntry(picklee)
       builder.putField("x", { fieldBuilder =>
          stringPickler.pickle(picklee.x, fieldBuilder)
       })
    builder.endEntry()
  }

Next, the unpickler needs to check to see if the pickle is tagged as null:

override def unpickle(tag: String, reader: PReader): Any =
  if(tag == FastTypeTag.Null.key) null else {
    val x = stringUnpickler.unpickleEntry(reader.readField("x"))
    new Foo(x.asInstanceOf[String])
  }

You can see the benefit of having a type tag, as we're using this to catch a serialized null value versus an instance of Foo.

Handling unknown subclasses

You may have noticed that the Foo class is NOT final, so it could have subclasses. Our current pickler will drop any information from these subclasses. Ideally, we'd like to be able to handle possible subclasses in our pickler. To do so, we'll need to rely on the runtime registry of picklers and unpicklers built into to scala pickling.

To do so, we first need to alter our pickling methods to check if we're pickling a subclass of Foo, and delegate to its pickler in that instance.

override def pickle(picklee: Foo, builder: PBuilder): Unit =
  if(picklee == null) {
    builder.hintTag(Defaults.nullPickler.tag)
    Defaults.nullPickler.pickle(null, builder)
  } else if(picklee.getClass != classOf[Foo]) {
    // We are pickling a subclass, let's delegate down to its pickler
    val realTag =
      FastTypeTag.mkRaw(clazz,  scala.reflect.runtime.universe.runtimeMirror(picklee.getClass.getClassLoader))
    // Lookup a preregistered pickler for this tag,
    // or possibly create a new one on the fly.
    val pickler =
      scala.pickling.internal.currentRuntime.picklers.genPickler(classLoader, clazz, tag)
    // Pickle the entry  
    pickler.pickle(picklee, builder)  
  } else {
    builder.hintTag(tag) // This is always required
    builder.beginEntry(picklee)
       builder.putField("x", { fieldBuilder =>
          stringPickler.pickle(picklee.x, fieldBuilder)
       })
    builder.endEntry()
  }

This has the basic algorithm outline of:

  1. Check to see if we're on a subclass of Foo
  2. Create a new string type tag for the actual subclass.
  3. Lookup (or create) a pickler for the subclass and delegate to its implementation.

Next, we also need to adjust our unpickler to handle the new (unknown) subclass.

override def unpickle(tag: String, reader: PReader): Any =
  if(tag == FastTypeTag.Null.key) null
  else tag match {
    case FastTypeTag[Foo].key =>
      val x = stringUnpickler.unpickleEntry(reader.readField("x"))
      new Foo(x.asInstanceOf[String])
    case other =>
      // Look up an unpickler, or generate one on the fly
      val up = scala.pickling.internal.currentRuntime.picklers.genUnpickler(
        scala.pickling.internal.currentMirror, tagKey)
      up.unpickle(tag, reader)  
  }

Here, we simple check to see if the tag matches our expected value, and if not we look for the registered pickler for the given tag.

Eliding type tags

So, while this custom pickler works great, you may notice immediately that the resulting pickle is a bit verbose:

scala> import scala.pickling._, json._, Defaults._

scala> Foo("hello").pickle
res0:  JsonPickle("{
  "$tag": "Foo",
  "x": {
    "$tag": "java.lang.String",
    "hi"
  }
}")

You'll see that the pickling library is storing the type tag for java.lang.String, just as it is for our object. While these tags are great for allowing us to handle classes that may have subclasses, java.lang.String is final and we'd prefer not to have to waste pickle size just to store a tag we already know about from the Foo pickler.

To remove the tag, we use an "elide" hint to the pickler. While elide hints are completely optional for a pickling format, all of the built in formats will make use of elide tags.

Let's alter our pickle method to be as follows:

override def pickle(picklee: Foo, builder: PBuilder): Unit = {
  builder.hintTag(tag) // This is always required
  builder.beginEntry(picklee)
     builder.putField("x", { fieldBuilder =>
        fieldBuilder.hintStaticallyElided();  // Tells the PBuilder to drop the hint.
        stringPickler.pickle(picklee.x, fieldBuilder)
     })
  builder.endEntry()
}

Now when we pickle our object, we get:

scala> Foo("hello").pickle
res1:  JsonPickle("{
  "$tag": "Foo",
  "x": "hi"
}")

This is great! Unfortunately, we may no longer be able to unpickle the object. Since we've elided the type during pickling, we ALSO need to tell the builder we'll be eliding the type during unpickling:

override def unpickle(tag: String, reader: PReader): Any = {
  val xreader = reader.readField("x")
  xreader.hintStaticallyElided()  // Tells the reader not to expect a type tag.
  val x = stringUnpickler.unpickleEntry(xreader)
  new Foo(x.asInstanceOf[String])
}

Now we are able to pickle and unpickle successfully.

Sharing pickling values and circular structures.

Pickling has built in support for sharing values of objects while picklling unpickling. This feature can be a space-saving optimization in teh event that some values are often repeated, but can also be used to help pickle circular data structures.

Note: While pickling (and the default formats), provide hooks for sharing instances out of thebox, not all formats are required to support sharing. Please check the documentation of your format to ensure that it is able to handle sharing.

Let's create a circular data structure, for which we'd like to pickle/unpicle the results.

final class Node(val name: String, var edge: Node)

The class Node represents the node of a list. It can contain pointers to another node within the list, and may be circular. Let's first start by creating a simple pickler for this data structure.

import scala.pickling._
object Node {
  implicit object pickler extends AbstractPicklerUnpickler[Node] {
    private val stringPickler = implicitly[Pickler[String]]
    private val stringUnpickler = implicitly[Unpickler[String]]

    override val tag = FastTypeTag[Graph]

    override def pickle(picklee: Node, builder: PBuilder): Unit = {
      builder.hintTag(tag) // This is always required
      builder.beginEntry(picklee)
         builder.putField("name", { fieldBuilder =>
            fieldBuilder.hintStaticallyElided()
            stringPickler.pickle(picklee.x, fieldBuilder)
         })
         builder.putField("edge", { fieldBuilder =>
           fieldBuilder.hintStaticallyElided()
           pickle(picklee.edge, fieldBuilder)
         })
      builder.endEntry()
    }

    override def unpickle(tag: String, reader: PReader): Any = {
      val name = stringUnpickler.unpickleEntry(reader.readField("name"))
      val edge = unpickleEntry(reader.readField("edge"))
      new Node(name.asInstanceOf[String], edge.asInstanceOf[Node])
    }
  }
}

So far, this structure functions jsut fine. However, it will recurse infinitely if the Graph is circular. To fix this, we need to alter the pickle method so that we can store reference ids and share them.

override def pickle(picklee: Node, builder: PBuilder): Unit = {
  // Check to see if we've encountered this `Graph` before
  val oid = scala.pickling.internal.lookupPicklee(picklee)
  // Inform the builder of the previous `oid` (or -1 if this is new)
  builder.hintOid(oid)
  builder.hintTag(tag) // This is always required
  builder.beginEntry(picklee)
  // Pickling 0.10.x - do not pickle values if we were previously pickled
  if(oid == -1) {
     builder.putField("name", { fieldBuilder =>
        fieldBuilder.hintStaticallyElided()
        stringPickler.pickle(picklee.x, fieldBuilder)
     })
     builder.putField("edge", { fieldBuilder =>
       fieldBuilder.hintStaticallyElided()
       pickle(picklee.edge, fieldBuilder)
     })
  }
  builder.endEntry()
}

The above code is doing three things:

  1. It is checking to see if the current Node was seen before. THis also registers the node with the current picklee cache.
  2. We inform the builder of our previous obejct id, if we had one.
  3. We do not serialize any of our values if there was a previous obejct id. Note: This breaks if the format does not support object sharing. In Pickling 0.11.x this will be corrected and the if check is no longer necessary.

Next, we ened to update the unpickler to handle sharing as well. We do so via:

override def unpickle(tag: String, reader: PReader): Any =
  if(tag === FastTypeTag.Ref.key) {
    Defaults.refUnpickler.unpickle(tag, reader)
  } else {
    // Grab the `oid` that will match up with the exact same `oid` from the pickler.
    val oid = _root_.scala.pickling.internal.`package`.preregisterUnpicklee()  
    val name = stringUnpickler.unpickleEntry(reader.readField("name"))
    // Register the value of this pickle with the cache.  We MUST do this
    // before we recurse down fields, because
    val result = Node(name.asInstanceOf[String], edge.asInstanceOf[Node])
    scala.pickling.internal.registerUnpicklee(result, oid)
    // Asign the edge to our node
    val edge = unpickleEntry(reader.readField("edge"))
    result.edge = edge  
    result
  }

The above code follows this algorithm:

  1. First check to see if we're just looking up a reference from the cache.
  2. If not unpickling a previous value, Preregister for the oid you'll need for your object.
  3. Create an instance of Node we register with the object cache. This instance will be used everywhere the reader finds an oid reference.
  4. Read any fields which may have circular references. note: this must be done after registering an object instance, or we will fail to handle nested recursive structures.