-
Notifications
You must be signed in to change notification settings - Fork 78
[Design] New Pickling Generators
Currently, the algorithm to generate picklers is optimised for very efficient runtime pickling. However, we've discovered some areas of unsafety with the interaction of the current IR classes and the scala compiler, specifically there are times where the scala compiler will elide symbol information that WOULD be available by runtime reflection. In these instances, incorrect picklers are generated.
We'd like to enhance the the algorithms used to generate picklers with a nuanced approach, with goals of:
-
Correctness -
Any algorithm should be accurate. In this vein, we should be able to correctly utilize existing conventions, such as JPA annotations,
java.io.Serializable
,case class
etc. - Efficiency Any algorithm should retain as much effiency as possible, ideally operating as fast as hand-rolled code would. The preference for statically generated code, rather than runtime reflection, should account for most speed up.
- Completeness - Pickling should be a drop in replacement for serialization, similar to java.io.Serializable or Kryo. This means pickling should attempt to handle ALL possible scenarios of objects needing to be serialized, and only force the user to hand-roll code when they desire. This does not mean we do not encourage hand-written picklers, only that we'd like the "out of the box" picklers to be solid and something you can rely on.
-
Runtime configuration -
There are some mechanisms that may not be available on all platforms, e.g. the use of
sun.misc.Unsafe
may be the most efficient means of interacting with a class, but would not be available in all environments.
The new approach will divide the Pickler/Unpickler generation into several layers/algorithms. Some of these algorithms may only be used during runtime, while others can only be used for static compilation. Additionally, we will split the generated picklers into several encodings for how to actually serializer/reify classes at runtime.
These are a Scala language construct. Most case classes should be serializable as is.
These are a Java langauge construct. Serializable denotes certain mechanisms are available to serialize a class safely. For pickling, we can additional add the following assumptions:
- If there is no writeObject/readObject method, we can assume "simple" serialization, and mimick it in our generated picklers.
- If we cannot unify the constructor with all serializable fields, we should be able to shove the field values into the object via reflection/Unsafe.
These are a Java extension. These annotations describe how to take a class and serialize it into a database. However, we should also be able to use these same annotation to serialize classes out to a pickler format.
These are a Java idiom. We can try to infer the existence of a static singleton instance and create a pickler which simple grabs the singleton instance (not serializing anything).
Most runtime serialization frameworks also take a crack at serializing unknown objects (or unannotated). These are objects that don't fall into any special class, or have user serialization annotations. These objects CAN be serialized, but it can also be unsafe to do so. Right now, the existing pickling implementation handles these quite well at runtime, and not so well at compile time.
Here is a set of pickler implementations, in order of preference, we'd like to use. These represent what code in a Pickler/Unpickler would look like.
This implementation involves trying to directly call the constructor of the class.
Applicable
- Scala Case classes
- JPA annotated classes (if methods are not private)
- Classes fully defined by constructor (if we can detect it)
This implementation involves the use of sun.misc.Unsafe to look up field layouts for classes. When pickling, we rip the underlying primitives directly out of the class. When instantiating, we construct instances and shove date into memory locations directly.
Applicable
- Not running in Android (no Unsafe)
- java.io.Serializable (without fancy writeObject/readObject)
- Can use on any object, but not 100% safe.
This implementation involves calling the constructor of a class via java reflection, and setting values
Applicable
- JPA classes (Where not all fields/methods are public or constructor cannot be unified)
- java.io.Serializable (without fancy writeObject/readObject)
- Can set private method calls to public (i.e. not with some security managers).
- Can use on any object, but not 100% safe.
- Unsafe vs. Java reflection should actually be a RUNTIME decision (if possible)
- We should outline what "constructor unification" is, in my vernacular.
Here are a set of examples, and how pickler should operate>
case class Foo(x: int)
- Pickling should always be able to generate a
Foo
pickler/unpickler if there is an implicitInt
pickler available. - The pickler/unpickler should be generated with the
direct
method (i.e. just calls constructor directly)
object Bar { .. }
- Pickling should issue a warning that singletons are not serialized across JVMs, but instead just referenced.
- Pickling should always be able to generate a
Bar
pickler - The pickler/unpickler should be generated with the
direct
method.
class Foo(private var bar: Int) extends java.io.Serializable
- Pickling should issue a debug/warning message that it needs to use Reflection to implement this pickler.
- Pickling should always be able to generate the
Foo
pickler IF anInt
pickler is available implicitly - Pickling should encode a pickling/unpickling strategy that can be customized at runtime to either use
reflection
orunsafe
algorithms.
class Foo(private var bar: Int) extends java.io.Serializable {
override protected def writeObject(o: ObjectOuputStream): Unit = {
...
}
}
Note: Complicated => means that either writeObject or readObject is implemented.
- Pickling should issue a warning that this serializable class has custom write/read methods. This
can be configured to be a always error, via
import pickling.static.simpleSerializableOnly
. -
Short Term - Pickling can fallback to runtime pickler generation. If
import pickling.static.staticOnly
is available, instead we issue an error. - Long Term - Read the bytecode of read/write object and generate the pickler to mimic the behavior, only interacting with a PReader/PWriter instead of an ObjectStream.
class Foo {
public Foo() {}
@Id private int id;
@Field private String name;
}
- Pickling should issue a debug/warning message that it needs to use Reflection to implement this pickler.
- Pickling should always be able to generate the
Foo
pickler IF anInt
pickler and aString
pickler is available implicitly - Pickling should encode a pickling/unpickling strategy that can be customized at runtime to either use
reflection
orunsafe
algorithms.