Presentation: Jackson Performance

Turbo-charging Jackson

Although Jackson JSON Processor is fast out-of-the-box, with default settings and common usage patterns, there are ways to make it process things even faster.

This presentation looks at couple of things you can use that can make a big difference in performance, for cases where every last drop of CPU power matters.

Basics: Things You Should Do Anyway

(note: this section is inspired by Jackson Performance: best practices Uncyclo page at FasterXML Jackson Uncyclo

There are some basic ground rules to follow, to ensure that Jackson processes things at optimal level. These are things that you should "do anyway", even if you do not have actual performance problems: think of them as an interpretation of the "Boy Scout Rule" ("Always leave the campground cleaner than you found it"). Note that guidelines are shown in loosely decreasing order of importance.

Reuse heavy-weight objects: ObjectMapper (data-binding) and `JsonFactory (streaming API)

To a lesser degree, you may also want to reuse ObjectReader and ObjectWriter instances -- this is just some icing on the cake, but they are fully thread-safe and reusable

Close things that need to be closed: JsonParser, JsonGenerator

This helps reuse underlying things such as symbol tables, reusable input/output buffers
Nothing to close for ObjectMapper

Use "unrefined" (least processed) forms of input: i.e. do not try decorating input sources and output targets:

Input: byte[] is better if you have it; InputStream next best; Reader then -- and in every case, do NOT try reading input into String!
Output: OutputStream is best; Writer second best; calling writeValueAsString() is the least efficient (why construct intermediate String?)
Rationale: Jackson is very good at finding the most efficient (sometimes zero-copy) way to consume/produce JSON encoded data -- let it do its magic

If you need to re-process, replay, don't re-parse

Sometimes you need to process things in multiple phases; for example, you may need to parser part of JSON to figure out further processing or data-binding rules, and/or modify intermediate presentation for further processing
Instead of writing out intermediate forms back as JSON (which will incur both JSON writing and reading overhead), it is better to use a more efficient intermediate form
The most efficient intermediate form is TokenBuffer (flat sequence of JSON Tokens); followed by JSON Tree model (JsonNode)
May also want to use ObjetMapper.convertValue(), to convert between Object types

Use ObjectReader method readValues() for reading sequences of same POJO type

Functionally equivalent to calling readValue() multiple times, but both more convenient AND (slightly) more efficient

Prefer 'ObjectReader'/'ObjectWriter' over 'ObjectMapper'

ObjectReader and ObjectWriter are safer to use -- they are fully immutable and freely shareable between threads -- but they can also be bit more efficient, since they can avoid some of the lookups that ObjectMapper has to do

Specific options for further improving performance

Once you have reviewed "the basics" discussed above, you may want to consider other tasks specifically aimed at further improving performance.

Ease vs Compatibility

There are two main criteria that differentiate approaches listed below:

Ease -- how much work is involved in making the change
Compatibility -- is the resulting system interoperable with "Plain Old JSON" usage?

Compatible, not so easy: Use Streaming API

The big benefit of Jackson Databind API is the ease of use: with just a line or two of code you can convert between POJOs and JSON. But this convenience is not completely free: there is overhead involved in some of the automated processing, such as that of handling POJO property values using Java Reflection API (compared to explicit calls to getters and setters).

So one straight-forward (if laborious) possibility is to rewrite data conversion to use Jackson Streaming API. With Streaming API one has to construct JsonParsers and JsonGenerators, and use low-level calls to read and write JSON as tokens.

If you explicitly rewrite all the conversions to use Streaming API instead of data binding, you may be able to increase through-put by 30-40%; and this without any changes to actual JSON produced. But writing and maintaining the low-level code takes time and effort, so whether you want to do this depends on how much you want to invest in getting moderate speedup.

One possible trade-off is that of only rewriting parts of the process; specifically, optimizing most commonly used conversions: these are usually leaf-level classes (classes that have only primitive or String -valued properties). You can achieve this by only writing JsonSerializers and JsonDeserializers for small number of types; Jackson can happily use both its own default POJO serializers, deserializers, and custom overrides for specific types.

Non-compatible, easy: Smile binary "JSON"

Another kind of trade-off is to consider Smile binary format, which was developed as part of Jackson 1.6.

Smile is a binary format that is 100% compatible with logical JSON Data model; similar to how "binary XML" (like Fast Infoset) is related to standard textual XML. This means that conversion between JSON and Smile can be done efficiently and without loss of information. It also means that the API for working with Smile-encoded data is 100% regular Jackson API: the only difference being that the underlying factory is of type SmileFactory, instead of JsonFactory. This factory is provided by Jackson Smile Module

Converting a service (or client) to use Smile is very easy: just create an ObjectMapper that uses SmileFactory. But the potential challenge is that such a change is visible to clients; this may or may not be a problem (depending on whether content format can be auto-negotiated, like is done using JAX-RS). But it is a visible change either way.

Use of binary format may be problematic more generally, as well; dealing with binary formats is very difficult from Javascript (and this is true for ALL binary formats, including protobuf and thrift) -- and for Javascript, specifically, it is SLOWER than handling of JSON -- but may also be problematic from languages that do not yet have Smile codec available. Currently Smile support is provided by libsmile library written in C (and obviously standard Java implementation). Finally, debugging of binary formats is more difficult than that of textual data formats, as some kind of reader will be needed.

Performance improvements from using Smile are similar to using Streaming API (30 - 50% improvement), but an additional bonus is that size of the data will decrease as well; typically by similar amount (30-50%). Note that performance improvements are more significant with redundant data like streams of similar Objects ("big data", such as Map/Reduce data streams); this because Smile can use back-references to all but eliminate repeating property names and short String values (like Enumerated values).

Finally, note that as with JSON, you can also choose between Streaming API and databinding when using Smile as the underlying format. Doing this will combine performance benefits.

Non-compatible, easy: POJOs as JSON Arrays (Jackson 2.1)

As a very new (to be included in soon-to-be-released Jackson 2.1) option, it will be possible to change actual JSON Structure used for serializing Java Objects. For example, consider case of hypothetical Point class:

public class Point {
  public int x, y;
}

which would typically be serialized as:

{"x":27, "y":15}

However, if one declares it as:

@JsonFormat(shape=JsonFormat.Shape.ARRAY)
@JsonPropertyOrder(alphabetic=true)
public class Point {
  public int x, y;
}

we would instead get:

[27,15]

which basically just eliminates property names by using positional values for indicating which property value is stored where. This can lead to significant compaction of the serialized JSON content; and this translates quite directly to performance. It is also worth noting that this works equally well for "simple" non-repeating data (like request/response messages), as property names are simply eliminated.

As with Smile, this change is directly visible to client, and either requires that client uses Jackson, or implements similar functionality. Nonetheless, this format is slightly easier to read (or at least debug) and process with scripting languages.

Since this features is brand new, it has not been extensively performance tested, but the initial results suggest that it can achieve improvements similar to use of Smile or hand-written Streaming API based converters. And this feature can be combined with use of Smile format as well.

Compatible, easy: Afterburner

After going through couple of compromises (easy OR compatible), there is one approach that is both (yay!): Jackson Afterburner Module.

What Afterburner module does is to optimize underlying serializers and deserializers, by:

Uses byte code generation to replace Java Reflection calls (used for field and method access and constructor calls) with actual byte code -- similar to how one would write explicit field access and method calls from Java code
Inlines handling of a small set of basic types (String, int, long -- possibly more in future), so that if the default serializer/deserializer is used, calls are replaced by equivalent standard handling (which eliminates couple of method calls, and possible argument/return value boxing)
Speculative "match/parsing" of ordered property names, using special matching calls in JsonParser -- this can eliminate symbol table lookups if field names are serialized in order Jackson serializes them (which may be indicated by use of @JsonPropertyOrder)

Since these optimizations more or less mimic more efficient patterns used by "hand-written" converters (i.e. our first option, use of Streaming API), performance improvements could theoretically reach the level of such converters. In practice we have observed improvements in 60-70% range of this maximum (that is, Afterburner can eliminate 2/3 of overhead that standard databinding has over hand-written alternatives).

Maturity of approaches

Approaches discussed so far have different levels of maturity, and this may affect your choices:

Streaming API - based converters ("hand-written"): Streaming API has been available since the first Jackson release
Smile format: First introduced in Jackson 1.6, very stable, both format and parser/generator implementations
Significant amount of real heavy production use by projects like Elastic Search
Afterburner: Has been available since Jackson 1.8 -- not experimental, but has not been used as heavily as Smile.
POJO-as-array: Experimental, included in Jackson 2.1; work in progress

Or Do (almost) All of Above!

But do you need to choose just one approach? Absolutely not!

In fact, bundling can save you big here: you can combine most of the approaches. Specifically:

Choice of Smile over JSON is compatible with all the other choices and can vary independently.
Choices of "POJO-as-array" and Afterburner are compatible with choices other than Streaming API

So, you could consider combinations like:

Use Smile format, but write your code using Streaming API: this is what some frameworks do (Elastic Search)
Use Afterburner with "POJO-as-Array"; either with regular JSON or Smile

if you are after maximal performance.

How fast is fast?

With "extreme" combinations such as listed above, use of plain old JSON can meet or exceed performance of fast binary formats such as protobuf, thrift or avro. And with Smile, both processing speed and data sizes can exceed alternatives (as small, even faster!).

Although not all combinations discussed above are included, JVM Serializers benchmark can give some idea for improvements, as it includes results for JSON/Smile, Streaming-API/databind/Afterburner combinations.

Presentation: Jackson Performance

Turbo-charging Jackson

Basics: Things You Should Do Anyway

Specific options for further improving performance

Ease vs Compatibility

Compatible, not so easy: Use Streaming API

Non-compatible, easy: Smile binary "JSON"

Non-compatible, easy: POJOs as JSON Arrays (Jackson 2.1)

Compatible, easy: Afterburner

Maturity of approaches

Or Do (almost) All of Above!

How fast is fast?

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally