Realizing the software of the future...today

Get the code from Bitbucket. If you find this useful, donate to the cause.

Q2 2023 Updates for Java Serialization comparison

When you really can't let a completed project go...

By Nico Vaidyanathan Hidalgo

I completed my investigation of existing object serialization in Java, wrote Loial, and published detailed technical reports comparing availble options. Yet, like a dog sniffing the bone after eating the meat, I smelled a few nagging questions...

  1. How does Loial compare to some of the other libraries I discovered and omitted?
  2. Could changes in library versions--such as the release of Gson 2.10 or Jackson 2.14--affect results?
  3. Wouldn't it be cool to automate some of the comparison generation process to quickly re-run analysis with new libraries and/or versions?

This eventually led me to integrate more libraries into the serialization framework performance benchmark. I decided to defer the full architectural/leanness analysis unless splashy new outliers (such as something that outperformed Loial) demanded the attention. I added the following:

It strikes me that it might be fairer to group results by wire format type--such as binary/custom, JSON, YAML. It's possible that certain inherent properties of the wire formats lead to similar perfomance amongst competitors within that space. I'm still thinking about writing out code generation plugins for Loial to generate JSON and YAML efficiently, as well. I continue to defer these investigations in favor of other initiatives, but they're worth noting.

Contemplating that manually parsing JMH console output and suffering through Excel Hell further would not scale, I started thinking about automation. I realized that the Gradle JMH plugin can emit CSV. In retrospect, it was absolutely silly not to choose that in the first place. I wrote Riff-Raff to turn JMH output into comparison graphs.

Results

G1

Throughput

Mutable Single Boolean

data for MUTABLE_SINGLE_BOOLEAN

Mutable Three Fields

data for MUTABLE_THREE_FIELDS

Mutable Five Fields

data for MUTABLE_FIVE_FIELDS

Mutable Ten Fields

data for MUTABLE_TEN_FIELDS

Immutable Single Boolean (not a Record)

data for IMMUTABLE_SINGLE_BOOLEAN

Single Boolean Record

data for SINGLE_BOOLEAN_RECORD

Three Field Record

data for THREE_FIELD_RECORD

Discussion

Loial continues to outperform the competition by a significant margin. Alibaba's new FastJSON does mark an interesting new entry in the serialization space by outperforming the popular Jackson library. Kryo still outperforms it by 5 fields+, and at a glance it's codebase doesn't strike me with any particularly interesting ideas worth borrowing--such as Kryo's use of Fibonacci Hashing. But it is worth keeping an eye on.

Shenandoah

Throughput

Mutable Single Boolean

data for MUTABLE_SINGLE_BOOLEAN

Mutable Three Fields

data for MUTABLE_THREE_FIELDS

Mutable Five Fields

data for MUTABLE_FIVE_FIELDS

Mutable Ten Fields

data for MUTABLE_TEN_FIELDS

Immutable Single Boolean (not a Record)

data for IMMUTABLE_SINGLE_BOOLEAN

Single Boolean Record

data for SINGLE_BOOLEAN_RECORD

Three Field Record

data for THREE_FIELD_RECORD

Discussion

Overall throughput performance appears to suffer a bit on Shenandoah. The slight drop off is practically insignificant and the effect is smaller the larger the number of fields in the objects. The results are likely not statistically significantly different, though it should show that changing GC algorithm from G1 to Shenandoah will not be a magic bullet for performance increases with respect to object serialization.

ZGC

Throughput

Mutable Single Boolean

data for MUTABLE_SINGLE_BOOLEAN

Mutable Three Fields

data for MUTABLE_THREE_FIELDS

Mutable Five Fields

data for MUTABLE_FIVE_FIELDS

Mutable Ten Fields

data for MUTABLE_TEN_FIELDS

Immutable Single Boolean (not a Record)

data for IMMUTABLE_SINGLE_BOOLEAN

Single Boolean Record

data for SINGLE_BOOLEAN_RECORD

Three Field Record

data for THREE_FIELD_RECORD

Discussion

All in all the ZGC results look very similar to both Shenandoah and G1. This once again suggests that garbage collection algorithm does not play a significant impact in this domain. Perhaps additional tweaks of parameters related to these algorithms would prove fruitful.

Conclusion

Generally results are unchanged from the initial work in 2022. Loial performs significantly better than all other offerings. For those looking for different licensing or a more widely used, though less perfomant library, Moshi and FastJSON are interesting new options. FastJSON, in particular, usually outperforms Jackson.

Nevertheless, this work benefits the community by providing a complete performance benchmarking solution. Now developers can use this to track and compare a wide number of Java serialization solutions, can investigate performance difference between versions of said libraries, and can be use it as a target for performance optimization and verification. Riff-Raff itself is not tied to serialization library performance benchmarking. It is suitable for rapid visualization of JMH output results for any JMH performance investigation.

Want to see future work like this, including similar work with Java App Servers or HTTP servers? Donate!