Measuring GC latencies in Haskell, OCaml, Racket
James Fisher has a blog post on a case where GHC’s runtime system imposed unpleasant latencies on their Haskell program:
Low latency, large working set, and GHC’s garbage collector: pick two of three
The blog post proposes a very simple, synthetic benchmark that exhibits the issue — basically, latencies incurred by copy time — with latencies of 50ms that are considered excessive. I thought it would be amusing to reproduce the synthetic benchmark in OCaml and Racket, to see how other GCs handle this.
Without further ado, the main take-away are as follows: the OCaml GC has no issue with large objects in its old generation, as it uses a mark&sweep instead of copying collection, and exhibits less than 3ms worst-case pauses on this benchmark.
The Racket GC also does not copy the old generation, but its incremental GC is still in infancy (compared to the throughput-oriented settings which works well) so the results are less good. It currently suffer from a “ramp-up” effect that I will describe, that causes large pauses at the beginning of the benchmark (up to 120ms latency), but in its steady state the longest pause are around 22ms.
Please keep in mind that the original benchmark is designed to exercise a very specific workflow that exercises worst-case behavior for GHC’s garbage collector. This does not mean that GHC’s latencies are bad in general, or that the other tested languages have smaller latencies in general.
The implementations I use, with a Makefile encapsulating the logic for running and analyzing them, are available in a Gitlab repository: