Improving Garbage Collection Observability with Performance Tracing
Debugging garbage collectors for performance and correctness is notoriously difficult. Among the arsenal of tools available to systems engineers, support for one of the most powerful, tracing, is lacking in most garbage collectors. Instead, engineers must rely on counting, sampling, and logging. Counting and sampling are limited to statistical analyses while logging is limited to hard-wired metrics. This results in cognitive friction, curtailing innovation and optimization.
We demonstrate that tracing is well suited to GC performance debugging. We leverage the modular design of MMTk to deliver a powerful VM and collector-neutral tool.
We find that tracing allows: cheap insertion of tracepoints—just 14 lines of code and no measurable run-time overhead, decoupling of the declaration of tracepoints from tracing logic, high fidelity measurement able to detect subtle performance regressions, while also allowing interrogation of a running binary.
Our tools crisply highlight several classes of performance bug, such as poor scalability in multi-threaded GCs, and lock contention in the allocation sequence. These observations uncover optimization opportunities in collectors, and even reveal bugs in application programs.
We showcase tracing as a powerful tool for GC designers and practitioners. Tracing can uncover missed opportunities and lead to novel algorithms and new engineering practices.