Performance

It’s difficult to give exact overhead numbers for instrumenting a system. For systems with high throughput and low processing time, if every small operation is measured then the footprint of instrumentation can be significant compared to the work performed. In a calculation-intensive system with longer processing times, the overhead of recording metrics amortizes as the work time increases.

Observing a system takes time and work, so the more that is instrumented and observed, the more overhead it costs to do so. Lightbend Telemetry has flexible configuration for deciding which parts of the system should be instrumented and at what granularity. Systems with low processing time, or systems with temporary, short-living actors, can be configured appropriately. Selecting the right parts of a system to instrument allows you to control the overhead of the telemetry.

Lightbend Telemetry performance overhead

From our performance tests, the time overhead for measuring a single message send and the processing of that message, collecting all relevant metrics1, is up to two microseconds on standard server hardware2. In the overall running of a realistic sample application3, this leads to 3% overhead on throughput, when all actors have telemetry enabled and including the reporting of metrics to a monitoring solution. Note that this does not indicate that another application will see the same overhead, as it depends on the system architecture, the amount of communication and work time, and the telemetry configuration.

1 Mailbox size, message time in mailbox, message processing time 2 Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz 3 An application that analyzes documents