Cloudflow (formerly Pipelines) Kafka
These monitors are used by Lightbend Pipelines to track health of Kafka clients. They are not meant for monitoring any other Kafka workloads besides Lightbend Pipelines applications. The metrics come from the Kafka Java Consumer and Kafka Java Producer libraries, via jmx_exporter, and describe throughput for message handling. More details in Lightbend Pipelines Docs
pipelines_kafka_consumer_throughput
Kafka consumers take batches of messages from brokers, process them, then take the next batch in a continuous loop. Throughput for a consumer is the average number of messages read per second. The pipelines_kafka_consumer_throughput
monitor warns if that throughput is unusual compared to previous throughput. This warning occurs if throughput rises or drops more than three standard deviations from average throughput, which by the empirical rules means a 99% chance that something unusual is happening.
Kafka Java Consumer provides throughput per partition as kafka_consumer_consumer_fetch_manager_metrics_records_consumed_rate
which Console aggregates per topic into kafka_consumer_topic_consumed_rate
. Total throughput per topic is the input to this monitor.
pipelines_kafka_producer_throughput
Kafka producers send batches of messages to brokers. Throughput for a producer is the average number of messages sent per second. The pipelines_kafka_producer_throughput
monitor warns if that throughput is anomalous compared to average throughput, in similar manner to the pipelines_kafka_consumer_throughtput
monitor.
Kafka Java Producer provides throughput per partition as kafka_producer_producer_metrics_record_send_rate
which Console aggregates per topic into kafka_producer_topic_send_rate
for the input to this monitor.
pipelines_kafka_consumer_lag
Kafka consumers read messages from a broker after some producer has written those messages. “Lag” is the number of messages between a producer’s latest message and what the consumer is currently reading. So for example, if a producer has written 100 messages, but a consumer has only read 10 messages so far, lag would be 90 messages.
The pipelines_kafka_consumer_lag
monitor alerts if that lag trends upward over time, meaning the consumer is not keeping up with the producer.
Kafka Java Consumer provides lag per partition as kafka_consumer_consumer_fetch_manager_metrics_records_lag_max
which Console aggregates per topic and client instance into kafka_consumer_topic_lag_max
for the input to this monitor.