Akka

Lightbend Telemetry is capable of capturing data for the following Akka related features.

Cinnamon Akka module dependency

After adding the Cinnamon Agent as described in the setup, make sure that you add the Cinnamon Akka module dependency to your build file:

sbt
libraryDependencies += Cinnamon.library.cinnamonAkka
Maven
<dependency>
  <groupId>com.lightbend.cinnamon</groupId>
  <artifactId>cinnamon-akka_2.11</artifactId>
  <version>2.5.2</version>
</dependency>
Gradle
dependencies {
  compile group: 'com.lightbend.cinnamon', name: 'cinnamon-akka_2.11', version: '2.5.2'
}

Actor metrics

The following metrics are recorded for instrumented actors, type of metric in parenthesis:

  • Running actors (counter) — the number of running actors (of an actor class or group).

  • Mailbox size (counter) — statistics for actor mailbox sizes.

  • Stash size (counter) — statistics for actor stash sizes.

  • Mailbox time (recorder) — statistics for the time that messages are in the mailbox.

  • Processed messages (rate) — the number of messages that actors have processed in the selected time frame.

  • Processing time (recorder) — statistics for the processing time of actors.

  • Sent messages (rate) — statistics for the number of sent messages per actor.

All time related metrics use nano seconds unless specified otherwise.

Router metrics

The following router metrics are available:

  • Processed messages (rate) — the number of messages that routers have processed in the selected time frame.

  • Processing time (recorder) — statistics for the processing time of the router logic.

Note: Router metrics are only available for router actors, i.e. not availble when routers are used directly.

Note: Use the setting routers = off to disable router metrics from being created, see router exclude settings.

Actor remote metrics

The following remote metrics are recorded for instrumented actors, type of metric in parenthesis:

  • Sent messages (rate) — statistics for the number sent remote messages.

  • Sent message size (bytes) — statistics for remote sent message sizes.

  • Serialization time (recorder) — statistics for the time that serialization takes.

  • Received messages (rate) — statistics for the number received remote messages.

  • Received message size (bytes) — statistics for remote received message sizes.

  • Deserialization time (recorder) — statistics for the time that deserialization takes.

  • Node quarantine (event) — node quarantine event information.

  • Phi accrual value (gauge) — statistics for the Phi accrual failure detector. A Phi value represents the connection between two nodes; self and remote. A self node can have a connection to any number of remote nodes and each connection will have its own Phi value. Note that internally in Akka the Phi accrual value can become Double.Infinity. If this happens Cinnamon will convert this value to 1024*1024. The reason for this is that most visualizers cannot handle infinity. If you therefore see the value 1048576 (1024*1024) this means that the Phi value has reached infinity.

  • Phi accrual threshold value (gauge) — the configured Phi accrual threshold value.

All time related metrics use nano seconds unless specified otherwise.

Note: Timing of serialization/deserialization is turned off by default. To enable it, you need to add this setting to your configuration: “cinnamon.akka.remote.serialization-timing = on”

Note: Phi accrual metrics and node quarantine events are turned off by default. To enable them, you need to add this setting to your configuration: “cinnamon.akka.remote.failure-detector-metrics = on”

Actor selection

Actor configuration supports selecting and grouping actors for instrumentation by actor class, package, subtree, or instance, so that telemetry and metric aggregation can be tailored to the application. Details on how to configure actor telemetry can be found under actor configuration.

Actor events

Out of the ordinary events are automatically recorded for instrumented actors. Events may trigger a debug snapshot when using the OverOps plugin. Actor events include:

  • Failures — when an actor fails, catches an exception, or logs an error.

  • Unhandled messages — when an actor does not handle a message sent to it.

  • Dead letters — when a message is sent to an actor that no longer exists.

Cluster events

These are the types of Akka clustering events that Cinnamon observes:

  • Domain events — cluster domain events like leader changed, role leader changed or cluster shutting down.

  • Member events — cluster member events like member up, unreachable, reachable, exited or removed.

  • Singleton events — cluster singleton events with information about node, actor singleton class and name.

  • Shard region started event — cluster shard region started event with information about actor, node, type name and type of shard region (normal/proxy).

  • Shard region stopped event — cluster shard region stopped event with information about actor, node, type name and type of shard region (normal/proxy).

Note: Cluster events are turned off by default. To enable them, you need to add these settings to your configuration: “cinnamon.akka.cluster.domain-events = on”, “cinnamon.akka.cluster.member-events = on”, “cinnamon.akka.cluster.singleton-events = on” and/or “cinnamon.akka.cluster.shard-region-info = on”

Cluster metrics

The following cluster metrics are recorded, type of metric in parenthesis:

  • Shard region delivered messages (rate) — statistics for the number of messages that have been delivered by the shard region actor (regardless of where the shard resides).

Note: Cluster related metrics is turned off by default. To enable it, you need to add these settings to your configuration: “cinnamon.akka.cluster.shard-region-info = on”

Split Brain Resolver events

Running Split Brain Resolver (SBR), a plugin available with the Lightbend Production Suite, in your cluster will ensure better resilience. If you have SBR running, Lightbend Telemetry will automatically keep track of any activity therein. If there is a split in your cluster events will be created with information about the node running SBR, the decision, available and unreachable nodes.

Note: Split brain resolver events are turned off by default. To enable them, you need to add these settings to your configuration: “cinnamon.akka.cluster.split-brain-resolver-events = on”

Threshold events

Thresholds can be specified for some of the metrics. If the threshold is exceeded then an event is fired (and which will trigger OverOps debug snapshots). Alerts and integration with notification systems are available in OverOps. Thresholds are supported for:

  • Mailbox size — mailbox queue grows too large.

  • Stash size — stash queue grows too large.

  • Mailbox time — message has been in the mailbox for too long.

  • Processing time — message processing takes too long.

  • Remote large message sent — a message larger than the threshold has been sent

  • Remote large message received — a message larger than the threshold has been received

For more information see metric thresholds configuration.

Stopwatch

Stopwatch provides a timer that follows asynchronous flows. A Stopwatch can be started in one actor and then flow through to others via message sends. You can use it to gather time metrics for “hot paths” within message flows that cross multiple actors. Intervals are marked programmatically with start and stop points within the application using an Akka extension Stopwatch API. Time metrics are recorded for Stopwatches and threshold events can be configured. For more details see the Stopwatch extension.

Dispatcher metrics

The following metrics can be recorded for instrumented dispatchers, type of metric in parenthesis:

Basic metrics

These are metrics that are built into the standard ForkJoinPool and ThreadPool ExecutorService implementations in Java and Scala. They are polled periodically by the instrumentation.

ForkJoinPool

  • Parallelism — the parallelism setting

  • Pool size (counter) — the current size of the thread pool

  • Active threads (counter) — an estimate of the number of threads running or stealing tasks

  • Running threads (counter) — an estimate of the number of threads not blocked in managed synchronization

  • Queued tasks (counter) — an estimate of the total number of tasks currently in queues

ThreadPool

  • Core pool size (counter) — the minimum size of the thread pool

  • Max pool size (counter) — the maximum size of the thread pool

  • Pool size (counter) — the current size of the thread pool

  • Active threads (counter) — an estimate of the number of threads running tasks

  • Processed tasks (counter) — an estimate of the number of processed tasks

Time metrics

Additional detailed time metrics for dispatchers.

  • Queue size (counter) — the number of tasks waiting to be processed

  • Queue time (recorder) — statistics for how long tasks are in the queue

  • Processing (counter) — how many tasks are being processed righ now

  • Processing time (recorder) — statistics for how long the processing takes

All time related metrics use nano seconds unless specified otherwise.

Dispatcher selection

Dispatcher configuration supports selecting which dispatchers should be instrumented, and what type of instrumentation should be performed for them, so that telemetry can be tailored to the application. Details on how to configure dispatcher telemetry can be found under dispatcher configuration.

Detailed information

For specific information of how to configure actors and dispatchers see: