OpenTracing

OpenTracing is an open standard for distributed tracing. Distributed tracing can be used for optimizing end-user latency (the trace gives a breakdown of where time has been spent in distributed requests), root-cause analysis for errors (errors can be annotated in the trace and show how other parts of a distributed system relate to an error), and understanding the bigger picture of the system (traces can give insight into the distinct pieces of a distributed system and how they are connected).

As an example, here’s a simple message flow across actors:

Actor A sends messages to actors B and C (which is running in a different actor system), and actor C sends a message to actor D.

Here’s what a possible trace for this message flow looks like conceptually:

A trace shows a dataflow or an execution path through a distributed system. Each span in the trace represents a logical unit of work. In the case of actors, each span represents the processing of a message by an actor. The duration of the span is recorded. Spans may be nested to model causal relationships, with spans referencing other spans, and for actor tracing these relationships are message sends. Events can be logged within a span.

An actor trace shows the flow of messages, and records when messages were processed and how long it took to process each message. Message sends to other actors are logged within the trace span, as well as any actor events such as actor failures, unhandled messages, dead letters, or logged errors and warnings.

Actor configuration

Actors need to be enabled for tracing, similar to metrics and events. This is an extension of the actor configuration, with a traceable setting that can be enabled for any actor selection.

For example, actors can be selected by class or path and then enabled as traceable, such as in the following configuration:

cinnamon.akka {
  actors {
    "com.example.a.b.*" {
      report-by = class
      traceable = on
    }
    "/user/x/*" {
      report-by = class
      traceable = on
    }
  }
}

System messages

Akka system messages (special internal messages for managing actors) can also be traced. This is off by default, but can be enabled with this configuration:

cinnamon.opentracing {
  akka {
    trace-system-messages = on
  }
}

Akka HTTP configuration

Akka HTTP endpoints need to be enabled for tracing, similar to metrics. This is an extension of the Akka HTTP configuration, with a traceable setting that can be enabled for any endpoint selection.

For example, endpoint paths can be selected using a wildcard and then enabled as traceable, such as in the following configuration:

cinnamon.akka.http {
  clients {
    "*:*" {
      paths {
        "*" {
          metrics = on
          traceable = on
        }
      }
    }
  }
  servers {
    "*:*" {
      paths {
        "*" {
          metrics = on
          traceable = on
        }
      }
    }
  }
}

Akka HTTP internal actors

An Akka HTTP server also creates actors under the /user guardian. If you have enabled actor tracing with a /user/* selection, then internal Akka HTTP and Akka Streams actors will also appear in traces. You can select actors by package instead, to only trace application actors. Or you can exclude the internal actor packages from a /user/* selection, such as in the following configuration:

cinnamon.akka {
  actors {
    "/user/*" {
      excludes = ["akka.http.*", "akka.stream.*"]
      report-by = class
      traceable = on
    }
  }
}

Scala Futures

Actor traces will be automatically propagated through Scala Futures, but scheduled Futures or callbacks will not be automatically represented as trace spans. To enable tracing of Futures, there is a naming API to indicate Futures or Future callbacks that should be traced and to specify the trace span operation name.

For example, there is a named alternative to Future.apply which allows scheduled Futures to be traced:

// this Future is not traceable
val future = Future {
  "compute all the things"
}

import com.lightbend.cinnamon.scala.future.named._

// this Future is traceable and named "compute"
val tracedFuture = FutureNamed("compute") {
  "compute all the things"
}

There are also named alternatives for the callback operations, which are added implicitly as extension methods on Future. For example, to name and trace a mapped transform operation between actors, the mapNamed method can be used in place of map:

import com.lightbend.cinnamon.scala.future.named._

val foo = tracedActor("foo")
val bar = tracedActor("bar")

val future = foo ? message

val transformed = future.mapNamed("transform") {
  value => transform(value)
}

transformed pipeTo bar

This transformation will then show up as its own trace span, between the actor spans, such as in this trace:

Active span

OpenTracing supports recording logs to a trace span and also attaching baggage to the trace context — key:value string pairs which are propagated with the trace, similar to a logging MDC. Cinnamon provides access to the currently active span and there are utility methods for logging and attaching baggage to this span.

Span logs

Cinnamon includes utility methods for logging events or structured data to the currently active span.

The Cinnamon ActiveSpan API can be imported with:

import com.lightbend.cinnamon.opentracing.ActiveSpan

You can log an event to the active span:

ActiveSpan.log("something")

You can log structured data (a Java Map) to the active span:

ActiveSpan.log(ImmutableMap.of("a", "one", "b", "two"))

Trace baggage

Cinnamon includes utility methods for attaching baggage to the trace context — key:value string pairs which are propagated with the trace.

The Cinnamon ActiveSpan API can be imported with:

import com.lightbend.cinnamon.opentracing.ActiveSpan

A baggage item (a key:value string pair) can be attached to the current trace:

ActiveSpan.setBaggageItem("token", "abc123")

Baggage items can also be accessed from anywhere deeper in a trace:

ActiveSpan.getBaggageItem("token")

Note: Baggage items are transferred throughout the trace, both locally and remotely, which can introduce some extra overhead.

Tracing configuration

The OpenTracing integration for both Jaeger and Zipkin build on the Jaeger client. The tracer supports the following configuration:

Setting a service name for each node is useful. The service name can be configured specifically for tracing using the service-name setting (example below) or otherwise this will be based on the application name from the shared Cinnamon metadata. You can use the cinnamon.application setting to configure the same name for both metrics and tracing.

Note: Tracing can produce a very high volume of data, so sampling is applied (at the beginning of a trace). The sampler used, and its settings, can be configured. The default sampler is a rate-limiting sampler that captures up to 10 traces per second.

On the Example tab, there is a configuration that sets the service-name to my-component and configures a rate-limiting sampler with a maximum of 25 traces per second:

Required

There is nothing to configure if you want to use the default OpenTracing settings that will use the rate limiting sampler with 10 traces per second.

Example
cinnamon.opentracing {
  tracer {
    service-name = "my-component"

    sampler = rate-limiting-sampler

    rate-limiting-sampler {
      max-traces-per-second = 25
    }
  }
}
Reference
cinnamon.opentracing {
  tracer {

    # Service name for this application, defaults to main class when not set
    service-name = null

    # Trace sampler to use
    sampler = rate-limiting-sampler

    rate-limiting-sampler {
      # Maximum number of sampled traces per second
      max-traces-per-second = 10
    }

    probabilistic-sampler {
      # Probabilistic sampling rate, between 0.0 and 1.0
      sampling-rate = 0.001
    }

    const-sampler {
      # Constant decision on whether to sample traces
      # Note: this sampler is NOT recommended for production
      decision = true
    }

    # Log trace spans with SLF4J (can be used for debugging the tracer)
    # Set `cinnamon.opentracing.tracer.reporters += trace-logging`
    trace-logging {
      # Name of SLF4J logger to use when logging
      logger = "cinnamon.opentracing.Tracer"
    }

  }
}

Note: These settings are defined in the reference.conf. You only need to specify any of these settings when you want to override the defaults.

Jaeger reporter

Jaeger is a distributed tracing system with support for OpenTracing.

Cinnamon Jaeger dependency

First make sure that your build is configured to use the Cinnamon Agent.

To enable the Jaeger reporter, add the following dependency to your build:

sbt
libraryDependencies += Cinnamon.library.cinnamonOpenTracingJaeger
Maven
<dependency>
  <groupId>com.lightbend.cinnamon</groupId>
  <artifactId>cinnamon-opentracing-jaeger_2.11</artifactId>
  <version>2.5.2</version>
</dependency>
Gradle
dependencies {
  compile group: 'com.lightbend.cinnamon', name: 'cinnamon-opentracing-jaeger_2.11', version: '2.5.2'
}

Jaeger configuration

Jaeger reporting can be configured. On the Example tab, there is a configuration that sets a different endpoint for the Jaeger agent by configuring the host and port settings:

Required

There is nothing to configure if you want to use the default Jaeger settings that will communicate with localhost on port 5775.

Example
cinnamon.opentracing {
  jaeger {
    host = "localhost"
    port = 5432
  }
}
Reference
cinnamon.opentracing {
  jaeger {

    # Host for Jaeger trace span collector
    host = "localhost"

    # UDP port for Jaeger trace span collector
    port = 5775

    # Max size for UDP packets
    max-packet-size = 65000

    # Flush interval for trace span reporter
    flush-interval = 1s

    # Max queue size of trace span reporter
    max-queue-size = 1000

  }
}

Note: These settings are defined in the reference.conf. You only need to specify any of these settings when you want to override the defaults.

Running Jaeger

See the Jaeger documentation for running Jaeger. The Jaeger getting started shows how to run Jaeger locally for development and testing.

Here’s what an example actor trace in Jaeger looks like:

Jaeger trace

Zipkin reporter

Zipkin is a distributed tracing system with support for OpenTracing.

Cinnamon Zipkin dependency

First make sure that your build is configured to use the Cinnamon Agent.

To enable the Zipkin reporter, add the following dependency to your build:

sbt
libraryDependencies += Cinnamon.library.cinnamonOpenTracingZipkin
Maven
<dependency>
  <groupId>com.lightbend.cinnamon</groupId>
  <artifactId>cinnamon-opentracing-zipkin_2.11</artifactId>
  <version>2.5.2</version>
</dependency>
Gradle
dependencies {
  compile group: 'com.lightbend.cinnamon', name: 'cinnamon-opentracing-zipkin_2.11', version: '2.5.2'
}

Zipkin configuration

The default Zipkin sender is the URL connection sender, which can be used for sending trace spans directly to the Zipkin API. This sender can be configured. On the Example tab there is a configuration that sets a different endpoint for the Zipkin trace span collector by configuring the endpoint setting:

Required

There is nothing to configure if you want to use the default Zipkin settings that will communicate with localhost on port 9411 using the URL connection sender.

Example
cinnamon.opentracing {
  zipkin {
    url-connection {
      endpoint = "http://my.zipkin.host:9411/api/v1/spans"
    }
  }
}
Reference
cinnamon.opentracing {
  zipkin {

    # Flush interval for trace span reporter
    flush-interval = 1s

    # Max queue size of trace span reporter
    max-queue-size = 1000

    # Zipkin sender to use for reporting trace spans
    sender = url-connection

    # URL connection sender for reporting directly to a Zipkin API endpoint
    url-connection {
      # POST URL for Zipkin's v1 api, usually "http://zipkinhost:9411/api/v1/spans"
      endpoint = "http://localhost:9411/api/v1/spans"

      # Encoding to use for trace spans (thrift or json)
      encoding = "thrift"

      # Timeout for establishing URL connection
      connect-timeout = 10s

      # Timeout for connection reads
      read-timeout = 60s

      # Whether GZIP compression is enabled
      compression = true

      # Maximum size of messages
      max-message-size = 5MiB
    }

  }
}

Note: These settings are defined in the reference.conf. You only need to specify any of these settings when you want to override the defaults.

See the following sections for configuring the Zipkin sender for Kafka or Scribe.

Zipkin Kafka sender

Zipkin can be configured to send traces to a Kafka topic. This sender supports Kafka 0.10.2+.

To enable the Zipkin Kafka sender, add the following dependency to your build:

sbt
libraryDependencies += Cinnamon.library.cinnamonOpenTracingZipkinKafka
Maven
<dependency>
  <groupId>com.lightbend.cinnamon</groupId>
  <artifactId>cinnamon-opentracing-zipkin-kafka_2.11</artifactId>
  <version>2.5.2</version>
</dependency>
Gradle
dependencies {
  compile group: 'com.lightbend.cinnamon', name: 'cinnamon-opentracing-zipkin-kafka_2.11', version: '2.5.2'
}

You can then configure the Zipkin reporter to use the Kafka sender. You must specify the Kafka bootstrap servers to use. You can also override any of the producer configs using the properties configuration section:

Required
cinnamon.opentracing {
  zipkin {
    sender = kafka

    kafka {
      bootstrap-servers = ["my.kafka.host1:9091", "my.kafka.host2:9091"]
    }
  }
}
Reference
cinnamon.opentracing {
  zipkin {
    kafka {
      # Initial set of kafka servers to connect to (must be specified)
      bootstrap-servers = []

      # Kafka topic to send trace spans to
      topic = "zipkin"

      # Encoding to use for trace spans (thrift or json)
      encoding = "thrift"

      # Property overrides for producer configs (http://kafka.apache.org/0102/documentation.html#producerconfigs)
      properties {}

      # Maximum size of messages
      max-message-size = 1MB
    }
  }
}

Note: These settings are defined in the reference.conf. You only need to specify any of these settings when you want to override the defaults.

Zipkin Scribe sender

Zipkin can be configured to send traces to Scribe.

To enable the Zipkin Scribe sender, add the following dependency to your build:

sbt
libraryDependencies += Cinnamon.library.cinnamonOpenTracingZipkinScribe
Maven
<dependency>
  <groupId>com.lightbend.cinnamon</groupId>
  <artifactId>cinnamon-opentracing-zipkin-scribe_2.11</artifactId>
  <version>2.5.2</version>
</dependency>
Gradle
dependencies {
  compile group: 'com.lightbend.cinnamon', name: 'cinnamon-opentracing-zipkin-scribe_2.11', version: '2.5.2'
}

You can then configure the Zipkin reporter to use the Scribe sender. On the Example tab there is a configuration that changes the Scribe endpoint using the host and port settings:

Required
cinnamon.opentracing {
  zipkin {
    sender = scribe
  }
}
Example
cinnamon.opentracing {
  zipkin {
    sender = scribe

    scribe {
      host = "my.scribe.host"
      port = 9410
    }
  }
}
Reference
cinnamon.opentracing {
  zipkin {
    scribe {
      # Host of Scribe trace collector
      host = "localhost"

      # Port of Scribe trace collector
      port = 9410

      # Timeout for socket reads
      socket-timeout = 60s

      # Timeout for connections
      connect-timeout = 10s

      # Maximum size of messages (scribe default is 16384000 bytes)
      max-message-size = 16000KiB
    }
  }
}

Note: These settings are defined in the reference.conf. You only need to specify any of these settings when you want to override the defaults.

Running Zipkin

See the Zipkin documentation for running Zipkin. The Zipkin quickstart shows how to run Zipkin locally for development and testing.

Here’s what an example actor trace in Zipkin looks like:

Zipkin trace

Custom OpenTracing tracers

It’s possible to create OpenTracing compatible tracers programmatically, by providing a Cinnamon TracerFactory that creates the Tracer directly.

For example, the LightStep Tracer can be used by implementing a TracerFactory such as:

import com.lightbend.cinnamon.opentracing.TracerFactory
import io.opentracing.Tracer

class LightStepTracerFactory extends TracerFactory {
  def create(): Tracer = {
    new com.lightstep.tracer.jre.JRETracer(
      new com.lightstep.tracer.shared.Options.OptionsBuilder()
        .withAccessToken("{your_access_token}")
        .build()
    )
  }
}

And then configuring Cinnamon to use this tracer:

cinnamon.opentracing {
  tracers = [lightstep]

  lightstep {
    factory-class = "sample.LightStepTracerFactory"
  }
}