Graphite is a popular storage platform and often used in conjunction with Grafana, a common visualization tool used in monitoring. Graphite focuses on time-series data, is designed to scale and works well on inexpensive hardware. In the case of Lightbend Monitoring, one possible option we provide is to push data to StatsD, the collector, which in turn sends the aggregates to Graphite.
Elasticsearch defines itself as a scalable, distributed real-time search and analytics engine. The great thing about Elasticsearch is that it allows you to explore and analyze your data via its full-text search capabilities backed by Lucene. Beyond that, Elasticsearch also provides structured data searches and supports human language, relationships, and geolocation. We use Elasticsearch as one of the stores backed by our Elasticsearch reporter to store events and metrics.
Cassandra is well known for its reliability and ability to scale. One thing that you may not know is that it is also commonly used to store time-series data, more specifically in the form of metrics. According to DataStax, the company behind Cassandra, using Cassandra for storing time-series data is such a common use case that they provide guidelines on how best to go about using Cassandra for this purpose.
As with Kafka, we don’t have native support for Cassandra, but there are open source projects that support StatsD integration and Grafana’s Datasources API enables plugin development for any database that can communicate via HTTP.
Blueflood is a high throughput, low latency, multi-tenant distributed metric processing system behind Rackspace Metrics. Data is stored using Cassandra providing a fault-tolerant and highly available environment and can be used to construct dashboards, generate reports, graphs or for any other use involving time-series data. It focuses on near-realtime data, with data that is queryable mere milliseconds after ingestion.
KairosDB is a Fast Time Series Database built atop Cassandra that provides a rich set of features such as collectors for multiple protocols, Rest API, Web UI, Aggregators, native client libraries and plugin architecture for extensibility.
Netflix Atlas is a dimensional time-series data store for near-realtime operational data. Its design features an in-memory store which allows Atlas to gather and report large quantities of metrics in a fast and efficient manner. The development of Atlas by Netflix was in response to the massive explosion of metrics related to their streaming system. It supports tags, normalization of data (smoothing) and a variety of metrics types such as gauges, rates, and counters.