Introduction

Cloudflow provides tools for developing, composing, deploying, operating and monitoring distributed stream processing applications. Cloudflow enterprise features are available with a Lightbend Platform Subscription. Lightbend also sponsors an open source version of Cloudflow.

Enterprise features include:

This guide introduces Cloudflow and describes the Cloudflow enterprise features that are available with a Lightbend Platform subscription.

To use the enterprise features of Cloudflow and take advantage of Lightbend Subscription support, you must use the installer documented in this guide.

The benefits and challenges of streaming architectures

Technologies like mobile, the Internet of Things (IoT), Big Data analytics, machine learning, and others are driving enterprises to modernize how they process large volumes of data. A rapidly growing percentage of that data is now arriving in the form of data streams and a growing percentage of those streams now require near-realtime processing.

The streaming landscape has been rapidly evolving, with tools like Spark, Flink, and Kafka Streams emerging from the world of large scale batch processing while projects like Reactive Streams and Akka Streams have emerged from the world of application development and high-performance networking.

The demand for availability, scalability, and resilience is forcing streaming architectures to become more like microservice architectures. Conversely, successful organizations building microservices find their data needs grow with their organization while their data sources are becoming more stream-like and more real-time. Hence, there is a unification happening between streaming data and microservice architectures.

It can be quite hard to develop, deploy, and operate large-scale microservices-based systems that can take advantage of streaming data and seamlessly integrate with systems for analytics processing and machine learning. Individual technologies may be well-documented from the development side, but often have little information on deployment and production. This makes combining them into fully integrated unified systems no easy task. Cloudflow aims to make this easier by integrating the most popular streaming frameworks into a single platform for creating and running distributed applications.

Streaming application requirements

Stream processing is a discipline and a set of techniques for extracting information from unbounded data. Streaming applications apply stream processing to provide actionable insights from data as it arrives into the system. The growing popularity of streaming applications is driven by:

  • the increasing availability of data from many sources.

  • the need of enterprises to speed up their reaction time to that data.

We characterize streaming applications as a connected graph of stream-processing components, where each component specializes on a particular task, using the 'right tool for the job' premise. The figure below, An Abstract Streaming Application, generically illustrates an application that processes data events:

  • The first circle on the left represents an initial stage for capturing or accepting data. This could be an HTTP endpoint to accept data from remote clients, a connection to a Kafka topic, or input from an internal system in an enterprise.

  • The next circle to the right represents a processing phase that applies some logic to the data, such as: business rules, statistical data analysis, or a machine learning model that implements the business aspect of the application. This processing component may add additional information to the event and send it as valid data to an external system or flag the data as invalid and report it.

  • The final two circles on the right shows the two different data output paths, valid and invalid.

    abstract streaming app
    Fig. 1 - An Abstract Streaming Application

Each component in the illustration presents different application requirements, scalability concerns, and often—​different Kubernetes deployment strategies. Such specialized needs make even a simple streaming application non-trivial to develop and deploy. For large enterprises with complex use cases, creating streaming data applications that can extract actionable business value quickly is challenging.

This guide last published: 2020-04-03