Introduction

Cloudflow provides tools for developing, composing, deploying, operating and monitoring distributed stream processing applications. Cloudflow enterprise features are available with a Lightbend Platform Subscription. Lightbend also sponsors an open source version of Cloudflow.

Enterprise features include:

This guide introduces Cloudflow and describes the Cloudflow enterprise features that are available with a Lightbend Platform subscription. The open source version guide, cloudflow.io, describes how to develop streamlets.

To use the enterprise features of Cloudflow and take advantage of Lightbend Subscription support, you must use the installer documented in this guide.

The benefits and challenges of streaming architectures

Technologies like mobile, the Internet of Things (IoT), Big Data analytics, machine learning, and others are driving enterprises to modernize how they process large volumes of data. A rapidly growing percentage of that data is now arriving in the form of data streams and a growing percentage of those streams now require near-realtime processing.

The streaming landscape has been rapidly evolving, with tools like Spark, Flink, and Kafka Streams emerging from the world of large scale batch processing while projects like Reactive Streams and Akka Streams have emerged from the world of application development and high-performance networking.

The demand for availability, scalability, and resilience is forcing streaming architectures to become more like microservice architectures. Conversely, successful organizations building microservices find their data needs grow with their organization while their data sources are becoming more stream-like and more real-time. Hence, there is a unification happening between streaming data and microservice architectures.

It can be quite hard to develop, deploy, and operate large-scale microservices-based systems that can take advantage of streaming data and seamlessly integrate with systems for analytics processing and machine learning. Individual technologies may be well-documented from the development side, but often have little information on deployment and production. This makes combining them into fully integrated unified systems no easy task. Cloudflow aims to make this easier by integrating the most popular streaming frameworks into a single platform for creating and running distributed applications.

What can Cloudflow do for you?

Cloudflow allows you to quickly build and deploy large, distributed stream processing applications by removing the need for you to develop connections between different incoming, processing, and outgoing flows. It spares you from understanding how to configure multiple technologies to make them work together. Instead, you can concentrate on business logic and let Cloudflow handle the rest.

You compose a Cloudflow application from smaller stream processing units called streamlets. Each streamlet represents a discrete chunk of stream processing logic with data being safely persisted at the edges using pre-defined schemas. Streamlets can be scaled up and down to process partitioned data streams. Streamlets can be written using multiple streaming runtimes, such as Akka Streams and Spark. This exposes the full power of the underlying runtime and its libraries while providing a higher-level abstraction for composing streamlets and expressing data schemas.

You compose streamlets into larger systems using application blueprints, which specify how streamlets can be connected together. Cloudflow will take care of deploying the individual streamlets as a whole and making sure connections get translated into data flowing between the streamlets at runtime.

Cloudflow provides tooling for developing streamlets, composing them into applications, deploying those applications to your clusters, as well as operating and observing deployed applications. A custom UI for Cloudflow in Lightbend Console allows you to visualize data flows as shown in the example below.

Cloudflow application
Figure 1. Cloudflow application visualization in Lightbend Console

This simple example demonstrates the benefits of visualization over trying to create a mental model from configuration files and source code. In the screen shot above, you can see the dataflow from left to right. Each streamlet icon shows its service type--in this case Akka Streams or Spark Streaming—​and name.

Next, learn more about the parts of a Cloudflow application.

This guide last published: 2020-01-09