Sizing recommendations

The resources required for resilient, scalable clusters depends on the type of applications you plan to deploy as well as anticipated usage of those applications.

Node sizing

Kubernetes defines two types of nodes, master and worker. A master node manages and coordinates the worker nodes. It runs the API server, scheduler, and controller manager. See Kubernetes Concepts Overview for more details.

In addition, OpenShift recommends specializing nodes for the following purposes:

  • A single bootstrap node is often used as a staging location for building and maintaining the cluster. This is the host where you install and run Ansible. It can also be used as a jumpbox for accessing other nodes, such as via SSH.

  • For production environments, dedicated infrastructure nodes are recommended to keep Registry and router pods on dedicated nodes.

  • Node ConfigMaps and host labels can be used to further determine the placement of pods by the scheduler. For example, some high-storage nodes would be best for databases. However, segmenting of available nodes have some disadvantages. In particular, each segment requires some headroom to reschedule pods in the event of failure. Specialization of the nodes reduces the schedulers ability to make use of available resources across the cluster.

Sections on this page provide node sizing recommendations for production configurations. For large deployments, consider OpenShift cluster limits when planning distribution. For non-production clusters, downsize as you see fit, based on anticipated loads, whether or not redundancy and failover capabilities are needed, etc. See Architecture overview in the OpenShift documentation for further details.

This chapter discusses sizing in the context of physical hardware. For cloud deployments, use the descriptions of your cloud provider’s image configurations to choose appropriate analogs.

Bootstrap Node

A bootstrap node will have Ansible, and a high-availability (HA) TCP/Layer 3 load balancer, such as HAProxy, to balance the following TCP ports to all master nodes: 80, 443, 8080, 8181, 2181, 5050. An unencrypted SSH key is required for authenticating with the cluster nodes over SSH. Encrypted SSH keys are not supported.

The bootstrap node does not require a lot of resources:

Table 1. Bootstrap Resource Recommendations
Resource Amount

Processor

4 cores

Memory

16 GB RAM

Hard disk

60 GB

However, you may want to allocate additional resources, such as disk space, for storing larger libraries of playbooks and source images.

See Preparing your hosts in the OpenShift documentation for more information.

OpenShift Master Nodes

OpenShift master nodes run services that manage cluster resources to form the control plane. These include the API server, controller manager server, and etcd. One master also running etcd can be sufficient for non-critical clusters, such as for development and testing. For failover, production clusters should have at least three masters. Review OpenShift’s Planning your installation for more information.

Since etcd frequently performs small amounts of storage input and output, storage that handles small read/write operations quickly, such as SSD, is recommended for etcd. Lightbend strongly recommends configuring durable storage using RAID. The RAID controllers should be configured with a BBU and cache configured in writeback mode.

In a default install of OpenShift, ZooKeeper and Exhibitor run on OpenShift Master nodes as systemd services. ZooKeeper is used by a number of OpenShift and Lightbend Platform components, such as Kafka. For ZooKeeper to be fault tolerant, you should always provision an odd number of OpenShift Master nodes. ZooKeeper can only maintain quorum if the majority of nodes are available. For example, a three-node ZooKeeper cluster can tolerate the loss of one node and a five-node ZooKeeper cluster can tolerate the loss of up to two nodes.

Following RedHat’s recommendations, here are our minimum recommendations:

Table 2. Master Node Resource Recommendations
Resource Amount

Node Count

1 (development/test), 3 or 5 (production)

Processor

4 cores/node

Memory

32GB RAM

Disk Space

120GB Solid-state drive (SSD)

General-Purpose Worker Nodes

Worker nodes are where most of your storage and compute activities will occur. Worker nodes have more diverse requirements, depending on the kinds of services and workloads run on them.

OpenShift clusters should have a load balancer or agent node with a public IP address, while the rest of the nodes can have only private IP addresses. The hardware recommendations below apply to both public and private agents.

It is common to have a variety of configurations depending on how the particular worker will be utilized in a cluster. For example, some high-storage nodes are best for databases. Nodes can be labeled and specific systems and frameworks can be restricted to use nodes with labels. OpenShift provides recommended practices for node hosts.

The following configuration is suitable for general-purpose applications and infrastructure nodes:

Table 3. General-Purpose Node Resource Recommendations
Resource Amount Discussion

Nodes

6 or more

Higher for busier clusters

Processor

2 cores/node

Much higher for compute-intensive jobs (e.g., Spark)

Memory

16 GB RAM

Much higher for memory-intensive jobs (e.g., SQL joins)

Hard disk

60 GB+

Much higher for local caching of data, such as state in Spark streaming jobs, or GlusterFS 'converged mode' nodes.

Use enough nodes to have at least 35-40 cores if you install all Lightbend Platform streaming components. For smaller non-production clusters, only install the services your really need.