Architecture #

Flink Kubernetes Operator (Operator) acts as a control plane to manage the complete deployment lifecycle of Apache Flink applications. The Operator can be installed on a Kubernetes cluster using Helm. In most production environments it is typically deployed in a designated namespace and controls Flink deployments in one or more managed namespaces. The custom resource descriptor (CRD) that describes the schema of a FlinkDeployment is a cluster wide resource. For a CRD, the declaration must be registered before any resources of that CRDs kind(s) can be used, and the registration process sometimes takes a few seconds.

Note: There is no support at this time for upgrading or deleting CRDs using Helm.

Control Loop #

The Operator follow the Kubernetes principles, notably the control loop:

Users can interact with the operator using the Kubernetes command-line tool, kubectl. The Operator continuously tracks cluster events relating to the FlinkDeployment custom resource. When the operator receives a new event, it will take action to adjust the Kubernetes cluster to the desired state as part of its reconciliation loop. The initial loop consists of the following high-level steps:

User submits a FlinkDeployment custom resource(CR) using kubectl
The operator launches the Flink cluster deployment and creates an ingress rule for UI access
The JobManager creates TaskManager pods
The JobManager submits the job

The CR can be (re)applied on the cluster any time. The Operator makes continuous adjustments to imitate the desired state until the current state becomes the desired state. All lifecycle management operations are realized using this very simple principle in the Operator.

The Operator is built with the Java Operator SDK and uses the Native Kubernetes Integration for launching Flink deployments and submitting jobs under the hood. The Java Operator SDK is a higher level framework and related tooling to support writing Kubernetes Operators in Java. Both the Java Operator SDK and Flink’s native kubernetes integration itself is using the Fabric8 Kubernetes Client to interact with the Kubernetes API Server.

State Machine of JobManager Deployment #

The Operator manages the lifecycle of the JobManager Deployment. Its state machine is as follows:

The possible transitions usually indicate that there are some underlying changes:

MISSING -> DEPLOYING: A new JM deployment exists and is being created
DEPLOYING -> DEPLOYED_NOT_READY: The JM deployment exists and passes checks of the availability of replicas and JM port connectivity. Now, it is waiting the REST service to be ready.
DEPLOYED_NOT_READY -> READY: JM can serve requests.
READY -> READY: JM works fine.
READY -> DEPLOYED_NOT_READY: JM REST service becomes unavailable.
READY -> ERROR: REST service is unavailable and JM deployment failed(e.g. in CrashLoopBackoff state).
READY -> MISSING: JM deployment does not exist(e.g. deleted by kubectl or by SUSPEND action).
ERROR -> ERROR: JM deployment failed.
DEPLOYING -> DEPLOYING: The JM deployment exists and is still being created.
DEPLOYING -> ERROR: JM deployment failed.
MISSING -> MISSING: JM deployment does not exist.