This page provides instructions on how to run Flink in a fully distributed fashion on a static (but possibly heterogeneous) cluster.
Flink runs on all UNIX-like environments, e.g. Linux, Mac OS X, and Cygwin (for Windows) and expects the cluster to consist of one master node and one or more worker nodes. Before you start to setup the system, make sure you have the following software installed on each node:
If your cluster does not fulfill these software requirements you will need to install/upgrade it.
JAVA_HOME
ConfigurationFlink requires the JAVA_HOME
environment variable to be set on the master and all worker nodes and point to the directory of your Java installation.
You can set this variable in conf/flink-conf.yaml
via the env.java.home
key.
Go to the downloads page and get the ready to run package. Make sure to pick the Flink package matching your Hadoop version. If you don’t plan to use Hadoop, pick any version.
After downloading the latest release, copy the archive to your master node and extract it:
tar xzf flink-*.tgz
cd flink-*
After having extracted the system files, you need to configure Flink for the cluster by editing conf/flink-conf.yaml.
Set the jobmanager.rpc.address
key to point to your master node. Furthermode define the maximum amount of main memory the JVM is allowed to allocate on each node by setting the jobmanager.heap.mb
and taskmanager.heap.mb
keys.
The value is given in MB. If some worker nodes have more main memory which you want to allocate to the Flink system you can overwrite the default value by setting an environment variable FLINK_TM_HEAP
on the respective node.
Finally you must provide a list of all nodes in your cluster which shall be used as worker nodes. Therefore, similar to the HDFS configuration, edit the file conf/slaves and enter the IP/host name of each worker node. Each worker node will later run a TaskManager.
Each entry must be separated by a new line, as in the following example:
192.168.0.100
192.168.0.101
.
.
.
192.168.0.150
The Flink directory must be available on every worker under the same path. You can use a shared NSF directory, or copy the entire Flink directory to every worker node.
Please see the configuration page for details and additional configuration options.
In particular,
taskmanager.heap.mb
),taskmanager.numberOfTaskSlots
),parallelism.default
) andtaskmanager.tmp.dirs
)are very important configuration values.
The following script starts a JobManager on the local node and connects via SSH to all worker nodes listed in the slaves file to start the TaskManager on each node. Now your Flink system is up and running. The JobManager running on the local node will now accept jobs at the configured RPC port.
Assuming that you are on the master node and inside the Flink directory:
bin/start-cluster.sh
To stop Flink, there is also a stop-cluster.sh
script.
You can add both JobManager and TaskManager instances to your running cluster with the bin/taskmanager.sh
and bin/jobmanager.sh
scripts.
bin/jobmanager.sh (start cluster)|stop|stop-all
bin/taskmanager.sh start|stop|stop-all
Make sure to call these scripts on the hosts, on which you want to start/stop the respective instance.