This documentation is for an out-of-date version of Apache Flink. We recommend you use the latest stable version.
Flink comes with an integrated interactive Python Shell.
It can be used in a local setup as well as in a cluster setup.
To use the shell with an integrated Flink cluster, you can simply install PyFlink with PyPi and execute the shell directly:
To run the shell on a cluster, please see the Setup section below.
Usage
The shell only supports Table API currently.
The Table Environments are automatically prebound after startup.
Use “bt_env” and “st_env” to access BatchTableEnvironment and StreamTableEnvironment respectively.
Table API
The example below is a simple program in the Python shell:
Setup
To get an overview of what options the Python Shell provides, please use
Local
To use the shell with an integrated Flink cluster just execute:
Remote
To use it with a running cluster, please start the Python shell with the keyword remote
and supply the host and port of the JobManager with:
Yarn Python Shell cluster
The shell can deploy a Flink cluster to YARN, which is used exclusively by the
shell. The number of YARN containers can be controlled by the parameter -n <arg>.
The shell deploys a new Flink cluster on YARN and connects the
cluster. You can also specify options for YARN cluster such as memory for
JobManager, name of YARN application, etc.
For example, to start a Yarn cluster for the Python Shell with two TaskManagers
use the following:
For all other options, see the full reference at the bottom.
Yarn Session
If you have previously deployed a Flink cluster using the Flink Yarn Session,
the Python shell can connect with it using the following command: