This documentation is for an unreleased version of Apache Flink Machine Learning Library. We recommend you use the latest stable version.
Quick Start #
This document provides a quick introduction to using Flink ML. Readers of this document will be guided to submit a simple Flink job that trains a Machine Learning Model and use it to provide prediction service.
Python version (3.6, 3.7, or 3.8) is required for Flink ML. Please run the following command to make sure that it meets the requirements:
$ python --version # the version printed here must be 3.6, 3.7 or 3.8
Installation of Flink ML Python SDK #
Flink ML Python SDK is available in PyPi and can be installed as follows:
$ python -m pip install apache-flink-ml
You can also build Flink ML Python SDK from sources by following the development guide.
Run Flink ML example job #
After setting up Flink ML Python SDK, you can run a Flink ML example job as follows.
$ python -m pyflink.examples.ml.clustering.kmeans_example
The command above would create a Flink mini-cluster and execute Flink ML’s
kmeans_example job. There are also example jobs for other Flink ML algorithms,
and you can find them in
A sample output in your terminal is as follows.
Features: [9.6,0.0] Cluster Id: 0 Features: [9.0,0.6] Cluster Id: 0 Features: [0.0,0.3] Cluster Id: 1 Features: [0.0,0.0] Cluster Id: 1 Features: [0.3,3.0] Cluster Id: 1 Features: [9.0,0.0] Cluster Id: 0
Now you have successfully run a Flink ML job.