This documentation provides instructions on how to prepare Flink for YARN executions on a MapR cluster.
The instructions below assume MapR version 5.2.0. They will guide you to be able to start submitting Flink on YARN jobs or sessions to a MapR cluster.
In order to run Flink on MapR, Flink needs to be built with MapR’s own Hadoop and Zookeeper distribution. Simply build Flink using Maven with the following command from the project root directory:
mvn clean install -DskipTests -Pvendor-repos,mapr \ -Dhadoop.version=2.7.0-mapr-1607 \ -Dzookeeper.version=3.4.5-mapr-1604
vendor-repos build profile adds MapR’s repository to the build so that
MapR’s Hadoop / Zookeeper dependencies can be fetched. The
profile additionally resolves some dependency clashes between MapR and
Flink, as well as ensuring that the native MapR libraries on the cluster
nodes are used. Both profiles must be activated.
By default the
mapr profile builds with Hadoop / Zookeeper dependencies
for MapR version 5.2.0, so you don’t need to explicitly override
For different MapR versions, simply override these properties to appropriate
values. The corresponding Hadoop / Zookeeper distributions for each MapR version
can be found on MapR documentations such as
The client submitting Flink jobs to MapR also needs to be prepared with the below setups.
Ensure that MapR’s JAAS config file is picked up to avoid login failures:
Make sure that the
yarn.nodemanager.resource.cpu-vcores property is set in
<!-- in /opt/mapr/hadoop/hadoop-2.7.0/etc/hadoop/yarn-site.xml --> <configuration> ... <property> <name>yarn.nodemanager.resource.cpu-vcores</name> <value>...</value> </property> ... </configuration>
Also remember to set the
variables to the path where
yarn-site.xml is located:
export YARN_CONF_DIR=/opt/mapr/hadoop/hadoop-2.7.0/etc/hadoop/ export HADOOP_CONF_DIR=/opt/mapr/hadoop/hadoop-2.7.0/etc/hadoop/
Make sure that the MapR native libraries are picked up in the classpath:
If you’ll be starting Flink on YARN sessions with
below is also required:
Note: In Flink 1.2.0, Flink’s Kerberos authentication for YARN execution has a bug that forbids it to work with MapR Security. Please upgrade to later Flink versions in order to use Flink with a secured MapR cluster. For more details, please see FLINK-5949.
Flink’s Kerberos authentication is independent of MapR’s Security authentication. With the above build procedures and environment variable setups, Flink does not require any additional configuration to work with MapR Security.
Users simply need to login by using MapR’s
utility. Users that haven’t acquired MapR login credentials would not be
able to submit Flink jobs, erroring with:
java.lang.Exception: unable to establish the security context Caused by: o.a.f.r.security.modules.SecurityModule$SecurityInstallException: Unable to set the Hadoop login user Caused by: java.io.IOException: failure to login: Unable to obtain MapR credentials