@Public public abstract class ExecutionEnvironment extends Object
LocalEnvironment
will cause execution in the current JVM, a
RemoteEnvironment
will cause execution on a remote setup.
The environment provides methods to control the job execution (such as setting the parallelism) and to interact with the outside world (data access).
Please note that the execution environment needs strong type information for the input and return types of all operations that are executed. This means that the environments needs to know that the return value of an operation is for example a Tuple of String and Integer. Because the Java compiler throws much of the generic type information away, most methods attempt to re- obtain that information using reflection. In certain cases, it may be necessary to manually supply that information to some of the methods.
LocalEnvironment
,
RemoteEnvironment
Modifier and Type | Field and Description |
---|---|
protected JobID |
jobID
The ID of the session, defined by this execution environment.
|
protected JobExecutionResult |
lastJobExecutionResult
Result from the latest execution, to make it retrievable when using eager execution methods.
|
protected static org.slf4j.Logger |
LOG
The logger used by the environment and its subclasses.
|
protected long |
sessionTimeout
The session timeout in seconds.
|
Modifier | Constructor and Description |
---|---|
protected |
ExecutionEnvironment()
Creates a new Execution Environment.
|
Modifier and Type | Method and Description |
---|---|
void |
addDefaultKryoSerializer(Class<?> type,
Class<? extends com.esotericsoftware.kryo.Serializer<?>> serializerClass)
Adds a new Kryo default serializer to the Runtime.
|
<T extends com.esotericsoftware.kryo.Serializer<?> & Serializable> |
addDefaultKryoSerializer(Class<?> type,
T serializer)
Adds a new Kryo default serializer to the Runtime.
|
static boolean |
areExplicitEnvironmentsAllowed()
Checks whether it is currently permitted to explicitly instantiate a LocalEnvironment
or a RemoteEnvironment.
|
static CollectionEnvironment |
createCollectionsEnvironment()
Creates a
CollectionEnvironment that uses Java Collections underneath. |
<X> DataSource<X> |
createInput(InputFormat<X,?> inputFormat)
Generic method to create an input
DataSet with in InputFormat . |
<X> DataSource<X> |
createInput(InputFormat<X,?> inputFormat,
TypeInformation<X> producedType)
Generic method to create an input DataSet with in
InputFormat . |
static LocalEnvironment |
createLocalEnvironment()
Creates a
LocalEnvironment . |
static LocalEnvironment |
createLocalEnvironment(Configuration customConfiguration)
Creates a
LocalEnvironment . |
static LocalEnvironment |
createLocalEnvironment(int parallelism)
Creates a
LocalEnvironment . |
static ExecutionEnvironment |
createLocalEnvironmentWithWebUI(Configuration conf)
Creates a
LocalEnvironment for local program execution that also starts the
web monitoring UI. |
Plan |
createProgramPlan()
Creates the program's
Plan . |
Plan |
createProgramPlan(String jobName)
Creates the program's
Plan . |
Plan |
createProgramPlan(String jobName,
boolean clearSinks)
Creates the program's
Plan . |
static ExecutionEnvironment |
createRemoteEnvironment(String host,
int port,
Configuration clientConfiguration,
String... jarFiles)
Creates a
RemoteEnvironment . |
static ExecutionEnvironment |
createRemoteEnvironment(String host,
int port,
int parallelism,
String... jarFiles)
Creates a
RemoteEnvironment . |
static ExecutionEnvironment |
createRemoteEnvironment(String host,
int port,
String... jarFiles)
Creates a
RemoteEnvironment . |
JobExecutionResult |
execute()
Triggers the program execution.
|
abstract JobExecutionResult |
execute(String jobName)
Triggers the program execution.
|
<X> DataSource<X> |
fromCollection(Collection<X> data)
Creates a DataSet from the given non-empty collection.
|
<X> DataSource<X> |
fromCollection(Collection<X> data,
TypeInformation<X> type)
Creates a DataSet from the given non-empty collection.
|
<X> DataSource<X> |
fromCollection(Iterator<X> data,
Class<X> type)
Creates a DataSet from the given iterator.
|
<X> DataSource<X> |
fromCollection(Iterator<X> data,
TypeInformation<X> type)
Creates a DataSet from the given iterator.
|
<X> DataSource<X> |
fromElements(Class<X> type,
X... data)
Creates a new data set that contains the given elements.
|
<X> DataSource<X> |
fromElements(X... data)
Creates a new data set that contains the given elements.
|
<X> DataSource<X> |
fromParallelCollection(SplittableIterator<X> iterator,
Class<X> type)
Creates a new data set that contains elements in the iterator.
|
<X> DataSource<X> |
fromParallelCollection(SplittableIterator<X> iterator,
TypeInformation<X> type)
Creates a new data set that contains elements in the iterator.
|
DataSource<Long> |
generateSequence(long from,
long to)
Creates a new data set that contains a sequence of numbers.
|
ExecutionConfig |
getConfig()
Gets the config object that defines execution parameters.
|
static int |
getDefaultLocalParallelism()
Gets the default parallelism that will be used for the local execution environment created by
createLocalEnvironment() . |
static ExecutionEnvironment |
getExecutionEnvironment()
Creates an execution environment that represents the context in which the program is currently executed.
|
abstract String |
getExecutionPlan()
Creates the plan with which the system will execute the program, and returns it as
a String using a JSON representation of the execution data flow graph.
|
JobID |
getId()
Gets the JobID by which this environment is identified.
|
String |
getIdString()
Gets the JobID by which this environment is identified, as a string.
|
JobExecutionResult |
getLastJobExecutionResult()
Returns the
JobExecutionResult of the last executed job. |
int |
getNumberOfExecutionRetries()
Deprecated.
This method will be replaced by
getRestartStrategy() . The
RestartStrategies.FixedDelayRestartStrategyConfiguration contains the number of
execution retries. |
int |
getParallelism()
Gets the parallelism with which operation are executed by default.
|
RestartStrategies.RestartStrategyConfiguration |
getRestartStrategy()
Returns the specified restart strategy configuration.
|
long |
getSessionTimeout()
Gets the session timeout for this environment.
|
protected static void |
initializeContextEnvironment(ExecutionEnvironmentFactory ctx)
Sets a context environment factory, that creates the context environment for running programs
with pre-configured environments.
|
CsvReader |
readCsvFile(String filePath)
Creates a CSV reader to read a comma separated value (CSV) file.
|
<X> DataSource<X> |
readFile(FileInputFormat<X> inputFormat,
String filePath) |
<X> DataSource<X> |
readFileOfPrimitives(String filePath,
Class<X> typeClass)
Creates a
DataSet that represents the primitive type produced by reading the given file line wise. |
<X> DataSource<X> |
readFileOfPrimitives(String filePath,
String delimiter,
Class<X> typeClass)
Creates a
DataSet that represents the primitive type produced by reading the given file in delimited way. |
DataSource<String> |
readTextFile(String filePath)
Creates a
DataSet that represents the Strings produced by reading the given file line wise. |
DataSource<String> |
readTextFile(String filePath,
String charsetName)
Creates a
DataSet that represents the Strings produced by reading the given file line wise. |
DataSource<StringValue> |
readTextFileWithValue(String filePath)
Creates a
DataSet that represents the Strings produced by reading the given file line wise. |
DataSource<StringValue> |
readTextFileWithValue(String filePath,
String charsetName,
boolean skipInvalidLines)
Creates a
DataSet that represents the Strings produced by reading the given file line wise. |
void |
registerCachedFile(String filePath,
String name)
Registers a file at the distributed cache under the given name.
|
void |
registerCachedFile(String filePath,
String name,
boolean executable)
Registers a file at the distributed cache under the given name.
|
protected void |
registerCachedFilesWithPlan(Plan p)
Registers all files that were registered at this execution environment's cache registry of the
given plan's cache registry.
|
void |
registerType(Class<?> type)
Registers the given type with the serialization stack.
|
void |
registerTypeWithKryoSerializer(Class<?> type,
Class<? extends com.esotericsoftware.kryo.Serializer<?>> serializerClass)
Registers the given Serializer via its class as a serializer for the given type at the KryoSerializer.
|
<T extends com.esotericsoftware.kryo.Serializer<?> & Serializable> |
registerTypeWithKryoSerializer(Class<?> type,
T serializer)
Registers the given type with a Kryo Serializer.
|
protected static void |
resetContextEnvironment()
Un-sets the context environment factory.
|
static void |
setDefaultLocalParallelism(int parallelism)
Sets the default parallelism that will be used for the local execution environment created by
createLocalEnvironment() . |
void |
setNumberOfExecutionRetries(int numberOfExecutionRetries)
Deprecated.
This method will be replaced by
setRestartStrategy(org.apache.flink.api.common.restartstrategy.RestartStrategies.RestartStrategyConfiguration) . The
RestartStrategies.FixedDelayRestartStrategyConfiguration contains the number of
execution retries. |
void |
setParallelism(int parallelism)
Sets the parallelism for operations executed through this environment.
|
void |
setRestartStrategy(RestartStrategies.RestartStrategyConfiguration restartStrategyConfiguration)
Sets the restart strategy configuration.
|
void |
setSessionTimeout(long timeout)
Sets the session timeout to hold the intermediate results of a job.
|
abstract void |
startNewSession()
Starts a new session, discarding the previous data flow and all of its intermediate results.
|
protected static final org.slf4j.Logger LOG
protected JobExecutionResult lastJobExecutionResult
protected JobID jobID
protected long sessionTimeout
protected ExecutionEnvironment()
public ExecutionConfig getConfig()
public int getParallelism()
Operator.setParallelism(int)
. Other operations may need to run with a different
parallelism - for example calling
DataSet.reduce(org.apache.flink.api.common.functions.ReduceFunction)
over the entire
set will insert eventually an operation that runs non-parallel (parallelism of one).ExecutionConfig.PARALLELISM_DEFAULT
, if the environment's default parallelism should be used.public void setParallelism(int parallelism)
This method overrides the default parallelism for this environment.
The LocalEnvironment
uses by default a value equal to the number of hardware
contexts (CPU cores / threads). When executing the program via the command line client
from a JAR file, the default parallelism is the one configured for that setup.
parallelism
- The parallelism@PublicEvolving public void setRestartStrategy(RestartStrategies.RestartStrategyConfiguration restartStrategyConfiguration)
restartStrategyConfiguration
- Restart strategy configuration to be set@PublicEvolving public RestartStrategies.RestartStrategyConfiguration getRestartStrategy()
@Deprecated @PublicEvolving public void setNumberOfExecutionRetries(int numberOfExecutionRetries)
setRestartStrategy(org.apache.flink.api.common.restartstrategy.RestartStrategies.RestartStrategyConfiguration)
. The
RestartStrategies.FixedDelayRestartStrategyConfiguration
contains the number of
execution retries.-1
indicates that the system
default value (as defined in the configuration) should be used.numberOfExecutionRetries
- The number of times the system will try to re-execute failed tasks.@Deprecated @PublicEvolving public int getNumberOfExecutionRetries()
getRestartStrategy()
. The
RestartStrategies.FixedDelayRestartStrategyConfiguration
contains the number of
execution retries.-1
indicates that the system default value (as defined in the configuration)
should be used.public JobExecutionResult getLastJobExecutionResult()
JobExecutionResult
of the last executed job.@PublicEvolving public JobID getId()
getIdString()
@PublicEvolving public String getIdString()
getId()
@PublicEvolving public void setSessionTimeout(long timeout)
timeout
- The timeout, in seconds.@PublicEvolving public long getSessionTimeout()
@PublicEvolving public abstract void startNewSession() throws Exception
Exception
public <T extends com.esotericsoftware.kryo.Serializer<?> & Serializable> void addDefaultKryoSerializer(Class<?> type, T serializer)
Note that the serializer instance must be serializable (as defined by java.io.Serializable), because it may be distributed to the worker nodes by java serialization.
type
- The class of the types serialized with the given serializer.serializer
- The serializer to use.public void addDefaultKryoSerializer(Class<?> type, Class<? extends com.esotericsoftware.kryo.Serializer<?>> serializerClass)
type
- The class of the types serialized with the given serializer.serializerClass
- The class of the serializer to use.public <T extends com.esotericsoftware.kryo.Serializer<?> & Serializable> void registerTypeWithKryoSerializer(Class<?> type, T serializer)
Note that the serializer instance must be serializable (as defined by java.io.Serializable), because it may be distributed to the worker nodes by java serialization.
type
- The class of the types serialized with the given serializer.serializer
- The serializer to use.public void registerTypeWithKryoSerializer(Class<?> type, Class<? extends com.esotericsoftware.kryo.Serializer<?>> serializerClass)
type
- The class of the types serialized with the given serializer.serializerClass
- The class of the serializer to use.public void registerType(Class<?> type)
type
- The class of the type to register.public DataSource<String> readTextFile(String filePath)
DataSet
that represents the Strings produced by reading the given file line wise.
The file will be read with the system's default character set.filePath
- The path of the file, as a URI (e.g., "file:///some/local/file" or "hdfs://host:port/file/path").DataSet
that represents the data read from the given file as text lines.public DataSource<String> readTextFile(String filePath, String charsetName)
DataSet
that represents the Strings produced by reading the given file line wise.
The Charset
with the given name will be used to read the files.filePath
- The path of the file, as a URI (e.g., "file:///some/local/file" or "hdfs://host:port/file/path").charsetName
- The name of the character set used to read the file.DataSet
that represents the data read from the given file as text lines.public DataSource<StringValue> readTextFileWithValue(String filePath)
DataSet
that represents the Strings produced by reading the given file line wise.
This method is similar to readTextFile(String)
, but it produces a DataSet with mutable
StringValue
objects, rather than Java Strings. StringValues can be used to tune implementations
to be less object and garbage collection heavy.
The file will be read with the system's default character set.
filePath
- The path of the file, as a URI (e.g., "file:///some/local/file" or "hdfs://host:port/file/path").DataSet
that represents the data read from the given file as text lines.public DataSource<StringValue> readTextFileWithValue(String filePath, String charsetName, boolean skipInvalidLines)
DataSet
that represents the Strings produced by reading the given file line wise.
This method is similar to readTextFile(String, String)
, but it produces a DataSet with mutable
StringValue
objects, rather than Java Strings. StringValues can be used to tune implementations
to be less object and garbage collection heavy.
The Charset
with the given name will be used to read the files.
filePath
- The path of the file, as a URI (e.g., "file:///some/local/file" or "hdfs://host:port/file/path").charsetName
- The name of the character set used to read the file.skipInvalidLines
- A flag to indicate whether to skip lines that cannot be read with the given character set.public <X> DataSource<X> readFileOfPrimitives(String filePath, Class<X> typeClass)
DataSet
that represents the primitive type produced by reading the given file line wise.
This method is similar to readCsvFile(String)
with single field, but it produces a DataSet not through
Tuple1
.filePath
- The path of the file, as a URI (e.g., "file:///some/local/file" or "hdfs://host:port/file/path").typeClass
- The primitive type class to be read.DataSet
that represents the data read from the given file as primitive type.public <X> DataSource<X> readFileOfPrimitives(String filePath, String delimiter, Class<X> typeClass)
DataSet
that represents the primitive type produced by reading the given file in delimited way.
This method is similar to readCsvFile(String)
with single field, but it produces a DataSet not through
Tuple1
.filePath
- The path of the file, as a URI (e.g., "file:///some/local/file" or "hdfs://host:port/file/path").delimiter
- The delimiter of the given file.typeClass
- The primitive type class to be read.DataSet
that represents the data read from the given file as primitive type.public CsvReader readCsvFile(String filePath)
filePath
- The path of the CSV file.public <X> DataSource<X> readFile(FileInputFormat<X> inputFormat, String filePath)
public <X> DataSource<X> createInput(InputFormat<X,?> inputFormat)
DataSet
with in InputFormat
. The DataSet will not be
immediately created - instead, this method returns a DataSet that will be lazily created from
the input format once the program is executed.
Since all data sets need specific information about their types, this method needs to determine
the type of the data produced by the input format. It will attempt to determine the data type
by reflection, unless the input format implements the ResultTypeQueryable
interface.
In the latter case, this method will invoke the ResultTypeQueryable.getProducedType()
method to determine data type produced by the input format.
inputFormat
- The input format used to create the data set.DataSet
that represents the data created by the input format.createInput(InputFormat, TypeInformation)
public <X> DataSource<X> createInput(InputFormat<X,?> inputFormat, TypeInformation<X> producedType)
InputFormat
. The DataSet
will not be
immediately created - instead, this method returns a DataSet
that will be lazily created from
the input format once the program is executed.
The DataSet
is typed to the given TypeInformation. This method is intended for input formats that
where the return type cannot be determined by reflection analysis, and that do not implement the
ResultTypeQueryable
interface.
inputFormat
- The input format used to create the data set.DataSet
that represents the data created by the input format.createInput(InputFormat)
public <X> DataSource<X> fromCollection(Collection<X> data)
The framework will try and determine the exact type from the collection elements.
In case of generic elements, it may be necessary to manually supply the type information
via fromCollection(Collection, TypeInformation)
.
Note that this operation will result in a non-parallel data source, i.e. a data source with a parallelism of one.
data
- The collection of elements to create the data set from.fromCollection(Collection, TypeInformation)
public <X> DataSource<X> fromCollection(Collection<X> data, TypeInformation<X> type)
The returned DataSet is typed to the given TypeInformation.
data
- The collection of elements to create the data set from.type
- The TypeInformation for the produced data set.fromCollection(Collection)
public <X> DataSource<X> fromCollection(Iterator<X> data, Class<X> type)
Note that this operation will result in a non-parallel data source, i.e. a data source with a parallelism of one.
data
- The collection of elements to create the data set from.type
- The class of the data produced by the iterator. Must not be a generic class.fromCollection(Iterator, TypeInformation)
public <X> DataSource<X> fromCollection(Iterator<X> data, TypeInformation<X> type)
fromCollection(Iterator, Class)
does not supply all type information.
Note that this operation will result in a non-parallel data source, i.e. a data source with a parallelism of one.
data
- The collection of elements to create the data set from.type
- The TypeInformation for the produced data set.fromCollection(Iterator, Class)
@SafeVarargs public final <X> DataSource<X> fromElements(X... data)
String
or Integer
. The sequence of elements must not be empty.
The framework will try and determine the exact type from the collection elements.
In case of generic elements, it may be necessary to manually supply the type information
via fromCollection(Collection, TypeInformation)
.
Note that this operation will result in a non-parallel data source, i.e. a data source with a parallelism of one.
data
- The elements to make up the data set.@SafeVarargs public final <X> DataSource<X> fromElements(Class<X> type, X... data)
type
- The base class type for every element in the collection.data
- The elements to make up the data set.public <X> DataSource<X> fromParallelCollection(SplittableIterator<X> iterator, Class<X> type)
Because the iterator will remain unmodified until the actual execution happens, the type of data returned by the iterator must be given explicitly in the form of the type class (this is due to the fact that the Java compiler erases the generic type information).
iterator
- The iterator that produces the elements of the data set.type
- The class of the data produced by the iterator. Must not be a generic class.fromParallelCollection(SplittableIterator, TypeInformation)
public <X> DataSource<X> fromParallelCollection(SplittableIterator<X> iterator, TypeInformation<X> type)
Because the iterator will remain unmodified until the actual execution happens, the type of data
returned by the iterator must be given explicitly in the form of the type information.
This method is useful for cases where the type is generic. In that case, the type class
(as given in fromParallelCollection(SplittableIterator, Class)
does not supply all type information.
iterator
- The iterator that produces the elements of the data set.type
- The TypeInformation for the produced data set.fromParallelCollection(SplittableIterator, Class)
public DataSource<Long> generateSequence(long from, long to)
from
- The number to start at (inclusive).to
- The number to stop at (inclusive).[from, to]
interval.public JobExecutionResult execute() throws Exception
DataSet.print()
,
writing results (e.g. DataSet.writeAsText(String)
,
DataSet.write(org.apache.flink.api.common.io.FileOutputFormat, String)
, or other generic
data sinks created with DataSet.output(org.apache.flink.api.common.io.OutputFormat)
.
The program execution will be logged and displayed with a generated default name.
Exception
- Thrown, if the program executions fails.public abstract JobExecutionResult execute(String jobName) throws Exception
DataSet.print()
,
writing results (e.g. DataSet.writeAsText(String)
,
DataSet.write(org.apache.flink.api.common.io.FileOutputFormat, String)
, or other generic
data sinks created with DataSet.output(org.apache.flink.api.common.io.OutputFormat)
.
The program execution will be logged and displayed with the given job name.
Exception
- Thrown, if the program executions fails.public abstract String getExecutionPlan() throws Exception
Exception
- Thrown, if the compiler could not be instantiated, or the master could not
be contacted to retrieve information relevant to the execution planning.public void registerCachedFile(String filePath, String name)
The RuntimeContext
can be obtained inside UDFs via
RichFunction.getRuntimeContext()
and provides access
DistributedCache
via
RuntimeContext.getDistributedCache()
.
filePath
- The path of the file, as a URI (e.g. "file:///some/path" or "hdfs://host:port/and/path")name
- The name under which the file is registered.public void registerCachedFile(String filePath, String name, boolean executable)
The RuntimeContext
can be obtained inside UDFs via
RichFunction.getRuntimeContext()
and provides access
DistributedCache
via
RuntimeContext.getDistributedCache()
.
filePath
- The path of the file, as a URI (e.g. "file:///some/path" or "hdfs://host:port/and/path")name
- The name under which the file is registered.executable
- flag indicating whether the file should be executableprotected void registerCachedFilesWithPlan(Plan p) throws IOException
p
- The plan to register files at.IOException
- Thrown if checks for existence and sanity fail.@Internal public Plan createProgramPlan()
Plan
. The plan is a description of all data sources, data sinks,
and operations and how they interact, as an isolated unit that can be executed with a
PlanExecutor
. Obtaining a plan and starting it with an
executor is an alternative way to run a program and is only possible if the program consists
only of distributed operations.
This automatically starts a new stage of execution.@Internal public Plan createProgramPlan(String jobName)
Plan
. The plan is a description of all data sources, data sinks,
and operations and how they interact, as an isolated unit that can be executed with a
PlanExecutor
. Obtaining a plan and starting it with an
executor is an alternative way to run a program and is only possible if the program consists
only of distributed operations.
This automatically starts a new stage of execution.jobName
- The name attached to the plan (displayed in logs and monitoring).@Internal public Plan createProgramPlan(String jobName, boolean clearSinks)
Plan
. The plan is a description of all data sources, data sinks,
and operations and how they interact, as an isolated unit that can be executed with a
PlanExecutor
. Obtaining a plan and starting it with an
executor is an alternative way to run a program and is only possible if the program consists
only of distributed operations.jobName
- The name attached to the plan (displayed in logs and monitoring).clearSinks
- Whether or not to start a new stage of execution.public static ExecutionEnvironment getExecutionEnvironment()
createLocalEnvironment()
. If the program is invoked from within the command line client to be
submitted to a cluster, this method returns the execution environment of this cluster.@PublicEvolving public static CollectionEnvironment createCollectionsEnvironment()
CollectionEnvironment
that uses Java Collections underneath. This will execute in a
single thread in the current JVM. It is very fast but will fail if the data does not fit into
memory. parallelism will always be 1. This is useful during implementation and for debugging.public static LocalEnvironment createLocalEnvironment()
LocalEnvironment
. The local execution environment will run the program in a
multi-threaded fashion in the same JVM as the environment was created in. The default
parallelism of the local environment is the number of hardware contexts (CPU cores / threads),
unless it was specified differently by setDefaultLocalParallelism(int)
.public static LocalEnvironment createLocalEnvironment(int parallelism)
LocalEnvironment
. The local execution environment will run the program in a
multi-threaded fashion in the same JVM as the environment was created in. It will use the
parallelism specified in the parameter.parallelism
- The parallelism for the local environment.public static LocalEnvironment createLocalEnvironment(Configuration customConfiguration)
LocalEnvironment
. The local execution environment will run the program in a
multi-threaded fashion in the same JVM as the environment was created in. It will use the
parallelism specified in the parameter.customConfiguration
- Pass a custom configuration to the LocalEnvironment.@PublicEvolving public static ExecutionEnvironment createLocalEnvironmentWithWebUI(Configuration conf)
LocalEnvironment
for local program execution that also starts the
web monitoring UI.
The local execution environment will run the program in a multi-threaded fashion in the same JVM as the environment was created in. It will use the parallelism specified in the parameter.
If the configuration key 'jobmanager.web.port' was set in the configuration, that particular port will be used for the web UI. Otherwise, the default port (8081) will be used.
public static ExecutionEnvironment createRemoteEnvironment(String host, int port, String... jarFiles)
RemoteEnvironment
. The remote environment sends (parts of) the program
to a cluster for execution. Note that all file paths used in the program must be accessible from the
cluster. The execution will use the cluster's default parallelism, unless the parallelism is
set explicitly via setParallelism(int)
.host
- The host name or address of the master (JobManager), where the program should be executed.port
- The port of the master (JobManager), where the program should be executed.jarFiles
- The JAR files with code that needs to be shipped to the cluster. If the program uses
user-defined functions, user-defined input formats, or any libraries, those must be
provided in the JAR files.public static ExecutionEnvironment createRemoteEnvironment(String host, int port, Configuration clientConfiguration, String... jarFiles)
RemoteEnvironment
. The remote environment sends (parts of) the program
to a cluster for execution. Note that all file paths used in the program must be accessible from the
cluster. The custom configuration file is used to configure Akka specific configuration parameters
for the Client only; Program parallelism can be set via setParallelism(int)
.
Cluster configuration has to be done in the remotely running Flink instance.
host
- The host name or address of the master (JobManager), where the program should be executed.port
- The port of the master (JobManager), where the program should be executed.clientConfiguration
- Configuration used by the client that connects to the cluster.jarFiles
- The JAR files with code that needs to be shipped to the cluster. If the program uses
user-defined functions, user-defined input formats, or any libraries, those must be
provided in the JAR files.public static ExecutionEnvironment createRemoteEnvironment(String host, int port, int parallelism, String... jarFiles)
RemoteEnvironment
. The remote environment sends (parts of) the program
to a cluster for execution. Note that all file paths used in the program must be accessible from the
cluster. The execution will use the specified parallelism.host
- The host name or address of the master (JobManager), where the program should be executed.port
- The port of the master (JobManager), where the program should be executed.parallelism
- The parallelism to use during the execution.jarFiles
- The JAR files with code that needs to be shipped to the cluster. If the program uses
user-defined functions, user-defined input formats, or any libraries, those must be
provided in the JAR files.public static int getDefaultLocalParallelism()
createLocalEnvironment()
.public static void setDefaultLocalParallelism(int parallelism)
createLocalEnvironment()
.parallelism
- The parallelism to use as the default local parallelism.protected static void initializeContextEnvironment(ExecutionEnvironmentFactory ctx)
When the context environment factory is set, no other environments can be explicitly used.
ctx
- The context environment factory.protected static void resetContextEnvironment()
getExecutionEnvironment()
will again return a default local execution environment, and
it is possible to explicitly instantiate the LocalEnvironment and the RemoteEnvironment.@Internal public static boolean areExplicitEnvironmentsAllowed()
Copyright © 2014–2018 The Apache Software Foundation. All rights reserved.