@Deprecated @Public public class ExecutionEnvironment extends Object
LocalEnvironment
will cause execution in the current JVM, a RemoteEnvironment
will cause
execution on a remote setup.
The environment provides methods to control the job execution (such as setting the parallelism) and to interact with the outside world (data access).
Please note that the execution environment needs strong type information for the input and return types of all operations that are executed. This means that the environments needs to know that the return value of an operation is for example a Tuple of String and Integer. Because the Java compiler throws much of the generic type information away, most methods attempt to re- obtain that information using reflection. In certain cases, it may be necessary to manually supply that information to some of the methods.
Modifier and Type | Field and Description |
---|---|
protected JobExecutionResult |
lastJobExecutionResult
Deprecated.
Result from the latest execution, to make it retrievable when using eager execution methods.
|
protected static org.slf4j.Logger |
LOG
Deprecated.
The logger used by the environment and its subclasses.
|
Modifier | Constructor and Description |
---|---|
protected |
ExecutionEnvironment()
Deprecated.
Creates a new Execution Environment.
|
|
ExecutionEnvironment(Configuration configuration)
Deprecated.
Creates a new
ExecutionEnvironment that will use the given Configuration to
configure the PipelineExecutor . |
|
ExecutionEnvironment(Configuration configuration,
ClassLoader userClassloader)
Deprecated.
Creates a new
ExecutionEnvironment that will use the given Configuration to
configure the PipelineExecutor . |
|
ExecutionEnvironment(PipelineExecutorServiceLoader executorServiceLoader,
Configuration configuration,
ClassLoader userClassloader)
Deprecated.
Creates a new
ExecutionEnvironment that will use the given Configuration to
configure the PipelineExecutor . |
Modifier and Type | Method and Description |
---|---|
void |
addDefaultKryoSerializer(Class<?> type,
Class<? extends com.esotericsoftware.kryo.Serializer<?>> serializerClass)
Deprecated.
Adds a new Kryo default serializer to the Runtime.
|
<T extends com.esotericsoftware.kryo.Serializer<?> & Serializable> |
addDefaultKryoSerializer(Class<?> type,
T serializer)
Deprecated.
Adds a new Kryo default serializer to the Runtime.
|
static boolean |
areExplicitEnvironmentsAllowed()
Deprecated.
Checks whether it is currently permitted to explicitly instantiate a LocalEnvironment or a
RemoteEnvironment.
|
void |
clearJobListeners()
Deprecated.
Clear all registered
JobListener s. |
void |
configure(ReadableConfig configuration,
ClassLoader classLoader)
Deprecated.
Sets all relevant options contained in the
ReadableConfig such as e.g. |
static CollectionEnvironment |
createCollectionsEnvironment()
Deprecated.
Creates a
CollectionEnvironment that uses Java Collections underneath. |
<X> DataSource<X> |
createInput(InputFormat<X,?> inputFormat)
Deprecated.
Generic method to create an input
DataSet with in InputFormat . |
<X> DataSource<X> |
createInput(InputFormat<X,?> inputFormat,
TypeInformation<X> producedType)
Deprecated.
Generic method to create an input DataSet with in
InputFormat . |
static LocalEnvironment |
createLocalEnvironment()
Deprecated.
Creates a
LocalEnvironment . |
static LocalEnvironment |
createLocalEnvironment(Configuration customConfiguration)
Deprecated.
Creates a
LocalEnvironment . |
static LocalEnvironment |
createLocalEnvironment(int parallelism)
Deprecated.
Creates a
LocalEnvironment . |
static ExecutionEnvironment |
createLocalEnvironmentWithWebUI(Configuration conf)
Deprecated.
Creates a
LocalEnvironment for local program execution that also starts the web
monitoring UI. |
Plan |
createProgramPlan()
Deprecated.
Creates the program's
Plan . |
Plan |
createProgramPlan(String jobName)
Deprecated.
Creates the program's
Plan . |
Plan |
createProgramPlan(String jobName,
boolean clearSinks)
Deprecated.
Creates the program's
Plan . |
static ExecutionEnvironment |
createRemoteEnvironment(String host,
int port,
Configuration clientConfiguration,
String... jarFiles)
Deprecated.
Creates a
RemoteEnvironment . |
static ExecutionEnvironment |
createRemoteEnvironment(String host,
int port,
int parallelism,
String... jarFiles)
Deprecated.
Creates a
RemoteEnvironment . |
static ExecutionEnvironment |
createRemoteEnvironment(String host,
int port,
String... jarFiles)
Deprecated.
Creates a
RemoteEnvironment . |
JobExecutionResult |
execute()
Deprecated.
Triggers the program execution.
|
JobExecutionResult |
execute(String jobName)
Deprecated.
Triggers the program execution.
|
JobClient |
executeAsync()
Deprecated.
Triggers the program execution asynchronously.
|
JobClient |
executeAsync(String jobName)
Deprecated.
Triggers the program execution asynchronously.
|
<X> DataSource<X> |
fromCollection(Collection<X> data)
Deprecated.
Creates a DataSet from the given non-empty collection.
|
<X> DataSource<X> |
fromCollection(Collection<X> data,
TypeInformation<X> type)
Deprecated.
Creates a DataSet from the given non-empty collection.
|
<X> DataSource<X> |
fromCollection(Iterator<X> data,
Class<X> type)
Deprecated.
Creates a DataSet from the given iterator.
|
<X> DataSource<X> |
fromCollection(Iterator<X> data,
TypeInformation<X> type)
Deprecated.
Creates a DataSet from the given iterator.
|
<X> DataSource<X> |
fromElements(Class<X> type,
X... data)
Deprecated.
Creates a new data set that contains the given elements.
|
<X> DataSource<X> |
fromElements(X... data)
Deprecated.
Creates a new data set that contains the given elements.
|
<X> DataSource<X> |
fromParallelCollection(SplittableIterator<X> iterator,
Class<X> type)
Deprecated.
Creates a new data set that contains elements in the iterator.
|
<X> DataSource<X> |
fromParallelCollection(SplittableIterator<X> iterator,
TypeInformation<X> type)
Deprecated.
Creates a new data set that contains elements in the iterator.
|
DataSource<Long> |
generateSequence(long from,
long to)
Deprecated.
Creates a new data set that contains a sequence of numbers.
|
ExecutionConfig |
getConfig()
Deprecated.
Gets the config object that defines execution parameters.
|
Configuration |
getConfiguration()
Deprecated.
|
static int |
getDefaultLocalParallelism()
Deprecated.
Gets the default parallelism that will be used for the local execution environment created by
createLocalEnvironment() . |
static ExecutionEnvironment |
getExecutionEnvironment()
Deprecated.
Creates an execution environment that represents the context in which the program is
currently executed.
|
String |
getExecutionPlan()
Deprecated.
Creates the plan with which the system will execute the program, and returns it as a String
using a JSON representation of the execution data flow graph.
|
PipelineExecutorServiceLoader |
getExecutorServiceLoader()
Deprecated.
|
protected List<JobListener> |
getJobListeners()
Deprecated.
Gets the config JobListeners.
|
JobExecutionResult |
getLastJobExecutionResult()
Deprecated.
Returns the
JobExecutionResult of the last executed job. |
int |
getNumberOfExecutionRetries()
Deprecated.
This method will be replaced by
getRestartStrategy() . The RestartStrategies.FixedDelayRestartStrategyConfiguration contains the number of
execution retries. |
int |
getParallelism()
Deprecated.
Gets the parallelism with which operation are executed by default.
|
RestartStrategies.RestartStrategyConfiguration |
getRestartStrategy()
Deprecated.
Returns the specified restart strategy configuration.
|
ClassLoader |
getUserCodeClassLoader()
Deprecated.
|
protected static void |
initializeContextEnvironment(ExecutionEnvironmentFactory ctx)
Deprecated.
Sets a context environment factory, that creates the context environment for running programs
with pre-configured environments.
|
CsvReader |
readCsvFile(String filePath)
Deprecated.
Creates a CSV reader to read a comma separated value (CSV) file.
|
<X> DataSource<X> |
readFile(FileInputFormat<X> inputFormat,
String filePath)
Deprecated.
|
<X> DataSource<X> |
readFileOfPrimitives(String filePath,
Class<X> typeClass)
Deprecated.
Creates a
DataSet that represents the primitive type produced by reading the given
file line wise. |
<X> DataSource<X> |
readFileOfPrimitives(String filePath,
String delimiter,
Class<X> typeClass)
Deprecated.
Creates a
DataSet that represents the primitive type produced by reading the given
file in delimited way. |
DataSource<String> |
readTextFile(String filePath)
Deprecated.
Creates a
DataSet that represents the Strings produced by reading the given file line
wise. |
DataSource<String> |
readTextFile(String filePath,
String charsetName)
Deprecated.
Creates a
DataSet that represents the Strings produced by reading the given file line
wise. |
DataSource<StringValue> |
readTextFileWithValue(String filePath)
Deprecated.
Creates a
DataSet that represents the Strings produced by reading the given file line
wise. |
DataSource<StringValue> |
readTextFileWithValue(String filePath,
String charsetName,
boolean skipInvalidLines)
Deprecated.
Creates a
DataSet that represents the Strings produced by reading the given file line
wise. |
void |
registerCachedFile(String filePath,
String name)
Deprecated.
Registers a file at the distributed cache under the given name.
|
void |
registerCachedFile(String filePath,
String name,
boolean executable)
Deprecated.
Registers a file at the distributed cache under the given name.
|
void |
registerJobListener(JobListener jobListener)
Deprecated.
Register a
JobListener in this environment. |
void |
registerType(Class<?> type)
Deprecated.
Registers the given type with the serialization stack.
|
void |
registerTypeWithKryoSerializer(Class<?> type,
Class<? extends com.esotericsoftware.kryo.Serializer<?>> serializerClass)
Deprecated.
Registers the given Serializer via its class as a serializer for the given type at the
KryoSerializer.
|
<T extends com.esotericsoftware.kryo.Serializer<?> & Serializable> |
registerTypeWithKryoSerializer(Class<?> type,
T serializer)
Deprecated.
Registers the given type with a Kryo Serializer.
|
protected static void |
resetContextEnvironment()
Deprecated.
Un-sets the context environment factory.
|
static void |
setDefaultLocalParallelism(int parallelism)
Deprecated.
Sets the default parallelism that will be used for the local execution environment created by
createLocalEnvironment() . |
void |
setNumberOfExecutionRetries(int numberOfExecutionRetries)
Deprecated.
This method will be replaced by
setRestartStrategy(org.apache.flink.api.common.restartstrategy.RestartStrategies.RestartStrategyConfiguration) . The RestartStrategies.FixedDelayRestartStrategyConfiguration contains the number of
execution retries. |
void |
setParallelism(int parallelism)
Deprecated.
Sets the parallelism for operations executed through this environment.
|
void |
setRestartStrategy(RestartStrategies.RestartStrategyConfiguration restartStrategyConfiguration)
Deprecated.
Sets the restart strategy configuration.
|
protected static final org.slf4j.Logger LOG
protected JobExecutionResult lastJobExecutionResult
@PublicEvolving public ExecutionEnvironment(Configuration configuration)
ExecutionEnvironment
that will use the given Configuration
to
configure the PipelineExecutor
.@PublicEvolving public ExecutionEnvironment(Configuration configuration, ClassLoader userClassloader)
ExecutionEnvironment
that will use the given Configuration
to
configure the PipelineExecutor
.
In addition, this constructor allows specifying the user code ClassLoader
.
@PublicEvolving public ExecutionEnvironment(PipelineExecutorServiceLoader executorServiceLoader, Configuration configuration, ClassLoader userClassloader)
ExecutionEnvironment
that will use the given Configuration
to
configure the PipelineExecutor
.
In addition, this constructor allows specifying the PipelineExecutorServiceLoader
and user code ClassLoader
.
protected ExecutionEnvironment()
@Internal public ClassLoader getUserCodeClassLoader()
@Internal public PipelineExecutorServiceLoader getExecutorServiceLoader()
@Internal public Configuration getConfiguration()
public ExecutionConfig getConfig()
protected List<JobListener> getJobListeners()
public int getParallelism()
Operator.setParallelism(int)
. Other operations may need to run with a different parallelism
- for example calling DataSet.reduce(org.apache.flink.api.common.functions.ReduceFunction)
over the entire set
will insert eventually an operation that runs non-parallel (parallelism of one).ExecutionConfig.PARALLELISM_DEFAULT
, if the environment's default
parallelism should be used.public void setParallelism(int parallelism)
This method overrides the default parallelism for this environment. The LocalEnvironment
uses by default a value equal to the number of hardware contexts (CPU cores
/ threads). When executing the program via the command line client from a JAR file, the
default parallelism is the one configured for that setup.
parallelism
- The parallelism@PublicEvolving public void setRestartStrategy(RestartStrategies.RestartStrategyConfiguration restartStrategyConfiguration)
restartStrategyConfiguration
- Restart strategy configuration to be set@PublicEvolving public RestartStrategies.RestartStrategyConfiguration getRestartStrategy()
@Deprecated @PublicEvolving public void setNumberOfExecutionRetries(int numberOfExecutionRetries)
setRestartStrategy(org.apache.flink.api.common.restartstrategy.RestartStrategies.RestartStrategyConfiguration)
. The RestartStrategies.FixedDelayRestartStrategyConfiguration
contains the number of
execution retries.-1
indicates that the system default value (as
defined in the configuration) should be used.numberOfExecutionRetries
- The number of times the system will try to re-execute failed
tasks.@Deprecated @PublicEvolving public int getNumberOfExecutionRetries()
getRestartStrategy()
. The RestartStrategies.FixedDelayRestartStrategyConfiguration
contains the number of
execution retries.-1
indicates that the system default value (as defined in the configuration) should be used.public JobExecutionResult getLastJobExecutionResult()
JobExecutionResult
of the last executed job.public <T extends com.esotericsoftware.kryo.Serializer<?> & Serializable> void addDefaultKryoSerializer(Class<?> type, T serializer)
Note that the serializer instance must be serializable (as defined by java.io.Serializable), because it may be distributed to the worker nodes by java serialization.
type
- The class of the types serialized with the given serializer.serializer
- The serializer to use.public void addDefaultKryoSerializer(Class<?> type, Class<? extends com.esotericsoftware.kryo.Serializer<?>> serializerClass)
type
- The class of the types serialized with the given serializer.serializerClass
- The class of the serializer to use.public <T extends com.esotericsoftware.kryo.Serializer<?> & Serializable> void registerTypeWithKryoSerializer(Class<?> type, T serializer)
Note that the serializer instance must be serializable (as defined by java.io.Serializable), because it may be distributed to the worker nodes by java serialization.
type
- The class of the types serialized with the given serializer.serializer
- The serializer to use.public void registerTypeWithKryoSerializer(Class<?> type, Class<? extends com.esotericsoftware.kryo.Serializer<?>> serializerClass)
type
- The class of the types serialized with the given serializer.serializerClass
- The class of the serializer to use.public void registerType(Class<?> type)
type
- The class of the type to register.@PublicEvolving public void configure(ReadableConfig configuration, ClassLoader classLoader)
ReadableConfig
such as e.g. PipelineOptions.CACHED_FILES
. It will reconfigure ExecutionEnvironment
and ExecutionConfig
.
It will change the value of a setting only if a corresponding option was set in the configuration
. If a key is not present, the current value of a field will remain untouched.
configuration
- a configuration to read the values fromclassLoader
- a class loader to use when loading classespublic DataSource<String> readTextFile(String filePath)
DataSet
that represents the Strings produced by reading the given file line
wise. The file will be read with the UTF-8 character set.filePath
- The path of the file, as a URI (e.g., "file:///some/local/file" or
"hdfs://host:port/file/path").DataSet
that represents the data read from the given file as text lines.public DataSource<String> readTextFile(String filePath, String charsetName)
DataSet
that represents the Strings produced by reading the given file line
wise. The Charset
with the given name will be used to read the
files.filePath
- The path of the file, as a URI (e.g., "file:///some/local/file" or
"hdfs://host:port/file/path").charsetName
- The name of the character set used to read the file.DataSet
that represents the data read from the given file as text lines.public DataSource<StringValue> readTextFileWithValue(String filePath)
DataSet
that represents the Strings produced by reading the given file line
wise. This method is similar to readTextFile(String)
, but it produces a DataSet with
mutable StringValue
objects, rather than Java Strings. StringValues can be used to
tune implementations to be less object and garbage collection heavy.
The file will be read with the UTF-8 character set.
filePath
- The path of the file, as a URI (e.g., "file:///some/local/file" or
"hdfs://host:port/file/path").DataSet
that represents the data read from the given file as text lines.public DataSource<StringValue> readTextFileWithValue(String filePath, String charsetName, boolean skipInvalidLines)
DataSet
that represents the Strings produced by reading the given file line
wise. This method is similar to readTextFile(String, String)
, but it produces a
DataSet with mutable StringValue
objects, rather than Java Strings. StringValues can
be used to tune implementations to be less object and garbage collection heavy.
The Charset
with the given name will be used to read the files.
filePath
- The path of the file, as a URI (e.g., "file:///some/local/file" or
"hdfs://host:port/file/path").charsetName
- The name of the character set used to read the file.skipInvalidLines
- A flag to indicate whether to skip lines that cannot be read with the
given character set.public <X> DataSource<X> readFileOfPrimitives(String filePath, Class<X> typeClass)
DataSet
that represents the primitive type produced by reading the given
file line wise. This method is similar to readCsvFile(String)
with single field, but
it produces a DataSet not through Tuple1
.filePath
- The path of the file, as a URI (e.g., "file:///some/local/file" or
"hdfs://host:port/file/path").typeClass
- The primitive type class to be read.DataSet
that represents the data read from the given file as primitive
type.public <X> DataSource<X> readFileOfPrimitives(String filePath, String delimiter, Class<X> typeClass)
DataSet
that represents the primitive type produced by reading the given
file in delimited way. This method is similar to readCsvFile(String)
with single
field, but it produces a DataSet not through Tuple1
.filePath
- The path of the file, as a URI (e.g., "file:///some/local/file" or
"hdfs://host:port/file/path").delimiter
- The delimiter of the given file.typeClass
- The primitive type class to be read.DataSet
that represents the data read from the given file as primitive
type.public CsvReader readCsvFile(String filePath)
filePath
- The path of the CSV file.public <X> DataSource<X> readFile(FileInputFormat<X> inputFormat, String filePath)
public <X> DataSource<X> createInput(InputFormat<X,?> inputFormat)
DataSet
with in InputFormat
. The DataSet
will not be immediately created - instead, this method returns a DataSet that will be lazily
created from the input format once the program is executed.
Since all data sets need specific information about their types, this method needs to
determine the type of the data produced by the input format. It will attempt to determine the
data type by reflection, unless the input format implements the ResultTypeQueryable
interface. In the latter case, this method will invoke the ResultTypeQueryable.getProducedType()
method to determine data type produced by the input
format.
inputFormat
- The input format used to create the data set.DataSet
that represents the data created by the input format.createInput(InputFormat, TypeInformation)
public <X> DataSource<X> createInput(InputFormat<X,?> inputFormat, TypeInformation<X> producedType)
InputFormat
. The DataSet
will not be immediately created - instead, this method returns a DataSet
that will be
lazily created from the input format once the program is executed.
The DataSet
is typed to the given TypeInformation. This method is intended for
input formats that where the return type cannot be determined by reflection analysis, and
that do not implement the ResultTypeQueryable
interface.
inputFormat
- The input format used to create the data set.DataSet
that represents the data created by the input format.createInput(InputFormat)
public <X> DataSource<X> fromCollection(Collection<X> data)
The framework will try and determine the exact type from the collection elements. In case
of generic elements, it may be necessary to manually supply the type information via fromCollection(Collection, TypeInformation)
.
Note that this operation will result in a non-parallel data source, i.e. a data source with a parallelism of one.
data
- The collection of elements to create the data set from.fromCollection(Collection, TypeInformation)
public <X> DataSource<X> fromCollection(Collection<X> data, TypeInformation<X> type)
The returned DataSet is typed to the given TypeInformation.
data
- The collection of elements to create the data set from.type
- The TypeInformation for the produced data set.fromCollection(Collection)
public <X> DataSource<X> fromCollection(Iterator<X> data, Class<X> type)
Note that this operation will result in a non-parallel data source, i.e. a data source with a parallelism of one.
data
- The collection of elements to create the data set from.type
- The class of the data produced by the iterator. Must not be a generic class.fromCollection(Iterator, TypeInformation)
public <X> DataSource<X> fromCollection(Iterator<X> data, TypeInformation<X> type)
fromCollection(Iterator,
Class)
does not supply all type information.
Note that this operation will result in a non-parallel data source, i.e. a data source with a parallelism of one.
data
- The collection of elements to create the data set from.type
- The TypeInformation for the produced data set.fromCollection(Iterator, Class)
@SafeVarargs public final <X> DataSource<X> fromElements(X... data)
String
or Integer
. The sequence of elements
must not be empty.
The framework will try and determine the exact type from the collection elements. In case
of generic elements, it may be necessary to manually supply the type information via fromCollection(Collection, TypeInformation)
.
Note that this operation will result in a non-parallel data source, i.e. a data source with a parallelism of one.
data
- The elements to make up the data set.@SafeVarargs public final <X> DataSource<X> fromElements(Class<X> type, X... data)
type
- The base class type for every element in the collection.data
- The elements to make up the data set.public <X> DataSource<X> fromParallelCollection(SplittableIterator<X> iterator, Class<X> type)
Because the iterator will remain unmodified until the actual execution happens, the type of data returned by the iterator must be given explicitly in the form of the type class (this is due to the fact that the Java compiler erases the generic type information).
iterator
- The iterator that produces the elements of the data set.type
- The class of the data produced by the iterator. Must not be a generic class.fromParallelCollection(SplittableIterator, TypeInformation)
public <X> DataSource<X> fromParallelCollection(SplittableIterator<X> iterator, TypeInformation<X> type)
Because the iterator will remain unmodified until the actual execution happens, the type
of data returned by the iterator must be given explicitly in the form of the type
information. This method is useful for cases where the type is generic. In that case, the
type class (as given in fromParallelCollection(SplittableIterator, Class)
does not
supply all type information.
iterator
- The iterator that produces the elements of the data set.type
- The TypeInformation for the produced data set.fromParallelCollection(SplittableIterator, Class)
public DataSource<Long> generateSequence(long from, long to)
from
- The number to start at (inclusive).to
- The number to stop at (inclusive).[from, to]
interval.public JobExecutionResult execute() throws Exception
DataSet.print()
, writing results (e.g. DataSet.writeAsText(String)
, DataSet.write(org.apache.flink.api.common.io.FileOutputFormat, String)
, or other generic
data sinks created with DataSet.output(org.apache.flink.api.common.io.OutputFormat)
.
The program execution will be logged and displayed with a generated default name.
Exception
- Thrown, if the program executions fails.public JobExecutionResult execute(String jobName) throws Exception
DataSet.print()
, writing results (e.g. DataSet.writeAsText(String)
, DataSet.write(org.apache.flink.api.common.io.FileOutputFormat, String)
, or other generic
data sinks created with DataSet.output(org.apache.flink.api.common.io.OutputFormat)
.
The program execution will be logged and displayed with the given job name.
Exception
- Thrown, if the program executions fails.@PublicEvolving public void registerJobListener(JobListener jobListener)
JobListener
in this environment. The JobListener
will be notified
on specific job status changed.@PublicEvolving public void clearJobListeners()
JobListener
s.@PublicEvolving public final JobClient executeAsync() throws Exception
DataSet.print()
, writing results (e.g. DataSet.writeAsText(String)
,
DataSet.write(org.apache.flink.api.common.io.FileOutputFormat, String)
, or other
generic data sinks created with DataSet.output(org.apache.flink.api.common.io.OutputFormat)
.
The program execution will be logged and displayed with a generated default name.
@PublicEvolving public JobClient executeAsync(String jobName) throws Exception
DataSet.print()
, writing results (e.g. DataSet.writeAsText(String)
,
DataSet.write(org.apache.flink.api.common.io.FileOutputFormat, String)
, or other
generic data sinks created with DataSet.output(org.apache.flink.api.common.io.OutputFormat)
.
The program execution will be logged and displayed with the given job name.
public String getExecutionPlan() throws Exception
Exception
- Thrown, if the compiler could not be instantiated.public void registerCachedFile(String filePath, String name)
The RuntimeContext
can be obtained inside
UDFs via RichFunction.getRuntimeContext()
and
provides access DistributedCache
via RuntimeContext.getDistributedCache()
.
filePath
- The path of the file, as a URI (e.g. "file:///some/path" or
"hdfs://host:port/and/path")name
- The name under which the file is registered.public void registerCachedFile(String filePath, String name, boolean executable)
The RuntimeContext
can be obtained inside
UDFs via RichFunction.getRuntimeContext()
and
provides access DistributedCache
via RuntimeContext.getDistributedCache()
.
filePath
- The path of the file, as a URI (e.g. "file:///some/path" or
"hdfs://host:port/and/path")name
- The name under which the file is registered.executable
- flag indicating whether the file should be executable@Internal public Plan createProgramPlan()
Plan
. The plan is a description of all data sources, data
sinks, and operations and how they interact, as an isolated unit that can be executed with an
PipelineExecutor
. Obtaining a plan and starting it with an executor is an alternative
way to run a program and is only possible if the program consists only of distributed
operations. This automatically starts a new stage of execution.@Internal public Plan createProgramPlan(String jobName)
Plan
. The plan is a description of all data sources, data
sinks, and operations and how they interact, as an isolated unit that can be executed with an
PipelineExecutor
. Obtaining a plan and starting it with an executor is an alternative
way to run a program and is only possible if the program consists only of distributed
operations. This automatically starts a new stage of execution.jobName
- The name attached to the plan (displayed in logs and monitoring).@Internal public Plan createProgramPlan(String jobName, boolean clearSinks)
Plan
. The plan is a description of all data sources, data
sinks, and operations and how they interact, as an isolated unit that can be executed with an
PipelineExecutor
. Obtaining a plan and starting it with an executor is an alternative
way to run a program and is only possible if the program consists only of distributed
operations.jobName
- The name attached to the plan (displayed in logs and monitoring).clearSinks
- Whether or not to start a new stage of execution.public static ExecutionEnvironment getExecutionEnvironment()
createLocalEnvironment()
. If the program is
invoked from within the command line client to be submitted to a cluster, this method returns
the execution environment of this cluster.@PublicEvolving public static CollectionEnvironment createCollectionsEnvironment()
CollectionEnvironment
that uses Java Collections underneath. This will
execute in a single thread in the current JVM. It is very fast but will fail if the data does
not fit into memory. parallelism will always be 1. This is useful during implementation and
for debugging.public static LocalEnvironment createLocalEnvironment()
LocalEnvironment
. The local execution environment will run the program in a
multi-threaded fashion in the same JVM as the environment was created in. The default
parallelism of the local environment is the number of hardware contexts (CPU cores /
threads), unless it was specified differently by setDefaultLocalParallelism(int)
.public static LocalEnvironment createLocalEnvironment(int parallelism)
LocalEnvironment
. The local execution environment will run the program in a
multi-threaded fashion in the same JVM as the environment was created in. It will use the
parallelism specified in the parameter.parallelism
- The parallelism for the local environment.public static LocalEnvironment createLocalEnvironment(Configuration customConfiguration)
LocalEnvironment
. The local execution environment will run the program in a
multi-threaded fashion in the same JVM as the environment was created in. It will use the
parallelism specified in the parameter.customConfiguration
- Pass a custom configuration to the LocalEnvironment.@PublicEvolving public static ExecutionEnvironment createLocalEnvironmentWithWebUI(Configuration conf)
LocalEnvironment
for local program execution that also starts the web
monitoring UI.
The local execution environment will run the program in a multi-threaded fashion in the same JVM as the environment was created in. It will use the parallelism specified in the parameter.
If the configuration key 'rest.port' was set in the configuration, that particular port will be used for the web UI. Otherwise, the default port (8081) will be used.
public static ExecutionEnvironment createRemoteEnvironment(String host, int port, String... jarFiles)
RemoteEnvironment
. The remote environment sends (parts of) the program to a
cluster for execution. Note that all file paths used in the program must be accessible from
the cluster. The execution will use the cluster's default parallelism, unless the parallelism
is set explicitly via setParallelism(int)
.host
- The host name or address of the master (JobManager), where the program should be
executed.port
- The port of the master (JobManager), where the program should be executed.jarFiles
- The JAR files with code that needs to be shipped to the cluster. If the
program uses user-defined functions, user-defined input formats, or any libraries, those
must be provided in the JAR files.public static ExecutionEnvironment createRemoteEnvironment(String host, int port, Configuration clientConfiguration, String... jarFiles)
RemoteEnvironment
. The remote environment sends (parts of) the program to a
cluster for execution. Note that all file paths used in the program must be accessible from
the cluster. The custom configuration file is used to configure Pekko specific configuration
parameters for the Client only; Program parallelism can be set via setParallelism(int)
.
Cluster configuration has to be done in the remotely running Flink instance.
host
- The host name or address of the master (JobManager), where the program should be
executed.port
- The port of the master (JobManager), where the program should be executed.clientConfiguration
- Configuration used by the client that connects to the cluster.jarFiles
- The JAR files with code that needs to be shipped to the cluster. If the
program uses user-defined functions, user-defined input formats, or any libraries, those
must be provided in the JAR files.public static ExecutionEnvironment createRemoteEnvironment(String host, int port, int parallelism, String... jarFiles)
RemoteEnvironment
. The remote environment sends (parts of) the program to a
cluster for execution. Note that all file paths used in the program must be accessible from
the cluster. The execution will use the specified parallelism.host
- The host name or address of the master (JobManager), where the program should be
executed.port
- The port of the master (JobManager), where the program should be executed.parallelism
- The parallelism to use during the execution.jarFiles
- The JAR files with code that needs to be shipped to the cluster. If the
program uses user-defined functions, user-defined input formats, or any libraries, those
must be provided in the JAR files.public static int getDefaultLocalParallelism()
createLocalEnvironment()
.public static void setDefaultLocalParallelism(int parallelism)
createLocalEnvironment()
.parallelism
- The parallelism to use as the default local parallelism.protected static void initializeContextEnvironment(ExecutionEnvironmentFactory ctx)
When the context environment factory is set, no other environments can be explicitly used.
ctx
- The context environment factory.protected static void resetContextEnvironment()
getExecutionEnvironment()
will again return a default local execution environment, and it
is possible to explicitly instantiate the LocalEnvironment and the RemoteEnvironment.@Internal public static boolean areExplicitEnvironmentsAllowed()
Copyright © 2014–2024 The Apache Software Foundation. All rights reserved.