ExecutionEnvironment (flink 1.4-SNAPSHOT API)

java.lang.Object
- org.apache.flink.api.java.ExecutionEnvironment

Direct Known Subclasses:

CollectionEnvironment, ContextEnvironment, LocalEnvironment, OptimizerPlanEnvironment, PreviewPlanEnvironment, RemoteEnvironment, TestEnvironment
```
@Public
public abstract class ExecutionEnvironment
extends Object
```
The ExecutionEnvironment is the context in which a program is executed. A LocalEnvironment will cause execution in the current JVM, a RemoteEnvironment will cause execution on a remote setup.
The environment provides methods to control the job execution (such as setting the parallelism) and to interact with the outside world (data access).
Please note that the execution environment needs strong type information for the input and return types of all operations that are executed. This means that the environments needs to know that the return value of an operation is for example a Tuple of String and Integer. Because the Java compiler throws much of the generic type information away, most methods attempt to re- obtain that information using reflection. In certain cases, it may be necessary to manually supply that information to some of the methods.

See Also:

LocalEnvironment, RemoteEnvironment

Field Summary

Fields
Modifier and Type	Field and Description
`protected JobID`	`jobID` The ID of the session, defined by this execution environment.
`protected JobExecutionResult`	`lastJobExecutionResult` Result from the latest execution, to make it retrievable when using eager execution methods.
`protected static org.slf4j.Logger`	`LOG` The logger used by the environment and its subclasses.
`protected long`	`sessionTimeout` The session timeout in seconds.

Constructor Summary

Constructors
Modifier Constructor and Description

protected ExecutionEnvironment()
Creates a new Execution Environment.

Constructors
Modifier	Constructor and Description
`protected`	`ExecutionEnvironment()` Creates a new Execution Environment.

Method Summary

All Methods Static Methods Instance Methods Abstract Methods Concrete Methods Deprecated Methods
Modifier and Type	Method and Description
`void`	`addDefaultKryoSerializer(Class<?> type, Class<? extends com.esotericsoftware.kryo.Serializer<?>> serializerClass)` Adds a new Kryo default serializer to the Runtime.
`<T extends com.esotericsoftware.kryo.Serializer<?> & Serializable> void`	`addDefaultKryoSerializer(Class<?> type, T serializer)` Adds a new Kryo default serializer to the Runtime.
`static boolean`	`areExplicitEnvironmentsAllowed()` Checks whether it is currently permitted to explicitly instantiate a LocalEnvironment or a RemoteEnvironment.
`static CollectionEnvironment`	`createCollectionsEnvironment()` Creates a `CollectionEnvironment` that uses Java Collections underneath.
`<X> DataSource<X>`	`createInput(InputFormat<X,?> inputFormat)` Generic method to create an input `DataSet` with in `InputFormat`.
`<X> DataSource<X>`	`createInput(InputFormat<X,?> inputFormat, TypeInformation<X> producedType)` Generic method to create an input DataSet with in `InputFormat`.
`static LocalEnvironment`	`createLocalEnvironment()` Creates a `LocalEnvironment`.
`static LocalEnvironment`	`createLocalEnvironment(Configuration customConfiguration)` Creates a `LocalEnvironment`.
`static LocalEnvironment`	`createLocalEnvironment(int parallelism)` Creates a `LocalEnvironment`.
`static ExecutionEnvironment`	`createLocalEnvironmentWithWebUI(Configuration conf)` Creates a `LocalEnvironment` for local program execution that also starts the web monitoring UI.
`Plan`	`createProgramPlan()` Creates the program's `Plan`.
`Plan`	`createProgramPlan(String jobName)` Creates the program's `Plan`.
`Plan`	`createProgramPlan(String jobName, boolean clearSinks)` Creates the program's `Plan`.
`static ExecutionEnvironment`	`createRemoteEnvironment(String host, int port, Configuration clientConfiguration, String... jarFiles)` Creates a `RemoteEnvironment`.
`static ExecutionEnvironment`	`createRemoteEnvironment(String host, int port, int parallelism, String... jarFiles)` Creates a `RemoteEnvironment`.
`static ExecutionEnvironment`	`createRemoteEnvironment(String host, int port, String... jarFiles)` Creates a `RemoteEnvironment`.
`JobExecutionResult`	`execute()` Triggers the program execution.
`abstract JobExecutionResult`	`execute(String jobName)` Triggers the program execution.
`<X> DataSource<X>`	`fromCollection(Collection<X> data)` Creates a DataSet from the given non-empty collection.
`<X> DataSource<X>`	`fromCollection(Collection<X> data, TypeInformation<X> type)` Creates a DataSet from the given non-empty collection.
`<X> DataSource<X>`	`fromCollection(Iterator<X> data, Class<X> type)` Creates a DataSet from the given iterator.
`<X> DataSource<X>`	`fromCollection(Iterator<X> data, TypeInformation<X> type)` Creates a DataSet from the given iterator.
`<X> DataSource<X>`	`fromElements(Class<X> type, X... data)` Creates a new data set that contains the given elements.
`<X> DataSource<X>`	`fromElements(X... data)` Creates a new data set that contains the given elements.
`<X> DataSource<X>`	`fromParallelCollection(SplittableIterator<X> iterator, Class<X> type)` Creates a new data set that contains elements in the iterator.
`<X> DataSource<X>`	`fromParallelCollection(SplittableIterator<X> iterator, TypeInformation<X> type)` Creates a new data set that contains elements in the iterator.
`DataSource<Long>`	`generateSequence(long from, long to)` Creates a new data set that contains a sequence of numbers.
`ExecutionConfig`	`getConfig()` Gets the config object that defines execution parameters.
`static int`	`getDefaultLocalParallelism()` Gets the default parallelism that will be used for the local execution environment created by `createLocalEnvironment()`.
`static ExecutionEnvironment`	`getExecutionEnvironment()` Creates an execution environment that represents the context in which the program is currently executed.
`abstract String`	`getExecutionPlan()` Creates the plan with which the system will execute the program, and returns it as a String using a JSON representation of the execution data flow graph.
`JobID`	`getId()` Gets the JobID by which this environment is identified.
`String`	`getIdString()` Gets the JobID by which this environment is identified, as a string.
`JobExecutionResult`	`getLastJobExecutionResult()` Returns the `JobExecutionResult` of the last executed job.
`int`	`getNumberOfExecutionRetries()` Deprecated. This method will be replaced by `getRestartStrategy()`. The `RestartStrategies.FixedDelayRestartStrategyConfiguration` contains the number of execution retries.
`int`	`getParallelism()` Gets the parallelism with which operation are executed by default.
`RestartStrategies.RestartStrategyConfiguration`	`getRestartStrategy()` Returns the specified restart strategy configuration.
`long`	`getSessionTimeout()` Gets the session timeout for this environment.
`protected static void`	`initializeContextEnvironment(ExecutionEnvironmentFactory ctx)` Sets a context environment factory, that creates the context environment for running programs with pre-configured environments.
`CsvReader`	`readCsvFile(String filePath)` Creates a CSV reader to read a comma separated value (CSV) file.
`<X> DataSource<X>`	`readFile(FileInputFormat<X> inputFormat, String filePath)`
`<X> DataSource<X>`	`readFileOfPrimitives(String filePath, Class<X> typeClass)` Creates a `DataSet` that represents the primitive type produced by reading the given file line wise.
`<X> DataSource<X>`	`readFileOfPrimitives(String filePath, String delimiter, Class<X> typeClass)` Creates a `DataSet` that represents the primitive type produced by reading the given file in delimited way.
`DataSource<String>`	`readTextFile(String filePath)` Creates a `DataSet` that represents the Strings produced by reading the given file line wise.
`DataSource<String>`	`readTextFile(String filePath, String charsetName)` Creates a `DataSet` that represents the Strings produced by reading the given file line wise.
`DataSource<StringValue>`	`readTextFileWithValue(String filePath)` Creates a `DataSet` that represents the Strings produced by reading the given file line wise.
`DataSource<StringValue>`	`readTextFileWithValue(String filePath, String charsetName, boolean skipInvalidLines)` Creates a `DataSet` that represents the Strings produced by reading the given file line wise.
`void`	`registerCachedFile(String filePath, String name)` Registers a file at the distributed cache under the given name.
`void`	`registerCachedFile(String filePath, String name, boolean executable)` Registers a file at the distributed cache under the given name.
`protected void`	`registerCachedFilesWithPlan(Plan p)` Registers all files that were registered at this execution environment's cache registry of the given plan's cache registry.
`void`	`registerType(Class<?> type)` Registers the given type with the serialization stack.
`void`	`registerTypeWithKryoSerializer(Class<?> type, Class<? extends com.esotericsoftware.kryo.Serializer<?>> serializerClass)` Registers the given Serializer via its class as a serializer for the given type at the KryoSerializer.
`<T extends com.esotericsoftware.kryo.Serializer<?> & Serializable> void`	`registerTypeWithKryoSerializer(Class<?> type, T serializer)` Registers the given type with a Kryo Serializer.
`protected static void`	`resetContextEnvironment()` Un-sets the context environment factory.
`static void`	`setDefaultLocalParallelism(int parallelism)` Sets the default parallelism that will be used for the local execution environment created by `createLocalEnvironment()`.
`void`	`setNumberOfExecutionRetries(int numberOfExecutionRetries)` Deprecated. This method will be replaced by `setRestartStrategy(org.apache.flink.api.common.restartstrategy.RestartStrategies.RestartStrategyConfiguration)`. The `RestartStrategies.FixedDelayRestartStrategyConfiguration` contains the number of execution retries.
`void`	`setParallelism(int parallelism)` Sets the parallelism for operations executed through this environment.
`void`	`setRestartStrategy(RestartStrategies.RestartStrategyConfiguration restartStrategyConfiguration)` Sets the restart strategy configuration.
`void`	`setSessionTimeout(long timeout)` Sets the session timeout to hold the intermediate results of a job.
`abstract void`	`startNewSession()` Starts a new session, discarding the previous data flow and all of its intermediate results.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Field Detail
  - LOG
```
protected static final org.slf4j.Logger LOG
```
    The logger used by the environment and its subclasses.
  - lastJobExecutionResult
```
protected JobExecutionResult lastJobExecutionResult
```
    Result from the latest execution, to make it retrievable when using eager execution methods.
  - jobID
```
protected JobID jobID
```
    The ID of the session, defined by this execution environment. Sessions and Jobs are same in Flink, as Jobs can consist of multiple parts that are attached to the growing dataflow graph.
  - sessionTimeout
```
protected long sessionTimeout
```
    The session timeout in seconds.
- Constructor Detail
  - ExecutionEnvironment
```
protected ExecutionEnvironment()
```
    Creates a new Execution Environment.
- Method Detail
  - getConfig
```
public ExecutionConfig getConfig()
```
    Gets the config object that defines execution parameters.
    
    Returns:
    
    The environment's execution configuration.
  - getParallelism
```
public int getParallelism()
```
    Gets the parallelism with which operation are executed by default. Operations can individually override this value to use a specific parallelism via Operator.setParallelism(int). Other operations may need to run with a different parallelism - for example calling DataSet.reduce(org.apache.flink.api.common.functions.ReduceFunction) over the entire set will insert eventually an operation that runs non-parallel (parallelism of one).
    
    Returns:
    
    The parallelism used by operations, unless they override that value. This method returns ExecutionConfig.PARALLELISM_DEFAULT, if the environment's default parallelism should be used.
  - setParallelism
```
public void setParallelism(int parallelism)
```
    Sets the parallelism for operations executed through this environment. Setting a parallelism of x here will cause all operators (such as join, map, reduce) to run with x parallel instances.
    This method overrides the default parallelism for this environment. The LocalEnvironment uses by default a value equal to the number of hardware contexts (CPU cores / threads). When executing the program via the command line client from a JAR file, the default parallelism is the one configured for that setup.
    
    Parameters:
    
    parallelism - The parallelism
  - setRestartStrategy
```
@PublicEvolving
public void setRestartStrategy(RestartStrategies.RestartStrategyConfiguration restartStrategyConfiguration)
```
    Sets the restart strategy configuration. The configuration specifies which restart strategy will be used for the execution graph in case of a restart.
    
    Parameters:
    
    restartStrategyConfiguration - Restart strategy configuration to be set
  - getRestartStrategy
```
@PublicEvolving
public RestartStrategies.RestartStrategyConfiguration getRestartStrategy()
```
    Returns the specified restart strategy configuration.
    
    Returns:
    
    The restart strategy configuration to be used
  - setNumberOfExecutionRetries
```
@Deprecated
 @PublicEvolving
public void setNumberOfExecutionRetries(int numberOfExecutionRetries)
```
    Deprecated. This method will be replaced by setRestartStrategy(org.apache.flink.api.common.restartstrategy.RestartStrategies.RestartStrategyConfiguration). The RestartStrategies.FixedDelayRestartStrategyConfiguration contains the number of execution retries.
    
    Sets the number of times that failed tasks are re-executed. A value of zero effectively disables fault tolerance. A value of -1 indicates that the system default value (as defined in the configuration) should be used.
    
    Parameters:
    
    numberOfExecutionRetries - The number of times the system will try to re-execute failed tasks.
  - getNumberOfExecutionRetries
```
@Deprecated
 @PublicEvolving
public int getNumberOfExecutionRetries()
```
    Deprecated. This method will be replaced by getRestartStrategy(). The RestartStrategies.FixedDelayRestartStrategyConfiguration contains the number of execution retries.
    
    Gets the number of times the system will try to re-execute failed tasks. A value of -1 indicates that the system default value (as defined in the configuration) should be used.
    
    Returns:
    
    The number of times the system will try to re-execute failed tasks.
  - getLastJobExecutionResult
```
public JobExecutionResult getLastJobExecutionResult()
```
    Returns the JobExecutionResult of the last executed job.
    
    Returns:
    
    The execution result from the latest job execution.
  - getId
```
@PublicEvolving
public JobID getId()
```
    Gets the JobID by which this environment is identified. The JobID sets the execution context in the cluster or local environment.
    
    Returns:
    
    The JobID of this environment.
    
    See Also:
    
    getIdString()
  - getIdString
```
@PublicEvolving
public String getIdString()
```
    Gets the JobID by which this environment is identified, as a string.
    
    Returns:
    
    The JobID as a string.
    
    See Also:
    
    getId()
  - setSessionTimeout
```
@PublicEvolving
public void setSessionTimeout(long timeout)
```
    Sets the session timeout to hold the intermediate results of a job. This only applies the updated timeout in future executions.
    
    Parameters:
    
    timeout - The timeout, in seconds.
  - getSessionTimeout
```
@PublicEvolving
public long getSessionTimeout()
```
    Gets the session timeout for this environment. The session timeout defines for how long after an execution, the job and its intermediate results will be kept for future interactions.
    
    Returns:
    
    The session timeout, in seconds.
  - startNewSession
```
@PublicEvolving
public abstract void startNewSession()
                                              throws Exception
```
    Starts a new session, discarding the previous data flow and all of its intermediate results.
    
    Throws:
    
    Exception
  - addDefaultKryoSerializer
```
public <T extends com.esotericsoftware.kryo.Serializer<?> & Serializable> void addDefaultKryoSerializer(Class<?> type,
                                                                                                        T serializer)
```
    Adds a new Kryo default serializer to the Runtime.
    Note that the serializer instance must be serializable (as defined by java.io.Serializable), because it may be distributed to the worker nodes by java serialization.
    
    Parameters:
    
    type - The class of the types serialized with the given serializer.
    
    serializer - The serializer to use.
  - addDefaultKryoSerializer
```
public void addDefaultKryoSerializer(Class<?> type,
                                     Class<? extends com.esotericsoftware.kryo.Serializer<?>> serializerClass)
```
    Adds a new Kryo default serializer to the Runtime.
    
    Parameters:
    
    type - The class of the types serialized with the given serializer.
    
    serializerClass - The class of the serializer to use.
  - registerTypeWithKryoSerializer
```
public <T extends com.esotericsoftware.kryo.Serializer<?> & Serializable> void registerTypeWithKryoSerializer(Class<?> type,
                                                                                                              T serializer)
```
    Registers the given type with a Kryo Serializer.
    Note that the serializer instance must be serializable (as defined by java.io.Serializable), because it may be distributed to the worker nodes by java serialization.
    
    Parameters:
    
    type - The class of the types serialized with the given serializer.
    
    serializer - The serializer to use.
  - registerTypeWithKryoSerializer
```
public void registerTypeWithKryoSerializer(Class<?> type,
                                           Class<? extends com.esotericsoftware.kryo.Serializer<?>> serializerClass)
```
    Registers the given Serializer via its class as a serializer for the given type at the KryoSerializer.
    
    Parameters:
    
    type - The class of the types serialized with the given serializer.
    
    serializerClass - The class of the serializer to use.
  - registerType
```
public void registerType(Class<?> type)
```
    Registers the given type with the serialization stack. If the type is eventually serialized as a POJO, then the type is registered with the POJO serializer. If the type ends up being serialized with Kryo, then it will be registered at Kryo to make sure that only tags are written.
    
    Parameters:
    
    type - The class of the type to register.
  - readTextFile
```
public DataSource<String> readTextFile(String filePath)
```
    Creates a DataSet that represents the Strings produced by reading the given file line wise. The file will be read with the system's default character set.
    
    Parameters:
    
    filePath - The path of the file, as a URI (e.g., "file:///some/local/file" or "hdfs://host:port/file/path").
    
    Returns:
    
    A DataSet that represents the data read from the given file as text lines.
  - readTextFile
```
public DataSource<String> readTextFile(String filePath,
                                       String charsetName)
```
    Creates a DataSet that represents the Strings produced by reading the given file line wise. The Charset with the given name will be used to read the files.
    
    Parameters:
    
    filePath - The path of the file, as a URI (e.g., "file:///some/local/file" or "hdfs://host:port/file/path").
    
    charsetName - The name of the character set used to read the file.
    
    Returns:
    
    A DataSet that represents the data read from the given file as text lines.
  - readTextFileWithValue
```
public DataSource<StringValue> readTextFileWithValue(String filePath)
```
    Creates a DataSet that represents the Strings produced by reading the given file line wise. This method is similar to readTextFile(String), but it produces a DataSet with mutable StringValue objects, rather than Java Strings. StringValues can be used to tune implementations to be less object and garbage collection heavy.
    The file will be read with the system's default character set.
    
    Parameters:
    
    filePath - The path of the file, as a URI (e.g., "file:///some/local/file" or "hdfs://host:port/file/path").
    
    Returns:
    
    A DataSet that represents the data read from the given file as text lines.
  - readTextFileWithValue
```
public DataSource<StringValue> readTextFileWithValue(String filePath,
                                                     String charsetName,
                                                     boolean skipInvalidLines)
```
    Creates a DataSet that represents the Strings produced by reading the given file line wise. This method is similar to readTextFile(String, String), but it produces a DataSet with mutable StringValue objects, rather than Java Strings. StringValues can be used to tune implementations to be less object and garbage collection heavy.
    The Charset with the given name will be used to read the files.
    
    Parameters:
    
    filePath - The path of the file, as a URI (e.g., "file:///some/local/file" or "hdfs://host:port/file/path").
    
    charsetName - The name of the character set used to read the file.
    
    skipInvalidLines - A flag to indicate whether to skip lines that cannot be read with the given character set.
    
    Returns:
    
    A DataSet that represents the data read from the given file as text lines.
  - readFileOfPrimitives
```
public <X> DataSource<X> readFileOfPrimitives(String filePath,
                                              Class<X> typeClass)
```
    Creates a DataSet that represents the primitive type produced by reading the given file line wise. This method is similar to readCsvFile(String) with single field, but it produces a DataSet not through Tuple1.
    
    Parameters:
    
    filePath - The path of the file, as a URI (e.g., "file:///some/local/file" or "hdfs://host:port/file/path").
    
    typeClass - The primitive type class to be read.
    
    Returns:
    
    A DataSet that represents the data read from the given file as primitive type.
  - readFileOfPrimitives
```
public <X> DataSource<X> readFileOfPrimitives(String filePath,
                                              String delimiter,
                                              Class<X> typeClass)
```
    Creates a DataSet that represents the primitive type produced by reading the given file in delimited way. This method is similar to readCsvFile(String) with single field, but it produces a DataSet not through Tuple1.
    
    Parameters:
    
    filePath - The path of the file, as a URI (e.g., "file:///some/local/file" or "hdfs://host:port/file/path").
    
    delimiter - The delimiter of the given file.
    
    typeClass - The primitive type class to be read.
    
    Returns:
    
    A DataSet that represents the data read from the given file as primitive type.
  - readCsvFile
```
public CsvReader readCsvFile(String filePath)
```
    Creates a CSV reader to read a comma separated value (CSV) file. The reader has options to define parameters and field types and will eventually produce the DataSet that corresponds to the read and parsed CSV input.
    
    Parameters:
    
    filePath - The path of the CSV file.
    
    Returns:
    
    A CsvReader that can be used to configure the CSV input.
  - readFile
```
public <X> DataSource<X> readFile(FileInputFormat<X> inputFormat,
                                  String filePath)
```
  - createInput
```
public <X> DataSource<X> createInput(InputFormat<X,?> inputFormat)
```
    Generic method to create an input DataSet with in InputFormat. The DataSet will not be immediately created - instead, this method returns a DataSet that will be lazily created from the input format once the program is executed.
    Since all data sets need specific information about their types, this method needs to determine the type of the data produced by the input format. It will attempt to determine the data type by reflection, unless the input format implements the ResultTypeQueryable interface. In the latter case, this method will invoke the ResultTypeQueryable.getProducedType() method to determine data type produced by the input format.
    
    Parameters:
    
    inputFormat - The input format used to create the data set.
    
    Returns:
    
    A DataSet that represents the data created by the input format.
    
    See Also:
    
    createInput(InputFormat, TypeInformation)
  - createInput
```
public <X> DataSource<X> createInput(InputFormat<X,?> inputFormat,
                                     TypeInformation<X> producedType)
```
    Generic method to create an input DataSet with in InputFormat. The DataSet will not be immediately created - instead, this method returns a DataSet that will be lazily created from the input format once the program is executed.
    The DataSet is typed to the given TypeInformation. This method is intended for input formats that where the return type cannot be determined by reflection analysis, and that do not implement the ResultTypeQueryable interface.
    
    Parameters:
    
    inputFormat - The input format used to create the data set.
    
    Returns:
    
    A DataSet that represents the data created by the input format.
    
    See Also:
    
    createInput(InputFormat)
  - fromCollection
```
public <X> DataSource<X> fromCollection(Collection<X> data)
```
    Creates a DataSet from the given non-empty collection. The type of the data set is that of the elements in the collection.
    The framework will try and determine the exact type from the collection elements. In case of generic elements, it may be necessary to manually supply the type information via fromCollection(Collection, TypeInformation).
    Note that this operation will result in a non-parallel data source, i.e. a data source with a parallelism of one.
    
    Parameters:
    
    data - The collection of elements to create the data set from.
    
    Returns:
    
    A DataSet representing the given collection.
    
    See Also:
    
    fromCollection(Collection, TypeInformation)
  - fromCollection
```
public <X> DataSource<X> fromCollection(Collection<X> data,
                                        TypeInformation<X> type)
```
    Creates a DataSet from the given non-empty collection. Note that this operation will result in a non-parallel data source, i.e. a data source with a parallelism of one.
    The returned DataSet is typed to the given TypeInformation.
    
    Parameters:
    
    data - The collection of elements to create the data set from.
    
    type - The TypeInformation for the produced data set.
    
    Returns:
    
    A DataSet representing the given collection.
    
    See Also:
    
    fromCollection(Collection)
  - fromCollection
```
public <X> DataSource<X> fromCollection(Iterator<X> data,
                                        Class<X> type)
```
    Creates a DataSet from the given iterator. Because the iterator will remain unmodified until the actual execution happens, the type of data returned by the iterator must be given explicitly in the form of the type class (this is due to the fact that the Java compiler erases the generic type information).
    Note that this operation will result in a non-parallel data source, i.e. a data source with a parallelism of one.
    
    Parameters:
    
    data - The collection of elements to create the data set from.
    
    type - The class of the data produced by the iterator. Must not be a generic class.
    
    Returns:
    
    A DataSet representing the elements in the iterator.
    
    See Also:
    
    fromCollection(Iterator, TypeInformation)
  - fromCollection
```
public <X> DataSource<X> fromCollection(Iterator<X> data,
                                        TypeInformation<X> type)
```
    Creates a DataSet from the given iterator. Because the iterator will remain unmodified until the actual execution happens, the type of data returned by the iterator must be given explicitly in the form of the type information. This method is useful for cases where the type is generic. In that case, the type class (as given in fromCollection(Iterator, Class) does not supply all type information.
    Note that this operation will result in a non-parallel data source, i.e. a data source with a parallelism of one.
    
    Parameters:
    
    data - The collection of elements to create the data set from.
    
    type - The TypeInformation for the produced data set.
    
    Returns:
    
    A DataSet representing the elements in the iterator.
    
    See Also:
    
    fromCollection(Iterator, Class)
  - fromElements
```
@SafeVarargs
public final <X> DataSource<X> fromElements(X... data)
```
    Creates a new data set that contains the given elements. The elements must all be of the same type, for example, all of the String or Integer. The sequence of elements must not be empty.
    The framework will try and determine the exact type from the collection elements. In case of generic elements, it may be necessary to manually supply the type information via fromCollection(Collection, TypeInformation).
    Note that this operation will result in a non-parallel data source, i.e. a data source with a parallelism of one.
    
    Parameters:
    
    data - The elements to make up the data set.
    
    Returns:
    
    A DataSet representing the given list of elements.
  - fromElements
```
@SafeVarargs
public final <X> DataSource<X> fromElements(Class<X> type,
                                                         X... data)
```
    Creates a new data set that contains the given elements. The framework will determine the type according to the based type user supplied. The elements should be the same or be the subclass to the based type. The sequence of elements must not be empty. Note that this operation will result in a non-parallel data source, i.e. a data source with a parallelism of one.
    
    Parameters:
    
    type - The base class type for every element in the collection.
    
    data - The elements to make up the data set.
    
    Returns:
    
    A DataSet representing the given list of elements.
  - fromParallelCollection
```
public <X> DataSource<X> fromParallelCollection(SplittableIterator<X> iterator,
                                                Class<X> type)
```
    Creates a new data set that contains elements in the iterator. The iterator is splittable, allowing the framework to create a parallel data source that returns the elements in the iterator.
    Because the iterator will remain unmodified until the actual execution happens, the type of data returned by the iterator must be given explicitly in the form of the type class (this is due to the fact that the Java compiler erases the generic type information).
    
    Parameters:
    
    iterator - The iterator that produces the elements of the data set.
    
    type - The class of the data produced by the iterator. Must not be a generic class.
    
    Returns:
    
    A DataSet representing the elements in the iterator.
    
    See Also:
    
    fromParallelCollection(SplittableIterator, TypeInformation)
  - fromParallelCollection
```
public <X> DataSource<X> fromParallelCollection(SplittableIterator<X> iterator,
                                                TypeInformation<X> type)
```
    Creates a new data set that contains elements in the iterator. The iterator is splittable, allowing the framework to create a parallel data source that returns the elements in the iterator.
    Because the iterator will remain unmodified until the actual execution happens, the type of data returned by the iterator must be given explicitly in the form of the type information. This method is useful for cases where the type is generic. In that case, the type class (as given in fromParallelCollection(SplittableIterator, Class) does not supply all type information.
    
    Parameters:
    
    iterator - The iterator that produces the elements of the data set.
    
    type - The TypeInformation for the produced data set.
    
    Returns:
    
    A DataSet representing the elements in the iterator.
    
    See Also:
    
    fromParallelCollection(SplittableIterator, Class)
  - generateSequence
```
public DataSource<Long> generateSequence(long from,
                                         long to)
```
    Creates a new data set that contains a sequence of numbers. The data set will be created in parallel, so there is no guarantee about the order of the elements.
    
    Parameters:
    
    from - The number to start at (inclusive).
    
    to - The number to stop at (inclusive).
    
    Returns:
    
    A DataSet, containing all number in the [from, to] interval.
  - execute
```
public JobExecutionResult execute()
                           throws Exception
```
    Triggers the program execution. The environment will execute all parts of the program that have resulted in a "sink" operation. Sink operations are for example printing results (DataSet.print(), writing results (e.g. DataSet.writeAsText(String), DataSet.write(org.apache.flink.api.common.io.FileOutputFormat, String), or other generic data sinks created with DataSet.output(org.apache.flink.api.common.io.OutputFormat).
    The program execution will be logged and displayed with a generated default name.
    
    Returns:
    
    The result of the job execution, containing elapsed time and accumulators.
    
    Throws:
    
    Exception - Thrown, if the program executions fails.
  - execute
```
public abstract JobExecutionResult execute(String jobName)
                                    throws Exception
```
    Triggers the program execution. The environment will execute all parts of the program that have resulted in a "sink" operation. Sink operations are for example printing results (DataSet.print(), writing results (e.g. DataSet.writeAsText(String), DataSet.write(org.apache.flink.api.common.io.FileOutputFormat, String), or other generic data sinks created with DataSet.output(org.apache.flink.api.common.io.OutputFormat).
    The program execution will be logged and displayed with the given job name.
    
    Returns:
    
    The result of the job execution, containing elapsed time and accumulators.
    
    Throws:
    
    Exception - Thrown, if the program executions fails.
  - getExecutionPlan
```
public abstract String getExecutionPlan()
                                 throws Exception
```
    Creates the plan with which the system will execute the program, and returns it as a String using a JSON representation of the execution data flow graph. Note that this needs to be called, before the plan is executed.
    
    Returns:
    
    The execution plan of the program, as a JSON String.
    
    Throws:
    
    Exception - Thrown, if the compiler could not be instantiated, or the master could not be contacted to retrieve information relevant to the execution planning.
  - registerCachedFile
```
public void registerCachedFile(String filePath,
                               String name)
```
    Registers a file at the distributed cache under the given name. The file will be accessible from any user-defined function in the (distributed) runtime under a local path. Files may be local files (as long as all relevant workers have access to it), or files in a distributed file system. The runtime will copy the files temporarily to a local cache, if needed.
    The RuntimeContext can be obtained inside UDFs via RichFunction.getRuntimeContext() and provides access DistributedCache via RuntimeContext.getDistributedCache().
    
    Parameters:
    
    filePath - The path of the file, as a URI (e.g. "file:///some/path" or "hdfs://host:port/and/path")
    
    name - The name under which the file is registered.
  - registerCachedFile
```
public void registerCachedFile(String filePath,
                               String name,
                               boolean executable)
```
    Registers a file at the distributed cache under the given name. The file will be accessible from any user-defined function in the (distributed) runtime under a local path. Files may be local files (as long as all relevant workers have access to it), or files in a distributed file system. The runtime will copy the files temporarily to a local cache, if needed.
    The RuntimeContext can be obtained inside UDFs via RichFunction.getRuntimeContext() and provides access DistributedCache via RuntimeContext.getDistributedCache().
    
    Parameters:
    
    filePath - The path of the file, as a URI (e.g. "file:///some/path" or "hdfs://host:port/and/path")
    
    name - The name under which the file is registered.
    
    executable - flag indicating whether the file should be executable
  - registerCachedFilesWithPlan
```
protected void registerCachedFilesWithPlan(Plan p)
                                    throws IOException
```
    Registers all files that were registered at this execution environment's cache registry of the given plan's cache registry.
    
    Parameters:
    
    p - The plan to register files at.
    
    Throws:
    
    IOException - Thrown if checks for existence and sanity fail.
  - createProgramPlan
```
@Internal
public Plan createProgramPlan()
```
    Creates the program's Plan. The plan is a description of all data sources, data sinks, and operations and how they interact, as an isolated unit that can be executed with a PlanExecutor. Obtaining a plan and starting it with an executor is an alternative way to run a program and is only possible if the program consists only of distributed operations. This automatically starts a new stage of execution.
    
    Returns:
    
    The program's plan.
  - createProgramPlan
```
@Internal
public Plan createProgramPlan(String jobName)
```
    Creates the program's Plan. The plan is a description of all data sources, data sinks, and operations and how they interact, as an isolated unit that can be executed with a PlanExecutor. Obtaining a plan and starting it with an executor is an alternative way to run a program and is only possible if the program consists only of distributed operations. This automatically starts a new stage of execution.
    
    Parameters:
    
    jobName - The name attached to the plan (displayed in logs and monitoring).
    
    Returns:
    
    The program's plan.
  - createProgramPlan
```
@Internal
public Plan createProgramPlan(String jobName,
                                        boolean clearSinks)
```
    Creates the program's Plan. The plan is a description of all data sources, data sinks, and operations and how they interact, as an isolated unit that can be executed with a PlanExecutor. Obtaining a plan and starting it with an executor is an alternative way to run a program and is only possible if the program consists only of distributed operations.
    
    Parameters:
    
    jobName - The name attached to the plan (displayed in logs and monitoring).
    
    clearSinks - Whether or not to start a new stage of execution.
    
    Returns:
    
    The program's plan.
  - getExecutionEnvironment
```
public static ExecutionEnvironment getExecutionEnvironment()
```
    Creates an execution environment that represents the context in which the program is currently executed. If the program is invoked standalone, this method returns a local execution environment, as returned by createLocalEnvironment(). If the program is invoked from within the command line client to be submitted to a cluster, this method returns the execution environment of this cluster.
    
    Returns:
    
    The execution environment of the context in which the program is executed.
  - createCollectionsEnvironment
```
@PublicEvolving
public static CollectionEnvironment createCollectionsEnvironment()
```
    Creates a CollectionEnvironment that uses Java Collections underneath. This will execute in a single thread in the current JVM. It is very fast but will fail if the data does not fit into memory. parallelism will always be 1. This is useful during implementation and for debugging.
    
    Returns:
    
    A Collection Environment
  - createLocalEnvironment
```
public static LocalEnvironment createLocalEnvironment()
```
    Creates a LocalEnvironment. The local execution environment will run the program in a multi-threaded fashion in the same JVM as the environment was created in. The default parallelism of the local environment is the number of hardware contexts (CPU cores / threads), unless it was specified differently by setDefaultLocalParallelism(int).
    
    Returns:
    
    A local execution environment.
  - createLocalEnvironment
```
public static LocalEnvironment createLocalEnvironment(int parallelism)
```
    Creates a LocalEnvironment. The local execution environment will run the program in a multi-threaded fashion in the same JVM as the environment was created in. It will use the parallelism specified in the parameter.
    
    Parameters:
    
    parallelism - The parallelism for the local environment.
    
    Returns:
    
    A local execution environment with the specified parallelism.
  - createLocalEnvironment
```
public static LocalEnvironment createLocalEnvironment(Configuration customConfiguration)
```
    Creates a LocalEnvironment. The local execution environment will run the program in a multi-threaded fashion in the same JVM as the environment was created in. It will use the parallelism specified in the parameter.
    
    Parameters:
    
    customConfiguration - Pass a custom configuration to the LocalEnvironment.
    
    Returns:
    
    A local execution environment with the specified parallelism.
  - createLocalEnvironmentWithWebUI
```
@PublicEvolving
public static ExecutionEnvironment createLocalEnvironmentWithWebUI(Configuration conf)
```
    Creates a LocalEnvironment for local program execution that also starts the web monitoring UI.
    The local execution environment will run the program in a multi-threaded fashion in the same JVM as the environment was created in. It will use the parallelism specified in the parameter.
    If the configuration key 'jobmanager.web.port' was set in the configuration, that particular port will be used for the web UI. Otherwise, the default port (8081) will be used.
  - createRemoteEnvironment
```
public static ExecutionEnvironment createRemoteEnvironment(String host,
                                                           int port,
                                                           String... jarFiles)
```
    Creates a RemoteEnvironment. The remote environment sends (parts of) the program to a cluster for execution. Note that all file paths used in the program must be accessible from the cluster. The execution will use the cluster's default parallelism, unless the parallelism is set explicitly via setParallelism(int).
    
    Parameters:
    
    host - The host name or address of the master (JobManager), where the program should be executed.
    
    port - The port of the master (JobManager), where the program should be executed.
    
    jarFiles - The JAR files with code that needs to be shipped to the cluster. If the program uses user-defined functions, user-defined input formats, or any libraries, those must be provided in the JAR files.
    
    Returns:
    
    A remote environment that executes the program on a cluster.
  - createRemoteEnvironment
```
public static ExecutionEnvironment createRemoteEnvironment(String host,
                                                           int port,
                                                           Configuration clientConfiguration,
                                                           String... jarFiles)
```
    Creates a RemoteEnvironment. The remote environment sends (parts of) the program to a cluster for execution. Note that all file paths used in the program must be accessible from the cluster. The custom configuration file is used to configure Akka specific configuration parameters for the Client only; Program parallelism can be set via setParallelism(int).
    Cluster configuration has to be done in the remotely running Flink instance.
    
    Parameters:
    
    host - The host name or address of the master (JobManager), where the program should be executed.
    
    port - The port of the master (JobManager), where the program should be executed.
    
    clientConfiguration - Configuration used by the client that connects to the cluster.
    
    jarFiles - The JAR files with code that needs to be shipped to the cluster. If the program uses user-defined functions, user-defined input formats, or any libraries, those must be provided in the JAR files.
    
    Returns:
    
    A remote environment that executes the program on a cluster.
  - createRemoteEnvironment
```
public static ExecutionEnvironment createRemoteEnvironment(String host,
                                                           int port,
                                                           int parallelism,
                                                           String... jarFiles)
```
    Creates a RemoteEnvironment. The remote environment sends (parts of) the program to a cluster for execution. Note that all file paths used in the program must be accessible from the cluster. The execution will use the specified parallelism.
    
    Parameters:
    
    host - The host name or address of the master (JobManager), where the program should be executed.
    
    port - The port of the master (JobManager), where the program should be executed.
    
    parallelism - The parallelism to use during the execution.
    
    jarFiles - The JAR files with code that needs to be shipped to the cluster. If the program uses user-defined functions, user-defined input formats, or any libraries, those must be provided in the JAR files.
    
    Returns:
    
    A remote environment that executes the program on a cluster.
  - getDefaultLocalParallelism
```
public static int getDefaultLocalParallelism()
```
    Gets the default parallelism that will be used for the local execution environment created by createLocalEnvironment().
    
    Returns:
    
    The default local parallelism
  - setDefaultLocalParallelism
```
public static void setDefaultLocalParallelism(int parallelism)
```
    Sets the default parallelism that will be used for the local execution environment created by createLocalEnvironment().
    
    Parameters:
    
    parallelism - The parallelism to use as the default local parallelism.
  - initializeContextEnvironment
```
protected static void initializeContextEnvironment(ExecutionEnvironmentFactory ctx)
```
    Sets a context environment factory, that creates the context environment for running programs with pre-configured environments. Examples are running programs from the command line, and running programs in the Scala shell.
    When the context environment factory is set, no other environments can be explicitly used.
    
    Parameters:
    
    ctx - The context environment factory.
  - resetContextEnvironment
```
protected static void resetContextEnvironment()
```
    Un-sets the context environment factory. After this method is called, the call to getExecutionEnvironment() will again return a default local execution environment, and it is possible to explicitly instantiate the LocalEnvironment and the RemoteEnvironment.
  - areExplicitEnvironmentsAllowed
```
@Internal
public static boolean areExplicitEnvironmentsAllowed()
```
    Checks whether it is currently permitted to explicitly instantiate a LocalEnvironment or a RemoteEnvironment.
    
    Returns:
    
    True, if it is possible to explicitly instantiate a LocalEnvironment or a RemoteEnvironment, false otherwise.

Back to Flink Website

Class ExecutionEnvironment

Field Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Field Detail

LOG

lastJobExecutionResult

jobID

sessionTimeout

Constructor Detail

ExecutionEnvironment

Method Detail

getConfig

getParallelism

setParallelism

setRestartStrategy

getRestartStrategy

setNumberOfExecutionRetries

getNumberOfExecutionRetries

getLastJobExecutionResult

getId

getIdString

setSessionTimeout

getSessionTimeout

startNewSession

addDefaultKryoSerializer

addDefaultKryoSerializer

registerTypeWithKryoSerializer

registerTypeWithKryoSerializer

registerType

readTextFile

readTextFile

readTextFileWithValue

readTextFileWithValue

readFileOfPrimitives

readFileOfPrimitives

readCsvFile

readFile

createInput

createInput

fromCollection

fromCollection

fromCollection

fromCollection

fromElements

fromElements

fromParallelCollection

fromParallelCollection

generateSequence

execute

execute

getExecutionPlan

registerCachedFile

registerCachedFile

registerCachedFilesWithPlan

createProgramPlan

createProgramPlan

createProgramPlan

getExecutionEnvironment

createCollectionsEnvironment

createLocalEnvironment

createLocalEnvironment

createLocalEnvironment

createLocalEnvironmentWithWebUI

createRemoteEnvironment

createRemoteEnvironment

createRemoteEnvironment

getDefaultLocalParallelism

setDefaultLocalParallelism

initializeContextEnvironment

resetContextEnvironment

areExplicitEnvironmentsAllowed

Back to Flink Website