StreamExecutionEnvironment (flink 1.3-SNAPSHOT API)

java.lang.Object
- org.apache.flink.streaming.api.scala.StreamExecutionEnvironment

public class StreamExecutionEnvironment
extends Object

Constructor Summary

Constructors
Constructor and Description

StreamExecutionEnvironment(StreamExecutionEnvironment javaEnv)

Constructors
Constructor and Description
`StreamExecutionEnvironment(StreamExecutionEnvironment javaEnv)`

Method Summary

All Methods Static Methods Instance Methods Concrete Methods Deprecated Methods
Modifier and Type	Method and Description
`void`	`addDefaultKryoSerializer(Class<?> type, Class<? extends com.esotericsoftware.kryo.Serializer<?>> serializerClass)` Adds a new Kryo default serializer to the Runtime.
`<T extends com.esotericsoftware.kryo.Serializer<?>> void`	`addDefaultKryoSerializer(Class<?> type, T serializer)` Adds a new Kryo default serializer to the Runtime.
`<T> DataStream<T>`	`addSource(scala.Function1<SourceFunction.SourceContext<T>,scala.runtime.BoxedUnit> function, TypeInformation<T> evidence$10)` Create a DataStream using a user defined source function for arbitrary source functionality.
`<T> DataStream<T>`	`addSource(SourceFunction<T> function, TypeInformation<T> evidence$9)` Create a DataStream using a user defined source function for arbitrary source functionality.
`<T> DataStream<T>`	`createInput(InputFormat<T,?> inputFormat, TypeInformation<T> evidence$8)` Generic method to create an input data stream with a specific input format.
`static StreamExecutionEnvironment`	`createLocalEnvironment(int parallelism)` Creates a local execution environment.
`static StreamExecutionEnvironment`	`createLocalEnvironmentWithWebUI(Configuration config)` Creates a `StreamExecutionEnvironment` for local program execution that also starts the web monitoring UI.
`static StreamExecutionEnvironment`	`createRemoteEnvironment(String host, int port, int parallelism, scala.collection.Seq<String> jarFiles)` Creates a remote execution environment.
`static StreamExecutionEnvironment`	`createRemoteEnvironment(String host, int port, scala.collection.Seq<String> jarFiles)` Creates a remote execution environment.
`StreamExecutionEnvironment`	`disableOperatorChaining()` Disables operator chaining for streaming operators.
`StreamExecutionEnvironment`	`enableCheckpointing()` Deprecated. . Since .
`StreamExecutionEnvironment`	`enableCheckpointing(long interval)` Enables checkpointing for the streaming job.
`StreamExecutionEnvironment`	`enableCheckpointing(long interval, CheckpointingMode mode)` Enables checkpointing for the streaming job.
`StreamExecutionEnvironment`	`enableCheckpointing(long interval, CheckpointingMode mode, boolean force)` Deprecated. . Since .
`JobExecutionResult`	`execute()` Triggers the program execution.
`JobExecutionResult`	`execute(String jobName)` Triggers the program execution.
`<T> DataStream<T>`	`fromCollection(scala.collection.Iterator<T> data, TypeInformation<T> evidence$3)` Creates a DataStream from the given `Iterator`.
`<T> DataStream<T>`	`fromCollection(scala.collection.Seq<T> data, TypeInformation<T> evidence$2)` Creates a DataStream from the given non-empty `Seq`.
`<T> DataStream<T>`	`fromElements(scala.collection.Seq<T> data, TypeInformation<T> evidence$1)` Creates a DataStream that contains the given elements.
`<T> DataStream<T>`	`fromParallelCollection(SplittableIterator<T> data, TypeInformation<T> evidence$4)` Creates a DataStream from the given `SplittableIterator`.
`DataStream<Object>`	`generateSequence(long from, long to)` Creates a new DataStream that contains a sequence of numbers.
`long`	`getBufferTimeout()` Gets the default buffer timeout set for this environment
`List<Tuple2<String,DistributedCache.DistributedCacheEntry>>`	`getCachedFiles()` Gets cache files.
`CheckpointConfig`	`getCheckpointConfig()` Gets the checkpoint config, which defines values like checkpoint interval, delay between checkpoints, etc.
`CheckpointingMode`	`getCheckpointingMode()`
`ExecutionConfig`	`getConfig()` Gets the config object.
`static int`	`getDefaultLocalParallelism()` Gets the default parallelism that will be used for the local execution environment created by `createLocalEnvironment()`.
`static StreamExecutionEnvironment`	`getExecutionEnvironment()` Creates an execution environment that represents the context in which the program is currently executed.
`String`	`getExecutionPlan()` Creates the plan with which the system will execute the program, and returns it as a String using a JSON representation of the execution data flow graph.
`StreamExecutionEnvironment`	`getJavaEnv()`
`int`	`getMaxParallelism()` Returns the maximum degree of parallelism defined for the program.
`int`	`getNumberOfExecutionRetries()` Deprecated. This method will be replaced by `getRestartStrategy`. The FixedDelayRestartStrategyConfiguration contains the number of execution retries.
`int`	`getParallelism()` Returns the default parallelism for this execution environment.
`RestartStrategies.RestartStrategyConfiguration`	`getRestartStrategy()` Returns the specified restart strategy configuration.
`AbstractStateBackend`	`getStateBackend()` Returns the state backend that defines how to store and checkpoint state.
`StreamGraph`	`getStreamGraph()` Getter of the `StreamGraph` of the streaming job.
`TimeCharacteristic`	`getStreamTimeCharacteristic()` Gets the time characteristic/
`StreamExecutionEnvironment`	`getWrappedStreamExecutionEnvironment()` Getter of the wrapped `StreamExecutionEnvironment`
`<T> DataStream<T>`	`readFile(FileInputFormat<T> inputFormat, String filePath, FileProcessingMode watchType, long interval, FilePathFilter filter, TypeInformation<T> evidence$6)` Deprecated. Use `FileInputFormat#setFilesFilter(FilePathFilter)` to set a filter and `StreamExecutionEnvironment#readFile(FileInputFormat, String, FileProcessingMode, long)`
`<T> DataStream<T>`	`readFile(FileInputFormat<T> inputFormat, String filePath, FileProcessingMode watchType, long interval, TypeInformation<T> evidence$7)` Reads the contents of the user-specified path based on the given `FileInputFormat`.
`<T> DataStream<T>`	`readFile(FileInputFormat<T> inputFormat, String filePath, TypeInformation<T> evidence$5)` Reads the given file with the given input format.
`DataStream<String>`	`readFileStream(String StreamPath, long intervalMillis, FileMonitoringFunction.WatchType watchType)` Creates a DataStream that contains the contents of file created while system watches the given path.
`DataStream<String>`	`readTextFile(String filePath)` Creates a DataStream that represents the Strings produced by reading the given file line wise.
`DataStream<String>`	`readTextFile(String filePath, String charsetName)` Creates a data stream that represents the Strings produced by reading the given file line wise.
`void`	`registerCachedFile(String filePath, String name)` Registers a file at the distributed cache under the given name.
`void`	`registerCachedFile(String filePath, String name, boolean executable)` Registers a file at the distributed cache under the given name.
`void`	`registerType(Class<?> typeClass)` Registers the given type with the serialization stack.
`void`	`registerTypeWithKryoSerializer(Class<?> clazz, Class<? extends com.esotericsoftware.kryo.Serializer<?>> serializer)` Registers the given type with the serializer at the `KryoSerializer`.
`<T extends com.esotericsoftware.kryo.Serializer<?>> void`	`registerTypeWithKryoSerializer(Class<?> clazz, T serializer)` Registers the given type with the serializer at the `KryoSerializer`.
`<F> F`	`scalaClean(F f)` Returns a "closure-cleaned" version of the given function.
`StreamExecutionEnvironment`	`setBufferTimeout(long timeoutMillis)` Sets the maximum time frequency (milliseconds) for the flushing of the output buffers.
`static void`	`setDefaultLocalParallelism(int parallelism)` Sets the default parallelism that will be used for the local execution environment created by `createLocalEnvironment()`.
`void`	`setMaxParallelism(int maxParallelism)` Sets the maximum degree of parallelism defined for the program.
`void`	`setNumberOfExecutionRetries(int numRetries)` Deprecated. This method will be replaced by `setRestartStrategy()`. The FixedDelayRestartStrategyConfiguration contains the number of execution retries.
`void`	`setParallelism(int parallelism)` Sets the parallelism for operations executed through this environment.
`void`	`setRestartStrategy(RestartStrategies.RestartStrategyConfiguration restartStrategyConfiguration)` Sets the restart strategy configuration.
`StreamExecutionEnvironment`	`setStateBackend(AbstractStateBackend backend)` Sets the state backend that describes how to store and checkpoint operator state.
`void`	`setStreamTimeCharacteristic(TimeCharacteristic characteristic)` Sets the time characteristic for all streams create from this environment, e.g., processing time, event time, or ingestion time.
`DataStream<String>`	`socketTextStream(String hostname, int port, char delimiter, long maxRetry)` Creates a new DataStream that contains the strings received infinitely from socket.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Constructor Detail
  - StreamExecutionEnvironment
```
public StreamExecutionEnvironment(StreamExecutionEnvironment javaEnv)
```
- Method Detail
  - setDefaultLocalParallelism
```
public static void setDefaultLocalParallelism(int parallelism)
```
    Sets the default parallelism that will be used for the local execution environment created by createLocalEnvironment().
    
    Parameters:
    
    parallelism - The default parallelism to use for local execution.
  - getDefaultLocalParallelism
```
public static int getDefaultLocalParallelism()
```
    Gets the default parallelism that will be used for the local execution environment created by createLocalEnvironment().
    
    Returns:
    
    (undocumented)
  - getExecutionEnvironment
```
public static StreamExecutionEnvironment getExecutionEnvironment()
```
    Creates an execution environment that represents the context in which the program is currently executed. If the program is invoked standalone, this method returns a local execution environment. If the program is invoked from within the command line client to be submitted to a cluster, this method returns the execution environment of this cluster.
    
    Returns:
    
    (undocumented)
  - createLocalEnvironment
```
public static StreamExecutionEnvironment createLocalEnvironment(int parallelism)
```
    Creates a local execution environment. The local execution environment will run the program in a multi-threaded fashion in the same JVM as the environment was created in.
    This method sets the environment's default parallelism to given parameter, which defaults to the value set via setDefaultLocalParallelism(Int).
    
    Parameters:
    
    parallelism - (undocumented)
    
    Returns:
    
    (undocumented)
  - createLocalEnvironmentWithWebUI
```
public static StreamExecutionEnvironment createLocalEnvironmentWithWebUI(Configuration config)
```
    Creates a StreamExecutionEnvironment for local program execution that also starts the web monitoring UI.
    The local execution environment will run the program in a multi-threaded fashion in the same JVM as the environment was created in. It will use the parallelism specified in the parameter.
    If the configuration key 'jobmanager.web.port' was set in the configuration, that particular port will be used for the web UI. Otherwise, the default port (8081) will be used.
    
    Parameters:
    
    config - optional config for the local execution
    
    Returns:
    
    The created StreamExecutionEnvironment
  - createRemoteEnvironment
```
public static StreamExecutionEnvironment createRemoteEnvironment(String host,
                                                                 int port,
                                                                 scala.collection.Seq<String> jarFiles)
```
    Creates a remote execution environment. The remote environment sends (parts of) the program to a cluster for execution. Note that all file paths used in the program must be accessible from the cluster. The execution will use the cluster's default parallelism, unless the parallelism is set explicitly via StreamExecutionEnvironment.setParallelism().
    
    Parameters:
    
    host - The host name or address of the master (JobManager), where the program should be executed.
    
    port - The port of the master (JobManager), where the program should be executed.
    
    jarFiles - The JAR files with code that needs to be shipped to the cluster. If the program uses user-defined functions, user-defined input formats, or any libraries, those must be provided in the JAR files.
    
    Returns:
    
    (undocumented)
  - createRemoteEnvironment
```
public static StreamExecutionEnvironment createRemoteEnvironment(String host,
                                                                 int port,
                                                                 int parallelism,
                                                                 scala.collection.Seq<String> jarFiles)
```
    Creates a remote execution environment. The remote environment sends (parts of) the program to a cluster for execution. Note that all file paths used in the program must be accessible from the cluster. The execution will use the specified parallelism.
    
    Parameters:
    
    host - The host name or address of the master (JobManager), where the program should be executed.
    
    port - The port of the master (JobManager), where the program should be executed.
    
    parallelism - The parallelism to use during the execution.
    
    jarFiles - The JAR files with code that needs to be shipped to the cluster. If the program uses user-defined functions, user-defined input formats, or any libraries, those must be provided in the JAR files.
    
    Returns:
    
    (undocumented)
  - getJavaEnv
```
public StreamExecutionEnvironment getJavaEnv()
```
    Returns:
    
    the wrapped Java environment
  - getConfig
```
public ExecutionConfig getConfig()
```
    Gets the config object.
    
    Returns:
    
    (undocumented)
  - getCachedFiles
```
public List<Tuple2<String,DistributedCache.DistributedCacheEntry>> getCachedFiles()
```
    Gets cache files.
    
    Returns:
    
    (undocumented)
  - setParallelism
```
public void setParallelism(int parallelism)
```
    Sets the parallelism for operations executed through this environment. Setting a parallelism of x here will cause all operators (such as join, map, reduce) to run with x parallel instances. This value can be overridden by specific operations using DataStream.setParallelism(int).
    
    Parameters:
    
    parallelism - (undocumented)
  - setMaxParallelism
```
public void setMaxParallelism(int maxParallelism)
```
    Sets the maximum degree of parallelism defined for the program. The maximum degree of parallelism specifies the upper limit for dynamic scaling. It also defines the number of key groups used for partitioned state.
    
    Parameters:
    
    maxParallelism - (undocumented)
  - getParallelism
```
public int getParallelism()
```
    Returns the default parallelism for this execution environment. Note that this value can be overridden by individual operations using DataStream.setParallelism(int)
    
    Returns:
    
    (undocumented)
  - getMaxParallelism
```
public int getMaxParallelism()
```
    Returns the maximum degree of parallelism defined for the program.
    The maximum degree of parallelism specifies the upper limit for dynamic scaling. It also defines the number of key groups used for partitioned state.
    
    Returns:
    
    (undocumented)
  - setBufferTimeout
```
public StreamExecutionEnvironment setBufferTimeout(long timeoutMillis)
```
    Sets the maximum time frequency (milliseconds) for the flushing of the output buffers. By default the output buffers flush frequently to provide low latency and to aid smooth developer experience. Setting the parameter can result in three logical modes:
    - A positive integer triggers flushing periodically by that integer
    - 0 triggers flushing after every record thus minimizing latency
    - -1 triggers flushing only when the output buffer is full thus maximizing throughput
    Parameters:
    
    timeoutMillis - (undocumented)
    
    Returns:
    
    (undocumented)
  - getBufferTimeout
```
public long getBufferTimeout()
```
    Gets the default buffer timeout set for this environment
    
    Returns:
    
    (undocumented)
  - disableOperatorChaining
```
public StreamExecutionEnvironment disableOperatorChaining()
```
    Disables operator chaining for streaming operators. Operator chaining allows non-shuffle operations to be co-located in the same thread fully avoiding serialization and de-serialization.
    
    Returns:
    
    (undocumented)
  - getCheckpointConfig
```
public CheckpointConfig getCheckpointConfig()
```
    Gets the checkpoint config, which defines values like checkpoint interval, delay between checkpoints, etc.
    
    Returns:
    
    (undocumented)
  - enableCheckpointing
```
public StreamExecutionEnvironment enableCheckpointing(long interval,
                                                      CheckpointingMode mode,
                                                      boolean force)
```
    Deprecated. . Since .
    
    Enables checkpointing for the streaming job. The distributed state of the streaming dataflow will be periodically snapshotted. In case of a failure, the streaming dataflow will be restarted from the latest completed checkpoint.
    The job draws checkpoints periodically, in the given interval. The state will be stored in the configured state backend.
    NOTE: Checkpointing iterative streaming dataflows in not properly supported at the moment. If the "force" parameter is set to true, the system will execute the job nonetheless.
    
    Parameters:
    
    interval - Time interval between state checkpoints in millis.
    
    mode - The checkpointing mode, selecting between "exactly once" and "at least once" guarantees.
    
    force - If true checkpointing will be enabled for iterative jobs as well.
    
    Returns:
    
    (undocumented)
  - enableCheckpointing
```
public StreamExecutionEnvironment enableCheckpointing(long interval,
                                                      CheckpointingMode mode)
```
    Enables checkpointing for the streaming job. The distributed state of the streaming dataflow will be periodically snapshotted. In case of a failure, the streaming dataflow will be restarted from the latest completed checkpoint.
    The job draws checkpoints periodically, in the given interval. The system uses the given CheckpointingMode for the checkpointing ("exactly once" vs "at least once"). The state will be stored in the configured state backend.
    NOTE: Checkpointing iterative streaming dataflows in not properly supported at the moment. For that reason, iterative jobs will not be started if used with enabled checkpointing. To override this mechanism, use the enableCheckpointing(long, CheckpointingMode, boolean) method.
    
    Parameters:
    
    interval - Time interval between state checkpoints in milliseconds.
    
    mode - The checkpointing mode, selecting between "exactly once" and "at least once" guarantees.
    
    Returns:
    
    (undocumented)
  - enableCheckpointing
```
public StreamExecutionEnvironment enableCheckpointing(long interval)
```
    Enables checkpointing for the streaming job. The distributed state of the streaming dataflow will be periodically snapshotted. In case of a failure, the streaming dataflow will be restarted from the latest completed checkpoint.
    The job draws checkpoints periodically, in the given interval. The program will use CheckpointingMode.EXACTLY_ONCE mode. The state will be stored in the configured state backend.
    NOTE: Checkpointing iterative streaming dataflows in not properly supported at the moment. For that reason, iterative jobs will not be started if used with enabled checkpointing. To override this mechanism, use the enableCheckpointing(long, CheckpointingMode, boolean) method.
    
    Parameters:
    
    interval - Time interval between state checkpoints in milliseconds.
    
    Returns:
    
    (undocumented)
  - enableCheckpointing
```
public StreamExecutionEnvironment enableCheckpointing()
```
    Deprecated. . Since .
    
    Method for enabling fault-tolerance. Activates monitoring and backup of streaming operator states. Time interval between state checkpoints is specified in in millis.
    Setting this option assumes that the job is used in production and thus if not stated explicitly otherwise with calling the setRestartStrategy method in case of failure the job will be resubmitted to the cluster indefinitely.
    
    Returns:
    
    (undocumented)
  - getCheckpointingMode
```
public CheckpointingMode getCheckpointingMode()
```
  - setStateBackend
```
public StreamExecutionEnvironment setStateBackend(AbstractStateBackend backend)
```
    Sets the state backend that describes how to store and checkpoint operator state. It defines in what form the key/value state, accessible from operations on KeyedStream is maintained (heap, managed memory, externally), and where state snapshots/checkpoints are stored, both for the key/value state, and for checkpointed functions (implementing the interface Checkpointed.
    
    The MemoryStateBackend for example maintains the state in heap memory, as objects. It is lightweight without extra dependencies, but can checkpoint only small states (some counters).
    
    In contrast, the FsStateBackend stores checkpoints of the state (also maintained as heap objects) in files. When using a replicated file system (like HDFS, S3, MapR FS, Tachyon, etc) this will guarantee that state is not lost upon failures of individual nodes and that the entire streaming program can be executed highly available and strongly consistent (assuming that Flink is run in high-availability mode).
    
    Parameters:
    
    backend - (undocumented)
    
    Returns:
    
    (undocumented)
  - getStateBackend
```
public AbstractStateBackend getStateBackend()
```
    Returns the state backend that defines how to store and checkpoint state.
    
    Returns:
    
    (undocumented)
  - setRestartStrategy
```
public void setRestartStrategy(RestartStrategies.RestartStrategyConfiguration restartStrategyConfiguration)
```
    Sets the restart strategy configuration. The configuration specifies which restart strategy will be used for the execution graph in case of a restart.
    
    Parameters:
    
    restartStrategyConfiguration - Restart strategy configuration to be set
  - getRestartStrategy
```
public RestartStrategies.RestartStrategyConfiguration getRestartStrategy()
```
    Returns the specified restart strategy configuration.
    
    Returns:
    
    The restart strategy configuration to be used
  - setNumberOfExecutionRetries
```
public void setNumberOfExecutionRetries(int numRetries)
```
    Deprecated. This method will be replaced by setRestartStrategy(). The FixedDelayRestartStrategyConfiguration contains the number of execution retries.
    
    Sets the number of times that failed tasks are re-executed. A value of zero effectively disables fault tolerance. A value of "-1" indicates that the system default value (as defined in the configuration) should be used.
    
    Parameters:
    
    numRetries - (undocumented)
  - getNumberOfExecutionRetries
```
public int getNumberOfExecutionRetries()
```
    Deprecated. This method will be replaced by getRestartStrategy. The FixedDelayRestartStrategyConfiguration contains the number of execution retries.
    
    Gets the number of times the system will try to re-execute failed tasks. A value of "-1" indicates that the system default value (as defined in the configuration) should be used.
    
    Returns:
    
    (undocumented)
  - addDefaultKryoSerializer
```
public <T extends com.esotericsoftware.kryo.Serializer<?>> void addDefaultKryoSerializer(Class<?> type,
                                                                                         T serializer)
```
    Adds a new Kryo default serializer to the Runtime.
    Note that the serializer instance must be serializable (as defined by java.io.Serializable), because it may be distributed to the worker nodes by java serialization.
    
    Parameters:
    
    type - The class of the types serialized with the given serializer.
    
    serializer - The serializer to use.
  - addDefaultKryoSerializer
```
public void addDefaultKryoSerializer(Class<?> type,
                                     Class<? extends com.esotericsoftware.kryo.Serializer<?>> serializerClass)
```
    Adds a new Kryo default serializer to the Runtime.
    
    Parameters:
    
    type - The class of the types serialized with the given serializer.
    
    serializerClass - The class of the serializer to use.
  - registerTypeWithKryoSerializer
```
public <T extends com.esotericsoftware.kryo.Serializer<?>> void registerTypeWithKryoSerializer(Class<?> clazz,
                                                                                               T serializer)
```
    Registers the given type with the serializer at the KryoSerializer.
    Note that the serializer instance must be serializable (as defined by java.io.Serializable), because it may be distributed to the worker nodes by java serialization.
    
    Parameters:
    
    clazz - (undocumented)
    
    serializer - (undocumented)
  - registerTypeWithKryoSerializer
```
public void registerTypeWithKryoSerializer(Class<?> clazz,
                                           Class<? extends com.esotericsoftware.kryo.Serializer<?>> serializer)
```
    Registers the given type with the serializer at the KryoSerializer.
    
    Parameters:
    
    clazz - (undocumented)
    
    serializer - (undocumented)
  - registerType
```
public void registerType(Class<?> typeClass)
```
    Registers the given type with the serialization stack. If the type is eventually serialized as a POJO, then the type is registered with the POJO serializer. If the type ends up being serialized with Kryo, then it will be registered at Kryo to make sure that only tags are written.
    
    Parameters:
    
    typeClass - (undocumented)
  - setStreamTimeCharacteristic
```
public void setStreamTimeCharacteristic(TimeCharacteristic characteristic)
```
    Sets the time characteristic for all streams create from this environment, e.g., processing time, event time, or ingestion time.
    If you set the characteristic to IngestionTime of EventTime this will set a default watermark update interval of 200 ms. If this is not applicable for your application you should change it using ExecutionConfig.setAutoWatermarkInterval(long)
    
    Parameters:
    
    characteristic - The time characteristic.
  - getStreamTimeCharacteristic
```
public TimeCharacteristic getStreamTimeCharacteristic()
```
    Gets the time characteristic/
    
    Returns:
    
    The time characteristic.
    
    See Also:
    
    setStreamTimeCharacteristic(org.apache.flink.streaming.api.TimeCharacteristic)
  - generateSequence
```
public DataStream<Object> generateSequence(long from,
                                           long to)
```
    Creates a new DataStream that contains a sequence of numbers. This source is a parallel source. If you manually set the parallelism to 1 the emitted elements are in order.
    
    Parameters:
    
    from - (undocumented)
    
    to - (undocumented)
    
    Returns:
    
    (undocumented)
  - fromElements
```
public <T> DataStream<T> fromElements(scala.collection.Seq<T> data,
                                      TypeInformation<T> evidence$1)
```
    Creates a DataStream that contains the given elements. The elements must all be of the same type.
    Note that this operation will result in a non-parallel data source, i.e. a data source with a parallelism of one.
    
    Parameters:
    
    data - (undocumented)
    
    evidence$1 - (undocumented)
    
    Returns:
    
    (undocumented)
  - fromCollection
```
public <T> DataStream<T> fromCollection(scala.collection.Seq<T> data,
                                        TypeInformation<T> evidence$2)
```
    Creates a DataStream from the given non-empty Seq. The elements need to be serializable because the framework may move the elements into the cluster if needed.
    Note that this operation will result in a non-parallel data source, i.e. a data source with a parallelism of one.
    
    Parameters:
    
    data - (undocumented)
    
    evidence$2 - (undocumented)
    
    Returns:
    
    (undocumented)
  - fromCollection
```
public <T> DataStream<T> fromCollection(scala.collection.Iterator<T> data,
                                        TypeInformation<T> evidence$3)
```
    Creates a DataStream from the given Iterator.
    Note that this operation will result in a non-parallel data source, i.e. a data source with a parallelism of one.
    
    Parameters:
    
    data - (undocumented)
    
    evidence$3 - (undocumented)
    
    Returns:
    
    (undocumented)
  - fromParallelCollection
```
public <T> DataStream<T> fromParallelCollection(SplittableIterator<T> data,
                                                TypeInformation<T> evidence$4)
```
    Creates a DataStream from the given SplittableIterator.
    
    Parameters:
    
    data - (undocumented)
    
    evidence$4 - (undocumented)
    
    Returns:
    
    (undocumented)
  - readTextFile
```
public DataStream<String> readTextFile(String filePath)
```
    Creates a DataStream that represents the Strings produced by reading the given file line wise. The file will be read with the system's default character set.
    
    Parameters:
    
    filePath - (undocumented)
    
    Returns:
    
    (undocumented)
  - readTextFile
```
public DataStream<String> readTextFile(String filePath,
                                       String charsetName)
```
    Creates a data stream that represents the Strings produced by reading the given file line wise. The character set with the given name will be used to read the files.
    
    Parameters:
    
    filePath - (undocumented)
    
    charsetName - (undocumented)
    
    Returns:
    
    (undocumented)
  - readFile
```
public <T> DataStream<T> readFile(FileInputFormat<T> inputFormat,
                                  String filePath,
                                  TypeInformation<T> evidence$5)
```
    Reads the given file with the given input format. The file path should be passed as a URI (e.g., "file:///some/local/file" or "hdfs://host:port/file/path").
    
    Parameters:
    
    inputFormat - (undocumented)
    
    filePath - (undocumented)
    
    evidence$5 - (undocumented)
    
    Returns:
    
    (undocumented)
  - readFileStream
```
public DataStream<String> readFileStream(String StreamPath,
                                         long intervalMillis,
                                         FileMonitoringFunction.WatchType watchType)
```
    Creates a DataStream that contains the contents of file created while system watches the given path. The file will be read with the system's default character set. The user can check the monitoring interval in milliseconds, and the way file modifications are handled. By default it checks for only new files every 100 milliseconds.
    
    Parameters:
    
    StreamPath - (undocumented)
    
    intervalMillis - (undocumented)
    
    watchType - (undocumented)
    
    Returns:
    
    (undocumented)
  - readFile
```
public <T> DataStream<T> readFile(FileInputFormat<T> inputFormat,
                                  String filePath,
                                  FileProcessingMode watchType,
                                  long interval,
                                  FilePathFilter filter,
                                  TypeInformation<T> evidence$6)
```
    Deprecated. Use FileInputFormat#setFilesFilter(FilePathFilter) to set a filter and StreamExecutionEnvironment#readFile(FileInputFormat, String, FileProcessingMode, long)
    
    Reads the contents of the user-specified path based on the given FileInputFormat. Depending on the provided FileProcessingMode.
    
    Parameters:
    
    inputFormat - The input format used to create the data stream
    
    filePath - The path of the file, as a URI (e.g., "file:///some/local/file" or "hdfs://host:port/file/path")
    
    watchType - The mode in which the source should operate, i.e. monitor path and react to new data, or process once and exit
    
    interval - In the case of periodic path monitoring, this specifies the interval (in millis) between consecutive path scans
    
    filter - The files to be excluded from the processing
    
    evidence$6 - (undocumented)
    
    Returns:
    
    The data stream that represents the data read from the given file
  - readFile
```
public <T> DataStream<T> readFile(FileInputFormat<T> inputFormat,
                                  String filePath,
                                  FileProcessingMode watchType,
                                  long interval,
                                  TypeInformation<T> evidence$7)
```
    Reads the contents of the user-specified path based on the given FileInputFormat. Depending on the provided FileProcessingMode, the source may periodically monitor (every interval ms) the path for new data (FileProcessingMode.PROCESS_CONTINUOUSLY), or process once the data currently in the path and exit (FileProcessingMode.PROCESS_ONCE). In addition, if the path contains files not to be processed, the user can specify a custom FilePathFilter. As a default implementation you can use FilePathFilter.createDefaultFilter().
    ** NOTES ON CHECKPOINTING: ** If the watchType is set to FileProcessingMode#PROCESS_ONCE, the source monitors the path ** once **, creates the FileInputSplits to be processed, forwards them to the downstream readers to read the actual data, and exits, without waiting for the readers to finish reading. This implies that no more checkpoint barriers are going to be forwarded after the source exits, thus having no checkpoints after that point.
    
    Parameters:
    
    inputFormat - The input format used to create the data stream
    
    filePath - The path of the file, as a URI (e.g., "file:///some/local/file" or "hdfs://host:port/file/path")
    
    watchType - The mode in which the source should operate, i.e. monitor path and react to new data, or process once and exit
    
    interval - In the case of periodic path monitoring, this specifies the interval (in millis) between consecutive path scans
    
    evidence$7 - (undocumented)
    
    Returns:
    
    The data stream that represents the data read from the given file
  - socketTextStream
```
public DataStream<String> socketTextStream(String hostname,
                                           int port,
                                           char delimiter,
                                           long maxRetry)
```
    Creates a new DataStream that contains the strings received infinitely from socket. Received strings are decoded by the system's default character set. The maximum retry interval is specified in seconds, in case of temporary service outage reconnection is initiated every second.
    
    Parameters:
    
    hostname - (undocumented)
    
    port - (undocumented)
    
    delimiter - (undocumented)
    
    maxRetry - (undocumented)
    
    Returns:
    
    (undocumented)
  - createInput
```
public <T> DataStream<T> createInput(InputFormat<T,?> inputFormat,
                                     TypeInformation<T> evidence$8)
```
    Generic method to create an input data stream with a specific input format. Since all data streams need specific information about their types, this method needs to determine the type of the data produced by the input format. It will attempt to determine the data type by reflection, unless the input format implements the ResultTypeQueryable interface.
    
    Parameters:
    
    inputFormat - (undocumented)
    
    evidence$8 - (undocumented)
    
    Returns:
    
    (undocumented)
  - addSource
```
public <T> DataStream<T> addSource(SourceFunction<T> function,
                                   TypeInformation<T> evidence$9)
```
    Create a DataStream using a user defined source function for arbitrary source functionality. By default sources have a parallelism of 1. To enable parallel execution, the user defined source should implement ParallelSourceFunction or extend RichParallelSourceFunction. In these cases the resulting source will have the parallelism of the environment. To change this afterwards call DataStreamSource.setParallelism(int)
    
    Parameters:
    
    function - (undocumented)
    
    evidence$9 - (undocumented)
    
    Returns:
    
    (undocumented)
  - addSource
```
public <T> DataStream<T> addSource(scala.Function1<SourceFunction.SourceContext<T>,scala.runtime.BoxedUnit> function,
                                   TypeInformation<T> evidence$10)
```
    Create a DataStream using a user defined source function for arbitrary source functionality.
    
    Parameters:
    
    function - (undocumented)
    
    evidence$10 - (undocumented)
    
    Returns:
    
    (undocumented)
  - execute
```
public JobExecutionResult execute()
```
    Triggers the program execution. The environment will execute all parts of the program that have resulted in a "sink" operation. Sink operations are for example printing results or forwarding them to a message queue.
    The program execution will be logged and displayed with a generated default name.
    
    Returns:
    
    (undocumented)
  - execute
```
public JobExecutionResult execute(String jobName)
```
    Triggers the program execution. The environment will execute all parts of the program that have resulted in a "sink" operation. Sink operations are for example printing results or forwarding them to a message queue.
    The program execution will be logged and displayed with the provided name.
    
    Parameters:
    
    jobName - (undocumented)
    
    Returns:
    
    (undocumented)
  - getExecutionPlan
```
public String getExecutionPlan()
```
    Creates the plan with which the system will execute the program, and returns it as a String using a JSON representation of the execution data flow graph. Note that this needs to be called, before the plan is executed.
    
    Returns:
    
    (undocumented)
  - getStreamGraph
```
public StreamGraph getStreamGraph()
```
    Getter of the StreamGraph of the streaming job.
    
    Returns:
    
    The StreamGraph representing the transformations
  - getWrappedStreamExecutionEnvironment
```
public StreamExecutionEnvironment getWrappedStreamExecutionEnvironment()
```
    Getter of the wrapped StreamExecutionEnvironment
    
    Returns:
    
    The encased ExecutionEnvironment
  - scalaClean
```
public <F> F scalaClean(F f)
```
    Returns a "closure-cleaned" version of the given function. Cleans only if closure cleaning is not disabled in the ExecutionConfig
    
    Parameters:
    
    f - (undocumented)
    
    Returns:
    
    (undocumented)
  - registerCachedFile
```
public void registerCachedFile(String filePath,
                               String name)
```
    Registers a file at the distributed cache under the given name. The file will be accessible from any user-defined function in the (distributed) runtime under a local path. Files may be local files (as long as all relevant workers have access to it), or files in a distributed file system. The runtime will copy the files temporarily to a local cache, if needed.
    The RuntimeContext can be obtained inside UDFs via RichFunction.getRuntimeContext() and provides access DistributedCache via RuntimeContext.getDistributedCache().
    
    Parameters:
    
    filePath - The path of the file, as a URI (e.g. "file:///some/path" or "hdfs://host:port/and/path")
    
    name - The name under which the file is registered.
  - registerCachedFile
```
public void registerCachedFile(String filePath,
                               String name,
                               boolean executable)
```
    Registers a file at the distributed cache under the given name. The file will be accessible from any user-defined function in the (distributed) runtime under a local path. Files may be local files (as long as all relevant workers have access to it), or files in a distributed file system. The runtime will copy the files temporarily to a local cache, if needed.
    The RuntimeContext can be obtained inside UDFs via RichFunction.getRuntimeContext() and provides access DistributedCache via RuntimeContext.getDistributedCache().
    
    Parameters:
    
    filePath - The path of the file, as a URI (e.g. "file:///some/path" or "hdfs://host:port/and/path")
    
    name - The name under which the file is registered.
    
    executable - flag indicating whether the file should be executable

Back to Flink Website

Class StreamExecutionEnvironment

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Constructor Detail

StreamExecutionEnvironment

Method Detail

setDefaultLocalParallelism

getDefaultLocalParallelism

getExecutionEnvironment

createLocalEnvironment

createLocalEnvironmentWithWebUI

createRemoteEnvironment

createRemoteEnvironment

getJavaEnv

getConfig

getCachedFiles

setParallelism

setMaxParallelism

getParallelism

getMaxParallelism

setBufferTimeout

getBufferTimeout

disableOperatorChaining

getCheckpointConfig

enableCheckpointing

enableCheckpointing

enableCheckpointing

enableCheckpointing

getCheckpointingMode

setStateBackend

getStateBackend

setRestartStrategy

getRestartStrategy

setNumberOfExecutionRetries

getNumberOfExecutionRetries

addDefaultKryoSerializer

addDefaultKryoSerializer

registerTypeWithKryoSerializer

registerTypeWithKryoSerializer

registerType

setStreamTimeCharacteristic

getStreamTimeCharacteristic

generateSequence

fromElements

fromCollection

fromCollection

fromParallelCollection

readTextFile

readTextFile

readFile

readFileStream

readFile

readFile

socketTextStream

createInput

addSource

addSource

execute

execute

getExecutionPlan

getStreamGraph

getWrappedStreamExecutionEnvironment

scalaClean

registerCachedFile

registerCachedFile

Back to Flink Website