Class DummyStreamExecutionEnvironment
- java.lang.Object
-
- org.apache.flink.streaming.api.environment.StreamExecutionEnvironment
-
- org.apache.flink.table.planner.utils.DummyStreamExecutionEnvironment
-
- All Implemented Interfaces:
AutoCloseable
public class DummyStreamExecutionEnvironment extends StreamExecutionEnvironment
This is dummyStreamExecutionEnvironment
, which holds a realStreamExecutionEnvironment
, shares all configurations of the real environment, and disables all configuration setting methods.When translating relational plan to execution plan in the
Planner
, the generatedTransformation
s will be added into StreamExecutionEnvironment's buffer, and they will be cleared only whenStreamExecutionEnvironment.execute()
method is called. EachTableEnvironment
instance holds an immutable StreamExecutionEnvironment instance. If there are multiple translations (not all for `execute`, e.g. `explain` and then `execute`) in one TableEnvironment instance, the transformation buffer is dirty, and execution result may be incorrect.This dummy StreamExecutionEnvironment is only used for buffering the transformations generated in the planner. A new dummy StreamExecutionEnvironment instance should be created for each translation, and this could avoid dirty the transformation buffer of the real StreamExecutionEnvironment instance.
All set methods (e.g. `setXX`, `enableXX`, `disableXX`, etc) are disabled to prohibit changing configuration, all get methods (e.g. `getXX`, `isXX`, etc) will be delegated to the real StreamExecutionEnvironment. `execute`, `getStreamGraph`, `getExecutionPlan` methods are also disabled, while `addOperator` method is enabled to allow the planner to add the generated transformations to the dummy StreamExecutionEnvironment.
This class could be removed once the
StreamTableSource
interface andStreamTableSink
interface are reworked.NOTE: Please remove
com.esotericsoftware.kryo
item in the whitelist of checkCodeDependencies() method intest_table_shaded_dependencies.sh
end-to-end test when this class is removed.
-
-
Field Summary
-
Fields inherited from class org.apache.flink.streaming.api.environment.StreamExecutionEnvironment
cacheFile, checkpointCfg, config, configuration, transformations
-
-
Constructor Summary
Constructors Constructor Description DummyStreamExecutionEnvironment(StreamExecutionEnvironment realExecEnv)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description StreamExecutionEnvironment
disableOperatorChaining()
Disables operator chaining for streaming operators.StreamExecutionEnvironment
enableCheckpointing(long interval)
Enables checkpointing for the streaming job.StreamExecutionEnvironment
enableCheckpointing(long interval, CheckpointingMode mode)
Enables checkpointing for the streaming job.StreamExecutionEnvironment
enableCheckpointing(long interval, CheckpointingMode mode)
Enables checkpointing for the streaming job.JobExecutionResult
execute()
Triggers the program execution.JobExecutionResult
execute(String jobName)
Triggers the program execution.JobExecutionResult
execute(StreamGraph streamGraph)
Triggers the program execution.long
getBufferTimeout()
Gets the maximum time frequency (milliseconds) for the flushing of the output buffers.List<Tuple2<String,DistributedCache.DistributedCacheEntry>>
getCachedFiles()
Get the list of cached files that were registered for distribution among the task managers.CheckpointConfig
getCheckpointConfig()
Gets the checkpoint config, which defines values like checkpoint interval, delay between checkpoints, etc.CheckpointingMode
getCheckpointingConsistencyMode()
Returns the checkpointing consistency mode (exactly-once vs. at-least-once).CheckpointingMode
getCheckpointingMode()
Returns the checkpointing mode (exactly-once vs. at-least-once).long
getCheckpointInterval()
Returns the checkpointing interval or -1 if checkpointing is disabled.ExecutionConfig
getConfig()
Gets the config object.ReadableConfig
getConfiguration()
Gives read-only access to the underlying configuration of this environment.String
getExecutionPlan()
Creates the plan with which the system will execute the program, and returns it as a String using a JSON representation of the execution data flow graph.int
getMaxParallelism()
Gets the maximum degree of parallelism defined for the program.int
getParallelism()
Gets the parallelism with which operation are executed by default.StreamGraph
getStreamGraph()
Getter of theStreamGraph
of the streaming job.boolean
isChainingEnabled()
Returns whether operator chaining is enabled.void
registerCachedFile(String filePath, String name)
Registers a file at the distributed cache under the given name.void
registerCachedFile(String filePath, String name, boolean executable)
Registers a file at the distributed cache under the given name.StreamExecutionEnvironment
setBufferTimeout(long timeoutMillis)
Sets the maximum time frequency (milliseconds) for the flushing of the output buffers.StreamExecutionEnvironment
setMaxParallelism(int maxParallelism)
Sets the maximum degree of parallelism defined for the program.StreamExecutionEnvironment
setParallelism(int parallelism)
Sets the parallelism for operations executed through this environment.-
Methods inherited from class org.apache.flink.streaming.api.environment.StreamExecutionEnvironment
addOperator, addSource, addSource, addSource, addSource, areExplicitEnvironmentsAllowed, clean, clearJobListeners, close, configure, configure, createInput, createInput, createLocalEnvironment, createLocalEnvironment, createLocalEnvironment, createLocalEnvironment, createLocalEnvironmentWithWebUI, createRemoteEnvironment, createRemoteEnvironment, createRemoteEnvironment, enableChangelogStateBackend, executeAsync, executeAsync, executeAsync, fromCollection, fromCollection, fromCollection, fromCollection, fromData, fromData, fromData, fromData, fromData, fromElements, fromElements, fromParallelCollection, fromParallelCollection, fromSequence, fromSource, fromSource, generateSequence, generateStreamGraph, getDefaultLocalParallelism, getDefaultSavepointDirectory, getExecutionEnvironment, getExecutionEnvironment, getJobListeners, getStreamGraph, getTransformations, getUserClassloader, initializeContextEnvironment, invalidateClusterDataset, isChainingOfOperatorsWithDifferentMaxParallelismEnabled, isChangelogStateBackendEnabled, isForceUnalignedCheckpoints, isUnalignedCheckpointsEnabled, listCompletedClusterDatasets, readFile, readFile, readFile, readFile, readFileStream, registerCacheTransformation, registerCollectIterator, registerJobListener, registerSlotSharingGroup, resetContextEnvironment, setDefaultLocalParallelism, setDefaultSavepointDirectory, setDefaultSavepointDirectory, setDefaultSavepointDirectory, setRuntimeMode, socketTextStream, socketTextStream, socketTextStream, socketTextStream, socketTextStream
-
-
-
-
Constructor Detail
-
DummyStreamExecutionEnvironment
public DummyStreamExecutionEnvironment(StreamExecutionEnvironment realExecEnv)
-
-
Method Detail
-
getConfig
public ExecutionConfig getConfig()
Description copied from class:StreamExecutionEnvironment
Gets the config object.- Overrides:
getConfig
in classStreamExecutionEnvironment
-
getConfiguration
public ReadableConfig getConfiguration()
Description copied from class:StreamExecutionEnvironment
Gives read-only access to the underlying configuration of this environment.Note that the returned configuration might not be complete. It only contains options that have initialized the environment via
StreamExecutionEnvironment(Configuration)
or options that are not represented in dedicated configuration classes such asExecutionConfig
orCheckpointConfig
.Use
StreamExecutionEnvironment.configure(ReadableConfig, ClassLoader)
to set options that are specific to this environment.- Overrides:
getConfiguration
in classStreamExecutionEnvironment
-
getCachedFiles
public List<Tuple2<String,DistributedCache.DistributedCacheEntry>> getCachedFiles()
Description copied from class:StreamExecutionEnvironment
Get the list of cached files that were registered for distribution among the task managers.- Overrides:
getCachedFiles
in classStreamExecutionEnvironment
-
setParallelism
public StreamExecutionEnvironment setParallelism(int parallelism)
Description copied from class:StreamExecutionEnvironment
Sets the parallelism for operations executed through this environment. Setting a parallelism of x here will cause all operators (such as map, batchReduce) to run with x parallel instances. This method overrides the default parallelism for this environment. TheLocalStreamEnvironment
uses by default a value equal to the number of hardware contexts (CPU cores / threads). When executing the program via the command line client from a JAR file, the default degree of parallelism is the one configured for that setup.- Overrides:
setParallelism
in classStreamExecutionEnvironment
- Parameters:
parallelism
- The parallelism
-
setMaxParallelism
public StreamExecutionEnvironment setMaxParallelism(int maxParallelism)
Description copied from class:StreamExecutionEnvironment
Sets the maximum degree of parallelism defined for the program. The upper limit (inclusive) is Short.MAX_VALUE + 1.The maximum degree of parallelism specifies the upper limit for dynamic scaling. It also defines the number of key groups used for partitioned state.
- Overrides:
setMaxParallelism
in classStreamExecutionEnvironment
- Parameters:
maxParallelism
- Maximum degree of parallelism to be used for the program., with0 < maxParallelism <= 2^15
.
-
getParallelism
public int getParallelism()
Description copied from class:StreamExecutionEnvironment
Gets the parallelism with which operation are executed by default. Operations can individually override this value to use a specific parallelism.- Overrides:
getParallelism
in classStreamExecutionEnvironment
- Returns:
- The parallelism used by operations, unless they override that value.
-
getMaxParallelism
public int getMaxParallelism()
Description copied from class:StreamExecutionEnvironment
Gets the maximum degree of parallelism defined for the program.The maximum degree of parallelism specifies the upper limit for dynamic scaling. It also defines the number of key groups used for partitioned state.
- Overrides:
getMaxParallelism
in classStreamExecutionEnvironment
- Returns:
- Maximum degree of parallelism
-
setBufferTimeout
public StreamExecutionEnvironment setBufferTimeout(long timeoutMillis)
Description copied from class:StreamExecutionEnvironment
Sets the maximum time frequency (milliseconds) for the flushing of the output buffers. By default the output buffers flush frequently to provide low latency and to aid smooth developer experience. Setting the parameter can result in three logical modes:- A positive integer triggers flushing periodically by that integer
- 0 triggers flushing after every record thus minimizing latency
- -1 triggers flushing only when the output buffer is full thus maximizing throughput
- Overrides:
setBufferTimeout
in classStreamExecutionEnvironment
- Parameters:
timeoutMillis
- The maximum time between two output flushes.
-
getBufferTimeout
public long getBufferTimeout()
Description copied from class:StreamExecutionEnvironment
Gets the maximum time frequency (milliseconds) for the flushing of the output buffers. For clarification on the extremal values seeStreamExecutionEnvironment.setBufferTimeout(long)
.- Overrides:
getBufferTimeout
in classStreamExecutionEnvironment
- Returns:
- The timeout of the buffer.
-
disableOperatorChaining
public StreamExecutionEnvironment disableOperatorChaining()
Description copied from class:StreamExecutionEnvironment
Disables operator chaining for streaming operators. Operator chaining allows non-shuffle operations to be co-located in the same thread fully avoiding serialization and de-serialization.- Overrides:
disableOperatorChaining
in classStreamExecutionEnvironment
- Returns:
- StreamExecutionEnvironment with chaining disabled.
-
isChainingEnabled
public boolean isChainingEnabled()
Description copied from class:StreamExecutionEnvironment
Returns whether operator chaining is enabled.- Overrides:
isChainingEnabled
in classStreamExecutionEnvironment
- Returns:
true
if chaining is enabled, false otherwise.
-
getCheckpointConfig
public CheckpointConfig getCheckpointConfig()
Description copied from class:StreamExecutionEnvironment
Gets the checkpoint config, which defines values like checkpoint interval, delay between checkpoints, etc.- Overrides:
getCheckpointConfig
in classStreamExecutionEnvironment
- Returns:
- The checkpoint config.
-
enableCheckpointing
public StreamExecutionEnvironment enableCheckpointing(long interval)
Description copied from class:StreamExecutionEnvironment
Enables checkpointing for the streaming job. The distributed state of the streaming dataflow will be periodically snapshotted. In case of a failure, the streaming dataflow will be restarted from the latest completed checkpoint. This method selectsCheckpointingMode.EXACTLY_ONCE
guarantees.The job draws checkpoints periodically, in the given interval. The state will be stored in the configured state backend.
NOTE: Checkpointing iterative streaming dataflows is not properly supported at the moment. For that reason, iterative jobs will not be started if used with enabled checkpointing.
- Overrides:
enableCheckpointing
in classStreamExecutionEnvironment
- Parameters:
interval
- Time interval between state checkpoints in milliseconds.
-
enableCheckpointing
public StreamExecutionEnvironment enableCheckpointing(long interval, CheckpointingMode mode)
Description copied from class:StreamExecutionEnvironment
Enables checkpointing for the streaming job. The distributed state of the streaming dataflow will be periodically snapshotted. In case of a failure, the streaming dataflow will be restarted from the latest completed checkpoint.The job draws checkpoints periodically, in the given interval. The system uses the given
CheckpointingMode
for the checkpointing ("exactly once" vs "at least once"). The state will be stored in the configured state backend.NOTE: Checkpointing iterative streaming dataflows is not properly supported at the moment. For that reason, iterative jobs will not be started if used with enabled checkpointing.
- Overrides:
enableCheckpointing
in classStreamExecutionEnvironment
- Parameters:
interval
- Time interval between state checkpoints in milliseconds.mode
- The checkpointing mode, selecting between "exactly once" and "at least once" guaranteed.
-
enableCheckpointing
public StreamExecutionEnvironment enableCheckpointing(long interval, CheckpointingMode mode)
Description copied from class:StreamExecutionEnvironment
Enables checkpointing for the streaming job. The distributed state of the streaming dataflow will be periodically snapshotted. In case of a failure, the streaming dataflow will be restarted from the latest completed checkpoint.The job draws checkpoints periodically, in the given interval. The system uses the given
CheckpointingMode
for the checkpointing ("exactly once" vs "at least once"). The state will be stored in the configured state backend.NOTE: Checkpointing iterative streaming dataflows is not properly supported at the moment. For that reason, iterative jobs will not be started if used with enabled checkpointing.
- Overrides:
enableCheckpointing
in classStreamExecutionEnvironment
- Parameters:
interval
- Time interval between state checkpoints in milliseconds.mode
- The checkpointing mode, selecting between "exactly once" and "at least once" guaranteed.
-
getCheckpointInterval
public long getCheckpointInterval()
Description copied from class:StreamExecutionEnvironment
Returns the checkpointing interval or -1 if checkpointing is disabled.Shorthand for
getCheckpointConfig().getCheckpointInterval()
.- Overrides:
getCheckpointInterval
in classStreamExecutionEnvironment
- Returns:
- The checkpointing interval or -1
-
getCheckpointingMode
public CheckpointingMode getCheckpointingMode()
Description copied from class:StreamExecutionEnvironment
Returns the checkpointing mode (exactly-once vs. at-least-once).Shorthand for
getCheckpointConfig().getCheckpointingMode()
.- Overrides:
getCheckpointingMode
in classStreamExecutionEnvironment
- Returns:
- The checkpoint mode
-
getCheckpointingConsistencyMode
public CheckpointingMode getCheckpointingConsistencyMode()
Description copied from class:StreamExecutionEnvironment
Returns the checkpointing consistency mode (exactly-once vs. at-least-once).Shorthand for
getCheckpointConfig().getCheckpointingConsistencyMode()
.- Overrides:
getCheckpointingConsistencyMode
in classStreamExecutionEnvironment
- Returns:
- The checkpoint mode
-
execute
public JobExecutionResult execute() throws Exception
Description copied from class:StreamExecutionEnvironment
Triggers the program execution. The environment will execute all parts of the program that have resulted in a "sink" operation. Sink operations are for example printing results or forwarding them to a message queue.The program execution will be logged and displayed with a generated default name.
- Overrides:
execute
in classStreamExecutionEnvironment
- Returns:
- The result of the job execution, containing elapsed time and accumulators.
- Throws:
Exception
- which occurs during job execution.
-
execute
public JobExecutionResult execute(String jobName) throws Exception
Description copied from class:StreamExecutionEnvironment
Triggers the program execution. The environment will execute all parts of the program that have resulted in a "sink" operation. Sink operations are for example printing results or forwarding them to a message queue.The program execution will be logged and displayed with the provided name
- Overrides:
execute
in classStreamExecutionEnvironment
- Parameters:
jobName
- Desired name of the job- Returns:
- The result of the job execution, containing elapsed time and accumulators.
- Throws:
Exception
- which occurs during job execution.
-
execute
public JobExecutionResult execute(StreamGraph streamGraph) throws Exception
Description copied from class:StreamExecutionEnvironment
Triggers the program execution. The environment will execute all parts of the program that have resulted in a "sink" operation. Sink operations are for example printing results or forwarding them to a message queue.- Overrides:
execute
in classStreamExecutionEnvironment
- Parameters:
streamGraph
- the stream graph representing the transformations- Returns:
- The result of the job execution, containing elapsed time and accumulators.
- Throws:
Exception
- which occurs during job execution.
-
registerCachedFile
public void registerCachedFile(String filePath, String name)
Description copied from class:StreamExecutionEnvironment
Registers a file at the distributed cache under the given name. The file will be accessible from any user-defined function in the (distributed) runtime under a local path. Files may be local files (which will be distributed via BlobServer), or files in a distributed file system. The runtime will copy the files temporarily to a local cache, if needed.The
RuntimeContext
can be obtained inside UDFs viaRichFunction.getRuntimeContext()
and provides accessDistributedCache
viaRuntimeContext.getDistributedCache()
.- Overrides:
registerCachedFile
in classStreamExecutionEnvironment
- Parameters:
filePath
- The path of the file, as a URI (e.g. "file:///some/path" or "hdfs://host:port/and/path")name
- The name under which the file is registered.
-
registerCachedFile
public void registerCachedFile(String filePath, String name, boolean executable)
Description copied from class:StreamExecutionEnvironment
Registers a file at the distributed cache under the given name. The file will be accessible from any user-defined function in the (distributed) runtime under a local path. Files may be local files (which will be distributed via BlobServer), or files in a distributed file system. The runtime will copy the files temporarily to a local cache, if needed.The
RuntimeContext
can be obtained inside UDFs viaRichFunction.getRuntimeContext()
and provides accessDistributedCache
viaRuntimeContext.getDistributedCache()
.- Overrides:
registerCachedFile
in classStreamExecutionEnvironment
- Parameters:
filePath
- The path of the file, as a URI (e.g. "file:///some/path" or "hdfs://host:port/and/path")name
- The name under which the file is registered.executable
- flag indicating whether the file should be executable
-
getStreamGraph
public StreamGraph getStreamGraph()
Description copied from class:StreamExecutionEnvironment
Getter of theStreamGraph
of the streaming job. This call clears previously registeredtransformations
.- Overrides:
getStreamGraph
in classStreamExecutionEnvironment
- Returns:
- The stream graph representing the transformations
-
getExecutionPlan
public String getExecutionPlan()
Description copied from class:StreamExecutionEnvironment
Creates the plan with which the system will execute the program, and returns it as a String using a JSON representation of the execution data flow graph. Note that this needs to be called, before the plan is executed.- Overrides:
getExecutionPlan
in classStreamExecutionEnvironment
- Returns:
- The execution plan of the program, as a JSON String.
-
-