Class DummyStreamExecutionEnvironment

  • All Implemented Interfaces:
    AutoCloseable

    public class DummyStreamExecutionEnvironment
    extends StreamExecutionEnvironment
    This is dummy StreamExecutionEnvironment, which holds a real StreamExecutionEnvironment, shares all configurations of the real environment, and disables all configuration setting methods.

    When translating relational plan to execution plan in the Planner, the generated Transformations will be added into StreamExecutionEnvironment's buffer, and they will be cleared only when StreamExecutionEnvironment.execute() method is called. Each TableEnvironment instance holds an immutable StreamExecutionEnvironment instance. If there are multiple translations (not all for `execute`, e.g. `explain` and then `execute`) in one TableEnvironment instance, the transformation buffer is dirty, and execution result may be incorrect.

    This dummy StreamExecutionEnvironment is only used for buffering the transformations generated in the planner. A new dummy StreamExecutionEnvironment instance should be created for each translation, and this could avoid dirty the transformation buffer of the real StreamExecutionEnvironment instance.

    All set methods (e.g. `setXX`, `enableXX`, `disableXX`, etc) are disabled to prohibit changing configuration, all get methods (e.g. `getXX`, `isXX`, etc) will be delegated to the real StreamExecutionEnvironment. `execute`, `getStreamGraph`, `getExecutionPlan` methods are also disabled, while `addOperator` method is enabled to allow the planner to add the generated transformations to the dummy StreamExecutionEnvironment.

    This class could be removed once the StreamTableSource interface and StreamTableSink interface are reworked.

    NOTE: Please remove com.esotericsoftware.kryo item in the whitelist of checkCodeDependencies() method in test_table_shaded_dependencies.sh end-to-end test when this class is removed.

    • Method Detail

      • setParallelism

        public StreamExecutionEnvironment setParallelism​(int parallelism)
        Description copied from class: StreamExecutionEnvironment
        Sets the parallelism for operations executed through this environment. Setting a parallelism of x here will cause all operators (such as map, batchReduce) to run with x parallel instances. This method overrides the default parallelism for this environment. The LocalStreamEnvironment uses by default a value equal to the number of hardware contexts (CPU cores / threads). When executing the program via the command line client from a JAR file, the default degree of parallelism is the one configured for that setup.
        Overrides:
        setParallelism in class StreamExecutionEnvironment
        Parameters:
        parallelism - The parallelism
      • setMaxParallelism

        public StreamExecutionEnvironment setMaxParallelism​(int maxParallelism)
        Description copied from class: StreamExecutionEnvironment
        Sets the maximum degree of parallelism defined for the program. The upper limit (inclusive) is Short.MAX_VALUE + 1.

        The maximum degree of parallelism specifies the upper limit for dynamic scaling. It also defines the number of key groups used for partitioned state.

        Overrides:
        setMaxParallelism in class StreamExecutionEnvironment
        Parameters:
        maxParallelism - Maximum degree of parallelism to be used for the program., with 0 < maxParallelism <= 2^15.
      • getParallelism

        public int getParallelism()
        Description copied from class: StreamExecutionEnvironment
        Gets the parallelism with which operation are executed by default. Operations can individually override this value to use a specific parallelism.
        Overrides:
        getParallelism in class StreamExecutionEnvironment
        Returns:
        The parallelism used by operations, unless they override that value.
      • getMaxParallelism

        public int getMaxParallelism()
        Description copied from class: StreamExecutionEnvironment
        Gets the maximum degree of parallelism defined for the program.

        The maximum degree of parallelism specifies the upper limit for dynamic scaling. It also defines the number of key groups used for partitioned state.

        Overrides:
        getMaxParallelism in class StreamExecutionEnvironment
        Returns:
        Maximum degree of parallelism
      • setBufferTimeout

        public StreamExecutionEnvironment setBufferTimeout​(long timeoutMillis)
        Description copied from class: StreamExecutionEnvironment
        Sets the maximum time frequency (milliseconds) for the flushing of the output buffers. By default the output buffers flush frequently to provide low latency and to aid smooth developer experience. Setting the parameter can result in three logical modes:
        • A positive integer triggers flushing periodically by that integer
        • 0 triggers flushing after every record thus minimizing latency
        • -1 triggers flushing only when the output buffer is full thus maximizing throughput
        Overrides:
        setBufferTimeout in class StreamExecutionEnvironment
        Parameters:
        timeoutMillis - The maximum time between two output flushes.
      • enableCheckpointing

        public StreamExecutionEnvironment enableCheckpointing​(long interval)
        Description copied from class: StreamExecutionEnvironment
        Enables checkpointing for the streaming job. The distributed state of the streaming dataflow will be periodically snapshotted. In case of a failure, the streaming dataflow will be restarted from the latest completed checkpoint. This method selects CheckpointingMode.EXACTLY_ONCE guarantees.

        The job draws checkpoints periodically, in the given interval. The state will be stored in the configured state backend.

        NOTE: Checkpointing iterative streaming dataflows is not properly supported at the moment. For that reason, iterative jobs will not be started if used with enabled checkpointing.

        Overrides:
        enableCheckpointing in class StreamExecutionEnvironment
        Parameters:
        interval - Time interval between state checkpoints in milliseconds.
      • enableCheckpointing

        public StreamExecutionEnvironment enableCheckpointing​(long interval,
                                                              CheckpointingMode mode)
        Description copied from class: StreamExecutionEnvironment
        Enables checkpointing for the streaming job. The distributed state of the streaming dataflow will be periodically snapshotted. In case of a failure, the streaming dataflow will be restarted from the latest completed checkpoint.

        The job draws checkpoints periodically, in the given interval. The system uses the given CheckpointingMode for the checkpointing ("exactly once" vs "at least once"). The state will be stored in the configured state backend.

        NOTE: Checkpointing iterative streaming dataflows is not properly supported at the moment. For that reason, iterative jobs will not be started if used with enabled checkpointing.

        Overrides:
        enableCheckpointing in class StreamExecutionEnvironment
        Parameters:
        interval - Time interval between state checkpoints in milliseconds.
        mode - The checkpointing mode, selecting between "exactly once" and "at least once" guaranteed.
      • enableCheckpointing

        public StreamExecutionEnvironment enableCheckpointing​(long interval,
                                                              CheckpointingMode mode)
        Description copied from class: StreamExecutionEnvironment
        Enables checkpointing for the streaming job. The distributed state of the streaming dataflow will be periodically snapshotted. In case of a failure, the streaming dataflow will be restarted from the latest completed checkpoint.

        The job draws checkpoints periodically, in the given interval. The system uses the given CheckpointingMode for the checkpointing ("exactly once" vs "at least once"). The state will be stored in the configured state backend.

        NOTE: Checkpointing iterative streaming dataflows is not properly supported at the moment. For that reason, iterative jobs will not be started if used with enabled checkpointing.

        Overrides:
        enableCheckpointing in class StreamExecutionEnvironment
        Parameters:
        interval - Time interval between state checkpoints in milliseconds.
        mode - The checkpointing mode, selecting between "exactly once" and "at least once" guaranteed.
      • getCheckpointInterval

        public long getCheckpointInterval()
        Description copied from class: StreamExecutionEnvironment
        Returns the checkpointing interval or -1 if checkpointing is disabled.

        Shorthand for getCheckpointConfig().getCheckpointInterval().

        Overrides:
        getCheckpointInterval in class StreamExecutionEnvironment
        Returns:
        The checkpointing interval or -1
      • execute

        public JobExecutionResult execute()
                                   throws Exception
        Description copied from class: StreamExecutionEnvironment
        Triggers the program execution. The environment will execute all parts of the program that have resulted in a "sink" operation. Sink operations are for example printing results or forwarding them to a message queue.

        The program execution will be logged and displayed with a generated default name.

        Overrides:
        execute in class StreamExecutionEnvironment
        Returns:
        The result of the job execution, containing elapsed time and accumulators.
        Throws:
        Exception - which occurs during job execution.
      • execute

        public JobExecutionResult execute​(String jobName)
                                   throws Exception
        Description copied from class: StreamExecutionEnvironment
        Triggers the program execution. The environment will execute all parts of the program that have resulted in a "sink" operation. Sink operations are for example printing results or forwarding them to a message queue.

        The program execution will be logged and displayed with the provided name

        Overrides:
        execute in class StreamExecutionEnvironment
        Parameters:
        jobName - Desired name of the job
        Returns:
        The result of the job execution, containing elapsed time and accumulators.
        Throws:
        Exception - which occurs during job execution.
      • execute

        public JobExecutionResult execute​(StreamGraph streamGraph)
                                   throws Exception
        Description copied from class: StreamExecutionEnvironment
        Triggers the program execution. The environment will execute all parts of the program that have resulted in a "sink" operation. Sink operations are for example printing results or forwarding them to a message queue.
        Overrides:
        execute in class StreamExecutionEnvironment
        Parameters:
        streamGraph - the stream graph representing the transformations
        Returns:
        The result of the job execution, containing elapsed time and accumulators.
        Throws:
        Exception - which occurs during job execution.
      • registerCachedFile

        public void registerCachedFile​(String filePath,
                                       String name,
                                       boolean executable)
        Description copied from class: StreamExecutionEnvironment
        Registers a file at the distributed cache under the given name. The file will be accessible from any user-defined function in the (distributed) runtime under a local path. Files may be local files (which will be distributed via BlobServer), or files in a distributed file system. The runtime will copy the files temporarily to a local cache, if needed.

        The RuntimeContext can be obtained inside UDFs via RichFunction.getRuntimeContext() and provides access DistributedCache via RuntimeContext.getDistributedCache().

        Overrides:
        registerCachedFile in class StreamExecutionEnvironment
        Parameters:
        filePath - The path of the file, as a URI (e.g. "file:///some/path" or "hdfs://host:port/and/path")
        name - The name under which the file is registered.
        executable - flag indicating whether the file should be executable
      • getExecutionPlan

        public String getExecutionPlan()
        Description copied from class: StreamExecutionEnvironment
        Creates the plan with which the system will execute the program, and returns it as a String using a JSON representation of the execution data flow graph. Note that this needs to be called, before the plan is executed.
        Overrides:
        getExecutionPlan in class StreamExecutionEnvironment
        Returns:
        The execution plan of the program, as a JSON String.