pyflink.datastream.data_stream.CachedDataStream.execute_and_collect#

CachedDataStream.execute_and_collect(job_execution_name: Optional[str] = None, limit: Optional[int] = None) → Union[pyflink.datastream.data_stream.CloseableIterator, list]#

Triggers the distributed execution of the streaming dataflow and returns an iterator over the elements of the given DataStream.

The DataStream application is executed in the regular distributed manner on the target environment, and the events from the stream are polled back to this application process and thread through Flink’s REST API.

The returned iterator must be closed to free all cluster resources.

Parameters

job_execution_name – The name of the job execution.
limit – The limit for the collected elements.