pyflink.datastream.connectors.kafka.Semantic#
- class Semantic(value)[source]#
Semantics that can be chosen.
- Data
EXACTLY_ONCE:
The Flink producer will write all messages in a Kafka transaction that will be committed to the Kafka on a checkpoint. In this mode FlinkKafkaProducer sets up a pool of FlinkKafkaProducer. Between each checkpoint there is created new Kafka transaction, which is being committed on FlinkKafkaProducer.notifyCheckpointComplete(long). If checkpoint complete notifications are running late, FlinkKafkaProducer can run out of FlinkKafkaProducers in the pool. In that case any subsequent FlinkKafkaProducer.snapshot- State() requests will fail and the FlinkKafkaProducer will keep using the FlinkKafkaProducer from previous checkpoint. To decrease chances of failing checkpoints there are four options:
decrease number of max concurrent checkpoints
make checkpoints mre reliable (so that they complete faster)
increase delay between checkpoints
increase size of FlinkKafkaProducers pool
- Data
AT_LEAST_ONCE:
The Flink producer will wait for all outstanding messages in the Kafka buffers to be acknowledged by the Kafka producer on a checkpoint.
- Data
NONE:
Means that nothing will be guaranteed. Messages can be lost and/or duplicated in case of failure.
Attributes
EXACTLY_ONCE
AT_LEAST_ONCE
NONE