pyflink.datastream.connectors.kafka.KafkaSourceBuilder#
- class KafkaSourceBuilder[source]#
The builder class for
KafkaSource
to make it easier for the users to construct aKafkaSource
.The following example shows the minimum setup to create a KafkaSource that reads the String values from a Kafka topic.
>>> source = KafkaSource.builder() \ ... .set_bootstrap_servers('MY_BOOTSTRAP_SERVERS') \ ... .set_topics('TOPIC1', 'TOPIC2') \ ... .set_value_only_deserializer(SimpleStringSchema()) \ ... .build()
The bootstrap servers, topics/partitions to consume, and the record deserializer are required fields that must be set.
To specify the starting offsets of the KafkaSource, one can call
set_starting_offsets()
.By default, the KafkaSource runs in an CONTINUOUS_UNBOUNDED mode and never stops until the Flink job is canceled or fails. To let the KafkaSource run in CONTINUOUS_UNBOUNDED but stops at some given offsets, one can call
set_stopping_offsets()
. For example the following KafkaSource stops after it consumes up to the latest partition offsets at the point when the Flink started.>>> source = KafkaSource.builder() \ ... .set_bootstrap_servers('MY_BOOTSTRAP_SERVERS') \ ... .set_topics('TOPIC1', 'TOPIC2') \ ... .set_value_only_deserializer(SimpleStringSchema()) \ ... .set_unbounded(KafkaOffsetsInitializer.latest()) \ ... .build()
New in version 1.16.0.
Methods
build
()set_bootstrap_servers
(bootstrap_servers)Sets the bootstrap servers for the KafkaConsumer of the KafkaSource.
set_bounded
(stopping_offsets_initializer)By default, the KafkaSource is set to run in CONTINUOUS_UNBOUNDED manner and thus never stops until the Flink job fails or is canceled.
set_client_id_prefix
(prefix)Sets the client id prefix of this KafkaSource.
set_group_id
(group_id)Sets the consumer group id of the KafkaSource.
set_partitions
(partitions)Set a set of partitions to consume from.
set_properties
(props)Set arbitrary properties for the KafkaSource and KafkaConsumer.
set_property
(key, value)Set an arbitrary property for the KafkaSource and KafkaConsumer.
set_starting_offsets
(...)Specify from which offsets the KafkaSource should start consume from by providing an
KafkaOffsetsInitializer
.set_topic_pattern
(topic_pattern)Set a topic pattern to consume from use the java Pattern.
set_topics
(*topics)Set a list of topics the KafkaSource should consume from.
set_unbounded
(stopping_offsets_initializer)By default, the KafkaSource is set to run in CONTINUOUS_UNBOUNDED manner and thus never stops until the Flink job fails or is canceled.
set_value_only_deserializer
(...)Sets the
DeserializationSchema
for deserializing the value of Kafka's ConsumerRecord.