pyflink.datastream.connectors.pulsar.PulsarSourceBuilder#
- class PulsarSourceBuilder[source]#
The builder class for PulsarSource to make it easier for the users to construct a PulsarSource.
The following example shows the minimum setup to create a PulsarSource that reads the String values from a Pulsar topic.
Example:
>>> source = PulsarSource() \ ... .builder() \ ... .set_service_url(PULSAR_BROKER_URL) \ ... .set_admin_url(PULSAR_BROKER_HTTP_URL) \ ... .set_subscription_name("flink-source-1") \ ... .set_topics([TOPIC1, TOPIC2]) \ ... .set_deserialization_schema( ... PulsarDeserializationSchema.flink_schema(SimpleStringSchema())) \ ... .build()
The service url, admin url, subscription name, topics to consume, and the record deserializer are required fields that must be set.
To specify the starting position of PulsarSource, one can call set_start_cursor(StartCursor).
By default the PulsarSource runs in an Boundedness.CONTINUOUS_UNBOUNDED mode and never stop until the Flink job is canceled or fails. To let the PulsarSource run in Boundedness.CONTINUOUS_UNBOUNDED but stops at some given offsets, one can call set_unbounded_stop_cursor(StopCursor).
For example the following PulsarSource stops after it consumes up to a event time when the Flink started.
Example:
>>> source = PulsarSource() \ ... .builder() \ ... .set_service_url(PULSAR_BROKER_URL) \ ... .set_admin_url(PULSAR_BROKER_HTTP_URL) \ ... .set_subscription_name("flink-source-1") \ ... .set_topics([TOPIC1, TOPIC2]) \ ... .set_deserialization_schema( ... PulsarDeserializationSchema.flink_schema(SimpleStringSchema())) \ ... .set_bounded_stop_cursor(StopCursor.at_publish_time(int(time.time() * 1000))) ... .build()
Methods
build
()Build the PulsarSource.
set_admin_url
(admin_url)Sets the admin endpoint for the PulsarAdmin of the PulsarSource.
set_bounded_stop_cursor
(stop_cursor)By default the PulsarSource is set to run in Boundedness.CONTINUOUS_UNBOUNDED manner and thus never stops until the Flink job fails or is canceled.
set_config
(key, value)Set arbitrary properties for the PulsarSource and PulsarConsumer.
set_config_with_dict
(config)Set arbitrary properties for the PulsarSource and PulsarConsumer.
set_deserialization_schema
(...)DeserializationSchema is required for getting the Schema for deserialize message from pulsar and getting the TypeInformation for message serialization in flink.
set_properties
(config)Set arbitrary properties for the PulsarSource and PulsarConsumer.
set_service_url
(service_url)Sets the server's link for the PulsarConsumer of the PulsarSource.
set_start_cursor
(start_cursor)Specify from which offsets the PulsarSource should start consume from by providing an StartCursor.
set_subscription_name
(subscription_name)Sets the name for this pulsar subscription.
set_subscription_type
(subscription_type)SubscriptionType is the consuming behavior for pulsar, we would generator different split by the given subscription type.
set_topic_pattern
(topic_pattern)Set a topic pattern to consume from the java regex str.
set_topics
(topics)Set a pulsar topic list for flink source.
set_topics_pattern
(topics_pattern)Set a topic pattern to consume from the java regex str.
set_unbounded_stop_cursor
(stop_cursor)By default the PulsarSource is set to run in Boundedness.CONTINUOUS_UNBOUNDED manner and thus never stops until the Flink job fails or is canceled.