KafkaSourceBuilder (Flink : 1.13-SNAPSHOT API)

java.lang.Object
- org.apache.flink.connector.kafka.source.KafkaSourceBuilder<OUT>

```
public class KafkaSourceBuilder<OUT>
extends Object
```
The @builder class for KafkaSource to make it easier for the users to construct a KafkaSource.
The following example shows the minimum setup to create a KafkaSource that reads the String values from a Kafka topic.
```
 KafkaSource<String> source = KafkaSource
     .<String>builder()
     .setBootstrapServers(MY_BOOTSTRAP_SERVERS)
     .setTopics(Arrays.asList(TOPIC1, TOPIC2))
     .setDeserializer(KafkaRecordDeserializationSchema.valueOnly(StringDeserializer.class))
     .build();
 
```
The bootstrap servers, topics/partitions to consume, and the record deserializer are required fields that must be set.
To specify the starting offsets of the KafkaSource, one can call setStartingOffsets(OffsetsInitializer).
By default the KafkaSource runs in an Boundedness.CONTINUOUS_UNBOUNDED mode and never stops until the Flink job is canceled or fails. To let the KafkaSource run in Boundedness.CONTINUOUS_UNBOUNDED but stops at some given offsets, one can call setUnbounded(OffsetsInitializer). For example the following KafkaSource stops after it consumes up to the latest partition offsets at the point when the Flink started.
```
 KafkaSource<String> source = KafkaSource
     .<String>builder()
     .setBootstrapServers(MY_BOOTSTRAP_SERVERS)
     .setTopics(Arrays.asList(TOPIC1, TOPIC2))
     .setDeserializer(KafkaRecordDeserializationSchema.valueOnly(StringDeserializer.class))
     .setUnbounded(OffsetsInitializer.latest())
     .build();
 
```
Check the Java docs of each individual methods to learn more about the settings to build a KafkaSource.

Field Summary

Fields
Modifier and Type Field and Description

protected Properties props

Fields
Modifier and Type	Field and Description
`protected Properties`	`props`

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`KafkaSource<OUT>`	`build()` Build the `KafkaSource`.
`KafkaSourceBuilder<OUT>`	`setBootstrapServers(String bootstrapServers)` Sets the bootstrap servers for the KafkaConsumer of the KafkaSource.
`KafkaSourceBuilder<OUT>`	`setBounded(OffsetsInitializer stoppingOffsetsInitializer)` By default the KafkaSource is set to run in `Boundedness.CONTINUOUS_UNBOUNDED` manner and thus never stops until the Flink job fails or is canceled.
`KafkaSourceBuilder<OUT>`	`setClientIdPrefix(String prefix)` Sets the client id prefix of this KafkaSource.
`KafkaSourceBuilder<OUT>`	`setDeserializer(KafkaRecordDeserializationSchema<OUT> recordDeserializer)` Sets the `deserializer` of the `ConsumerRecord` for KafkaSource.
`KafkaSourceBuilder<OUT>`	`setGroupId(String groupId)` Sets the consumer group id of the KafkaSource.
`KafkaSourceBuilder<OUT>`	`setPartitions(Set<org.apache.kafka.common.TopicPartition> partitions)` Set a set of partitions to consume from.
`KafkaSourceBuilder<OUT>`	`setProperties(Properties props)` Set arbitrary properties for the KafkaSource and KafkaConsumer.
`KafkaSourceBuilder<OUT>`	`setProperty(String key, String value)` Set an arbitrary property for the KafkaSource and KafkaConsumer.
`KafkaSourceBuilder<OUT>`	`setStartingOffsets(OffsetsInitializer startingOffsetsInitializer)` Specify from which offsets the KafkaSource should start consume from by providing an `OffsetsInitializer`.
`KafkaSourceBuilder<OUT>`	`setTopicPattern(Pattern topicPattern)` Set a topic pattern to consume from use the java `Pattern`.
`KafkaSourceBuilder<OUT>`	`setTopics(List<String> topics)` Set a list of topics the KafkaSource should consume from.
`KafkaSourceBuilder<OUT>`	`setTopics(String... topics)` Set a list of topics the KafkaSource should consume from.
`KafkaSourceBuilder<OUT>`	`setUnbounded(OffsetsInitializer stoppingOffsetsInitializer)` By default the KafkaSource is set to run in `Boundedness.CONTINUOUS_UNBOUNDED` manner and thus never stops until the Flink job fails or is canceled.
`KafkaSourceBuilder<OUT>`	`setValueOnlyDeserializer(DeserializationSchema<OUT> deserializationSchema)` Sets the `deserializer` of the `ConsumerRecord` for KafkaSource.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Field Detail
  - props
```
protected Properties props
```
- Method Detail
  - setBootstrapServers
```
public KafkaSourceBuilder<OUT> setBootstrapServers(String bootstrapServers)
```
    Sets the bootstrap servers for the KafkaConsumer of the KafkaSource.
    
    Parameters:
    
    bootstrapServers - the bootstrap servers of the Kafka cluster.
    
    Returns:
    
    this KafkaSourceBuilder.
  - setGroupId
```
public KafkaSourceBuilder<OUT> setGroupId(String groupId)
```
    Sets the consumer group id of the KafkaSource.
    
    Parameters:
    
    groupId - the group id of the KafkaSource.
    
    Returns:
    
    this KafkaSourceBuilder.
  - setTopics
```
public KafkaSourceBuilder<OUT> setTopics(List<String> topics)
```
    Set a list of topics the KafkaSource should consume from. All the topics in the list should have existed in the Kafka cluster. Otherwise an exception will be thrown. To allow some of the topics to be created lazily, please use setTopicPattern(Pattern) instead.
    
    Parameters:
    
    topics - the list of topics to consume from.
    
    Returns:
    
    this KafkaSourceBuilder.
    
    See Also:
    
    KafkaConsumer.subscribe(Collection)
  - setTopics
```
public KafkaSourceBuilder<OUT> setTopics(String... topics)
```
    Set a list of topics the KafkaSource should consume from. All the topics in the list should have existed in the Kafka cluster. Otherwise an exception will be thrown. To allow some of the topics to be created lazily, please use setTopicPattern(Pattern) instead.
    
    Parameters:
    
    topics - the list of topics to consume from.
    
    Returns:
    
    this KafkaSourceBuilder.
    
    See Also:
    
    KafkaConsumer.subscribe(Collection)
  - setTopicPattern
```
public KafkaSourceBuilder<OUT> setTopicPattern(Pattern topicPattern)
```
    Set a topic pattern to consume from use the java Pattern.
    
    Parameters:
    
    topicPattern - the pattern of the topic name to consume from.
    
    Returns:
    
    this KafkaSourceBuilder.
    
    See Also:
    
    KafkaConsumer.subscribe(Pattern)
  - setPartitions
```
public KafkaSourceBuilder<OUT> setPartitions(Set<org.apache.kafka.common.TopicPartition> partitions)
```
    Set a set of partitions to consume from.
    
    Parameters:
    
    partitions - the set of partitions to consume from.
    
    Returns:
    
    this KafkaSourceBuilder.
    
    See Also:
    
    KafkaConsumer.assign(Collection)
  - setStartingOffsets
```
public KafkaSourceBuilder<OUT> setStartingOffsets(OffsetsInitializer startingOffsetsInitializer)
```
    Specify from which offsets the KafkaSource should start consume from by providing an OffsetsInitializer.
    The following OffsetsInitializers are commonly used and provided out of the box. Users can also implement their own OffsetsInitializer for custom behaviors.
    - OffsetsInitializer.earliest() - starting from the earliest offsets. This is also the default OffsetsInitializer of the KafkaSource for starting offsets.
    - OffsetsInitializer.latest() - starting from the latest offsets.
    - OffsetsInitializer.committedOffsets() - starting from the committed offsets of the consumer group.
    - OffsetsInitializer.committedOffsets(org.apache.kafka.clients.consumer.OffsetResetStrategy) - starting from the committed offsets of the consumer group. If there is no committed offsets, starting from the offsets specified by the OffsetResetStrategy.
    - OffsetsInitializer.offsets(Map) - starting from the specified offsets for each partition.
    - OffsetsInitializer.timestamp(long) - starting from the specified timestamp for each partition. Note that the guarantee here is that all the records in Kafka whose ConsumerRecord.timestamp() is greater than the given starting timestamp will be consumed. However, it is possible that some consumer records whose timestamp is smaller than the given starting timestamp are also consumed.
    Parameters:
    
    startingOffsetsInitializer - the OffsetsInitializer setting the starting offsets for the Source.
    
    Returns:
    
    this KafkaSourceBuilder.
  - setUnbounded
```
public KafkaSourceBuilder<OUT> setUnbounded(OffsetsInitializer stoppingOffsetsInitializer)
```
    By default the KafkaSource is set to run in Boundedness.CONTINUOUS_UNBOUNDED manner and thus never stops until the Flink job fails or is canceled. To let the KafkaSource run as a streaming source but still stops at some point, one can set an OffsetsInitializer to specify the stopping offsets for each partition. When all the partitions have reached their stopping offsets, the KafkaSource will then exit.
    This method is different from setBounded(OffsetsInitializer) that after setting the stopping offsets with this method, KafkaSource.getBoundedness() will still return Boundedness.CONTINUOUS_UNBOUNDED even though it will stop at the stopping offsets specified by the stopping offsets OffsetsInitializer.
    The following OffsetsInitializer are commonly used and provided out of the box. Users can also implement their own OffsetsInitializer for custom behaviors.
    - OffsetsInitializer.latest() - stop at the latest offsets of the partitions when the KafkaSource starts to run.
    - OffsetsInitializer.committedOffsets() - stops at the committed offsets of the consumer group.
    - OffsetsInitializer.offsets(Map) - stops at the specified offsets for each partition.
    - OffsetsInitializer.timestamp(long) - stops at the specified timestamp for each partition. The guarantee of setting the stopping timestamp is that no Kafka records whose ConsumerRecord.timestamp() is greater than the given stopping timestamp will be consumed. However, it is possible that some records whose timestamp is smaller than the specified stopping timestamp are not consumed.
    Parameters:
    
    stoppingOffsetsInitializer - The OffsetsInitializer to specify the stopping offset.
    
    Returns:
    
    this KafkaSourceBuilder.
    
    See Also:
    
    setBounded(OffsetsInitializer)
  - setBounded
```
public KafkaSourceBuilder<OUT> setBounded(OffsetsInitializer stoppingOffsetsInitializer)
```
    By default the KafkaSource is set to run in Boundedness.CONTINUOUS_UNBOUNDED manner and thus never stops until the Flink job fails or is canceled. To let the KafkaSource run in Boundedness.BOUNDED manner and stops at some point, one can set an OffsetsInitializer to specify the stopping offsets for each partition. When all the partitions have reached their stopping offsets, the KafkaSource will then exit.
    This method is different from setUnbounded(OffsetsInitializer) that after setting the stopping offsets with this method, KafkaSource.getBoundedness() will return Boundedness.BOUNDED instead of Boundedness.CONTINUOUS_UNBOUNDED.
    The following OffsetsInitializer are commonly used and provided out of the box. Users can also implement their own OffsetsInitializer for custom behaviors.
    - OffsetsInitializer.latest() - stop at the latest offsets of the partitions when the KafkaSource starts to run.
    - OffsetsInitializer.committedOffsets() - stops at the committed offsets of the consumer group.
    - OffsetsInitializer.offsets(Map) - stops at the specified offsets for each partition.
    - OffsetsInitializer.timestamp(long) - stops at the specified timestamp for each partition. The guarantee of setting the stopping timestamp is that no Kafka records whose ConsumerRecord.timestamp() is greater than the given stopping timestamp will be consumed. However, it is possible that some records whose timestamp is smaller than the specified stopping timestamp are not consumed.
    Parameters:
    
    stoppingOffsetsInitializer - the OffsetsInitializer to specify the stopping offsets.
    
    Returns:
    
    this KafkaSourceBuilder.
    
    See Also:
    
    setUnbounded(OffsetsInitializer)
  - setDeserializer
```
public KafkaSourceBuilder<OUT> setDeserializer(KafkaRecordDeserializationSchema<OUT> recordDeserializer)
```
    Sets the deserializer of the ConsumerRecord for KafkaSource.
    
    Parameters:
    
    recordDeserializer - the deserializer for Kafka ConsumerRecord.
    
    Returns:
    
    this KafkaSourceBuilder.
  - setValueOnlyDeserializer
```
public KafkaSourceBuilder<OUT> setValueOnlyDeserializer(DeserializationSchema<OUT> deserializationSchema)
```
    Sets the deserializer of the ConsumerRecord for KafkaSource. The given DeserializationSchema will be used to deserialize the value of ConsumerRecord. The other information (e.g. key) in a ConsumerRecord will be ignored.
    
    Parameters:
    
    deserializationSchema - the DeserializationSchema to use for deserialization.
    
    Returns:
    
    this KafkaSourceBuilder.
  - setClientIdPrefix
```
public KafkaSourceBuilder<OUT> setClientIdPrefix(String prefix)
```
    Sets the client id prefix of this KafkaSource.
    
    Parameters:
    
    prefix - the client id prefix to use for this KafkaSource.
    
    Returns:
    
    this KafkaSourceBuilder.
  - setProperty
```
public KafkaSourceBuilder<OUT> setProperty(String key,
                                           String value)
```
    Set an arbitrary property for the KafkaSource and KafkaConsumer. The valid keys can be found in ConsumerConfig and KafkaSourceOptions.
    Note that the following keys will be overridden by the builder when the KafkaSource is created.
    - key.deserializer is always set to ByteArrayDeserializer.
    - value.deserializer is always set to ByteArrayDeserializer.
    - auto.offset.reset.strategy is overridden by OffsetsInitializer.getAutoOffsetResetStrategy() for the starting offsets, which is by default OffsetsInitializer.earliest().
    - partition.discovery.interval.ms is overridden to -1 when setBounded(OffsetsInitializer) has been invoked.
    Parameters:
    
    key - the key of the property.
    
    value - the value of the property.
    
    Returns:
    
    this KafkaSourceBuilder.
  - setProperties
```
public KafkaSourceBuilder<OUT> setProperties(Properties props)
```
    Set arbitrary properties for the KafkaSource and KafkaConsumer. The valid keys can be found in ConsumerConfig and KafkaSourceOptions.
    Note that the following keys will be overridden by the builder when the KafkaSource is created.
    - key.deserializer is always set to ByteArrayDeserializer.
    - value.deserializer is always set to ByteArrayDeserializer.
    - auto.offset.reset.strategy is overridden by OffsetsInitializer.getAutoOffsetResetStrategy() for the starting offsets, which is by default OffsetsInitializer.earliest().
    - partition.discovery.interval.ms is overridden to -1 when setBounded(OffsetsInitializer) has been invoked.
    - client.id is overridden to the "client.id.prefix-RANDOM_LONG", or "group.id-RANDOM_LONG" if the client id prefix is not set.
    Parameters:
    
    props - the properties to set for the KafkaSource.
    
    Returns:
    
    this KafkaSourceBuilder.
  - build
```
public KafkaSource<OUT> build()
```
    Build the KafkaSource.
    
    Returns:
    
    a KafkaSource with the settings made for this builder.

Back to Flink Website

Class KafkaSourceBuilder<OUT>

Field Summary

Method Summary

Methods inherited from class java.lang.Object

Field Detail

props

Method Detail

setBootstrapServers

setGroupId

setTopics

setTopics

setTopicPattern

setPartitions

setStartingOffsets

setUnbounded

setBounded

setDeserializer

setValueOnlyDeserializer

setClientIdPrefix

setProperty

setProperties

build

Back to Flink Website