E
- The type of the records (the raw type that typically contains checkpointing
information).T
- The final type of the records emitted by the source.SplitT
- The type of the splits processed by the source.SplitStateT
- The type of the mutable state per split.public abstract class SingleThreadMultiplexSourceReaderBase<E,T,SplitT extends SourceSplit,SplitStateT> extends SourceReaderBase<E,T,SplitT,SplitStateT>
SourceReader
s that read splits with one thread using one SplitReader
.
The splits can be read either one after the other (like in a file source) or concurrently by
changing the subscription in the split reader (like in the Kafka Source).
To implement a source reader based on this class, implementors need to supply the following:
SplitReader
, which connects to the source and reads/polls data. The split reader
gets notified whenever there is a new split. The split reader would read files, contain a
Kafka or other source client, etc.
RecordEmitter
that takes a record from the Split Reader and updates the
checkpointing state and converts it into the final form. For example for Kafka, the Record
Emitter takes a ConsumerRecord
, puts the offset information into state, transforms
the records with the deserializers into the final type, and emits the record.
SplitT
) and the mutable split state representation (SplitStateT
).
SourceReaderBase.start()
) or when a
split is finished (#onSplitFinished(Collection)
).
config, context, options, recordEmitter, splitFetcherManager
addSplits, close, getNumberOfCurrentlyAssignedSplits, handleSourceEvents, initializedState, isAvailable, notifyNoMoreSplits, onSplitFinished, pollNext, snapshotState, start, toSplitType
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
notifyCheckpointComplete
notifyCheckpointAborted
public SingleThreadMultiplexSourceReaderBase(java.util.function.Supplier<SplitReader<E,SplitT>> splitReaderSupplier, RecordEmitter<E,T,SplitStateT> recordEmitter, Configuration config, SourceReaderContext context)
The reader will use a handover queue sized as configured via SourceReaderOptions.ELEMENT_QUEUE_CAPACITY
.
public SingleThreadMultiplexSourceReaderBase(FutureCompletingBlockingQueue<RecordsWithSplitIds<E>> elementsQueue, java.util.function.Supplier<SplitReader<E,SplitT>> splitReaderSupplier, RecordEmitter<E,T,SplitStateT> recordEmitter, Configuration config, SourceReaderContext context)
SingleThreadMultiplexSourceReaderBase(Supplier,
RecordEmitter, Configuration, SourceReaderContext)
, but accepts a specific FutureCompletingBlockingQueue
.public SingleThreadMultiplexSourceReaderBase(FutureCompletingBlockingQueue<RecordsWithSplitIds<E>> elementsQueue, SingleThreadFetcherManager<E,SplitT> splitFetcherManager, RecordEmitter<E,T,SplitStateT> recordEmitter, Configuration config, SourceReaderContext context)
SingleThreadMultiplexSourceReaderBase(Supplier,
RecordEmitter, Configuration, SourceReaderContext)
, but accepts a specific FutureCompletingBlockingQueue
and SingleThreadFetcherManager
.Copyright © 2014–2021 The Apache Software Foundation. All rights reserved.