Class DynamicFileSplitEnumerator<SplitT extends FileSourceSplit>
- java.lang.Object
-
- org.apache.flink.connector.file.src.impl.DynamicFileSplitEnumerator<SplitT>
-
- All Implemented Interfaces:
AutoCloseable
,CheckpointListener
,SplitEnumerator<SplitT,PendingSplitsCheckpoint<SplitT>>
,SupportsHandleExecutionAttemptSourceEvent
@Internal public class DynamicFileSplitEnumerator<SplitT extends FileSourceSplit> extends Object implements SplitEnumerator<SplitT,PendingSplitsCheckpoint<SplitT>>, SupportsHandleExecutionAttemptSourceEvent
A SplitEnumerator implementation that supports dynamic filtering.This enumerator handles
DynamicFilteringEvent
and filter out the desired input splits with the support of theDynamicFileEnumerator
.If the enumerator receives the first split request before any dynamic filtering data is received, it will enumerate all splits. If a DynamicFilterEvent is received during the fully enumerating, the remaining splits will be filtered accordingly.
-
-
Constructor Summary
Constructors Constructor Description DynamicFileSplitEnumerator(SplitEnumeratorContext<SplitT> context, DynamicFileEnumerator.Provider fileEnumeratorFactory, FileSplitAssigner.Provider splitAssignerFactory)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
addReader(int subtaskId)
Add a new source reader with the given subtask ID.void
addSplitsBack(List<SplitT> splits, int subtaskId)
Add splits back to the split enumerator.void
close()
Called to close the enumerator, in case it holds on to any resources, like threads or network connections.void
handleSourceEvent(int subtaskId, int attemptNumber, SourceEvent sourceEvent)
Handles a custom source event from the source reader.void
handleSourceEvent(int subtaskId, SourceEvent sourceEvent)
Handles a custom source event from the source reader.void
handleSplitRequest(int subtask, String hostname)
Handles the request for a split.PendingSplitsCheckpoint<SplitT>
snapshotState(long checkpointId)
Creates a snapshot of the state of this split enumerator, to be stored in a checkpoint.void
start()
Start the split enumerator.-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface org.apache.flink.api.common.state.CheckpointListener
notifyCheckpointAborted
-
Methods inherited from interface org.apache.flink.api.connector.source.SplitEnumerator
notifyCheckpointComplete
-
-
-
-
Constructor Detail
-
DynamicFileSplitEnumerator
public DynamicFileSplitEnumerator(SplitEnumeratorContext<SplitT> context, DynamicFileEnumerator.Provider fileEnumeratorFactory, FileSplitAssigner.Provider splitAssignerFactory)
-
-
Method Detail
-
start
public void start()
Description copied from interface:SplitEnumerator
Start the split enumerator.The default behavior does nothing.
- Specified by:
start
in interfaceSplitEnumerator<SplitT extends FileSourceSplit,PendingSplitsCheckpoint<SplitT extends FileSourceSplit>>
-
close
public void close() throws IOException
Description copied from interface:SplitEnumerator
Called to close the enumerator, in case it holds on to any resources, like threads or network connections.- Specified by:
close
in interfaceAutoCloseable
- Specified by:
close
in interfaceSplitEnumerator<SplitT extends FileSourceSplit,PendingSplitsCheckpoint<SplitT extends FileSourceSplit>>
- Throws:
IOException
-
addReader
public void addReader(int subtaskId)
Description copied from interface:SplitEnumerator
Add a new source reader with the given subtask ID.- Specified by:
addReader
in interfaceSplitEnumerator<SplitT extends FileSourceSplit,PendingSplitsCheckpoint<SplitT extends FileSourceSplit>>
- Parameters:
subtaskId
- the subtask ID of the new source reader.
-
handleSplitRequest
public void handleSplitRequest(int subtask, @Nullable String hostname)
Description copied from interface:SplitEnumerator
Handles the request for a split. This method is called when the reader with the given subtask id calls theSourceReaderContext.sendSplitRequest()
method.- Specified by:
handleSplitRequest
in interfaceSplitEnumerator<SplitT extends FileSourceSplit,PendingSplitsCheckpoint<SplitT extends FileSourceSplit>>
- Parameters:
subtask
- the subtask id of the source reader who sent the source event.hostname
- Optional, the hostname where the requesting task is running. This can be used to make split assignments locality-aware.
-
handleSourceEvent
public void handleSourceEvent(int subtaskId, SourceEvent sourceEvent)
Description copied from interface:SplitEnumerator
Handles a custom source event from the source reader.This method has a default implementation that does nothing, because it is only required to be implemented by some sources, which have a custom event protocol between reader and enumerator. The common events for reader registration and split requests are not dispatched to this method, but rather invoke the
SplitEnumerator.addReader(int)
andSplitEnumerator.handleSplitRequest(int, String)
methods.- Specified by:
handleSourceEvent
in interfaceSplitEnumerator<SplitT extends FileSourceSplit,PendingSplitsCheckpoint<SplitT extends FileSourceSplit>>
- Parameters:
subtaskId
- the subtask id of the source reader who sent the source event.sourceEvent
- the source event from the source reader.
-
addSplitsBack
public void addSplitsBack(List<SplitT> splits, int subtaskId)
Description copied from interface:SplitEnumerator
Add splits back to the split enumerator. This will only happen when aSourceReader
fails and there are splits assigned to it after the last successful checkpoint.- Specified by:
addSplitsBack
in interfaceSplitEnumerator<SplitT extends FileSourceSplit,PendingSplitsCheckpoint<SplitT extends FileSourceSplit>>
- Parameters:
splits
- The splits to add back to the enumerator for reassignment.subtaskId
- The id of the subtask to which the returned splits belong.
-
snapshotState
public PendingSplitsCheckpoint<SplitT> snapshotState(long checkpointId)
Description copied from interface:SplitEnumerator
Creates a snapshot of the state of this split enumerator, to be stored in a checkpoint.The snapshot should contain the latest state of the enumerator: It should assume that all operations that happened before the snapshot have successfully completed. For example all splits assigned to readers via
SplitEnumeratorContext.assignSplit(SourceSplit, int)
andSplitEnumeratorContext.assignSplits(SplitsAssignment)
) don't need to be included in the snapshot anymore.This method takes the ID of the checkpoint for which the state is snapshotted. Most implementations should be able to ignore this parameter, because for the contents of the snapshot, it doesn't matter for which checkpoint it gets created. This parameter can be interesting for source connectors with external systems where those systems are themselves aware of checkpoints; for example in cases where the enumerator notifies that system about a specific checkpoint being triggered.
- Specified by:
snapshotState
in interfaceSplitEnumerator<SplitT extends FileSourceSplit,PendingSplitsCheckpoint<SplitT extends FileSourceSplit>>
- Parameters:
checkpointId
- The ID of the checkpoint for which the snapshot is created.- Returns:
- an object containing the state of the split enumerator.
-
handleSourceEvent
public void handleSourceEvent(int subtaskId, int attemptNumber, SourceEvent sourceEvent)
Description copied from interface:SupportsHandleExecutionAttemptSourceEvent
Handles a custom source event from the source reader. It is similar toSplitEnumerator.handleSourceEvent(int, SourceEvent)
but is aware of the subtask execution attempt who sent this event.- Specified by:
handleSourceEvent
in interfaceSupportsHandleExecutionAttemptSourceEvent
- Parameters:
subtaskId
- the subtask id of the source reader who sent the source event.attemptNumber
- the attempt number of the source reader who sent the source event.sourceEvent
- the source event from the source reader.
-
-