Class AbstractFileSource<T,SplitT extends FileSourceSplit>
- java.lang.Object
-
- org.apache.flink.connector.file.src.AbstractFileSource<T,SplitT>
-
- Type Parameters:
T
- The type of the events/records produced by this source.SplitT
- The subclass type of the FileSourceSplit used by the source implementation.
- All Implemented Interfaces:
Serializable
,Source<T,SplitT,PendingSplitsCheckpoint<SplitT>>
,SourceReaderFactory<T,SplitT>
,ResultTypeQueryable<T>
- Direct Known Subclasses:
FileSource
@PublicEvolving public abstract class AbstractFileSource<T,SplitT extends FileSourceSplit> extends Object implements Source<T,SplitT,PendingSplitsCheckpoint<SplitT>>, ResultTypeQueryable<T>
The base class for File Sources. The main implementation to use is theFileSource
, which also has the majority of the documentation.To read new formats, one commonly does NOT need to extend this class, but should implement a new Format Reader (like
StreamFormat
,BulkFormat
and use it with theFileSource
.The only reason to extend this class is when a source needs a different type of split, meaning an extension of the
FileSourceSplit
to carry additional information.- See Also:
- Serialized Form
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description protected static class
AbstractFileSource.AbstractFileSourceBuilder<T,SplitT extends FileSourceSplit,SELF extends AbstractFileSource.AbstractFileSourceBuilder<T,SplitT,SELF>>
The generic base builder.
-
Constructor Summary
Constructors Modifier Constructor Description protected
AbstractFileSource(Path[] inputPaths, FileEnumerator.Provider fileEnumerator, FileSplitAssigner.Provider splitAssigner, BulkFormat<T,SplitT> readerFormat, ContinuousEnumerationSettings continuousEnumerationSettings)
-
Method Summary
-
-
-
Constructor Detail
-
AbstractFileSource
protected AbstractFileSource(Path[] inputPaths, FileEnumerator.Provider fileEnumerator, FileSplitAssigner.Provider splitAssigner, BulkFormat<T,SplitT> readerFormat, @Nullable ContinuousEnumerationSettings continuousEnumerationSettings)
-
-
Method Detail
-
getEnumeratorFactory
protected FileEnumerator.Provider getEnumeratorFactory()
-
getAssignerFactory
public FileSplitAssigner.Provider getAssignerFactory()
-
getContinuousEnumerationSettings
@Nullable public ContinuousEnumerationSettings getContinuousEnumerationSettings()
-
getBoundedness
public Boundedness getBoundedness()
Description copied from interface:Source
Get the boundedness of this source.- Specified by:
getBoundedness
in interfaceSource<T,SplitT extends FileSourceSplit,PendingSplitsCheckpoint<SplitT extends FileSourceSplit>>
- Returns:
- the boundedness of this source.
-
createReader
public SourceReader<T,SplitT> createReader(SourceReaderContext readerContext)
Description copied from interface:SourceReaderFactory
Creates a new reader to read data from the splits it gets assigned. The reader starts fresh and does not have any state to resume.- Specified by:
createReader
in interfaceSourceReaderFactory<T,SplitT extends FileSourceSplit>
- Parameters:
readerContext
- Thecontext
for the source reader.- Returns:
- A new SourceReader.
-
createEnumerator
public SplitEnumerator<SplitT,PendingSplitsCheckpoint<SplitT>> createEnumerator(SplitEnumeratorContext<SplitT> enumContext)
Description copied from interface:Source
Creates a new SplitEnumerator for this source, starting a new input.- Specified by:
createEnumerator
in interfaceSource<T,SplitT extends FileSourceSplit,PendingSplitsCheckpoint<SplitT extends FileSourceSplit>>
- Parameters:
enumContext
- Thecontext
for the split enumerator.- Returns:
- A new SplitEnumerator.
-
restoreEnumerator
public SplitEnumerator<SplitT,PendingSplitsCheckpoint<SplitT>> restoreEnumerator(SplitEnumeratorContext<SplitT> enumContext, PendingSplitsCheckpoint<SplitT> checkpoint)
Description copied from interface:Source
Restores an enumerator from a checkpoint.- Specified by:
restoreEnumerator
in interfaceSource<T,SplitT extends FileSourceSplit,PendingSplitsCheckpoint<SplitT extends FileSourceSplit>>
- Parameters:
enumContext
- Thecontext
for the restored split enumerator.checkpoint
- The checkpoint to restore the SplitEnumerator from.- Returns:
- A SplitEnumerator restored from the given checkpoint.
-
getSplitSerializer
public abstract SimpleVersionedSerializer<SplitT> getSplitSerializer()
Description copied from interface:Source
Creates a serializer for the source splits. Splits are serialized when sending them from enumerator to reader, and when checkpointing the reader's current state.- Specified by:
getSplitSerializer
in interfaceSource<T,SplitT extends FileSourceSplit,PendingSplitsCheckpoint<SplitT extends FileSourceSplit>>
- Returns:
- The serializer for the split type.
-
getEnumeratorCheckpointSerializer
public SimpleVersionedSerializer<PendingSplitsCheckpoint<SplitT>> getEnumeratorCheckpointSerializer()
Description copied from interface:Source
Creates the serializer for theSplitEnumerator
checkpoint. The serializer is used for the result of theSplitEnumerator.snapshotState(long)
method.- Specified by:
getEnumeratorCheckpointSerializer
in interfaceSource<T,SplitT extends FileSourceSplit,PendingSplitsCheckpoint<SplitT extends FileSourceSplit>>
- Returns:
- The serializer for the SplitEnumerator checkpoint.
-
getProducedType
public TypeInformation<T> getProducedType()
Description copied from interface:ResultTypeQueryable
Gets the data type (as aTypeInformation
) produced by this function or input format.- Specified by:
getProducedType
in interfaceResultTypeQueryable<T>
- Returns:
- The data type produced by this function or input format.
-
-