T
- The type of the events/records produced by this source.SplitT
- The subclass type of the FileSourceSplit used by the source implementation.@PublicEvolving public abstract class AbstractFileSource<T,SplitT extends FileSourceSplit> extends Object implements Source<T,SplitT,PendingSplitsCheckpoint<SplitT>>, ResultTypeQueryable<T>
FileSource
, which
also has the majority of the documentation.
To read new formats, one commonly does NOT need to extend this class, but should implement a
new Format Reader (like StreamFormat
, BulkFormat
and use it with the FileSource
.
The only reason to extend this class is when a source needs a different type of split,
meaning an extension of the FileSourceSplit
to carry additional information.
Modifier and Type | Class and Description |
---|---|
protected static class |
AbstractFileSource.AbstractFileSourceBuilder<T,SplitT extends FileSourceSplit,SELF extends AbstractFileSource.AbstractFileSourceBuilder<T,SplitT,SELF>>
The generic base builder.
|
Modifier | Constructor and Description |
---|---|
protected |
AbstractFileSource(Path[] inputPaths,
FileEnumerator.Provider fileEnumerator,
FileSplitAssigner.Provider splitAssigner,
BulkFormat<T,SplitT> readerFormat,
ContinuousEnumerationSettings continuousEnumerationSettings) |
protected AbstractFileSource(Path[] inputPaths, FileEnumerator.Provider fileEnumerator, FileSplitAssigner.Provider splitAssigner, BulkFormat<T,SplitT> readerFormat, @Nullable ContinuousEnumerationSettings continuousEnumerationSettings)
protected FileEnumerator.Provider getEnumeratorFactory()
public FileSplitAssigner.Provider getAssignerFactory()
@Nullable public ContinuousEnumerationSettings getContinuousEnumerationSettings()
public Boundedness getBoundedness()
Source
getBoundedness
in interface Source<T,SplitT extends FileSourceSplit,PendingSplitsCheckpoint<SplitT extends FileSourceSplit>>
public SourceReader<T,SplitT> createReader(SourceReaderContext readerContext)
SourceReaderFactory
createReader
in interface SourceReaderFactory<T,SplitT extends FileSourceSplit>
readerContext
- The context
for the source reader.public SplitEnumerator<SplitT,PendingSplitsCheckpoint<SplitT>> createEnumerator(SplitEnumeratorContext<SplitT> enumContext)
Source
createEnumerator
in interface Source<T,SplitT extends FileSourceSplit,PendingSplitsCheckpoint<SplitT extends FileSourceSplit>>
enumContext
- The context
for the split enumerator.public SplitEnumerator<SplitT,PendingSplitsCheckpoint<SplitT>> restoreEnumerator(SplitEnumeratorContext<SplitT> enumContext, PendingSplitsCheckpoint<SplitT> checkpoint)
Source
restoreEnumerator
in interface Source<T,SplitT extends FileSourceSplit,PendingSplitsCheckpoint<SplitT extends FileSourceSplit>>
enumContext
- The context
for the restored split
enumerator.checkpoint
- The checkpoint to restore the SplitEnumerator from.public abstract SimpleVersionedSerializer<SplitT> getSplitSerializer()
Source
getSplitSerializer
in interface Source<T,SplitT extends FileSourceSplit,PendingSplitsCheckpoint<SplitT extends FileSourceSplit>>
public SimpleVersionedSerializer<PendingSplitsCheckpoint<SplitT>> getEnumeratorCheckpointSerializer()
Source
SplitEnumerator
checkpoint. The serializer is used for
the result of the SplitEnumerator.snapshotState(long)
method.getEnumeratorCheckpointSerializer
in interface Source<T,SplitT extends FileSourceSplit,PendingSplitsCheckpoint<SplitT extends FileSourceSplit>>
public TypeInformation<T> getProducedType()
ResultTypeQueryable
TypeInformation
) produced by this function or input format.getProducedType
in interface ResultTypeQueryable<T>
Copyright © 2014–2024 The Apache Software Foundation. All rights reserved.