Class NonSplittingRecursiveEnumerator
- java.lang.Object
-
- org.apache.flink.connector.file.src.enumerate.NonSplittingRecursiveEnumerator
-
- All Implemented Interfaces:
FileEnumerator
- Direct Known Subclasses:
BlockSplittingRecursiveEnumerator
,NonSplittingRecursiveAllDirEnumerator
@PublicEvolving public class NonSplittingRecursiveEnumerator extends Object implements FileEnumerator
ThisFileEnumerator
enumerates all files under the given paths recursively. Each file becomes one split; this enumerator does not split files into smaller "block" units.The default instantiation of this enumerator filters files with the common hidden file prefixes '.' and '_'. A custom file filter can be specified.
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from interface org.apache.flink.connector.file.src.enumerate.FileEnumerator
FileEnumerator.Provider
-
-
Field Summary
Fields Modifier and Type Field Description protected Predicate<Path>
fileFilter
The filter predicate to filter out unwanted files.
-
Constructor Summary
Constructors Constructor Description NonSplittingRecursiveEnumerator()
Creates a NonSplittingRecursiveEnumerator that enumerates all files except hidden files.NonSplittingRecursiveEnumerator(Predicate<Path> fileFilter)
Creates a NonSplittingRecursiveEnumerator that uses the given predicate as a filter for file paths.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected void
addSplitsForPath(FileStatus fileStatus, FileSystem fs, ArrayList<FileSourceSplit> target)
protected void
convertToSourceSplits(FileStatus file, FileSystem fs, List<FileSourceSplit> target)
Collection<FileSourceSplit>
enumerateSplits(Path[] paths, int minDesiredSplits)
Generates all file splits for the relevant files under the given paths.protected String
getNextId()
-
-
-
Method Detail
-
enumerateSplits
public Collection<FileSourceSplit> enumerateSplits(Path[] paths, int minDesiredSplits) throws IOException
Description copied from interface:FileEnumerator
Generates all file splits for the relevant files under the given paths. TheminDesiredSplits
is an optional hint indicating how many splits would be necessary to exploit parallelism properly.- Specified by:
enumerateSplits
in interfaceFileEnumerator
- Throws:
IOException
-
addSplitsForPath
protected void addSplitsForPath(FileStatus fileStatus, FileSystem fs, ArrayList<FileSourceSplit> target) throws IOException
- Throws:
IOException
-
convertToSourceSplits
protected void convertToSourceSplits(FileStatus file, FileSystem fs, List<FileSourceSplit> target) throws IOException
- Throws:
IOException
-
getNextId
protected final String getNextId()
-
-