@PublicEvolving public interface FileEnumerator
FileEnumerator
's task is to discover all files to be read and to split them into a
set of FileSourceSplit
.
This includes possibly, path traversals, file filtering (by name or other patterns) and deciding whether to split files into multiple splits, and how to split them.
Modifier and Type | Interface and Description |
---|---|
static interface |
FileEnumerator.Provider
Factory for the
FileEnumerator , to allow the FileEnumerator to be eagerly
initialized and to not be serializable. |
Modifier and Type | Method and Description |
---|---|
Collection<FileSourceSplit> |
enumerateSplits(Path[] paths,
int minDesiredSplits)
Generates all file splits for the relevant files under the given paths.
|
Collection<FileSourceSplit> enumerateSplits(Path[] paths, int minDesiredSplits) throws IOException
minDesiredSplits
is an optional hint indicating how many splits would be necessary to
exploit parallelism properly.IOException
Copyright © 2014–2023 The Apache Software Foundation. All rights reserved.