@Internal public class FileSystemTableSource extends Object implements ScanTableSource, SupportsProjectionPushDown, SupportsLimitPushDown, SupportsPartitionPushDown, SupportsFilterPushDown, SupportsReadingMetadata, SupportsStatisticReport
ScanTableSource.ScanContext, ScanTableSource.ScanRuntimeProvider
DynamicTableSource.Context, DynamicTableSource.DataStructureConverter
SupportsFilterPushDown.Result
Constructor and Description |
---|
FileSystemTableSource(ObjectIdentifier tableIdentifier,
DataType physicalRowDataType,
List<String> partitionKeys,
ReadableConfig tableOptions,
DecodingFormat<BulkFormat<RowData,FileSourceSplit>> bulkReaderFormat,
DecodingFormat<DeserializationSchema<RowData>> deserializationFormat) |
Modifier and Type | Method and Description |
---|---|
SupportsFilterPushDown.Result |
applyFilters(List<ResolvedExpression> filters)
Provides a list of filters in conjunctive form.
|
void |
applyLimit(long limit)
Provides the expected maximum number of produced records for limiting on a best-effort basis.
|
void |
applyPartitions(List<Map<String,String>> remainingPartitions)
Provides a list of remaining partitions.
|
void |
applyProjection(int[][] projectedFields,
DataType producedDataType)
Provides the field index paths that should be used for a projection.
|
void |
applyReadableMetadata(List<String> metadataKeys,
DataType producedDataType)
Provides a list of metadata keys that the produced
RowData must contain as appended
metadata columns. |
String |
asSummaryString()
Returns a string that summarizes this source for printing to a console or log.
|
FileSystemTableSource |
copy()
Creates a copy of this instance during planning.
|
ChangelogMode |
getChangelogMode()
Returns the set of changes that the planner can expect during runtime.
|
ScanTableSource.ScanRuntimeProvider |
getScanRuntimeProvider(ScanTableSource.ScanContext scanContext)
Returns a provider of runtime implementation for reading the data.
|
Optional<List<Map<String,String>>> |
listPartitions()
Returns a list of all partitions that a source can read if available.
|
Map<String,DataType> |
listReadableMetadata()
Returns the map of metadata keys and their corresponding data types that can be produced by
this table source for reading.
|
TableStats |
reportStatistics()
Returns the estimated statistics of this
DynamicTableSource , else TableStats.UNKNOWN if some situations are not supported or cannot be handled. |
boolean |
supportsNestedProjection()
Returns whether this source supports nested projection.
|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
applyProjection
supportsMetadataProjection
public FileSystemTableSource(ObjectIdentifier tableIdentifier, DataType physicalRowDataType, List<String> partitionKeys, ReadableConfig tableOptions, @Nullable DecodingFormat<BulkFormat<RowData,FileSourceSplit>> bulkReaderFormat, @Nullable DecodingFormat<DeserializationSchema<RowData>> deserializationFormat)
public ScanTableSource.ScanRuntimeProvider getScanRuntimeProvider(ScanTableSource.ScanContext scanContext)
ScanTableSource
There might exist different interfaces for runtime implementation which is why ScanTableSource.ScanRuntimeProvider
serves as the base interface. Concrete ScanTableSource.ScanRuntimeProvider
interfaces might be located in other Flink modules.
Independent of the provider interface, the table runtime expects that a source
implementation emits internal data structures (see RowData
for more information).
The given ScanTableSource.ScanContext
offers utilities by the planner for creating runtime
implementation with minimal dependencies to internal data structures.
SourceProvider
is the recommended core interface. SourceFunctionProvider
in flink-table-api-java-bridge
and InputFormatProvider
are available for
backwards compatibility.
getScanRuntimeProvider
in interface ScanTableSource
SourceProvider
public ChangelogMode getChangelogMode()
ScanTableSource
getChangelogMode
in interface ScanTableSource
RowKind
public SupportsFilterPushDown.Result applyFilters(List<ResolvedExpression> filters)
SupportsFilterPushDown
See the documentation of SupportsFilterPushDown
for more information.
applyFilters
in interface SupportsFilterPushDown
public void applyLimit(long limit)
SupportsLimitPushDown
applyLimit
in interface SupportsLimitPushDown
public Optional<List<Map<String,String>>> listPartitions()
SupportsPartitionPushDown
A single partition maps each partition key to a partition value.
If Optional.empty()
is returned, the list of partitions is queried from the
catalog.
listPartitions
in interface SupportsPartitionPushDown
public void applyPartitions(List<Map<String,String>> remainingPartitions)
SupportsPartitionPushDown
See the documentation of SupportsPartitionPushDown
for more information.
applyPartitions
in interface SupportsPartitionPushDown
public boolean supportsNestedProjection()
SupportsProjectionPushDown
supportsNestedProjection
in interface SupportsProjectionPushDown
public TableStats reportStatistics()
SupportsStatisticReport
DynamicTableSource
, else TableStats.UNKNOWN
if some situations are not supported or cannot be handled.reportStatistics
in interface SupportsStatisticReport
public FileSystemTableSource copy()
DynamicTableSource
copy
in interface DynamicTableSource
public String asSummaryString()
DynamicTableSource
asSummaryString
in interface DynamicTableSource
public void applyProjection(int[][] projectedFields, DataType producedDataType)
SupportsProjectionPushDown
SupportsProjectionPushDown.supportsNestedProjection()
.
In the example mentioned in SupportsProjectionPushDown
, this method would receive:
[[2], [1]]
which is equivalent to [["s"], ["r"]]
if SupportsProjectionPushDown.supportsNestedProjection()
returns false.
[[2], [1, 0]]
which is equivalent to [["s"], ["r", "d"]]]
if SupportsProjectionPushDown.supportsNestedProjection()
returns true.
Note: Use the passed data type instead of ResolvedSchema.toPhysicalRowDataType()
for describing the final output data type when creating TypeInformation
.
applyProjection
in interface SupportsProjectionPushDown
projectedFields
- field index paths of all fields that must be present in the physically
produced dataproducedDataType
- the final output type of the source, with the projection appliedpublic void applyReadableMetadata(List<String> metadataKeys, DataType producedDataType)
SupportsReadingMetadata
RowData
must contain as appended
metadata columns.
Implementations of this method must be idempotent. The planner might call this method multiple times.
Note: Use the passed data type instead of ResolvedSchema.toPhysicalRowDataType()
for describing the final output data type when creating TypeInformation
. If the
source implements SupportsProjectionPushDown
, the projection is already considered in
the given output data type, use the producedDataType
provided by this method instead
of the producedDataType
provided by SupportsProjectionPushDown.applyProjection(int[][], DataType)
.
applyReadableMetadata
in interface SupportsReadingMetadata
metadataKeys
- a subset of the keys returned by SupportsReadingMetadata.listReadableMetadata()
, ordered
by the iteration order of returned mapproducedDataType
- the final output type of the source, it is intended to be only
forwarded and the planner will decide on the field names to avoid collisionsDecodingFormat.applyReadableMetadata(List)
public Map<String,DataType> listReadableMetadata()
SupportsReadingMetadata
The returned map will be used by the planner for validation and insertion of explicit
casts (see LogicalTypeCasts.supportsExplicitCast(LogicalType, LogicalType)
) if
necessary.
The iteration order of the returned map determines the order of metadata keys in the list
passed in SupportsReadingMetadata.applyReadableMetadata(List, DataType)
. Therefore, it might be beneficial
to return a LinkedHashMap
if a strict metadata column order is required.
If a source forwards metadata from one or more formats, we recommend the following column order for consistency:
KEY FORMAT METADATA COLUMNS + VALUE FORMAT METADATA COLUMNS + SOURCE METADATA COLUMNS
Metadata key names follow the same pattern as mentioned in Factory
. In case of
duplicate names in format and source keys, format keys shall have higher precedence.
Regardless of the returned DataType
s, a metadata column is always represented
using internal data structures (see RowData
).
listReadableMetadata
in interface SupportsReadingMetadata
DecodingFormat.listReadableMetadata()
Copyright © 2014–2024 The Apache Software Foundation. All rights reserved.