public class FileSystemTableSource extends Object implements StreamTableSource<RowData>, PartitionableTableSource, ProjectableTableSource<RowData>, LimitableTableSource<RowData>, FilterableTableSource<RowData>
Constructor and Description |
---|
FileSystemTableSource(TableSchema schema,
Path path,
List<String> partitionKeys,
String defaultPartName,
Map<String,String> properties)
Construct a file system table source.
|
Modifier and Type | Method and Description |
---|---|
FileSystemTableSource |
applyLimit(long limit)
Check and push down the limit to the table source.
|
FileSystemTableSource |
applyPartitionPruning(List<Map<String,String>> remainingPartitions)
Applies the remaining partitions to the table source.
|
FileSystemTableSource |
applyPredicate(List<Expression> predicates)
Check and pick all predicates this table source can support.
|
String |
explainSource()
Describes the table source.
|
DataStream<RowData> |
getDataStream(StreamExecutionEnvironment execEnv)
Returns the data of the table as a
DataStream . |
List<Map<String,String>> |
getPartitions()
Returns all the partitions of this
PartitionableTableSource . |
DataType |
getProducedDataType()
Returns the
DataType for the produced data of the TableSource . |
TableSchema |
getTableSchema()
Returns the schema of the produced table.
|
boolean |
isBounded()
Returns true if this is a bounded source, false if this is an unbounded source.
|
boolean |
isFilterPushedDown()
Return the flag to indicate whether filter push down has been tried.
|
boolean |
isLimitPushedDown()
Return the flag to indicate whether limit push down has been tried.
|
FileSystemTableSource |
projectFields(int[] fields)
Creates a copy of the
TableSource that projects its output to the given field
indexes. |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
getReturnType
public FileSystemTableSource(TableSchema schema, Path path, List<String> partitionKeys, String defaultPartName, Map<String,String> properties)
schema
- schema of the table.path
- directory path of the file system table.partitionKeys
- partition keys of the table.defaultPartName
- The default partition name in case the dynamic partition column value
is null/empty string.properties
- table properties.public DataStream<RowData> getDataStream(StreamExecutionEnvironment execEnv)
StreamTableSource
DataStream
.
NOTE: This method is for internal use only for defining a TableSource
. Do not use
it in Table API programs.
getDataStream
in interface StreamTableSource<RowData>
public boolean isBounded()
StreamTableSource
isBounded
in interface StreamTableSource<RowData>
public List<Map<String,String>> getPartitions()
PartitionableTableSource
PartitionableTableSource
.getPartitions
in interface PartitionableTableSource
public FileSystemTableSource applyPartitionPruning(List<Map<String,String>> remainingPartitions)
PartitionableTableSource
remainingPartitions
is the
remaining partitions of PartitionableTableSource.getPartitions()
after partition pruning applied.
After trying to apply partition pruning, we should return a new TableSource
instance which holds all pruned-partitions.
applyPartitionPruning
in interface PartitionableTableSource
remainingPartitions
- Remaining partitions after partition pruning applied.TableSource
holds all pruned-partitions.public FileSystemTableSource projectFields(int[] fields)
ProjectableTableSource
TableSource
that projects its output to the given field
indexes. The field indexes relate to the physical poduced data type (TableSource.getProducedDataType()
) and not to the table schema (TableSource.getTableSchema()
of the TableSource
.
The table schema (TableSource.getTableSchema()
of the TableSource
copy must
not be modified by this method, but only the produced data type (TableSource.getProducedDataType()
) and the produced DataSet
(BatchTableSource#getDataSet(
) or DataStream
(StreamTableSource#getDataStream
).
If the TableSource
implements the DefinedFieldMapping
interface, it might
be necessary to adjust the mapping as well.
IMPORTANT: This method must return a true copy and must not modify the original table source object.
projectFields
in interface ProjectableTableSource<RowData>
fields
- The indexes of the fields to return.TableSource
that projects its output.public FileSystemTableSource applyLimit(long limit)
LimitableTableSource
applyLimit
in interface LimitableTableSource<RowData>
limit
- the value which limit the number of records.TableSource
.public boolean isLimitPushedDown()
LimitableTableSource
LimitableTableSource.applyLimit(long)
.isLimitPushedDown
in interface LimitableTableSource<RowData>
public FileSystemTableSource applyPredicate(List<Expression> predicates)
FilterableTableSource
WARNING: Flink planner will push down PlannerExpressions (which are
defined in flink-table-planner module), while Blink planner will push down Expression
s. So the implementation for Flink planner and Blink planner should be different
and incompatible. PlannerExpression will be removed in the future.
After trying to push predicates down, we should return a new TableSource
instance
which holds all pushed down predicates. Even if we actually pushed nothing down, it is
recommended that we still return a new TableSource
instance since we will mark the
returned instance as filter push down has been tried.
We also should note to not changing the form of the predicates passed in. It has been organized in CNF conjunctive form, and we should only take or leave each element from the list. Don't try to reorganize the predicates if you are absolutely confident with that.
applyPredicate
in interface FilterableTableSource<RowData>
predicates
- A list contains conjunctive predicates, you should pick and remove all
expressions that can be pushed down. The remaining elements of this list will further
evaluated by framework.TableSource
with or without any filters been pushed
into it.public boolean isFilterPushedDown()
FilterableTableSource
FilterableTableSource.applyPredicate(java.util.List<org.apache.flink.table.expressions.Expression>)
.isFilterPushedDown
in interface FilterableTableSource<RowData>
public DataType getProducedDataType()
TableSource
DataType
for the produced data of the TableSource
.getProducedDataType
in interface TableSource<RowData>
DataSet
or DataStream
.public TableSchema getTableSchema()
TableSource
getTableSchema
in interface TableSource<RowData>
TableSchema
of the produced table.public String explainSource()
TableSource
explainSource
in interface TableSource<RowData>
TableSource
.Copyright © 2014–2021 The Apache Software Foundation. All rights reserved.