public class HiveTableSource extends Object implements ScanTableSource, SupportsPartitionPushDown, SupportsProjectionPushDown, SupportsLimitPushDown, SupportsStatisticReport, SupportsDynamicFiltering
Modifier and Type | Class and Description |
---|---|
static class |
HiveTableSource.HiveContinuousPartitionFetcherContext<T extends Comparable<T>>
PartitionFetcher.Context for
ContinuousPartitionFetcher . |
ScanTableSource.ScanContext, ScanTableSource.ScanRuntimeProvider
DynamicTableSource.Context, DynamicTableSource.DataStructureConverter
Modifier and Type | Field and Description |
---|---|
protected ResolvedCatalogTable |
catalogTable |
protected List<String> |
dynamicFilterPartitionKeys |
protected ReadableConfig |
flinkConf |
protected HiveShim |
hiveShim |
protected String |
hiveVersion |
protected org.apache.hadoop.mapred.JobConf |
jobConf |
protected Long |
limit |
protected DataType |
producedDataType |
protected int[] |
projectedFields |
protected List<Map<String,String>> |
remainingPartitions |
protected ObjectPath |
tablePath |
Constructor and Description |
---|
HiveTableSource(org.apache.hadoop.mapred.JobConf jobConf,
ReadableConfig flinkConf,
ObjectPath tablePath,
ResolvedCatalogTable catalogTable) |
Modifier and Type | Method and Description |
---|---|
void |
applyDynamicFiltering(List<String> candidateFilterFields)
Applies the candidate filter fields into the table source.
|
void |
applyLimit(long limit)
Provides the expected maximum number of produced records for limiting on a best-effort basis.
|
void |
applyPartitions(List<Map<String,String>> remainingPartitions)
Provides a list of remaining partitions.
|
void |
applyProjection(int[][] projectedFields,
DataType producedDataType)
Provides the field index paths that should be used for a projection.
|
String |
asSummaryString()
Returns a string that summarizes this source for printing to a console or log.
|
DynamicTableSource |
copy()
Creates a copy of this instance during planning.
|
ChangelogMode |
getChangelogMode()
Returns the set of changes that the planner can expect during runtime.
|
protected DataStream<RowData> |
getDataStream(ProviderContext providerContext,
StreamExecutionEnvironment execEnv) |
org.apache.hadoop.mapred.JobConf |
getJobConf() |
ScanTableSource.ScanRuntimeProvider |
getScanRuntimeProvider(ScanTableSource.ScanContext runtimeProviderContext)
Returns a provider of runtime implementation for reading the data.
|
protected boolean |
isStreamingSource() |
List<String> |
listAcceptedFilterFields()
Return the filter fields this partition table source supported.
|
Optional<List<Map<String,String>>> |
listPartitions()
Returns a list of all partitions that a source can read if available.
|
TableStats |
reportStatistics()
Returns the estimated statistics of this
DynamicTableSource , else TableStats.UNKNOWN if some situations are not supported or cannot be handled. |
boolean |
supportsNestedProjection()
Returns whether this source supports nested projection.
|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
applyProjection
protected final org.apache.hadoop.mapred.JobConf jobConf
protected final ReadableConfig flinkConf
protected final ObjectPath tablePath
protected final ResolvedCatalogTable catalogTable
protected final String hiveVersion
protected final HiveShim hiveShim
protected int[] projectedFields
protected DataType producedDataType
protected Long limit
public HiveTableSource(org.apache.hadoop.mapred.JobConf jobConf, ReadableConfig flinkConf, ObjectPath tablePath, ResolvedCatalogTable catalogTable)
public ScanTableSource.ScanRuntimeProvider getScanRuntimeProvider(ScanTableSource.ScanContext runtimeProviderContext)
ScanTableSource
There might exist different interfaces for runtime implementation which is why ScanTableSource.ScanRuntimeProvider
serves as the base interface. Concrete ScanTableSource.ScanRuntimeProvider
interfaces might be located in other Flink modules.
Independent of the provider interface, the table runtime expects that a source
implementation emits internal data structures (see RowData
for more information).
The given ScanTableSource.ScanContext
offers utilities by the planner for creating runtime
implementation with minimal dependencies to internal data structures.
SourceProvider
is the recommended core interface. SourceFunctionProvider
in flink-table-api-java-bridge
and InputFormatProvider
are available for
backwards compatibility.
getScanRuntimeProvider
in interface ScanTableSource
SourceProvider
@VisibleForTesting protected DataStream<RowData> getDataStream(ProviderContext providerContext, StreamExecutionEnvironment execEnv)
protected boolean isStreamingSource()
public void applyLimit(long limit)
SupportsLimitPushDown
applyLimit
in interface SupportsLimitPushDown
public Optional<List<Map<String,String>>> listPartitions()
SupportsPartitionPushDown
A single partition maps each partition key to a partition value.
If Optional.empty()
is returned, the list of partitions is queried from the
catalog.
listPartitions
in interface SupportsPartitionPushDown
public void applyPartitions(List<Map<String,String>> remainingPartitions)
SupportsPartitionPushDown
See the documentation of SupportsPartitionPushDown
for more information.
applyPartitions
in interface SupportsPartitionPushDown
public List<String> listAcceptedFilterFields()
SupportsDynamicFiltering
listAcceptedFilterFields
in interface SupportsDynamicFiltering
public void applyDynamicFiltering(List<String> candidateFilterFields)
SupportsDynamicFiltering
NOTE: the candidate filter fields are always from the result of SupportsDynamicFiltering.listAcceptedFilterFields()
.
applyDynamicFiltering
in interface SupportsDynamicFiltering
public boolean supportsNestedProjection()
SupportsProjectionPushDown
supportsNestedProjection
in interface SupportsProjectionPushDown
public void applyProjection(int[][] projectedFields, DataType producedDataType)
SupportsProjectionPushDown
SupportsProjectionPushDown.supportsNestedProjection()
.
In the example mentioned in SupportsProjectionPushDown
, this method would receive:
[[2], [1]]
which is equivalent to [["s"], ["r"]]
if SupportsProjectionPushDown.supportsNestedProjection()
returns false.
[[2], [1, 0]]
which is equivalent to [["s"], ["r", "d"]]]
if SupportsProjectionPushDown.supportsNestedProjection()
returns true.
Note: Use the passed data type instead of ResolvedSchema.toPhysicalRowDataType()
for describing the final output data type when creating TypeInformation
.
applyProjection
in interface SupportsProjectionPushDown
projectedFields
- field index paths of all fields that must be present in the physically
produced dataproducedDataType
- the final output type of the source, with the projection appliedpublic String asSummaryString()
DynamicTableSource
asSummaryString
in interface DynamicTableSource
public ChangelogMode getChangelogMode()
ScanTableSource
getChangelogMode
in interface ScanTableSource
RowKind
public DynamicTableSource copy()
DynamicTableSource
copy
in interface DynamicTableSource
public TableStats reportStatistics()
SupportsStatisticReport
DynamicTableSource
, else TableStats.UNKNOWN
if some situations are not supported or cannot be handled.reportStatistics
in interface SupportsStatisticReport
@VisibleForTesting public org.apache.hadoop.mapred.JobConf getJobConf()
Copyright © 2014–2024 The Apache Software Foundation. All rights reserved.