Class HiveTableSource
- java.lang.Object
-
- org.apache.flink.connectors.hive.HiveTableSource
-
- All Implemented Interfaces:
SupportsDynamicFiltering
,SupportsLimitPushDown
,SupportsPartitionPushDown
,SupportsProjectionPushDown
,SupportsStatisticReport
,DynamicTableSource
,ScanTableSource
- Direct Known Subclasses:
HiveLookupTableSource
public class HiveTableSource extends Object implements ScanTableSource, SupportsPartitionPushDown, SupportsProjectionPushDown, SupportsLimitPushDown, SupportsStatisticReport, SupportsDynamicFiltering
A TableSource implementation to read data from Hive tables.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
HiveTableSource.HiveContinuousPartitionFetcherContext<T extends Comparable<T>>
PartitionFetcher.Context forContinuousPartitionFetcher
.-
Nested classes/interfaces inherited from interface org.apache.flink.table.connector.source.DynamicTableSource
DynamicTableSource.Context, DynamicTableSource.DataStructureConverter
-
Nested classes/interfaces inherited from interface org.apache.flink.table.connector.source.ScanTableSource
ScanTableSource.ScanContext, ScanTableSource.ScanRuntimeProvider
-
-
Field Summary
Fields Modifier and Type Field Description protected ResolvedCatalogTable
catalogTable
protected List<String>
dynamicFilterPartitionKeys
protected ReadableConfig
flinkConf
protected HiveShim
hiveShim
protected String
hiveVersion
protected org.apache.hadoop.mapred.JobConf
jobConf
protected Long
limit
protected DataType
producedDataType
protected int[]
projectedFields
protected List<Map<String,String>>
remainingPartitions
protected ObjectPath
tablePath
-
Constructor Summary
Constructors Constructor Description HiveTableSource(org.apache.hadoop.mapred.JobConf jobConf, ReadableConfig flinkConf, ObjectPath tablePath, ResolvedCatalogTable catalogTable)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
applyDynamicFiltering(List<String> candidateFilterFields)
Applies the candidate filter fields into the table source.void
applyLimit(long limit)
Provides the expected maximum number of produced records for limiting on a best-effort basis.void
applyPartitions(List<Map<String,String>> remainingPartitions)
Provides a list of remaining partitions.void
applyProjection(int[][] projectedFields, DataType producedDataType)
Provides the field index paths that should be used for a projection.String
asSummaryString()
Returns a string that summarizes this source for printing to a console or log.DynamicTableSource
copy()
Creates a copy of this instance during planning.ChangelogMode
getChangelogMode()
Returns the set of changes that the planner can expect during runtime.protected DataStream<RowData>
getDataStream(ProviderContext providerContext, StreamExecutionEnvironment execEnv)
org.apache.hadoop.mapred.JobConf
getJobConf()
ScanTableSource.ScanRuntimeProvider
getScanRuntimeProvider(ScanTableSource.ScanContext runtimeProviderContext)
Returns a provider of runtime implementation for reading the data.protected boolean
isStreamingSource()
List<String>
listAcceptedFilterFields()
Return the filter fields this partition table source supported.Optional<List<Map<String,String>>>
listPartitions()
Returns a list of all partitions that a source can read if available.TableStats
reportStatistics()
Returns the estimated statistics of thisDynamicTableSource
, elseTableStats.UNKNOWN
if some situations are not supported or cannot be handled.boolean
supportsNestedProjection()
Returns whether this source supports nested projection.-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface org.apache.flink.table.connector.source.abilities.SupportsProjectionPushDown
applyProjection
-
-
-
-
Field Detail
-
jobConf
protected final org.apache.hadoop.mapred.JobConf jobConf
-
flinkConf
protected final ReadableConfig flinkConf
-
tablePath
protected final ObjectPath tablePath
-
catalogTable
protected final ResolvedCatalogTable catalogTable
-
hiveVersion
protected final String hiveVersion
-
hiveShim
protected final HiveShim hiveShim
-
projectedFields
protected int[] projectedFields
-
producedDataType
protected DataType producedDataType
-
limit
protected Long limit
-
-
Constructor Detail
-
HiveTableSource
public HiveTableSource(org.apache.hadoop.mapred.JobConf jobConf, ReadableConfig flinkConf, ObjectPath tablePath, ResolvedCatalogTable catalogTable)
-
-
Method Detail
-
getScanRuntimeProvider
public ScanTableSource.ScanRuntimeProvider getScanRuntimeProvider(ScanTableSource.ScanContext runtimeProviderContext)
Description copied from interface:ScanTableSource
Returns a provider of runtime implementation for reading the data.There might exist different interfaces for runtime implementation which is why
ScanTableSource.ScanRuntimeProvider
serves as the base interface. ConcreteScanTableSource.ScanRuntimeProvider
interfaces might be located in other Flink modules.Independent of the provider interface, the table runtime expects that a source implementation emits internal data structures (see
RowData
for more information).The given
ScanTableSource.ScanContext
offers utilities by the planner for creating runtime implementation with minimal dependencies to internal data structures.SourceProvider
is the recommended core interface.SourceFunctionProvider
inflink-table-api-java-bridge
andInputFormatProvider
are available for backwards compatibility.- Specified by:
getScanRuntimeProvider
in interfaceScanTableSource
- See Also:
SourceProvider
-
getDataStream
@VisibleForTesting protected DataStream<RowData> getDataStream(ProviderContext providerContext, StreamExecutionEnvironment execEnv)
-
isStreamingSource
protected boolean isStreamingSource()
-
applyLimit
public void applyLimit(long limit)
Description copied from interface:SupportsLimitPushDown
Provides the expected maximum number of produced records for limiting on a best-effort basis.- Specified by:
applyLimit
in interfaceSupportsLimitPushDown
-
listPartitions
public Optional<List<Map<String,String>>> listPartitions()
Description copied from interface:SupportsPartitionPushDown
Returns a list of all partitions that a source can read if available.A single partition maps each partition key to a partition value.
If
Optional.empty()
is returned, the list of partitions is queried from the catalog.- Specified by:
listPartitions
in interfaceSupportsPartitionPushDown
-
applyPartitions
public void applyPartitions(List<Map<String,String>> remainingPartitions)
Description copied from interface:SupportsPartitionPushDown
Provides a list of remaining partitions. After those partitions are applied, a source must not read the data of other partitions during runtime.See the documentation of
SupportsPartitionPushDown
for more information.- Specified by:
applyPartitions
in interfaceSupportsPartitionPushDown
-
listAcceptedFilterFields
public List<String> listAcceptedFilterFields()
Description copied from interface:SupportsDynamicFiltering
Return the filter fields this partition table source supported. This method is can tell the planner which fields can be used as dynamic filtering fields, the planner will pick some fields from the returned fields based on the query, and create dynamic filtering operator.- Specified by:
listAcceptedFilterFields
in interfaceSupportsDynamicFiltering
-
applyDynamicFiltering
public void applyDynamicFiltering(List<String> candidateFilterFields)
Description copied from interface:SupportsDynamicFiltering
Applies the candidate filter fields into the table source. The data corresponding the filter fields will be provided in runtime, which can be used to filter the partitions or the input data.NOTE: the candidate filter fields are always from the result of
SupportsDynamicFiltering.listAcceptedFilterFields()
.- Specified by:
applyDynamicFiltering
in interfaceSupportsDynamicFiltering
-
supportsNestedProjection
public boolean supportsNestedProjection()
Description copied from interface:SupportsProjectionPushDown
Returns whether this source supports nested projection.- Specified by:
supportsNestedProjection
in interfaceSupportsProjectionPushDown
-
applyProjection
public void applyProjection(int[][] projectedFields, DataType producedDataType)
Description copied from interface:SupportsProjectionPushDown
Provides the field index paths that should be used for a projection. The indices are 0-based and support fields within (possibly nested) structures if this is enabled viaSupportsProjectionPushDown.supportsNestedProjection()
.In the example mentioned in
SupportsProjectionPushDown
, this method would receive:[[2], [1]]
which is equivalent to[["s"], ["r"]]
ifSupportsProjectionPushDown.supportsNestedProjection()
returns false.[[2], [1, 0]]
which is equivalent to[["s"], ["r", "d"]]]
ifSupportsProjectionPushDown.supportsNestedProjection()
returns true.
Note: Use the passed data type instead of
ResolvedSchema.toPhysicalRowDataType()
for describing the final output data type when creatingTypeInformation
.- Specified by:
applyProjection
in interfaceSupportsProjectionPushDown
- Parameters:
projectedFields
- field index paths of all fields that must be present in the physically produced dataproducedDataType
- the final output type of the source, with the projection applied
-
asSummaryString
public String asSummaryString()
Description copied from interface:DynamicTableSource
Returns a string that summarizes this source for printing to a console or log.- Specified by:
asSummaryString
in interfaceDynamicTableSource
-
getChangelogMode
public ChangelogMode getChangelogMode()
Description copied from interface:ScanTableSource
Returns the set of changes that the planner can expect during runtime.- Specified by:
getChangelogMode
in interfaceScanTableSource
- See Also:
RowKind
-
copy
public DynamicTableSource copy()
Description copied from interface:DynamicTableSource
Creates a copy of this instance during planning. The copy should be a deep copy of all mutable members.- Specified by:
copy
in interfaceDynamicTableSource
-
reportStatistics
public TableStats reportStatistics()
Description copied from interface:SupportsStatisticReport
Returns the estimated statistics of thisDynamicTableSource
, elseTableStats.UNKNOWN
if some situations are not supported or cannot be handled.- Specified by:
reportStatistics
in interfaceSupportsStatisticReport
-
getJobConf
@VisibleForTesting public org.apache.hadoop.mapred.JobConf getJobConf()
-
-