public class HiveTableInputFormat extends HadoopInputFormatCommonBase<RowData,HiveTableInputSplit> implements CheckpointableInputFormat<HiveTableInputSplit,Long>
Modifier and Type | Field and Description |
---|---|
protected SplitReader |
reader |
credentials
Constructor and Description |
---|
HiveTableInputFormat(org.apache.hadoop.mapred.JobConf jobConf,
CatalogTable catalogTable,
List<HiveTablePartition> partitions,
int[] projectedFields,
long limit,
String hiveVersion,
boolean useMapRedReader) |
Modifier and Type | Method and Description |
---|---|
void |
close()
Method that marks the end of the life-cycle of an input split.
|
void |
configure(Configuration parameters)
Configures this input format.
|
HiveTableInputSplit[] |
createInputSplits(int minNumSplits)
Computes the input splits.
|
static HiveTableInputSplit[] |
createInputSplits(int minNumSplits,
List<HiveTablePartition> partitions,
org.apache.hadoop.mapred.JobConf jobConf) |
Long |
getCurrentState()
Returns the split currently being read, along with its current state.
|
InputSplitAssigner |
getInputSplitAssigner(HiveTableInputSplit[] inputSplits)
Returns the assigner for the input splits.
|
org.apache.hadoop.mapred.JobConf |
getJobConf() |
BaseStatistics |
getStatistics(BaseStatistics cachedStats)
Gets the basic statistics from the input described by this format.
|
RowData |
nextRecord(RowData reuse)
Reads the next record from the input.
|
void |
open(HiveTableInputSplit split)
Opens a parallel instance of the input format to work on a split.
|
boolean |
reachedEnd()
Method used to check if the end of the input is reached.
|
void |
reopen(HiveTableInputSplit split,
Long state)
Restores the state of a parallel instance reading from an
InputFormat . |
getCredentialsFromUGI, read, write
closeInputFormat, getRuntimeContext, openInputFormat, setRuntimeContext
@VisibleForTesting protected transient SplitReader reader
public HiveTableInputFormat(org.apache.hadoop.mapred.JobConf jobConf, CatalogTable catalogTable, List<HiveTablePartition> partitions, int[] projectedFields, long limit, String hiveVersion, boolean useMapRedReader)
public org.apache.hadoop.mapred.JobConf getJobConf()
public void configure(Configuration parameters)
InputFormat
This method is always called first on a newly instantiated input format.
configure
in interface InputFormat<RowData,HiveTableInputSplit>
parameters
- The configuration with all parameters (note: not the Flink config but the
TaskConfig).public void open(HiveTableInputSplit split) throws IOException
InputFormat
When this method is called, the input format it guaranteed to be configured.
open
in interface InputFormat<RowData,HiveTableInputSplit>
split
- The split to be opened.IOException
- Thrown, if the spit could not be opened due to an I/O problem.public void reopen(HiveTableInputSplit split, Long state) throws IOException
CheckpointableInputFormat
InputFormat
. This is
necessary when recovering from a task failure. When this method is called, the input format
it guaranteed to be configured.
NOTE: The caller has to make sure that the provided split is the one to whom the state belongs.
reopen
in interface CheckpointableInputFormat<HiveTableInputSplit,Long>
split
- The split to be opened.state
- The state from which to start from. This can contain the offset, but also other
data, depending on the input format.IOException
public Long getCurrentState()
CheckpointableInputFormat
getCurrentState
in interface CheckpointableInputFormat<HiveTableInputSplit,Long>
public boolean reachedEnd() throws IOException
InputFormat
When this method is called, the input format it guaranteed to be opened.
reachedEnd
in interface InputFormat<RowData,HiveTableInputSplit>
IOException
- Thrown, if an I/O error occurred.public RowData nextRecord(RowData reuse) throws IOException
InputFormat
When this method is called, the input format it guaranteed to be opened.
nextRecord
in interface InputFormat<RowData,HiveTableInputSplit>
reuse
- Object that may be reused.IOException
- Thrown, if an I/O error occurred.public void close() throws IOException
InputFormat
When this method is called, the input format it guaranteed to be opened.
close
in interface InputFormat<RowData,HiveTableInputSplit>
IOException
- Thrown, if the input could not be closed properly.public HiveTableInputSplit[] createInputSplits(int minNumSplits) throws IOException
InputSplitSource
createInputSplits
in interface InputFormat<RowData,HiveTableInputSplit>
createInputSplits
in interface InputSplitSource<HiveTableInputSplit>
minNumSplits
- Number of minimal input splits, as a hint.IOException
public static HiveTableInputSplit[] createInputSplits(int minNumSplits, List<HiveTablePartition> partitions, org.apache.hadoop.mapred.JobConf jobConf) throws IOException
IOException
public BaseStatistics getStatistics(BaseStatistics cachedStats)
InputFormat
When this method is called, the input format is guaranteed to be configured.
getStatistics
in interface InputFormat<RowData,HiveTableInputSplit>
cachedStats
- The statistics that were cached. May be null.public InputSplitAssigner getInputSplitAssigner(HiveTableInputSplit[] inputSplits)
InputSplitSource
getInputSplitAssigner
in interface InputFormat<RowData,HiveTableInputSplit>
getInputSplitAssigner
in interface InputSplitSource<HiveTableInputSplit>
Copyright © 2014–2021 The Apache Software Foundation. All rights reserved.