public class HBaseRowDataInputFormat extends RichInputFormat<T,org.apache.flink.connector.hbase.source.TableInputSplit>
InputFormat
subclass that wraps the access for HTables. Returns the result as RowData
Modifier and Type | Field and Description |
---|---|
protected byte[] |
currentRow |
protected boolean |
endReached |
protected org.apache.hadoop.hbase.client.ResultScanner |
resultScanner
HBase iterator wrapper.
|
protected org.apache.hadoop.hbase.client.Scan |
scan |
protected long |
scannedRows |
protected byte[] |
serializedConfig |
protected org.apache.hadoop.hbase.client.HTable |
table |
Constructor and Description |
---|
HBaseRowDataInputFormat(Configuration conf,
String tableName,
HBaseTableSchema schema,
String nullStringLiteral) |
Modifier and Type | Method and Description |
---|---|
void |
close()
Method that marks the end of the life-cycle of an input split.
|
void |
closeInputFormat()
Closes this InputFormat instance.
|
void |
configure(Configuration parameters)
Creates a
Scan object and opens the HTable connection. |
org.apache.flink.connector.hbase.source.TableInputSplit[] |
createInputSplits(int minNumSplits)
Computes the input splits.
|
protected Configuration |
getHadoopConfiguration() |
InputSplitAssigner |
getInputSplitAssigner(org.apache.flink.connector.hbase.source.TableInputSplit[] inputSplits)
Returns the assigner for the input splits.
|
protected org.apache.hadoop.hbase.client.Scan |
getScanner()
Returns an instance of Scan that retrieves the required subset of records from the HBase
table.
|
BaseStatistics |
getStatistics(BaseStatistics cachedStatistics)
Gets the basic statistics from the input described by this format.
|
String |
getTableName()
What table is to be read.
|
protected boolean |
includeRegionInScan(byte[] startKey,
byte[] endKey)
Test if the given region is to be included in the scan while splitting the regions of a
table.
|
protected RowData |
mapResultToOutType(org.apache.hadoop.hbase.client.Result res)
HBase returns an instance of
Result . |
T |
nextRecord(T reuse)
Reads the next record from the input.
|
void |
open(org.apache.flink.connector.hbase.source.TableInputSplit split)
Opens a parallel instance of the input format to work on a split.
|
boolean |
reachedEnd()
Method used to check if the end of the input is reached.
|
getRuntimeContext, openInputFormat, setRuntimeContext
protected boolean endReached
protected transient org.apache.hadoop.hbase.client.HTable table
protected transient org.apache.hadoop.hbase.client.Scan scan
protected org.apache.hadoop.hbase.client.ResultScanner resultScanner
protected byte[] currentRow
protected long scannedRows
protected byte[] serializedConfig
public HBaseRowDataInputFormat(Configuration conf, String tableName, HBaseTableSchema schema, String nullStringLiteral)
public void configure(Configuration parameters)
Scan
object and opens the HTable
connection.
These are opened here because they are needed in the createInputSplits which is called before the openInputFormat method.
The connection is opened in this method and closed in closeInputFormat()
.
configure
in interface InputFormat<RowData,org.apache.flink.connector.hbase.source.TableInputSplit>
parameters
- The configuration that is to be usedConfiguration
protected org.apache.hadoop.hbase.client.Scan getScanner()
public String getTableName()
Per instance of a TableInputFormat derivative only a single table name is possible.
protected RowData mapResultToOutType(org.apache.hadoop.hbase.client.Result res)
Result
.
This method maps the returned Result
instance into the output type T
.
res
- The Result instance from HBase that needs to be convertedT
that contains the data of Result.protected Configuration getHadoopConfiguration()
public void open(org.apache.flink.connector.hbase.source.TableInputSplit split) throws IOException
InputFormat
When this method is called, the input format it guaranteed to be configured.
split
- The split to be opened.IOException
- Thrown, if the spit could not be opened due to an I/O problem.public T nextRecord(T reuse) throws IOException
InputFormat
When this method is called, the input format it guaranteed to be opened.
reuse
- Object that may be reused.IOException
- Thrown, if an I/O error occurred.public boolean reachedEnd() throws IOException
InputFormat
When this method is called, the input format it guaranteed to be opened.
IOException
- Thrown, if an I/O error occurred.public void close() throws IOException
InputFormat
When this method is called, the input format it guaranteed to be opened.
IOException
- Thrown, if the input could not be closed properly.public void closeInputFormat() throws IOException
RichInputFormat
RichInputFormat.openInputFormat()
should be closed in this method.closeInputFormat
in class RichInputFormat<T,org.apache.flink.connector.hbase.source.TableInputSplit>
IOException
- in case closing the resources failedInputFormat
public org.apache.flink.connector.hbase.source.TableInputSplit[] createInputSplits(int minNumSplits) throws IOException
InputSplitSource
minNumSplits
- Number of minimal input splits, as a hint.IOException
protected boolean includeRegionInScan(byte[] startKey, byte[] endKey)
startKey
- Start key of the regionendKey
- End key of the regionpublic InputSplitAssigner getInputSplitAssigner(org.apache.flink.connector.hbase.source.TableInputSplit[] inputSplits)
InputSplitSource
public BaseStatistics getStatistics(BaseStatistics cachedStatistics)
InputFormat
When this method is called, the input format is guaranteed to be configured.
cachedStatistics
- The statistics that were cached. May be null.Copyright © 2014–2021 The Apache Software Foundation. All rights reserved.