public abstract class TableInputFormat<T extends Tuple> extends RichInputFormat<T,TableInputSplit>
InputFormat
subclass that wraps the access for HTables.Modifier and Type | Field and Description |
---|---|
protected org.apache.hadoop.hbase.client.Scan |
scan |
protected org.apache.hadoop.hbase.client.HTable |
table |
Constructor and Description |
---|
TableInputFormat() |
Modifier and Type | Method and Description |
---|---|
void |
close()
Method that marks the end of the life-cycle of an input split.
|
void |
configure(Configuration parameters)
creates a
Scan object and a HTable connection |
TableInputSplit[] |
createInputSplits(int minNumSplits)
Creates the different splits of the input that can be processed in parallel.
|
InputSplitAssigner |
getInputSplitAssigner(TableInputSplit[] inputSplits)
Gets the type of the input splits that are processed by this input format.
|
protected abstract org.apache.hadoop.hbase.client.Scan |
getScanner() |
BaseStatistics |
getStatistics(BaseStatistics cachedStatistics)
Gets the basic statistics from the input described by this format.
|
protected abstract String |
getTableName() |
protected abstract T |
mapResultToTuple(org.apache.hadoop.hbase.client.Result r) |
T |
nextRecord(T reuse)
Reads the next record from the input.
|
void |
open(TableInputSplit split)
Opens a parallel instance of the input format to work on a split.
|
boolean |
reachedEnd()
Method used to check if the end of the input is reached.
|
getRuntimeContext, setRuntimeContext
protected transient org.apache.hadoop.hbase.client.HTable table
protected transient org.apache.hadoop.hbase.client.Scan scan
protected abstract org.apache.hadoop.hbase.client.Scan getScanner()
protected abstract String getTableName()
protected abstract T mapResultToTuple(org.apache.hadoop.hbase.client.Result r)
public void configure(Configuration parameters)
Scan
object and a HTable
connectionparameters
- Configuration
public boolean reachedEnd() throws IOException
InputFormat
When this method is called, the input format it guaranteed to be opened.
IOException
- Thrown, if an I/O error occurred.public T nextRecord(T reuse) throws IOException
InputFormat
When this method is called, the input format it guaranteed to be opened.
reuse
- Object that may be reused.IOException
- Thrown, if an I/O error occurred.public void open(TableInputSplit split) throws IOException
InputFormat
When this method is called, the input format it guaranteed to be configured.
split
- The split to be opened.IOException
- Thrown, if the spit could not be opened due to an I/O problem.public void close() throws IOException
InputFormat
When this method is called, the input format it guaranteed to be opened.
IOException
- Thrown, if the input could not be closed properly.public TableInputSplit[] createInputSplits(int minNumSplits) throws IOException
InputFormat
When this method is called, the input format it guaranteed to be configured.
minNumSplits
- The minimum desired number of splits. If fewer are created, some parallel
instances may remain idle.IOException
- Thrown, when the creation of the splits was erroneous.public InputSplitAssigner getInputSplitAssigner(TableInputSplit[] inputSplits)
InputFormat
public BaseStatistics getStatistics(BaseStatistics cachedStatistics)
InputFormat
When this method is called, the input format it guaranteed to be configured.
cachedStatistics
- The statistics that were cached. May be null.Copyright © 2014–2017 The Apache Software Foundation. All rights reserved.