@Internal public abstract class HadoopInputFormatBase<K,V,T> extends HadoopInputFormatCommonBase<T,HadoopInputSplit>
Modifier and Type | Field and Description |
---|---|
protected boolean |
fetched |
protected boolean |
hasNext |
protected Class<K> |
keyClass |
protected org.apache.hadoop.mapreduce.RecordReader<K,V> |
recordReader |
protected Class<V> |
valueClass |
credentials
Constructor and Description |
---|
HadoopInputFormatBase(org.apache.hadoop.mapreduce.InputFormat<K,V> mapreduceInputFormat,
Class<K> key,
Class<V> value,
org.apache.hadoop.mapreduce.Job job) |
Modifier and Type | Method and Description |
---|---|
void |
close()
Method that marks the end of the life-cycle of an input split.
|
void |
configure(Configuration parameters)
Configures this input format.
|
HadoopInputSplit[] |
createInputSplits(int minNumSplits)
Creates the different splits of the input that can be processed in parallel.
|
protected void |
fetchNext() |
org.apache.hadoop.conf.Configuration |
getConfiguration() |
InputSplitAssigner |
getInputSplitAssigner(HadoopInputSplit[] inputSplits)
Gets the type of the input splits that are processed by this input format.
|
BaseStatistics |
getStatistics(BaseStatistics cachedStats)
Gets the basic statistics from the input described by this format.
|
void |
open(HadoopInputSplit split)
Opens a parallel instance of the input format to work on a split.
|
boolean |
reachedEnd()
Method used to check if the end of the input is reached.
|
getCredentialsFromUGI, read, write
closeInputFormat, getRuntimeContext, openInputFormat, setRuntimeContext
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
nextRecord
public org.apache.hadoop.conf.Configuration getConfiguration()
public void configure(Configuration parameters)
InputFormat
This method is always called first on a newly instantiated input format.
parameters
- The configuration with all parameters.public BaseStatistics getStatistics(BaseStatistics cachedStats) throws IOException
InputFormat
When this method is called, the input format it guaranteed to be configured.
cachedStats
- The statistics that were cached. May be null.IOException
public HadoopInputSplit[] createInputSplits(int minNumSplits) throws IOException
InputFormat
When this method is called, the input format it guaranteed to be configured.
minNumSplits
- The minimum desired number of splits. If fewer are created, some parallel
instances may remain idle.IOException
- Thrown, when the creation of the splits was erroneous.public InputSplitAssigner getInputSplitAssigner(HadoopInputSplit[] inputSplits)
InputFormat
public void open(HadoopInputSplit split) throws IOException
InputFormat
When this method is called, the input format it guaranteed to be configured.
split
- The split to be opened.IOException
- Thrown, if the spit could not be opened due to an I/O problem.public boolean reachedEnd() throws IOException
InputFormat
When this method is called, the input format it guaranteed to be opened.
IOException
- Thrown, if an I/O error occurred.protected void fetchNext() throws IOException
IOException
public void close() throws IOException
InputFormat
When this method is called, the input format it guaranteed to be opened.
IOException
- Thrown, if the input could not be closed properly.Copyright © 2014–2017 The Apache Software Foundation. All rights reserved.