@Internal public abstract class HadoopOutputFormatBase<K,V,T> extends HadoopOutputFormatCommonBase<T> implements FinalizeOnMaster
FinalizeOnMaster.FinalizationContext
OutputFormat.InitializationContext
Modifier and Type | Field and Description |
---|---|
protected static Object |
CLOSE_MUTEX |
protected org.apache.hadoop.conf.Configuration |
configuration |
protected static Object |
CONFIGURE_MUTEX |
protected org.apache.hadoop.mapreduce.TaskAttemptContext |
context |
protected org.apache.hadoop.mapreduce.OutputFormat<K,V> |
mapreduceOutputFormat |
protected static Object |
OPEN_MUTEX |
protected org.apache.hadoop.mapreduce.OutputCommitter |
outputCommitter |
protected org.apache.hadoop.mapreduce.RecordWriter<K,V> |
recordWriter |
protected int |
taskNumber |
credentials
Constructor and Description |
---|
HadoopOutputFormatBase(org.apache.hadoop.mapreduce.OutputFormat<K,V> mapreduceOutputFormat,
org.apache.hadoop.mapreduce.Job job) |
Modifier and Type | Method and Description |
---|---|
void |
close()
commit the task by moving the output file out from the temporary directory.
|
void |
configure(Configuration parameters)
Configures this output format.
|
void |
finalizeGlobal(int parallelism)
The method is invoked on the master (JobManager) after all (parallel) instances of an
OutputFormat finished.
|
org.apache.hadoop.conf.Configuration |
getConfiguration() |
void |
open(int taskNumber,
int numTasks)
create the temporary output file for hadoop RecordWriter.
|
read, write
getRuntimeContext, setRuntimeContext
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
finalizeGlobal
open, writeRecord
protected static final Object OPEN_MUTEX
protected static final Object CONFIGURE_MUTEX
protected static final Object CLOSE_MUTEX
protected org.apache.hadoop.conf.Configuration configuration
protected transient org.apache.hadoop.mapreduce.OutputCommitter outputCommitter
protected transient org.apache.hadoop.mapreduce.TaskAttemptContext context
protected transient int taskNumber
public org.apache.hadoop.conf.Configuration getConfiguration()
public void configure(Configuration parameters)
OutputFormat
This method is always called first on a newly instantiated output format.
configure
in interface OutputFormat<T>
parameters
- The configuration with all parameters.public void open(int taskNumber, int numTasks) throws IOException
open
in interface OutputFormat<T>
taskNumber
- The number of the parallel instance.numTasks
- The number of parallel tasks.IOException
public void close() throws IOException
close
in interface OutputFormat<T>
IOException
public void finalizeGlobal(int parallelism) throws IOException
FinalizeOnMaster
finalizeGlobal
in interface FinalizeOnMaster
parallelism
- The parallelism with which the format or functions was run.IOException
- The finalization may throw exceptions, which may cause the job to abort.Copyright © 2014–2024 The Apache Software Foundation. All rights reserved.