Package org.apache.flink.api.common.io
Class FileOutputFormat<IT>
- java.lang.Object
-
- org.apache.flink.api.common.io.RichOutputFormat<IT>
-
- org.apache.flink.api.common.io.FileOutputFormat<IT>
-
- All Implemented Interfaces:
Serializable
,CleanupWhenUnsuccessful
,InitializeOnMaster
,OutputFormat<IT>
- Direct Known Subclasses:
AvroOutputFormat
,BinaryOutputFormat
,TextOutputFormat
@Public public abstract class FileOutputFormat<IT> extends RichOutputFormat<IT> implements InitializeOnMaster, CleanupWhenUnsuccessful
The abstract base class for all Rich output formats that are file based. Contains the logic to open/close the target file streams.- See Also:
- Serialized Form
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
FileOutputFormat.OutputDirectoryMode
Behavior for creating output directories.-
Nested classes/interfaces inherited from interface org.apache.flink.api.common.io.OutputFormat
OutputFormat.InitializationContext
-
-
Field Summary
Fields Modifier and Type Field Description protected Path
outputFilePath
The path of the file to be written.protected FSDataOutputStream
stream
The stream to which the data is written;
-
Constructor Summary
Constructors Constructor Description FileOutputFormat()
FileOutputFormat(Path outputPath)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description void
close()
Method that marks the end of the life-cycle of parallel output instance.void
configure(Configuration parameters)
Configures this output format.protected String
getDirectoryFileName(int taskNumber)
FileOutputFormat.OutputDirectoryMode
getOutputDirectoryMode()
Path
getOutputFilePath()
FileSystem.WriteMode
getWriteMode()
static void
initDefaultsFromConfiguration(Configuration configuration)
Initialize defaults for output format.void
initializeGlobal(int parallelism)
Initialization of the distributed file system if it is used.void
open(OutputFormat.InitializationContext context)
Opens a parallel instance of the output format to store the result of its parallel instance.void
setOutputDirectoryMode(FileOutputFormat.OutputDirectoryMode mode)
void
setOutputFilePath(Path path)
void
setWriteMode(FileSystem.WriteMode mode)
void
tryCleanupOnError()
Hook that is called upon an unsuccessful execution.-
Methods inherited from class org.apache.flink.api.common.io.RichOutputFormat
getRuntimeContext, setRuntimeContext
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface org.apache.flink.api.common.io.OutputFormat
writeRecord
-
-
-
-
Field Detail
-
outputFilePath
protected Path outputFilePath
The path of the file to be written.
-
stream
protected transient FSDataOutputStream stream
The stream to which the data is written;
-
-
Constructor Detail
-
FileOutputFormat
public FileOutputFormat()
-
FileOutputFormat
public FileOutputFormat(Path outputPath)
-
-
Method Detail
-
initDefaultsFromConfiguration
public static void initDefaultsFromConfiguration(Configuration configuration)
Initialize defaults for output format. Needs to be a static method because it is configured for local cluster execution.- Parameters:
configuration
- The configuration to load defaults from
-
setOutputFilePath
public void setOutputFilePath(Path path)
-
getOutputFilePath
public Path getOutputFilePath()
-
setWriteMode
public void setWriteMode(FileSystem.WriteMode mode)
-
getWriteMode
public FileSystem.WriteMode getWriteMode()
-
setOutputDirectoryMode
public void setOutputDirectoryMode(FileOutputFormat.OutputDirectoryMode mode)
-
getOutputDirectoryMode
public FileOutputFormat.OutputDirectoryMode getOutputDirectoryMode()
-
configure
public void configure(Configuration parameters)
Description copied from interface:OutputFormat
Configures this output format. Since output formats are instantiated generically and hence parameterless, this method is the place where the output formats set their basic fields based on configuration values.This method is always called first on a newly instantiated output format.
- Specified by:
configure
in interfaceOutputFormat<IT>
- Parameters:
parameters
- The configuration with all parameters.
-
open
public void open(OutputFormat.InitializationContext context) throws IOException
Description copied from interface:OutputFormat
Opens a parallel instance of the output format to store the result of its parallel instance.When this method is called, the output format it guaranteed to be configured.
- Specified by:
open
in interfaceOutputFormat<IT>
- Parameters:
context
- The context to get task parallel infos.- Throws:
IOException
- Thrown, if the output could not be opened due to an I/O problem.
-
getDirectoryFileName
protected String getDirectoryFileName(int taskNumber)
-
close
public void close() throws IOException
Description copied from interface:OutputFormat
Method that marks the end of the life-cycle of parallel output instance. Should be used to close channels and streams and release resources. After this method returns without an error, the output is assumed to be correct.When this method is called, the output format it guaranteed to be opened.
- Specified by:
close
in interfaceOutputFormat<IT>
- Throws:
IOException
- Thrown, if the input could not be closed properly.
-
initializeGlobal
public void initializeGlobal(int parallelism) throws IOException
Initialization of the distributed file system if it is used.- Specified by:
initializeGlobal
in interfaceInitializeOnMaster
- Parameters:
parallelism
- The task parallelism.- Throws:
IOException
- The initialization may throw exceptions, which may cause the job to abort.
-
tryCleanupOnError
public void tryCleanupOnError()
Description copied from interface:CleanupWhenUnsuccessful
Hook that is called upon an unsuccessful execution.- Specified by:
tryCleanupOnError
in interfaceCleanupWhenUnsuccessful
-
-