@PublicEvolving public class S3RecoverableWriter extends Object implements RecoverableWriter
RecoverableWriter
against S3.
This implementation makes heavy use of MultiPart Uploads in S3 to persist intermediate data as soon as possible.
This class partially reuses utility classes and implementations from the Hadoop project, specifically around configuring S3 requests and handling retries.
RecoverableWriter.CommitRecoverable, RecoverableWriter.ResumeRecoverable
Modifier and Type | Method and Description |
---|---|
boolean |
cleanupRecoverableState(RecoverableWriter.ResumeRecoverable resumable)
Frees up any resources that were previously occupied in order to be able to
recover from a (potential) failure.
|
SimpleVersionedSerializer<RecoverableWriter.CommitRecoverable> |
getCommitRecoverableSerializer()
The serializer for the CommitRecoverable types created in this writer.
|
SimpleVersionedSerializer<RecoverableWriter.ResumeRecoverable> |
getResumeRecoverableSerializer()
The serializer for the ResumeRecoverable types created in this writer.
|
RecoverableFsDataOutputStream |
open(Path path)
Opens a new recoverable stream to write to the given path.
|
S3RecoverableFsDataOutputStream |
recover(RecoverableWriter.ResumeRecoverable recoverable)
Resumes a recoverable stream consistently at the point indicated by the given ResumeRecoverable.
|
RecoverableFsDataOutputStream.Committer |
recoverForCommit(RecoverableWriter.CommitRecoverable recoverable)
Recovers a recoverable stream consistently at the point indicated by the given CommitRecoverable
for finalizing and committing.
|
boolean |
requiresCleanupOfRecoverableState()
Marks if the writer requires to do any additional cleanup/freeing of resources occupied
as part of a
RecoverableWriter.ResumeRecoverable , e.g. |
boolean |
supportsResume()
Checks whether the writer and its streams support resuming (appending to) files after
recovery (via the
RecoverableWriter.recover(ResumeRecoverable) method). |
static S3RecoverableWriter |
writer(org.apache.hadoop.fs.FileSystem fs,
FunctionWithException<File,RefCountedFile,IOException> tempFileCreator,
S3AccessHelper s3AccessHelper,
Executor uploadThreadPool,
long userDefinedMinPartSize,
int maxConcurrentUploadsPerStream) |
public RecoverableFsDataOutputStream open(Path path) throws IOException
RecoverableWriter
open
in interface RecoverableWriter
path
- The path of the file/object to write to.IOException
- Thrown if the stream could not be opened/initialized.public RecoverableFsDataOutputStream.Committer recoverForCommit(RecoverableWriter.CommitRecoverable recoverable) throws IOException
RecoverableWriter
recoverForCommit
in interface RecoverableWriter
recoverable
- The opaque handle with the recovery information.IOException
- Thrown, if recovery fails.public S3RecoverableFsDataOutputStream recover(RecoverableWriter.ResumeRecoverable recoverable) throws IOException
RecoverableWriter
This method is optional and whether it is supported is indicated through the
RecoverableWriter.supportsResume()
method.
recover
in interface RecoverableWriter
recoverable
- The opaque handle with the recovery information.IOException
- Thrown, if resuming fails.public boolean requiresCleanupOfRecoverableState()
RecoverableWriter
RecoverableWriter.ResumeRecoverable
, e.g. temporarily files created or objects uploaded
to external systems.
In case cleanup is required, then RecoverableWriter.cleanupRecoverableState(ResumeRecoverable)
should
be called.
requiresCleanupOfRecoverableState
in interface RecoverableWriter
true
if cleanup is required, false
otherwise.public boolean cleanupRecoverableState(RecoverableWriter.ResumeRecoverable resumable) throws IOException
RecoverableWriter
NOTE: This operation should not throw an exception if the resumable has already
been cleaned up and the resources have been freed. But the contract is that it will throw
an UnsupportedOperationException
if it is called for a RecoverableWriter
whose RecoverableWriter.requiresCleanupOfRecoverableState()
returns false
.
cleanupRecoverableState
in interface RecoverableWriter
resumable
- The RecoverableWriter.ResumeRecoverable
whose state we want to clean-up.true
if the resources were successfully freed, false
otherwise
(e.g. the file to be deleted was not there for any reason - already deleted or never created).IOException
public SimpleVersionedSerializer<RecoverableWriter.CommitRecoverable> getCommitRecoverableSerializer()
RecoverableWriter
getCommitRecoverableSerializer
in interface RecoverableWriter
public SimpleVersionedSerializer<RecoverableWriter.ResumeRecoverable> getResumeRecoverableSerializer()
RecoverableWriter
getResumeRecoverableSerializer
in interface RecoverableWriter
public boolean supportsResume()
RecoverableWriter
RecoverableWriter.recover(ResumeRecoverable)
method).
If true, then this writer supports the RecoverableWriter.recover(ResumeRecoverable)
method.
If false, then that method may not be supported and streams can only be recovered via
RecoverableWriter.recoverForCommit(CommitRecoverable)
.
supportsResume
in interface RecoverableWriter
public static S3RecoverableWriter writer(org.apache.hadoop.fs.FileSystem fs, FunctionWithException<File,RefCountedFile,IOException> tempFileCreator, S3AccessHelper s3AccessHelper, Executor uploadThreadPool, long userDefinedMinPartSize, int maxConcurrentUploadsPerStream)
Copyright © 2014–2020 The Apache Software Foundation. All rights reserved.