@PublicEvolving public class S3RecoverableWriter extends Object implements RecoverableWriter
RecoverableWriter
against S3.
This implementation makes heavy use of MultiPart Uploads in S3 to persist intermediate data as soon as possible.
This class partially reuses utility classes and implementations from the Hadoop project, specifically around configuring S3 requests and handling retries.
RecoverableWriter.CommitRecoverable, RecoverableWriter.ResumeRecoverable
Modifier and Type | Method and Description |
---|---|
boolean |
cleanupRecoverableState(RecoverableWriter.ResumeRecoverable resumable)
Frees up any resources that were previously occupied in order to be able to recover from a
(potential) failure.
|
SimpleVersionedSerializer<RecoverableWriter.CommitRecoverable> |
getCommitRecoverableSerializer()
The serializer for the CommitRecoverable types created in this writer.
|
SimpleVersionedSerializer<RecoverableWriter.ResumeRecoverable> |
getResumeRecoverableSerializer()
The serializer for the ResumeRecoverable types created in this writer.
|
RecoverableFsDataOutputStream |
open(Path path)
Opens a new recoverable stream to write to the given path.
|
S3RecoverableFsDataOutputStream |
recover(RecoverableWriter.ResumeRecoverable recoverable)
Resumes a recoverable stream consistently at the point indicated by the given
ResumeRecoverable.
|
RecoverableFsDataOutputStream.Committer |
recoverForCommit(RecoverableWriter.CommitRecoverable recoverable)
Recovers a recoverable stream consistently at the point indicated by the given
CommitRecoverable for finalizing and committing.
|
boolean |
requiresCleanupOfRecoverableState()
Marks if the writer requires to do any additional cleanup/freeing of resources occupied as
part of a
RecoverableWriter.ResumeRecoverable , e.g. temporarily files created or objects uploaded to
external systems. |
boolean |
supportsResume()
Checks whether the writer and its streams support resuming (appending to) files after
recovery (via the
RecoverableWriter.recover(ResumeRecoverable) method). |
static S3RecoverableWriter |
writer(org.apache.hadoop.fs.FileSystem fs,
FunctionWithException<File,RefCountedFileWithStream,IOException> tempFileCreator,
S3AccessHelper s3AccessHelper,
Executor uploadThreadPool,
long userDefinedMinPartSize,
int maxConcurrentUploadsPerStream) |
public RecoverableFsDataOutputStream open(Path path) throws IOException
RecoverableWriter
open
in interface RecoverableWriter
path
- The path of the file/object to write to.IOException
- Thrown if the stream could not be opened/initialized.public RecoverableFsDataOutputStream.Committer recoverForCommit(RecoverableWriter.CommitRecoverable recoverable) throws IOException
RecoverableWriter
recoverForCommit
in interface RecoverableWriter
recoverable
- The opaque handle with the recovery information.IOException
- Thrown, if recovery fails.public S3RecoverableFsDataOutputStream recover(RecoverableWriter.ResumeRecoverable recoverable) throws IOException
RecoverableWriter
This method is optional and whether it is supported is indicated through the RecoverableWriter.supportsResume()
method.
recover
in interface RecoverableWriter
recoverable
- The opaque handle with the recovery information.IOException
- Thrown, if resuming fails.public boolean requiresCleanupOfRecoverableState()
RecoverableWriter
RecoverableWriter.ResumeRecoverable
, e.g. temporarily files created or objects uploaded to
external systems.
In case cleanup is required, then RecoverableWriter.cleanupRecoverableState(ResumeRecoverable)
should be called.
requiresCleanupOfRecoverableState
in interface RecoverableWriter
true
if cleanup is required, false
otherwise.public boolean cleanupRecoverableState(RecoverableWriter.ResumeRecoverable resumable) throws IOException
RecoverableWriter
NOTE: This operation should not throw an exception, but return false if the cleanup did not happen for any reason.
cleanupRecoverableState
in interface RecoverableWriter
resumable
- The RecoverableWriter.ResumeRecoverable
whose state we want to clean-up.true
if the resources were successfully freed, false
otherwise (e.g.
the file to be deleted was not there for any reason - already deleted or never created).IOException
public SimpleVersionedSerializer<RecoverableWriter.CommitRecoverable> getCommitRecoverableSerializer()
RecoverableWriter
getCommitRecoverableSerializer
in interface RecoverableWriter
public SimpleVersionedSerializer<RecoverableWriter.ResumeRecoverable> getResumeRecoverableSerializer()
RecoverableWriter
getResumeRecoverableSerializer
in interface RecoverableWriter
public boolean supportsResume()
RecoverableWriter
RecoverableWriter.recover(ResumeRecoverable)
method).
If true, then this writer supports the RecoverableWriter.recover(ResumeRecoverable)
method. If
false, then that method may not be supported and streams can only be recovered via RecoverableWriter.recoverForCommit(CommitRecoverable)
.
supportsResume
in interface RecoverableWriter
public static S3RecoverableWriter writer(org.apache.hadoop.fs.FileSystem fs, FunctionWithException<File,RefCountedFileWithStream,IOException> tempFileCreator, S3AccessHelper s3AccessHelper, Executor uploadThreadPool, long userDefinedMinPartSize, int maxConcurrentUploadsPerStream)
Copyright © 2014–2024 The Apache Software Foundation. All rights reserved.