IN
- The type of input elements.BucketID
- The type of the object returned by the getBucketId(Object,
BucketAssigner.Context)
. This has to have a correct #hashCode()
and #equals(Object)
method. In addition, the Path
to the created bucket will be the
result of the #toString()
of this method, appended to the basePath
specified
in the file sink.@PublicEvolving public interface BucketAssigner<IN,BucketID> extends Serializable
The StreamingFileSink
can be writing to many buckets at a time, and it is responsible
for managing a set of active buckets. Whenever a new element arrives it will ask the BucketAssigner
for the bucket the element should fall in. The BucketAssigner
can, for
example, determine buckets based on system time.
Modifier and Type | Interface and Description |
---|---|
static interface |
BucketAssigner.Context
Context that the
BucketAssigner can use for getting additional data about an input
record. |
Modifier and Type | Method and Description |
---|---|
BucketID |
getBucketId(IN element,
BucketAssigner.Context context)
Returns the identifier of the bucket the provided element should be put into.
|
SimpleVersionedSerializer<BucketID> |
getSerializer() |
BucketID getBucketId(IN element, BucketAssigner.Context context)
element
- The current element being processed.context
- The context used by the current bucket assigner.base path
provided during the initialization of the file sink.SimpleVersionedSerializer<BucketID> getSerializer()
SimpleVersionedSerializer
capable of serializing/deserializing the elements
of type BucketID
. That is the type of the objects returned by the getBucketId(Object, BucketAssigner.Context)
.Copyright © 2014–2024 The Apache Software Foundation. All rights reserved.