Interface BucketAssigner<IN,BucketID>
-
- Type Parameters:
IN
- The type of input elements.BucketID
- The type of the object returned by thegetBucketId(Object, BucketAssigner.Context)
. This has to have a correct#hashCode()
and#equals(Object)
method. In addition, thePath
to the created bucket will be the result of the#toString()
of this method, appended to thebasePath
specified in the file sink.
- All Superinterfaces:
Serializable
- All Known Implementing Classes:
BasePathBucketAssigner
,DateTimeBucketAssigner
,FileSinkProgram.KeyBucketAssigner
,FileSystemTableSink.TableBucketAssigner
,StreamSQLTestProgram.KeyBucketAssigner
@PublicEvolving public interface BucketAssigner<IN,BucketID> extends Serializable
A BucketAssigner is used with a file sink to determine the bucket each incoming element should be put into.The
StreamingFileSink
can be writing to many buckets at a time, and it is responsible for managing a set of active buckets. Whenever a new element arrives it will ask theBucketAssigner
for the bucket the element should fall in. TheBucketAssigner
can, for example, determine buckets based on system time.
-
-
Nested Class Summary
Nested Classes Modifier and Type Interface Description static interface
BucketAssigner.Context
Context that theBucketAssigner
can use for getting additional data about an input record.
-
Method Summary
All Methods Instance Methods Abstract Methods Modifier and Type Method Description BucketID
getBucketId(IN element, BucketAssigner.Context context)
Returns the identifier of the bucket the provided element should be put into.SimpleVersionedSerializer<BucketID>
getSerializer()
-
-
-
Method Detail
-
getBucketId
BucketID getBucketId(IN element, BucketAssigner.Context context)
Returns the identifier of the bucket the provided element should be put into.- Parameters:
element
- The current element being processed.context
- The context used by the current bucket assigner.- Returns:
- A string representing the identifier of the bucket the element should be put into.
The actual path to the bucket will result from the concatenation of the returned string
and the
base path
provided during the initialization of the file sink.
-
getSerializer
SimpleVersionedSerializer<BucketID> getSerializer()
- Returns:
- A
SimpleVersionedSerializer
capable of serializing/deserializing the elements of typeBucketID
. That is the type of the objects returned by thegetBucketId(Object, BucketAssigner.Context)
.
-
-