DataDistribution (Flink : 2.0-SNAPSHOT API)

All Superinterfaces:: IOReadableWritable, Serializable

@PublicEvolving
public interface DataDistribution
extends IOReadableWritable, Serializable

Method Summary

All Methods Instance Methods Abstract Methods
Modifier and Type	Method and Description
`Object[]`	`getBucketBoundary(int bucketNum, int totalNumBuckets)` Returns the i'th bucket's upper bound, given that the distribution is to be split into `totalBuckets` buckets.
`TypeInformation[]`	`getKeyTypes()` Gets the type of the key by which the dataSet is partitioned.
`int`	`getNumberOfFields()` The number of fields in the (composite) key.

Methods inherited from interface org.apache.flink.core.io.IOReadableWritable
read, write

- Method Detail
  - getBucketBoundary
```
Object[] getBucketBoundary(int bucketNum,
                           int totalNumBuckets)
```
    Returns the i'th bucket's upper bound, given that the distribution is to be split into totalBuckets buckets.
    Assuming n buckets, let B_i be the result from calling getBucketBoundary(i, n), then the distribution will partition the data domain in the following fashion:
```
 (-inf, B_1] (B_1, B_2] ... (B_n-2, B_n-1] (B_n-1, inf)
 
```
    Note: The last bucket's upper bound is actually discarded by many algorithms. The last bucket is assumed to hold all values v such that v > getBucketBoundary(n-1, n), where n is the number of buckets.
    Parameters:
    
    bucketNum - The number of the bucket for which to get the upper bound.
    
    totalNumBuckets - The number of buckets to split the data into.
    
    Returns:
    
    A record whose values act as bucket boundaries for the specified bucket.
  - getNumberOfFields
```
int getNumberOfFields()
```
    The number of fields in the (composite) key. This determines how many fields in the records define the bucket. The number of fields must be the size of the array returned by the function getBucketBoundary(int, int).
    
    Returns:
    
    The number of fields in the (composite) key.
  - getKeyTypes
```
TypeInformation[] getKeyTypes()
```
    Gets the type of the key by which the dataSet is partitioned.
    
    Returns:
    
    The type of the key by which the dataSet is partitioned.

Back to Flink Website

Interface DataDistribution

Method Summary

Methods inherited from interface org.apache.flink.core.io.IOReadableWritable

Method Detail

getBucketBoundary

getNumberOfFields

getKeyTypes

Back to Flink Website