TypeComparator (flink 1.8-SNAPSHOT API)

java.lang.Object
- org.apache.flink.api.common.typeutils.TypeComparator<T>

Type Parameters:

T - The data type that the comparator works on.

All Implemented Interfaces:

Serializable

Direct Known Subclasses:

BasicTypeComparator, BooleanValueComparator, ByteValueArrayComparator, ByteValueComparator, CharValueArrayComparator, CharValueComparator, CompositeTypeComparator, CopyableValueComparator, DoubleValueArrayComparator, DoubleValueComparator, FloatValueArrayComparator, FloatValueComparator, GenericTypeComparator, IntValueArrayComparator, IntValueComparator, LongValueArrayComparator, LongValueComparator, NullAwareComparator, NullValueArrayComparator, NullValueComparator, PrimitiveArrayComparator, ShortValueArrayComparator, ShortValueComparator, StringValueArrayComparator, StringValueComparator, ValueComparator, WritableComparator
```
@PublicEvolving
public abstract class TypeComparator<T>
extends Object
implements Serializable
```
This interface describes the methods that are required for a data type to be handled by the pact runtime. Specifically, this interface contains the methods used for hashing, comparing, and creating auxiliary structures.
The methods in this interface depend not only on the record, but also on what fields of a record are used for the comparison or hashing. That set of fields is typically a subset of a record's fields. In general, this class assumes a contract on hash codes and equality the same way as defined for Object.equals(Object) Object.equals(Object)
Implementing classes are stateful, because several methods require to set one record as the reference for comparisons and later comparing a candidate against it. Therefore, the classes implementing this interface are not thread safe. The runtime will ensure that no instance is used twice in different threads, but will create a copy for that purpose. It is hence imperative that the copies created by the duplicate() method share no state with the instance from which they were copied: they have to be deep copies.

See Also:

Object.hashCode(), Object.equals(Object), Comparator.compare(Object, Object), Serialized Form

Constructor Summary

Constructors
Constructor and Description

TypeComparator()

Constructors
Constructor and Description
`TypeComparator()`

Method Summary

All Methods Instance Methods Abstract Methods Concrete Methods
Modifier and Type	Method and Description
`abstract int`	`compare(T first, T second)` Compares two records in object form.
`int`	`compareAgainstReference(Comparable[] keys)`
`abstract int`	`compareSerialized(DataInputView firstSource, DataInputView secondSource)` Compares two records in serialized form.
`abstract int`	`compareToReference(TypeComparator<T> referencedComparator)` This method compares the element that has been set as reference in this type accessor, to the element set as reference in the given type accessor.
`abstract TypeComparator<T>`	`duplicate()` Creates a copy of this class.
`abstract boolean`	`equalToReference(T candidate)` Checks, whether the given element is equal to the element that has been set as the comparison reference in this comparator instance.
`abstract int`	`extractKeys(Object record, Object[] target, int index)` Extracts the key fields from a record.
`abstract TypeComparator[]`	`getFlatComparators()` Get the field comparators.
`abstract int`	`getNormalizeKeyLen()` Gets the number of bytes that the normalized key would maximally take.
`abstract int`	`hash(T record)` Computes a hash value for the given record.
`abstract boolean`	`invertNormalizedKey()` Flag whether normalized key comparisons should be inverted key should be interpreted inverted, i.e.
`abstract boolean`	`isNormalizedKeyPrefixOnly(int keyBytes)` Checks, whether the given number of bytes for a normalized is only a prefix to determine the order of elements of the data type for which this comparator provides the comparison methods.
`abstract void`	`putNormalizedKey(T record, MemorySegment target, int offset, int numBytes)` Writes a normalized key for the given record into the target byte array, starting at the specified position and writing exactly the given number of bytes.
`abstract T`	`readWithKeyDenormalization(T reuse, DataInputView source)` Reads the record back while de-normalizing the key fields.
`abstract void`	`setReference(T toCompare)` Sets the given element as the comparison reference for future calls to `equalToReference(Object)` and `compareToReference(TypeComparator)`.
`boolean`	`supportsCompareAgainstReference()`
`abstract boolean`	`supportsNormalizedKey()` Checks whether the data type supports the creation of a normalized key for comparison.
`abstract boolean`	`supportsSerializationWithKeyNormalization()` Check whether this comparator supports to serialize the record in a format that replaces its keys by a normalized key.
`abstract void`	`writeWithKeyNormalization(T record, DataOutputView target)` Writes the record in such a fashion that all keys are normalizing and at the beginning of the serialized data.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Constructor Detail
  - TypeComparator
```
public TypeComparator()
```
- Method Detail
  - hash
```
public abstract int hash(T record)
```
    Computes a hash value for the given record. The hash value should include all fields in the record relevant to the comparison.
    The hash code is typically not used as it is in hash tables and for partitioning, but it is further scrambled to make sure that a projection of the hash values to a lower cardinality space is as results in a rather uniform value distribution. However, any collisions produced by this method cannot be undone. While it is NOT important to create hash codes that cover the full spectrum of bits in the integer, it IS important to avoid collisions when combining two value as much as possible.
    
    Parameters:
    
    record - The record to be hashed.
    
    Returns:
    
    A hash value for the record.
    
    See Also:
    
    Object.hashCode()
  - setReference
```
public abstract void setReference(T toCompare)
```
    Sets the given element as the comparison reference for future calls to equalToReference(Object) and compareToReference(TypeComparator). This method must set the given element into this comparator instance's state. If the comparison happens on a subset of the fields from the record, this method may extract those fields.
    A typical example for checking the equality of two elements is the following:
```
 E e1 = ...;
 E e2 = ...;
 
 TypeComparator<E> acc = ...;
 
 acc.setReference(e1);
 boolean equal = acc.equalToReference(e2);
 
```
    The rational behind this method is that elements are typically compared using certain features that are extracted from them, (such de-serializing as a subset of fields). When setting the reference, this extraction happens. The extraction needs happen only once per element, even though an element is often compared to multiple other elements, such as when finding equal elements in the process of grouping the elements.
    Parameters:
    
    toCompare - The element to set as the comparison reference.
  - equalToReference
```
public abstract boolean equalToReference(T candidate)
```
    Checks, whether the given element is equal to the element that has been set as the comparison reference in this comparator instance.
    
    Parameters:
    
    candidate - The candidate to check.
    
    Returns:
    
    True, if the element is equal to the comparison reference, false otherwise.
    
    See Also:
    
    setReference(Object)
  - compareToReference
```
public abstract int compareToReference(TypeComparator<T> referencedComparator)
```
    This method compares the element that has been set as reference in this type accessor, to the element set as reference in the given type accessor. Similar to comparing two elements e1 and e2 via a comparator, this method can be used the following way.
```
 E e1 = ...;
 E e2 = ...;
 
 TypeComparator<E> acc1 = ...;
 TypeComparator<E> acc2 = ...;
 
 acc1.setReference(e1);
 acc2.setReference(e2);
 
 int comp = acc1.compareToReference(acc2);
 
```
    The rational behind this method is that elements are typically compared using certain features that are extracted from them, (such de-serializing as a subset of fields). When setting the reference, this extraction happens. The extraction needs happen only once per element, even though an element is typically compared to many other elements when establishing a sorted order. The actual comparison performed by this method may be very cheap, as it happens on the extracted features.
    Parameters:
    
    referencedComparator - The type accessors where the element for comparison has been set as reference.
    
    Returns:
    
    A value smaller than zero, if the reference value of referencedAccessors is smaller than the reference value of this type accessor; a value greater than zero, if it is larger; zero, if both are equal.
    
    See Also:
    
    setReference(Object)
  - supportsCompareAgainstReference
```
public boolean supportsCompareAgainstReference()
```
  - compare
```
public abstract int compare(T first,
                            T second)
```
    Compares two records in object form. The return value indicates the order of the two in the same way as defined by Comparator.compare(Object, Object).
    
    Parameters:
    
    first - The first record.
    
    second - The second record.
    
    Returns:
    
    An integer defining the oder among the objects in the same way as Comparator.compare(Object, Object).
    
    See Also:
    
    Comparator.compare(Object, Object)
  - compareSerialized
```
public abstract int compareSerialized(DataInputView firstSource,
                                      DataInputView secondSource)
                               throws IOException
```
    Compares two records in serialized form. The return value indicates the order of the two in the same way as defined by Comparator.compare(Object, Object).
    This method may de-serialize the records or compare them directly based on their binary representation.
    
    Parameters:
    
    firstSource - The input view containing the first record.
    
    secondSource - The input view containing the second record.
    
    Returns:
    
    An integer defining the oder among the objects in the same way as Comparator.compare(Object, Object).
    
    Throws:
    
    IOException - Thrown, if any of the input views raised an exception when reading the records.
    
    See Also:
    
    Comparator.compare(Object, Object)
  - supportsNormalizedKey
```
public abstract boolean supportsNormalizedKey()
```
    Checks whether the data type supports the creation of a normalized key for comparison.
    
    Returns:
    
    True, if the data type supports the creation of a normalized key for comparison, false otherwise.
  - supportsSerializationWithKeyNormalization
```
public abstract boolean supportsSerializationWithKeyNormalization()
```
    Check whether this comparator supports to serialize the record in a format that replaces its keys by a normalized key.
    
    Returns:
    
    True, if the comparator supports that specific form of serialization, false if not.
  - getNormalizeKeyLen
```
public abstract int getNormalizeKeyLen()
```
    Gets the number of bytes that the normalized key would maximally take. A value of Integer.MAX_VALUE is interpreted as infinite.
    
    Returns:
    
    The number of bytes that the normalized key would maximally take.
  - isNormalizedKeyPrefixOnly
```
public abstract boolean isNormalizedKeyPrefixOnly(int keyBytes)
```
    Checks, whether the given number of bytes for a normalized is only a prefix to determine the order of elements of the data type for which this comparator provides the comparison methods. For example, if the data type is ordered with respect to an integer value it contains, then this method would return true, if the number of key bytes is smaller than four.
    
    Returns:
    
    True, if the given number of bytes is only a prefix, false otherwise.
  - putNormalizedKey
```
public abstract void putNormalizedKey(T record,
                                      MemorySegment target,
                                      int offset,
                                      int numBytes)
```
    Writes a normalized key for the given record into the target byte array, starting at the specified position and writing exactly the given number of bytes. Note that the comparison of the bytes is treating the bytes as unsigned bytes: int byteI = bytes[i] & 0xFF;
    If the meaningful part of the normalized key takes less than the given number of bytes, than it must be padded. Padding is typically required for variable length data types, such as strings. The padding uses a special character, either 0 or 0xff, depending on whether shorter values are sorted to the beginning or the end.
    This method is similar to NormalizableKey.copyNormalizedKey(MemorySegment, int, int). In the case that multiple fields of a record contribute to the normalized key, it is crucial that the fields align on the byte field, i.e. that every field always takes up the exact same number of bytes.
    
    Parameters:
    
    record - The record for which to create the normalized key.
    
    target - The byte array into which to write the normalized key bytes.
    
    offset - The offset in the byte array, where to start writing the normalized key bytes.
    
    numBytes - The number of bytes to be written exactly.
    
    See Also:
    
    NormalizableKey.copyNormalizedKey(MemorySegment, int, int)
  - writeWithKeyNormalization
```
public abstract void writeWithKeyNormalization(T record,
                                               DataOutputView target)
                                        throws IOException
```
    Writes the record in such a fashion that all keys are normalizing and at the beginning of the serialized data. This must only be used when for all the key fields the full normalized key is used. The method #supportsSerializationWithKeyNormalization() allows to check that.
    
    Parameters:
    
    record - The record object into which to read the record data.
    
    target - The stream to which to write the data,
    
    Throws:
    
    IOException
    
    See Also:
    
    supportsSerializationWithKeyNormalization(), readWithKeyDenormalization(Object, DataInputView), NormalizableKey.copyNormalizedKey(MemorySegment, int, int)
  - readWithKeyDenormalization
```
public abstract T readWithKeyDenormalization(T reuse,
                                             DataInputView source)
                                      throws IOException
```
    Reads the record back while de-normalizing the key fields. This must only be used when for all the key fields the full normalized key is used, which is hinted by the #supportsSerializationWithKeyNormalization() method.
    
    Parameters:
    
    reuse - The reuse object into which to read the record data.
    
    source - The stream from which to read the data,
    
    Throws:
    
    IOException
    
    See Also:
    
    supportsSerializationWithKeyNormalization(), writeWithKeyNormalization(Object, DataOutputView), NormalizableKey.copyNormalizedKey(MemorySegment, int, int)
  - invertNormalizedKey
```
public abstract boolean invertNormalizedKey()
```
    Flag whether normalized key comparisons should be inverted key should be interpreted inverted, i.e. descending.
    
    Returns:
    
    True, if all normalized key comparisons should invert the sign of the comparison result, false if the normalized key should be used as is.
  - duplicate
```
public abstract TypeComparator<T> duplicate()
```
    Creates a copy of this class. The copy must be deep such that no state set in the copy affects this instance of the comparator class.
    
    Returns:
    
    A deep copy of this comparator instance.
  - extractKeys
```
public abstract int extractKeys(Object record,
                                Object[] target,
                                int index)
```
    Extracts the key fields from a record. This is for use by the PairComparator to provide interoperability between different record types. Note, that at least one key should be extracted.
    
    Parameters:
    
    record - The record that contains the key(s)
    
    target - The array to write the key(s) into.
    
    index - The offset of the target array to start writing into.
    
    Returns:
    
    the number of keys added to target.
  - getFlatComparators
```
public abstract TypeComparator[] getFlatComparators()
```
    Get the field comparators. This is used together with extractKeys(Object, Object[], int) to provide interoperability between different record types. Note, that this should return at least one Comparator and that the number of Comparators must match the number of extracted keys.
    
    Returns:
    
    An Array of Comparators for the extracted keys.
  - compareAgainstReference
```
public int compareAgainstReference(Comparable[] keys)
```

Back to Flink Website

Class TypeComparator<T>

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Constructor Detail

TypeComparator

Method Detail

hash

setReference

equalToReference

compareToReference

supportsCompareAgainstReference

compare

compareSerialized

supportsNormalizedKey

supportsSerializationWithKeyNormalization

getNormalizeKeyLen

isNormalizedKeyPrefixOnly

putNormalizedKey

writeWithKeyNormalization

readWithKeyDenormalization

invertNormalizedKey

duplicate

extractKeys

getFlatComparators

compareAgainstReference

Back to Flink Website