Class StringValueComparator
- java.lang.Object
-
- org.apache.flink.api.common.typeutils.TypeComparator<StringValue>
-
- org.apache.flink.api.common.typeutils.base.StringValueComparator
-
- All Implemented Interfaces:
Serializable
@Internal public class StringValueComparator extends TypeComparator<StringValue>
Specialized comparator for StringValue based on CopyableValueComparator.- See Also:
- Serialized Form
-
-
Constructor Summary
Constructors Constructor Description StringValueComparator(boolean ascending)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description int
compare(StringValue first, StringValue second)
Compares two records in object form.int
compareSerialized(DataInputView firstSource, DataInputView secondSource)
Compares two records in serialized form.int
compareToReference(TypeComparator<StringValue> referencedComparator)
This method compares the element that has been set as reference in this type accessor, to the element set as reference in the given type accessor.TypeComparator<StringValue>
duplicate()
Creates a copy of this class.boolean
equalToReference(StringValue candidate)
Checks, whether the given element is equal to the element that has been set as the comparison reference in this comparator instance.int
extractKeys(Object record, Object[] target, int index)
Extracts the key fields from a record.TypeComparator<?>[]
getFlatComparators()
Get the field comparators.int
getNormalizeKeyLen()
Gets the number of bytes that the normalized key would maximally take.int
hash(StringValue record)
Computes a hash value for the given record.boolean
invertNormalizedKey()
Flag whether normalized key comparisons should be inverted key should be interpreted inverted, i.e. descending.boolean
isNormalizedKeyPrefixOnly(int keyBytes)
Checks, whether the given number of bytes for a normalized is only a prefix to determine the order of elements of the data type for which this comparator provides the comparison methods.void
putNormalizedKey(StringValue record, MemorySegment target, int offset, int numBytes)
Writes a normalized key for the given record into the target byte array, starting at the specified position and writing exactly the given number of bytes.StringValue
readWithKeyDenormalization(StringValue reuse, DataInputView source)
Reads the record back while de-normalizing the key fields.void
setReference(StringValue toCompare)
Sets the given element as the comparison reference for future calls toTypeComparator.equalToReference(Object)
andTypeComparator.compareToReference(TypeComparator)
.boolean
supportsNormalizedKey()
Checks whether the data type supports the creation of a normalized key for comparison.boolean
supportsSerializationWithKeyNormalization()
Check whether this comparator supports to serialize the record in a format that replaces its keys by a normalized key.void
writeWithKeyNormalization(StringValue record, DataOutputView target)
Writes the record in such a fashion that all keys are normalizing and at the beginning of the serialized data.-
Methods inherited from class org.apache.flink.api.common.typeutils.TypeComparator
compareAgainstReference, supportsCompareAgainstReference
-
-
-
-
Method Detail
-
hash
public int hash(StringValue record)
Description copied from class:TypeComparator
Computes a hash value for the given record. The hash value should include all fields in the record relevant to the comparison.The hash code is typically not used as it is in hash tables and for partitioning, but it is further scrambled to make sure that a projection of the hash values to a lower cardinality space is as results in a rather uniform value distribution. However, any collisions produced by this method cannot be undone. While it is NOT important to create hash codes that cover the full spectrum of bits in the integer, it IS important to avoid collisions when combining two value as much as possible.
- Specified by:
hash
in classTypeComparator<StringValue>
- Parameters:
record
- The record to be hashed.- Returns:
- A hash value for the record.
- See Also:
Object.hashCode()
-
setReference
public void setReference(StringValue toCompare)
Description copied from class:TypeComparator
Sets the given element as the comparison reference for future calls toTypeComparator.equalToReference(Object)
andTypeComparator.compareToReference(TypeComparator)
. This method must set the given element into this comparator instance's state. If the comparison happens on a subset of the fields from the record, this method may extract those fields.A typical example for checking the equality of two elements is the following:
The rational behind this method is that elements are typically compared using certain features that are extracted from them, (such de-serializing as a subset of fields). When setting the reference, this extraction happens. The extraction needs happen only once per element, even though an element is often compared to multiple other elements, such as when finding equal elements in the process of grouping the elements.E e1 = ...; E e2 = ...; TypeComparator<E> acc = ...; acc.setReference(e1); boolean equal = acc.equalToReference(e2);
- Specified by:
setReference
in classTypeComparator<StringValue>
- Parameters:
toCompare
- The element to set as the comparison reference.
-
equalToReference
public boolean equalToReference(StringValue candidate)
Description copied from class:TypeComparator
Checks, whether the given element is equal to the element that has been set as the comparison reference in this comparator instance.- Specified by:
equalToReference
in classTypeComparator<StringValue>
- Parameters:
candidate
- The candidate to check.- Returns:
- True, if the element is equal to the comparison reference, false otherwise.
- See Also:
TypeComparator.setReference(Object)
-
compareToReference
public int compareToReference(TypeComparator<StringValue> referencedComparator)
Description copied from class:TypeComparator
This method compares the element that has been set as reference in this type accessor, to the element set as reference in the given type accessor. Similar to comparing two elementse1
ande2
via a comparator, this method can be used the following way.
The rational behind this method is that elements are typically compared using certain features that are extracted from them, (such de-serializing as a subset of fields). When setting the reference, this extraction happens. The extraction needs happen only once per element, even though an element is typically compared to many other elements when establishing a sorted order. The actual comparison performed by this method may be very cheap, as it happens on the extracted features.E e1 = ...; E e2 = ...; TypeComparator<E> acc1 = ...; TypeComparator<E> acc2 = ...; acc1.setReference(e1); acc2.setReference(e2); int comp = acc1.compareToReference(acc2);
- Specified by:
compareToReference
in classTypeComparator<StringValue>
- Parameters:
referencedComparator
- The type accessors where the element for comparison has been set as reference.- Returns:
- A value smaller than zero, if the reference value of
referencedAccessors
is smaller than the reference value of this type accessor; a value greater than zero, if it is larger; zero, if both are equal. - See Also:
TypeComparator.setReference(Object)
-
compare
public int compare(StringValue first, StringValue second)
Description copied from class:TypeComparator
Compares two records in object form. The return value indicates the order of the two in the same way as defined byComparator.compare(Object, Object)
.- Specified by:
compare
in classTypeComparator<StringValue>
- Parameters:
first
- The first record.second
- The second record.- Returns:
- An integer defining the oder among the objects in the same way as
Comparator.compare(Object, Object)
. - See Also:
Comparator.compare(Object, Object)
-
compareSerialized
public int compareSerialized(DataInputView firstSource, DataInputView secondSource) throws IOException
Description copied from class:TypeComparator
Compares two records in serialized form. The return value indicates the order of the two in the same way as defined byComparator.compare(Object, Object)
.This method may de-serialize the records or compare them directly based on their binary representation.
- Specified by:
compareSerialized
in classTypeComparator<StringValue>
- Parameters:
firstSource
- The input view containing the first record.secondSource
- The input view containing the second record.- Returns:
- An integer defining the oder among the objects in the same way as
Comparator.compare(Object, Object)
. - Throws:
IOException
- Thrown, if any of the input views raised an exception when reading the records.- See Also:
Comparator.compare(Object, Object)
-
supportsNormalizedKey
public boolean supportsNormalizedKey()
Description copied from class:TypeComparator
Checks whether the data type supports the creation of a normalized key for comparison.- Specified by:
supportsNormalizedKey
in classTypeComparator<StringValue>
- Returns:
- True, if the data type supports the creation of a normalized key for comparison, false otherwise.
-
getNormalizeKeyLen
public int getNormalizeKeyLen()
Description copied from class:TypeComparator
Gets the number of bytes that the normalized key would maximally take. A value ofInteger
.MAX_VALUE is interpreted as infinite.- Specified by:
getNormalizeKeyLen
in classTypeComparator<StringValue>
- Returns:
- The number of bytes that the normalized key would maximally take.
-
isNormalizedKeyPrefixOnly
public boolean isNormalizedKeyPrefixOnly(int keyBytes)
Description copied from class:TypeComparator
Checks, whether the given number of bytes for a normalized is only a prefix to determine the order of elements of the data type for which this comparator provides the comparison methods. For example, if the data type is ordered with respect to an integer value it contains, then this method would return true, if the number of key bytes is smaller than four.- Specified by:
isNormalizedKeyPrefixOnly
in classTypeComparator<StringValue>
- Returns:
- True, if the given number of bytes is only a prefix, false otherwise.
-
putNormalizedKey
public void putNormalizedKey(StringValue record, MemorySegment target, int offset, int numBytes)
Description copied from class:TypeComparator
Writes a normalized key for the given record into the target byte array, starting at the specified position and writing exactly the given number of bytes. Note that the comparison of the bytes is treating the bytes as unsigned bytes:int byteI = bytes[i] & 0xFF;
If the meaningful part of the normalized key takes less than the given number of bytes, then it must be padded. Padding is typically required for variable length data types, such as strings. The padding uses a special character, either
0
or0xff
, depending on whether shorter values are sorted to the beginning or the end.This method is similar to
NormalizableKey.copyNormalizedKey(MemorySegment, int, int)
. In the case that multiple fields of a record contribute to the normalized key, it is crucial that the fields align on the byte field, i.e. that every field always takes up the exact same number of bytes.- Specified by:
putNormalizedKey
in classTypeComparator<StringValue>
- Parameters:
record
- The record for which to create the normalized key.target
- The byte array into which to write the normalized key bytes.offset
- The offset in the byte array, where to start writing the normalized key bytes.numBytes
- The number of bytes to be written exactly.- See Also:
NormalizableKey.copyNormalizedKey(MemorySegment, int, int)
-
invertNormalizedKey
public boolean invertNormalizedKey()
Description copied from class:TypeComparator
Flag whether normalized key comparisons should be inverted key should be interpreted inverted, i.e. descending.- Specified by:
invertNormalizedKey
in classTypeComparator<StringValue>
- Returns:
- True, if all normalized key comparisons should invert the sign of the comparison result, false if the normalized key should be used as is.
-
duplicate
public TypeComparator<StringValue> duplicate()
Description copied from class:TypeComparator
Creates a copy of this class. The copy must be deep such that no state set in the copy affects this instance of the comparator class.- Specified by:
duplicate
in classTypeComparator<StringValue>
- Returns:
- A deep copy of this comparator instance.
-
extractKeys
public int extractKeys(Object record, Object[] target, int index)
Description copied from class:TypeComparator
Extracts the key fields from a record. This is for use by the PairComparator to provide interoperability between different record types. Note, that at least one key should be extracted.- Specified by:
extractKeys
in classTypeComparator<StringValue>
- Parameters:
record
- The record that contains the key(s)target
- The array to write the key(s) into.index
- The offset of the target array to start writing into.- Returns:
- the number of keys added to target.
-
getFlatComparators
public TypeComparator<?>[] getFlatComparators()
Description copied from class:TypeComparator
Get the field comparators. This is used together withTypeComparator.extractKeys(Object, Object[], int)
to provide interoperability between different record types. Note, that this should return at least one Comparator and that the number of Comparators must match the number of extracted keys.- Specified by:
getFlatComparators
in classTypeComparator<StringValue>
- Returns:
- An Array of Comparators for the extracted keys.
-
supportsSerializationWithKeyNormalization
public boolean supportsSerializationWithKeyNormalization()
Description copied from class:TypeComparator
Check whether this comparator supports to serialize the record in a format that replaces its keys by a normalized key.- Specified by:
supportsSerializationWithKeyNormalization
in classTypeComparator<StringValue>
- Returns:
- True, if the comparator supports that specific form of serialization, false if not.
-
writeWithKeyNormalization
public void writeWithKeyNormalization(StringValue record, DataOutputView target) throws IOException
Description copied from class:TypeComparator
Writes the record in such a fashion that all keys are normalizing and at the beginning of the serialized data. This must only be used when for all the key fields the full normalized key is used. The method#supportsSerializationWithKeyNormalization()
allows to check that.- Specified by:
writeWithKeyNormalization
in classTypeComparator<StringValue>
- Parameters:
record
- The record object into which to read the record data.target
- The stream to which to write the data,- Throws:
IOException
- See Also:
TypeComparator.supportsSerializationWithKeyNormalization()
,TypeComparator.readWithKeyDenormalization(Object, DataInputView)
,NormalizableKey.copyNormalizedKey(MemorySegment, int, int)
-
readWithKeyDenormalization
public StringValue readWithKeyDenormalization(StringValue reuse, DataInputView source) throws IOException
Description copied from class:TypeComparator
Reads the record back while de-normalizing the key fields. This must only be used when for all the key fields the full normalized key is used, which is hinted by the#supportsSerializationWithKeyNormalization()
method.- Specified by:
readWithKeyDenormalization
in classTypeComparator<StringValue>
- Parameters:
reuse
- The reuse object into which to read the record data.source
- The stream from which to read the data,- Throws:
IOException
- See Also:
TypeComparator.supportsSerializationWithKeyNormalization()
,TypeComparator.writeWithKeyNormalization(Object, DataOutputView)
,NormalizableKey.copyNormalizedKey(MemorySegment, int, int)
-
-