Class BinaryStringData
- java.lang.Object
-
- org.apache.flink.table.data.binary.LazyBinaryFormat<String>
-
- org.apache.flink.table.data.binary.BinaryStringData
-
- All Implemented Interfaces:
Comparable<StringData>
,BinaryFormat
,StringData
@Internal public final class BinaryStringData extends LazyBinaryFormat<String> implements StringData
A lazily binary implementation ofStringData
which is backed byMemorySegment
s andString
.Either
MemorySegment
s orString
must be provided when constructingBinaryStringData
. The other representation will be materialized when needed.It provides many useful methods for comparison, search, and so on.
-
-
Field Summary
Fields Modifier and Type Field Description static BinaryStringData
EMPTY_UTF8
-
Fields inherited from interface org.apache.flink.table.data.binary.BinaryFormat
HIGHEST_FIRST_BIT, HIGHEST_SECOND_TO_EIGHTH_BIT, MAX_FIX_PART_DATA_SIZE
-
-
Constructor Summary
Constructors Constructor Description BinaryStringData()
BinaryStringData(String javaObject)
BinaryStringData(MemorySegment[] segments, int offset, int sizeInBytes)
BinaryStringData(MemorySegment[] segments, int offset, int sizeInBytes, String javaObject)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description static BinaryStringData
blankString(int length)
Creates aBinaryStringData
instance that contains `length` spaces.byte
byteAt(int index)
Returns thebyte
value at the specified index.int
compareTo(StringData o)
Compares two strings lexicographically.boolean
contains(BinaryStringData s)
Returns true if and only if this BinaryStringData contains the specified sequence of bytes values.BinaryStringData
copy()
Copy a newBinaryStringData
.boolean
endsWith(BinaryStringData suffix)
Tests if this BinaryStringData ends with the specified suffix.void
ensureMaterialized()
boolean
equals(Object o)
static BinaryStringData
fromAddress(MemorySegment[] segments, int offset, int numBytes)
Creates aBinaryStringData
instance from the given address (base and offset) and length.static BinaryStringData
fromBytes(byte[] bytes)
Creates aBinaryStringData
instance from the given UTF-8 bytes.static BinaryStringData
fromBytes(byte[] bytes, int offset, int numBytes)
Creates aBinaryStringData
instance from the given UTF-8 bytes with offset and number of bytes.static BinaryStringData
fromString(String str)
Creates aBinaryStringData
instance from the given Java string.int
getOffset()
Gets the start offset of this binary data in theMemorySegment
s.MemorySegment[]
getSegments()
Gets the underlyingMemorySegment
s this binary format spans.int
getSizeInBytes()
Gets the size in bytes of this binary data.int
hashCode()
int
indexOf(BinaryStringData str, int fromIndex)
Returns the index within this string of the first occurrence of the specified substring, starting at the specified index.protected BinarySection
materialize(TypeSerializer<String> serializer)
Materialize java object to binary format.int
numChars()
Returns the number of UTF-8 code points in the string.boolean
startsWith(BinaryStringData prefix)
Tests if this BinaryStringData starts with the specified prefix.BinaryStringData
substring(int beginIndex, int endIndex)
Returns a binary string that is a substring of this binary string.byte[]
toBytes()
Converts thisStringData
object to a UTF-8 byte array.BinaryStringData
toLowerCase()
Converts all of the characters in thisBinaryStringData
to lower case.String
toString()
Converts thisStringData
object to aString
.BinaryStringData
toUpperCase()
Converts all of the characters in thisBinaryStringData
to upper case.BinaryStringData
trim()
Returns a string whose value is this string, with any leading and trailing whitespace removed.-
Methods inherited from class org.apache.flink.table.data.binary.LazyBinaryFormat
ensureMaterialized, getBinarySection, getJavaObject, setJavaObject
-
-
-
-
Field Detail
-
EMPTY_UTF8
public static final BinaryStringData EMPTY_UTF8
-
-
Constructor Detail
-
BinaryStringData
public BinaryStringData()
-
BinaryStringData
public BinaryStringData(String javaObject)
-
BinaryStringData
public BinaryStringData(MemorySegment[] segments, int offset, int sizeInBytes)
-
BinaryStringData
public BinaryStringData(MemorySegment[] segments, int offset, int sizeInBytes, String javaObject)
-
-
Method Detail
-
fromAddress
public static BinaryStringData fromAddress(MemorySegment[] segments, int offset, int numBytes)
Creates aBinaryStringData
instance from the given address (base and offset) and length.
-
fromString
public static BinaryStringData fromString(String str)
Creates aBinaryStringData
instance from the given Java string.
-
fromBytes
public static BinaryStringData fromBytes(byte[] bytes)
Creates aBinaryStringData
instance from the given UTF-8 bytes.
-
fromBytes
public static BinaryStringData fromBytes(byte[] bytes, int offset, int numBytes)
Creates aBinaryStringData
instance from the given UTF-8 bytes with offset and number of bytes.
-
blankString
public static BinaryStringData blankString(int length)
Creates aBinaryStringData
instance that contains `length` spaces.
-
toBytes
public byte[] toBytes()
Description copied from interface:StringData
Converts thisStringData
object to a UTF-8 byte array.Note: The returned byte array may be reused.
- Specified by:
toBytes
in interfaceStringData
-
toString
public String toString()
Description copied from interface:StringData
Converts thisStringData
object to aString
.- Specified by:
toString
in interfaceStringData
- Overrides:
toString
in classObject
-
compareTo
public int compareTo(@Nonnull StringData o)
Compares two strings lexicographically. Since UTF-8 uses groups of six bits, it is sometimes useful to use octal notation which uses 3-bit groups. With a calculator which can convert between hexadecimal and octal it can be easier to manually create or interpret UTF-8 compared with using binary. So we just compare the binary.- Specified by:
compareTo
in interfaceComparable<StringData>
-
numChars
public int numChars()
Returns the number of UTF-8 code points in the string.
-
byteAt
public byte byteAt(int index)
Returns thebyte
value at the specified index. An index ranges from0
tobinarySection.sizeInBytes - 1
.- Parameters:
index
- the index of thebyte
value.- Returns:
- the
byte
value at the specified index of this UTF-8 bytes. - Throws:
IndexOutOfBoundsException
- if theindex
argument is negative or not less than the length of this UTF-8 bytes.
-
getSegments
public MemorySegment[] getSegments()
Description copied from interface:BinaryFormat
Gets the underlyingMemorySegment
s this binary format spans.- Specified by:
getSegments
in interfaceBinaryFormat
- Overrides:
getSegments
in classLazyBinaryFormat<String>
-
getOffset
public int getOffset()
Description copied from interface:BinaryFormat
Gets the start offset of this binary data in theMemorySegment
s.- Specified by:
getOffset
in interfaceBinaryFormat
- Overrides:
getOffset
in classLazyBinaryFormat<String>
-
getSizeInBytes
public int getSizeInBytes()
Description copied from interface:BinaryFormat
Gets the size in bytes of this binary data.- Specified by:
getSizeInBytes
in interfaceBinaryFormat
- Overrides:
getSizeInBytes
in classLazyBinaryFormat<String>
-
ensureMaterialized
public void ensureMaterialized()
-
materialize
protected BinarySection materialize(TypeSerializer<String> serializer)
Description copied from class:LazyBinaryFormat
Materialize java object to binary format. Inherited classes need to hold the information they need. (For example,RawValueData
needs javaObjectSerializer).- Specified by:
materialize
in classLazyBinaryFormat<String>
-
copy
public BinaryStringData copy()
Copy a newBinaryStringData
.
-
substring
public BinaryStringData substring(int beginIndex, int endIndex)
Returns a binary string that is a substring of this binary string. The substring begins at the specifiedbeginIndex
and extends to the character at indexendIndex - 1
.Examples:
fromString("hamburger").substring(4, 8) returns binary string "urge" fromString("smiles").substring(1, 5) returns binary string "mile"
- Parameters:
beginIndex
- the beginning index, inclusive.endIndex
- the ending index, exclusive.- Returns:
- the specified substring, return EMPTY_UTF8 when index out of bounds instead of StringIndexOutOfBoundsException.
-
contains
public boolean contains(BinaryStringData s)
Returns true if and only if this BinaryStringData contains the specified sequence of bytes values.- Parameters:
s
- the sequence to search for- Returns:
- true if this BinaryStringData contains
s
, false otherwise
-
startsWith
public boolean startsWith(BinaryStringData prefix)
Tests if this BinaryStringData starts with the specified prefix.- Parameters:
prefix
- the prefix.- Returns:
true
if the bytes represented by the argument is a prefix of the bytes represented by this string;false
otherwise. Note also thattrue
will be returned if the argument is an empty BinaryStringData or is equal to thisBinaryStringData
object as determined by theequals(Object)
method.
-
endsWith
public boolean endsWith(BinaryStringData suffix)
Tests if this BinaryStringData ends with the specified suffix.- Parameters:
suffix
- the suffix.- Returns:
true
if the bytes represented by the argument is a suffix of the bytes represented by this object;false
otherwise. Note that the result will betrue
if the argument is the empty string or is equal to thisBinaryStringData
object as determined by theequals(Object)
method.
-
trim
public BinaryStringData trim()
Returns a string whose value is this string, with any leading and trailing whitespace removed.- Returns:
- A string whose value is this string, with any leading and trailing white space removed, or this string if it has no leading or trailing white space.
-
indexOf
public int indexOf(BinaryStringData str, int fromIndex)
Returns the index within this string of the first occurrence of the specified substring, starting at the specified index.- Parameters:
str
- the substring to search for.fromIndex
- the index from which to start the search.- Returns:
- the index of the first occurrence of the specified substring, starting at the
specified index, or
-1
if there is no such occurrence.
-
toUpperCase
public BinaryStringData toUpperCase()
Converts all of the characters in thisBinaryStringData
to upper case.- Returns:
- the
BinaryStringData
, converted to uppercase.
-
toLowerCase
public BinaryStringData toLowerCase()
Converts all of the characters in thisBinaryStringData
to lower case.- Returns:
- the
BinaryStringData
, converted to lowercase.
-
-