Class AbstractBytesHashMap<K>

  • Direct Known Subclasses:
    BytesHashMap, WindowBytesHashMap

    public abstract class AbstractBytesHashMap<K>
    extends BytesMap<K,​BinaryRowData>
    Bytes based hash map. It can be used for performing aggregations where the aggregated values are fixed-width, because the data is stored in continuous memory, AggBuffer of variable length cannot be applied to this HashMap. The KeyValue form in hash map is designed to reduce the cost of key fetching in lookup. The memory is divided into two areas:

    Bucket area: pointer + hashcode.

    • Bytes 0 to 4: a pointer to the record in the record area
    • Bytes 4 to 8: key's full 32-bit hashcode

    Record area: the actual data in linked list records, a record has four parts:

    • Bytes 0 to 4: len(k)
    • Bytes 4 to 4 + len(k): key data
    • Bytes 4 + len(k) to 8 + len(k): len(v)
    • Bytes 8 + len(k) to 8 + len(k) + len(v): value data

    BytesHashMap are influenced by Apache Spark BytesToBytesMap.