Class BinaryHashTable
- java.lang.Object
-
- org.apache.flink.table.runtime.hashtable.BaseHybridHashTable
-
- org.apache.flink.table.runtime.hashtable.BinaryHashTable
-
- All Implemented Interfaces:
MemorySegmentSource
,MemorySegmentPool
public class BinaryHashTable extends BaseHybridHashTable
An implementation of a Hybrid Hash Join. The join starts operating in memory and gradually starts spilling contents to disk, when the memory is not sufficient. It does not need to know a priority how large the input will be.The design of this class follows in many parts the design presented in "Hash joins and hash teams in Microsoft SQL Server", by Goetz Graefe et al. In its current state, the implementation lacks features like dynamic role reversal, partition tuning, or histogram guided partitioning.
-
-
Field Summary
-
Fields inherited from class org.apache.flink.table.runtime.hashtable.BaseHybridHashTable
buildRowCount, buildSpillRetBufferNumbers, buildSpillReturnBuffers, closed, compressionBlockSize, compressionCodecFactory, compressionEnabled, currentEnumerator, currentRecursionDepth, currentSpilledBuildSide, currentSpilledProbeSide, initPartitionFanOut, internalPool, ioManager, LOG, MAX_NUM_PARTITIONS, MAX_RECURSION_DEPTH, numSpillFiles, segmentSize, segmentSizeBits, segmentSizeMask, spillInBytes, totalNumBuffers, tryDistinctBuildRow
-
-
Constructor Summary
Constructors Constructor Description BinaryHashTable(Object owner, boolean compressionEnabled, int compressionBlockSize, AbstractRowDataSerializer buildSideSerializer, AbstractRowDataSerializer probeSideSerializer, Projection<RowData,BinaryRowData> buildSideProjection, Projection<RowData,BinaryRowData> probeSideProjection, MemoryManager memManager, long reservedMemorySize, IOManager ioManager, int avgRecordLen, long buildRowCount, boolean useBloomFilters, HashJoinType type, JoinCondition condFunc, boolean reverseJoin, boolean[] filterNulls, boolean tryDistinctBuildRow)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
clearPartitions()
This method clears all partitions currently residing (partially) in memory.void
endBuild()
End build phase.RowIterator<BinaryRowData>
getBuildSideIterator()
RowData
getCurrentProbeRow()
List<BinaryHashPartition>
getPartitionsPendingForSMJ()
RowIterator
getSpilledPartitionBuildSideIter(BinaryHashPartition p)
ProbeIterator
getSpilledPartitionProbeSideIter(BinaryHashPartition p)
boolean
nextMatching()
Next record from rebuilt spilled partition or build side outer partition.void
putBuildRow(RowData row)
Put a build side row to hash table.protected int
spillPartition()
Selects a partition and spills it.boolean
tryProbe(RowData record)
Find matched build side rows for a probe row.-
Methods inherited from class org.apache.flink.table.runtime.hashtable.BaseHybridHashTable
close, createInputView, ensureNumBuffersReturned, free, freeCurrent, freePages, getNextBuffer, getNextBuffers, getNotNullNextBuffer, getNumSpillFiles, getSpillInBytes, getUsedMemoryInBytes, hash, maxInitBufferOfBucketArea, maxNumPartition, nextSegment, pageSize, readAllBuffers, releaseMemoryCacheForSMJ, remainBuffers, returnAll, returnPage
-
-
-
-
Constructor Detail
-
BinaryHashTable
public BinaryHashTable(Object owner, boolean compressionEnabled, int compressionBlockSize, AbstractRowDataSerializer buildSideSerializer, AbstractRowDataSerializer probeSideSerializer, Projection<RowData,BinaryRowData> buildSideProjection, Projection<RowData,BinaryRowData> probeSideProjection, MemoryManager memManager, long reservedMemorySize, IOManager ioManager, int avgRecordLen, long buildRowCount, boolean useBloomFilters, HashJoinType type, JoinCondition condFunc, boolean reverseJoin, boolean[] filterNulls, boolean tryDistinctBuildRow)
-
-
Method Detail
-
putBuildRow
public void putBuildRow(RowData row) throws IOException
Put a build side row to hash table.- Throws:
IOException
-
endBuild
public void endBuild() throws IOException
End build phase.- Throws:
IOException
-
tryProbe
public boolean tryProbe(RowData record) throws IOException
Find matched build side rows for a probe row.- Returns:
- return false if the target partition has spilled, we will spill this probe row too. The row will be re-match in rebuild phase.
- Throws:
IOException
-
nextMatching
public boolean nextMatching() throws IOException
Next record from rebuilt spilled partition or build side outer partition.- Throws:
IOException
-
getCurrentProbeRow
public RowData getCurrentProbeRow()
-
getBuildSideIterator
public RowIterator<BinaryRowData> getBuildSideIterator()
-
clearPartitions
public void clearPartitions()
This method clears all partitions currently residing (partially) in memory. It releases all memory and deletes all spilled partitions.This method is intended for a hard cleanup in the case that the join is aborted.
- Specified by:
clearPartitions
in classBaseHybridHashTable
-
spillPartition
protected int spillPartition() throws IOException
Selects a partition and spills it. The number of the spilled partition is returned.- Specified by:
spillPartition
in classBaseHybridHashTable
- Returns:
- The number of the spilled partition.
- Throws:
IOException
-
getPartitionsPendingForSMJ
public List<BinaryHashPartition> getPartitionsPendingForSMJ()
-
getSpilledPartitionBuildSideIter
public RowIterator getSpilledPartitionBuildSideIter(BinaryHashPartition p) throws IOException
- Throws:
IOException
-
getSpilledPartitionProbeSideIter
public ProbeIterator getSpilledPartitionProbeSideIter(BinaryHashPartition p) throws IOException
- Throws:
IOException
-
-