public class BloomFilter extends Object
Internally, this implementation of bloom filter uses MemorySegment to store BitSet, BloomFilter and BitSet are designed to be able to switch between different MemorySegments, so that Flink can share the same BloomFilter/BitSet object instance for different bloom filters.
Part of this class refers to the implementation from Apache Hive project https://github.com/apache/hive/blob/master/common/src/java/org/apache/hive/common/util/BloomFilter.java
Modifier and Type | Field and Description |
---|---|
protected BitSet |
bitSet |
protected int |
numHashFunctions |
Constructor and Description |
---|
BloomFilter(int expectedEntries,
int byteSize) |
Modifier and Type | Method and Description |
---|---|
void |
addHash(int hash32) |
static double |
estimateFalsePositiveProbability(long inputEntries,
int bitSize)
Compute the false positive probability based on given input entries and bits size.
|
static BloomFilter |
fromBytes(byte[] bytes)
Deserializing bytes array to BloomFilter.
|
static byte[] |
mergeSerializedBloomFilters(byte[] bf1Bytes,
byte[] bf2Bytes) |
static int |
optimalNumOfBits(long inputEntries,
double fpp)
Compute optimal bits number with given input entries and expected false positive probability.
|
void |
reset() |
void |
setBitsLocation(MemorySegment memorySegment,
int offset) |
boolean |
testHash(int hash32) |
static byte[] |
toBytes(BloomFilter filter)
Serializing to bytes, note that only heap memory is currently supported.
|
String |
toString() |
protected BitSet bitSet
protected int numHashFunctions
public void setBitsLocation(MemorySegment memorySegment, int offset)
public static int optimalNumOfBits(long inputEntries, double fpp)
inputEntries
- fpp
- public static double estimateFalsePositiveProbability(long inputEntries, int bitSize)
inputEntries
- bitSize
- public void addHash(int hash32)
public boolean testHash(int hash32)
public void reset()
public static byte[] toBytes(BloomFilter filter)
public static BloomFilter fromBytes(byte[] bytes)
public static byte[] mergeSerializedBloomFilters(byte[] bf1Bytes, byte[] bf2Bytes)
Copyright © 2014–2024 The Apache Software Foundation. All rights reserved.