Bloom Filters
How Bloom Filters Work A Bloom filter is a space-efficient probabilistic data structure used to test whether an element is possibly in a set or definitely not in a set. It allows for false positives but never false negatives. One example of application is the membership query in Wide Column Storage, HBase. They make document lookup faster by completely skipping some HFiles. Structure and Initialization A Bloom filter consists of: A bit array of size $m$, initialized to all 0s. $k$ independent hash functions, each mapping an input element to one of the $m$ positions in the bit array. Common operations Insertion of an Element To insert an element $x$: ...