Understanding How Bitmaps Identify Distinct Values¶
A bitmap is effectively a byte array in which each bit (zero-based) represents existence (bit = 1) or absence (bit = 0) of an integer in a table. For example,
SELECT bitmap_construct_agg(val) AS bitmap FROM bitmap_test_values;
Returns a byte array bitmap
where if bitmap[i][j] = 1
, it means that value = 8 * i + j
exists in the table.
The data can be sparse and very big. To be more efficient, we group the data into “buckets”. A distinct value is represented by the combination of a “bucket number” and a bit that is set in that bitmap. To identify the bucket and the bit that represents a specific value, use the following functions:
BITMAP_BUCKET_NUMBER to find the bucket.
BITMAP_BIT_POSITION to find the zero-based position of the bit.
In the FDB Relational Layer, the fixed bitmap bucket size is 10,000 bits (1250 bytes). For example:
|
|
|
---|---|---|
|
|
|
|
|
|
|
|
|
If we create and populate the table bitmap_test_values
:
CREATE TABLE bitmap_test_values (val INT);
insert into bitmap_test_values values (1), (20003);
Then the following query:
select bitmap_bucket_number(val) as offset,
bitmap_construct_agg(bitmap_bit_position(val)) as bitmap
from bitmap_test_values
group by offset;
Returns:
|
|
---|---|
|
|
|
|