Class RTree
java.lang.Object
com.apple.foundationdb.async.rtree.RTree
An implementation of an R-tree. See this} link for a general
introduction to R-trees.
The main use-case for R-trees, while they are normally used for spatial querying of actual objects in N-dimensional space, is to function as a natural extension of regular B+-tree-based indexes in FDB, but spanning into multiple dimensions. That allows to answer queries using multiple inequalities which is not possible with 1-D indexes.
Here is a short introduction copied from the explanation at the linked wikipedia page. The key idea of the data structure is to group nearby objects and represent them with their minimum bounding rectangle in the next higher level of the tree; the "R" in R-tree is for rectangle. Since all objects lie within this bounding rectangle, a query that does not intersect the bounding rectangle also cannot intersect any of the contained objects. At the leaf level, each rectangle describes a single object; at higher levels the aggregation includes an increasing number of objects. This can also be seen as an increasingly coarse approximation of the data set.
Similar to the B-tree, the R-tree is also a balanced search tree (so all leaf nodes are at the same depth), organizes the data in pages/nodes, and is designed for storage on disk. Each node can contain a maximum number of entries, often denoted as
One of the key properties of an R-tree is that the minimum bounding rectangles (MBR) of the children of a node in the tree may overlap which may cause multiple children to intersect with a query even if that query's mbr is just a single point. An object is only stored in exactly one leaf node of the tree, however, during a search of the tree multiple paths may have to exhaustively followed in order go find all matching objects of that query.
The search performance of the tree is strongly linked to the size of the area being covered by a child (as indicated by a child's mbr) as well as the overlap among children's mbrs at each level of the tree. The key difficulty in constructing a search-efficient tree is to minimize both covered area and the overlap while keeping the tree balanced. Variants of the R-tree such as R+-trees and R*-trees employ different techniques to improve on the basic R-tree ideas and are provably superior with respect to packing of the data structure as well as search performance. These improvements are accomplished by a more complex write path; R+-trees strive to eliminate overlap altogether which becomes more problematic with higher dimensionality while R*-trees attempt to minimize both the covered area by a node and the sibling overlap by approximations as well as re-insertions in order to avoid node-splits. For more information about R+-trees see R+-tree, for more information about R*-trees see R*-tree.
All variants of R-tree mentioned so far have a fatal flaw when considered in context with FDB and specifically the record layer. None of the R-tree variants can return their elements in a stable order that is not sensitive to the physical layout of the tree at query time. That proves to be problematic for queries that are continued at a later time as the physical structure of the tree may have changed due to re-balancing. Thus, it would become necessary to encode all already returned items into the continuation which is simply not feasible.
Another variant (the one we implement here) is a Hilbert R-tree. See Hilbert R-tree for details. In short, the Hilbert R-tree, in addition to being a regular R-tree, also utilizes the Hilbert value (see
Clustering based on the Hilbert value has been proven to be superior compared to R-trees, R+-trees, and R*-trees. A disadvantage of a Hilbert R-tree is the definition of the canvas the Hilbert curve is defined over. While there are ways to define a Hilbert curve for floating point coordinates, we cannot support variable length dimensions such as strings. In fact, we only support INT32 and INT64 dimensions.
The main use-case for R-trees, while they are normally used for spatial querying of actual objects in N-dimensional space, is to function as a natural extension of regular B+-tree-based indexes in FDB, but spanning into multiple dimensions. That allows to answer queries using multiple inequalities which is not possible with 1-D indexes.
Here is a short introduction copied from the explanation at the linked wikipedia page. The key idea of the data structure is to group nearby objects and represent them with their minimum bounding rectangle in the next higher level of the tree; the "R" in R-tree is for rectangle. Since all objects lie within this bounding rectangle, a query that does not intersect the bounding rectangle also cannot intersect any of the contained objects. At the leaf level, each rectangle describes a single object; at higher levels the aggregation includes an increasing number of objects. This can also be seen as an increasingly coarse approximation of the data set.
Similar to the B-tree, the R-tree is also a balanced search tree (so all leaf nodes are at the same depth), organizes the data in pages/nodes, and is designed for storage on disk. Each node can contain a maximum number of entries, often denoted as
M
. It also guarantees a minimum fill (except for the root node).
One of the key properties of an R-tree is that the minimum bounding rectangles (MBR) of the children of a node in the tree may overlap which may cause multiple children to intersect with a query even if that query's mbr is just a single point. An object is only stored in exactly one leaf node of the tree, however, during a search of the tree multiple paths may have to exhaustively followed in order go find all matching objects of that query.
The search performance of the tree is strongly linked to the size of the area being covered by a child (as indicated by a child's mbr) as well as the overlap among children's mbrs at each level of the tree. The key difficulty in constructing a search-efficient tree is to minimize both covered area and the overlap while keeping the tree balanced. Variants of the R-tree such as R+-trees and R*-trees employ different techniques to improve on the basic R-tree ideas and are provably superior with respect to packing of the data structure as well as search performance. These improvements are accomplished by a more complex write path; R+-trees strive to eliminate overlap altogether which becomes more problematic with higher dimensionality while R*-trees attempt to minimize both the covered area by a node and the sibling overlap by approximations as well as re-insertions in order to avoid node-splits. For more information about R+-trees see R+-tree, for more information about R*-trees see R*-tree.
All variants of R-tree mentioned so far have a fatal flaw when considered in context with FDB and specifically the record layer. None of the R-tree variants can return their elements in a stable order that is not sensitive to the physical layout of the tree at query time. That proves to be problematic for queries that are continued at a later time as the physical structure of the tree may have changed due to re-balancing. Thus, it would become necessary to encode all already returned items into the continuation which is simply not feasible.
Another variant (the one we implement here) is a Hilbert R-tree. See Hilbert R-tree for details. In short, the Hilbert R-tree, in addition to being a regular R-tree, also utilizes the Hilbert value (see
RTreeHilbertCurveHelpers
) of the center of an mbr of an object (or the point itself if the object is a
point) to establish an ordering among objects and nodes stored in the tree. All traversals of the tree return objects
in Hilbert Value order. The Hilbert value usually is a BigInteger
that can be encoded into the continuation
of a query thus overcoming the fundamental problems plaguing other variants of the R-trees as mentioned above. In
addition to a stable and logical traversal order, the Hilbert value is used to naturally cluster the tree as similar
values in Hilbert space map to nearby points in N-dimensional Euclidean space. Lastly, the Hilbert value is also used
to avoid eager node-splitting during insertions as well as eager node-fusing during deletions as it defines a natural
order between siblings. A node can transfer empty slots from their siblings (for insertions) or children
(for deletions). In this way the tree is packed more tightly and costly re-balancing can be avoided while we still
do not have to resort to re-insertions of overflowing children.
Clustering based on the Hilbert value has been proven to be superior compared to R-trees, R+-trees, and R*-trees. A disadvantage of a Hilbert R-tree is the definition of the canvas the Hilbert curve is defined over. While there are ways to define a Hilbert curve for floating point coordinates, we cannot support variable length dimensions such as strings. In fact, we only support INT32 and INT64 dimensions.
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic class
Configuration settings for aRTree
.static class
Builder forRTree.Config
.static class
Iterator for iterating the items contained in the leaf nodes produced by an underlyingRTree.LeafIterator
.static class
Class to capture an N-dimensional point.static class
Class to capture an N-dimensional rectangle/cube/hypercube.static enum
Different kinds of storage layouts. -
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final RTree.Config
static final int
The maximum number of slots a node has.static final int
The minimum number of slots a node has (if not the root node).static final int
The magic split number.static final RTree.Storage
Default storage layout.static final boolean
Indicator if Hilbert values should be stored or not with the data (in leaf nodes).static final boolean
Indicator if we should maintain a secondary node index consisting of hilbet value and key to speed up update/deletes.static final int
-
Constructor Summary
ConstructorsConstructorDescriptionRTree
(Subspace subspace, Subspace nodeSlotIndexSubspace, Executor executor, RTree.Config config, Function<RTree.Point, BigInteger> hilbertValueFunction, Supplier<byte[]> nodeIdSupplier, OnWriteListener onWriteListener, OnReadListener onReadListener) Initialize a new R-tree.RTree
(Subspace subspace, Subspace secondarySubspace, Executor executor, Function<RTree.Point, BigInteger> hilbertValueFunction) Initialize a new R-tree with the default configuration. -
Method Summary
Modifier and TypeMethodDescriptiondelete
(TransactionContext tc, RTree.Point point, Tuple keySuffix) Method to delete from the R-tree.int
depth
(TransactionContext transactionContext) Method to compute the depth of this R-tree.Get this r-tree's configuration.Get the executer used by this r-tree.Get the on-read listener.Get the on-write listener.insertOrUpdate
(TransactionContext tc, RTree.Point point, Tuple keySuffix, Tuple value) Method to insert an object/item into the R-tree.static RTree.ConfigBuilder
Start building aRTree.Config
.scan
(ReadTransaction readTransaction, BigInteger lastHilbertValue, Tuple lastKey, Predicate<RTree.Rectangle> mbrPredicate, BiPredicate<Tuple, Tuple> suffixKeyPredicate) Perform a scan over the tree within the transaction passed in using a predicate that is also passed in to eliminate subtrees from the scan.scan
(ReadTransaction readTransaction, Predicate<RTree.Rectangle> mbrPredicate, BiPredicate<Tuple, Tuple> suffixKeyPredicate) Perform a scan over the tree within the transaction passed in using a predicate that is also passed in to eliminate subtrees from the scan.void
Method to validate the Hilbert R-tree.void
Method to validate the Hilbert R-tree.
-
Field Details
-
MAX_CONCURRENT_READS
public static final int MAX_CONCURRENT_READS- See Also:
-
DEFAULT_USE_NODE_SLOT_INDEX
public static final boolean DEFAULT_USE_NODE_SLOT_INDEXIndicator if we should maintain a secondary node index consisting of hilbet value and key to speed up update/deletes.- See Also:
-
DEFAULT_MIN_M
public static final int DEFAULT_MIN_MThe minimum number of slots a node has (if not the root node).M
should be chosen in a way that the minimum is half of the maximum. That in turn guarantees that overflow/underflow handling can be performed without causing further underflow/overflow.- See Also:
-
DEFAULT_MAX_M
public static final int DEFAULT_MAX_MThe maximum number of slots a node has. This value is derived fromDEFAULT_MIN_M
.- See Also:
-
DEFAULT_S
public static final int DEFAULT_SThe magic split number. We splitS
toS + 1
nodes while inserting data and fuseS + 1
toS
nodes while deleting data. Academically, 2-to-3 splits and 3-to-2 fuses seem to yield the best results. Please be aware of the following constraints:- When splitting
S
toS + 1
nodes, we re-distribute the children ofS
nodes intoS + 1
nodes which may cause an underflow ifS
andM
are not set carefully with respect to each other. Example:MIN_M = 25
,MAX_M = 32
,S = 2
, two nodes at already at maximum capacity containing a combined total of 64 children when a new child is inserted. We split the two nodes into three as indicated byS = 2
. We have 65 children but there is no way of distributing them among three nodes such that none of them underflows. This constraint can be formulated asS * MAX_M / (S + 1) >= MIN_M
. - When fusing
S + 1
toS
nodes, we re-distribute the children ofS + 1
nodes intoS + 1
nodes which may cause an overflow ifS
andM
are not set carefully with respect to each other. Example:MIN_M = 25
,MAX_M = 32
,S = 2
, three nodes at already at minimum capacity containing a combined total of 75 children when a child is deleted. We fuse the three nodes into two as indicated byS = 2
. We have 75 children but there is no way of distributing them among two nodes such that none of them overflows. This constraint can be formulated as(S + 1) * MIN_M / S <= MAX_M
.
MAX_M / MIN_M >= (S + 1) / S
.- See Also:
- When splitting
-
DEFAULT_STORAGE
Default storage layout. Can be eitherBY_SLOT
orBY_NODE
.BY_SLOT
encodes all information pertaining to aNodeSlot
as one key/value pair in the database;BY_NODE
encodes all information pertaining to aNode
as one key/value pair in the database. WhileBY_SLOT
avoids conflicts as most inserts/updates only need to update one slot, it is by far less compact as some information is stored in a normalized fashion and therefore repeated multiple times (i.e. node identifiers, etc.).BY_NODE
inlines slot information into the node leading to a more size-efficient layout of the data. That advantage is offset by a higher likelihood of conflicts. -
DEFAULT_STORE_HILBERT_VALUES
public static final boolean DEFAULT_STORE_HILBERT_VALUESIndicator if Hilbert values should be stored or not with the data (in leaf nodes). A Hilbert value can always be recomputed from the point.- See Also:
-
DEFAULT_CONFIG
-
-
Constructor Details
-
RTree
public RTree(@Nonnull Subspace subspace, @Nonnull Subspace secondarySubspace, @Nonnull Executor executor, @Nonnull Function<RTree.Point, BigInteger> hilbertValueFunction) Initialize a new R-tree with the default configuration.- Parameters:
subspace
- the subspace where the r-tree is storedsecondarySubspace
- the subspace where the node index (if used is stored)executor
- an executor to use when running asynchronous taskshilbertValueFunction
- function to compute the Hilbert value from aRTree.Point
-
RTree
public RTree(@Nonnull Subspace subspace, @Nonnull Subspace nodeSlotIndexSubspace, @Nonnull Executor executor, @Nonnull RTree.Config config, @Nonnull Function<RTree.Point, BigInteger> hilbertValueFunction, @Nonnull Supplier<byte[]> nodeIdSupplier, @Nonnull OnWriteListener onWriteListener, @Nonnull OnReadListener onReadListener) Initialize a new R-tree.- Parameters:
subspace
- the subspace where the r-tree is storednodeSlotIndexSubspace
- the subspace where the node index (if used is stored)executor
- an executor to use when running asynchronous tasksconfig
- configuration to usehilbertValueFunction
- function to compute the Hilbert value for aRTree.Point
nodeIdSupplier
- supplier to be invoked when new nodes are createdonWriteListener
- an on-write listener to be called after writes take placeonReadListener
- an on-read listener to be called after reads take place
-
-
Method Details
-
newConfigBuilder
Start building aRTree.Config
.- Returns:
- a new
Config
that can be altered and then built for use with aRTree
- See Also:
-
getExecutor
Get the executer used by this r-tree.- Returns:
- executor used when running asynchronous tasks
-
getConfig
Get this r-tree's configuration.- Returns:
- r-tree configuration
-
getOnWriteListener
Get the on-write listener.- Returns:
- the on-write listener
-
getOnReadListener
Get the on-read listener.- Returns:
- the on-read listener
-
scan
@Nonnull public AsyncIterator<ItemSlot> scan(@Nonnull ReadTransaction readTransaction, @Nonnull Predicate<RTree.Rectangle> mbrPredicate, @Nonnull BiPredicate<Tuple, Tuple> suffixKeyPredicate) Perform a scan over the tree within the transaction passed in using a predicate that is also passed in to eliminate subtrees from the scan. This predicate may be stateful which allows for dynamic adjustments of the queried area while the scan is active.
A scan of the tree offers all items that pass thembrPredicate
test in Hilbert Value order using anAsyncIterator
. The predicate that is passed in is applied to intermediate nodes as well as leaf nodes, but not to elements contained by a leaf node. The caller should filter out items in a downstream operation. A scan of the tree will not prefetch the next node before the items of the current node have been consumed. This guarantees that the semantics of the mbr predicate can be adapted in response to the items being consumed. (this allows for efficient scans forORDER BY x, y LIMIT n
queries).- Parameters:
readTransaction
- the transaction to usembrPredicate
- a predicate on an mbrRTree.Rectangle
suffixKeyPredicate
- a predicate on the suffix key- Returns:
- an
AsyncIterator
ofItemSlot
s.
-
scan
@Nonnull public AsyncIterator<ItemSlot> scan(@Nonnull ReadTransaction readTransaction, @Nullable BigInteger lastHilbertValue, @Nullable Tuple lastKey, @Nonnull Predicate<RTree.Rectangle> mbrPredicate, @Nonnull BiPredicate<Tuple, Tuple> suffixKeyPredicate) Perform a scan over the tree within the transaction passed in using a predicate that is also passed in to eliminate subtrees from the scan. This predicate may be stateful which allows for dynamic adjustments of the queried area while the scan is active.
A scan of the tree offers all items that pass thembrPredicate
test in Hilbert Value order using anAsyncIterator
. The predicate that is passed in is applied to intermediate nodes as well as leaf nodes, but not to elements contained in a leaf node. The caller should filter out items in a downstream operation. A scan of the tree will not prefetch the next node before the items of the current node have been consumed. This guarantees that the semantics of the mbr predicate can be adapted in response to the items being consumed. (this allows for efficient scans forORDER BY x, y LIMIT n
queries).- Parameters:
readTransaction
- the transaction to uselastHilbertValue
- the last Hilbert value that was returned by a previous call to this methodlastKey
- the last key that was returned by a previous call to this methodmbrPredicate
- a predicate on an mbrRTree.Rectangle
suffixKeyPredicate
- a predicate on the suffix key- Returns:
- an
AsyncIterator
ofItemSlot
s.
-
insertOrUpdate
@Nonnull public CompletableFuture<Void> insertOrUpdate(@Nonnull TransactionContext tc, @Nonnull RTree.Point point, @Nonnull Tuple keySuffix, @Nonnull Tuple value) Method to insert an object/item into the R-tree. The item is treated unique per its point in space as well as its additional key that is also passed in. The Hilbert value of the point is passed in as to allow the caller to compute Hilbert values themselves. Note that there is a bijective mapping between point and Hilbert value which allows us to recompute point from Hilbert value as well as Hilbert value from point. We currently treat point and Hilbert value independent, however, they are redundant and not independent at all. The implication is that we do not have to store both point and Hilbert value (but we currently do).- Parameters:
tc
- transaction contextpoint
- the point to be used in spacekeySuffix
- the additional key to be stored with the itemvalue
- the additional value to be stored with the item- Returns:
- a completable future that completes when the insert is completed
-
delete
@Nonnull public CompletableFuture<Void> delete(@Nonnull TransactionContext tc, @Nonnull RTree.Point point, @Nonnull Tuple keySuffix) Method to delete from the R-tree. The item is treated unique per its point in space as well as its additional key that is passed in.- Parameters:
tc
- transaction contextpoint
- the pointkeySuffix
- the additional key to be stored with the item- Returns:
- a completable future that completes when the delete operation is completed
-
depth
Method to compute the depth of this R-tree.- Parameters:
transactionContext
- transaction context to be used- Returns:
- the depth of the R-tree
-
validate
Method to validate the Hilbert R-tree.- Parameters:
db
- the database to use
-
validate
Method to validate the Hilbert R-tree.- Parameters:
db
- the database to usemaxNumNodesToBeValidated
- a maximum number of nodes this call should attempt to validate
-