k-dimensional trees

Nearest neighbor search (NNS) is a common optimization problem of finding the point in a given set that is closest or most similar to a given query point. Closeness is typically expressed in terms of a dissimilarity function: the less similar the objects, the larger the function values. More formally, given a set S of n data points in a metric space X the nearest-neighbor search problem is defined as pre-processing these points so that given any query point q ∈ X, the data point nearest to q can be reported quickly.

Fig. 1: Nearest neighbors of roving query points

There are many possible choices of the metric space and the dissimilarity function, but most commonly, the space is ℝ^d real d-dimensional space where dissimilarity is expressed as a Minkowski L_m distance metric. For any integer m ≥ 1, the L_m-distance between points p = (p₁, p₂, …, p_d) and q = (q₁, q₂, …, q_d) in ℝ^d is defined to be the m-th root of Σ_{1 ≥ i ≥ d}|p_i - q_i|^m. The L₁, L₂, and L_∞ metrics are the well-known Manhattan, Euclidean, and max metrics respectively.

NNS has applications in many areas including knowledge discovery and data mining, pattern recognition and classification, machine learning, data compression, multimedia databases, document retrieval, and statistics. NNS was also among the first algorithms employed in solving the travelling salesman problem: starting at a random city, repeatedly visit the nearest city until all cities have been visited.

Here, we restrict attention to solutions using data structures of size O(dn) based on hierarchical spatial decompositions, and, in particular, the kd-tree because of the simplicity and widespread popularity of this data structure. A kd-tree is binary tree that hierarchically subdivides k-dimensional space with hyperplanes orthogonal to the coordinate axes. There are, however, a number of other data structures for nearest neighbor searching based on hierarchical spatial decompositions: vp-trees, R-trees, X-trees, SR-trees, TV-trees, etc.

A key issue in the design of the kd-tree is the choice of the splitting hyperplane. Friedman, Bentley, and Finkel proposed a splitting method based on selecting the plane orthogonal to the median coordinate along which the points have the greatest spread. Another common alternative uses the shape of the cell, rather than the distribution of the data points, and it splits each cell through its midpoint by a hyperplane orthogonal to its longest side. Yet another splitting method was introduced by Mount and Arya in the ANN library for approximate nearest neighbor searching. They proposed a method of sliding the splitting plane toward any remaining points when the subcells are empty. When data is highly clustered in low dimensional subspaces, this approach may alleviate slower query times by preventing the formation of highly enlongated subcells.

Fig. 2: kd-tree partitioning by median

In the canonical method of kd-tree construction, as one traverses down the tree, one cycles through the axes used to select the splitting planes. For example, in a three-dimensional kd-tree, the root may have an x-aligned splitting plane, the root's children may have y-aligned planes, the root's grandchildren may all have z-aligned planes, and the root's great-grandchildren may all have x-aligned planes. Points are then inserted by selecting the median of the points being put into the subtree, with respect to their coordinates in the axis being used to create the splitting plane. This method leads to a balanced kd-tree, in which each leaf node is approximately the same distance from the root.

Fig. 3: Nearest neighbor search via kd-tree traversal

Once the tree is constructed, a nearest neighbor can be found efficiently by using the tree properties to quickly eliminate large portions of the search space. Starting with the root node, the tree is traversed recursively, and the branch to descend is selected by comparing the query point to the current node along the splitting dimension. Upon reaching a leaf node, the distance is calculated, and the current node is propagated upwards as the current best. As the recursion unwinds, a new current best may be propagated upward if the current node or the nodes on either side of the splitting plane are closer.

Conceptually, this is accomplished by intersecting the splitting hyperplane with a hypersphere around the search point that has a radius equal to the current nearest distance. If the hypersphere crosses the plane, there may be nearer points on the other side of the plane, so the other branch is traversed for closer points. If the hypersphere fails to intersect the splitting plane, the opposite branch may be eliminated, and the current best propagates upwards. When the root node is reached, the search is complete.