kiddo

A fork of kdtree. Refactored to use const generics, with some performance improvements and extra features. Thanks and kudos to mrhooray for the original kdtree library on which kiddo is based.

Differences vs kdtree@0.6.0

Usage

Add kiddo to Cargo.toml toml [dependencies] kiddo = "0.2.1"

Add points to kdtree and query nearest n points with distance function ```rust use kiddo::KdTree; use kiddo::ErrorKind; use kiddo::distance::squared_euclidean;

let a: ([f64; 2], usize) = ([0f64, 0f64], 0); let b: ([f64; 2], usize) = ([1f64, 1f64], 1); let c: ([f64; 2], usize) = ([2f64, 2f64], 2); let d: ([f64; 2], usize) = ([3f64, 3f64], 3);

let mut kdtree = KdTree::new()?;

kdtree.add(&a.0, a.1)?; kdtree.add(&b.0, b.1)?; kdtree.add(&c.0, c.1)?; kdtree.add(&d.0, d.1)?;

assert_eq!(kdtree.size(), 4);

asserteq!( kdtree.nearest(&a.0, 0, &squaredeuclidean).unwrap(), vec![] ); asserteq!( kdtree.nearest(&a.0, 1, &squaredeuclidean).unwrap(), vec![(0f64, &0)] ); asserteq!( kdtree.nearest(&a.0, 2, &squaredeuclidean).unwrap(), vec![(0f64, &0), (2f64, &1)] ); asserteq!( kdtree.nearest(&a.0, 3, &squaredeuclidean).unwrap(), vec![(0f64, &0), (2f64, &1), (8f64, &2)] ); asserteq!( kdtree.nearest(&a.0, 4, &squaredeuclidean).unwrap(), vec![(0f64, &0), (2f64, &1), (8f64, &2), (18f64, &3)] ); asserteq!( kdtree.nearest(&a.0, 5, &squaredeuclidean).unwrap(), vec![(0f64, &0), (2f64, &1), (8f64, &2), (18f64, &3)] ); asserteq!( kdtree.nearest(&b.0, 4, &squaredeuclidean).unwrap(), vec![(0f64, &1), (2f64, &0), (2f64, &2), (8f64, &3)] ); ```

Benchmarks

Comparison with kdtree@0.6.0

Criterion is used to perform a series of benchmarks. Each action is benchmarked against trees that contain 100, 1,000, 10,000, 100,000 and 1,000,000 nodes, and charted below.

The Adding Items benchmarks are repeated against 2d, 3d and 4d trees. The 3d benchmarks are ran with points that are both of type f32 and of type f64.

All of the remaining tests are only performed against 3d trees, for expediency. The trees are populated with random source data whose points are all on a unit sphere. This use case is representative of common kd-tree usages in geospatial and astronomical contexts.

The Nearest n Items tests query the tree for the nearest 1, 100 and 1,000 points at each tree size. The test for the common case of the nearest one point uses kiddo's nearest_one() method, which is an optimised method for this specific common use case.

Methodology

The results and charts below were created via the following process:

bash cargo criterion --message-format json > criterion-kdtree.ndjson

bash cargo criterion --message-format json --all-features > criterion-kiddo.ndjson

bash python ./generate_benchmark_charts.py

Results

The following results were obtained with the above methodology on a machine with these specs:

The results are stored inside this repo as criterion-kiddo.ndjson and criterion-kdtree.ndjson, should you wish to perform your own analysis.

Adding items to the tree

Kiddo generally has a very small performance lead over kdtree@0.6.0 at larger tree sizes, with their performance being similar on smaller trees.

Charts showing benchmark results for adding items

Retrieving the nearest n items

Kiddo's optimised nearest_one() method gives a huge performance advantage for single item queries, with up to 9x faster performance. Kiddo's standard nearest() method also outperforms kdtree@0.6.0.

Charts showing benchmark results for retrieving the nearest n items

Retrieving all items within a distance, sorted

Things look closer here at first glance but the logarithmic nature of the charted data may obscure the fact that Kiddo is often up to twice as fast as kdtree@0.6.0 here.

Charts showing benchmark results for retrieving all items within a specified distance

Retrieving all items within a distance, unsorted

kdtree@0.6.0 does not have a within_unsorted() method, so we are comparing kiddo's within_unsorted() to kdtree@0.6.0's within() here, with kiddo up to 5x faster on the million-item tree.

Charts showing benchmark results for retrieving all items within a specified distance

Retrieving the best n items within a specified distance

Kiddo's performance advantage here ranges from twice as fast for hundred-item trees up to as much as 20x faster for trees with a million items.

Charts showing benchmark results for retrieving the best n items within a specified distance

License

Licensed under either of

at your option.

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.