Rust implementation of the approximate version of DBSCAN introduced by Gan and Tao in this paper
Accepted data files should contain one data point per line and nothing else. Each line should contain the components of the point separated by whitespace.
Each component of a point will be read and stored as a 64 bit floating point variable
text
1.0 1.1 0.5
2.3 3.4 6.2
...
There are four main functions in this library that differ in the kind of input they accept:
Each function below expects all points in the data file to have the same dimensionality and panics otherwise.
If the dimensionality of each point is statically known (so not a result from another calculation) then this function can be used:
rust
pub fn do_appr_dbscan_file<P, const D: usize>(
filename: P,
epsilon: f64,
rho: f64,
min_pts: usize
) -> DBSCANResult<D>
where
P: AsRef<Path>,
```rust extern crate apprdbscan; use apprdbscan::doapprdbscanfile; use apprdbscan::utils::DBSCANResult;
let res : DBSCANResult<2> = doapprdbscanfile("./datasets/outtest1.txt", 0.3, 0.1, 10); let clusterscount = res.len() - 1; let noisepointscount = res[0].len(); ```
If the dimensionality of the data points is not statically known (like if there is a loop going through multiple files with different dimensionalities) then this function can be used:
rust
pub fn do_appr_dbscan_auto_dimensionality_file<P>(
filename: P,
epsilon: f64,
rho: f64,
min_pts: usize
) -> (VectorDBSCANResult, usize)
where
P: AsRef<Path>,
```rust extern crate apprdbscan; use apprdbscan::doapprdbscanautodimensionality_file;
let (res,dimensionality) = doapprdbscanautodimensionalityfile("./datasets/outtest1.txt", 0.3, 0.1, 10); println!("Points dimensionality: {}",dimensionality); let clusterscount = res.len() - 1; let noisepointscount = res[0].len(); ```
If you have a vector of points of the type Vec<[f64;D]>
then this function can be used:
rust
pub fn do_appr_dbscan_points<const D: usize>(
points: Vec<Point<D>>,
epsilon: f64,
rho: f64,
min_pts: usize
) -> DBSCANResult<D>
```rust extern crate apprdbscan; use apprdbscan::doapprdbscanpoints; use apprdbscan::utils::DBSCANResult;
let points = vec![[0.0,0.0],[1.0,1.0],[0.0,1.0],[1.0,0.0],[2.0,1.0],[0.0,2.0],[2.0,1.0],[1.0,1.0]]; let res : DBSCANResult<2> = doapprdbscanpoints(points, 0.3, 0.1, 10); let clusterscount = res.len() - 1; let noisepointscount = res[0].len(); ```
If you have a vector of points of the type Vec<Vec<f64>>
(in example if you are in a loop clustering different vectors) then this function can be used:
rust
pub fn do_appr_dbscan_auto_dimensionality_points(
points: Vec<VectorPoint>,
epsilon: f64,
rho: f64,
min_pts: usize
) -> (VectorDBSCANResult, usize)
```rust extern crate apprdbscan; use apprdbscan::doapprdbscanautodimensionality_points;
let points = vec![vec![0.0,0.0],vec![1.0,1.0],vec![0.0,1.0],vec![1.0,0.0],vec![2.0,1.0],vec![0.0,2.0],vec![2.0,1.0],vec![1.0,1.0]]; let (res, dimensionality) = doapprdbscanautodimensionalitypoints(points, 0.3, 0.1, 10); println!("Points dimensionality: {}",dimensionality); let clusterscount = res.len() - 1; let noisepointscount = res[0].len(); ```