This is a CLI tool written in Rust, which indexes the 'directory' with a multi-threaded crawler. It has been designed to be generic in nature, currently, filesystem implementation exists. In future, anything that can be decomposed into 'directories' (items to be crawled) and 'files' (contents of 'directories') can be added by implementing the Resource interface:
```rust
pub enum Response
trait Resource
/// Get the path representation of the resource.
fn get_path(&self) -> Result<T>;
}
```
A filesystem implementation of this exists in fileresource.rs for input patterns starting with '/'.
It uses crossbeam for channels, threadpool library for threadpool support and finally, tantivy for full-text indexing support.
```sh
git clone https://github.com/ronin13/finde-rs && cd finde-rs cargo build --release
./target/release/finde-rs --help finde-rs 0.1.3 CLI finder tool
USAGE: finde-rs [FLAGS] [OPTIONS]
FLAGS: -h, --help Prints help information
-q, --quiet
Pass many times for less log output
-V, --version
Prints version information
-v, --verbose
Pass many times for more log output
By default, it'll only report errors. Passing `-v` one time also prints warnings, `-vv` enables info
logging, `-vvv` debug, and `-vvvv` trace.
OPTIONS:
-I, --index-dir
-i, --initial-threads <initial-threads>
Initial number of threads to spawn
-m, --max-threads <max-threads>
Maximum number of threads that threadpool can scale upto. Defaults to number of cpus
-p, --path <path>
Root path to crawl from [default: /usr/lib]
```
```sh
./target/release/finde-rs -p $HOME/repo -v -i 6 -m 12 --index-dir /tmp 2020-02-15 14:10:59,683 INFO [finders] Crawling /Users/raghu/repo 2020-02-15 14:10:59,684 INFO [finders::indexer] Starting indexer 2020-02-15 14:10:59,684 INFO [finders::crawler] Waiting on upto 12 crawler threads 2020-02-15 14:10:59,684 INFO [finders::indexer] Index directory created in /tmp/5ryH1 2020-02-15 14:10:59,684 INFO [tantivy::indexer::segmentupdater] save metas 2020-02-15 14:10:59,687 INFO [finders::indexer] Iterating over results 2020-02-15 14:10:59,785 INFO [finders::scheduler] Updating number of threads to 7, length of work queue 3818, pool size 6 2020-02-15 14:10:59,886 INFO [finders::scheduler] Updating number of threads to 8, length of work queue 6883, pool size 6 2020-02-15 14:10:59,988 INFO [finders::scheduler] Updating number of threads to 9, length of work queue 11192, pool size 6 2020-02-15 14:11:00,089 INFO [finders::scheduler] Updating number of threads to 10, length of work queue 12956, pool size 6 2020-02-15 14:11:00,190 INFO [finders::scheduler] Updating number of threads to 11, length of work queue 12857, pool size 6 2020-02-15 14:11:00,290 INFO [finders::scheduler] Updating number of threads to 12, length of work queue 12607, pool size 6 2020-02-15 14:11:04,834 INFO [finders::scheduler] Updating number of threads to 6, length of work queue 0, pool size 6 2020-02-15 14:11:05,739 INFO [finders::fileresource] Crawling done in ThreadId(5), leaving, bye! 2020-02-15 14:11:05,740 INFO [finders::fileresource] Crawling done in ThreadId(4), leaving, bye! 2020-02-15 14:11:05,740 INFO [finders::fileresource] Crawling done in ThreadId(2), leaving, bye! 2020-02-15 14:11:05,740 INFO [finders::fileresource] Crawling done in ThreadId(7), leaving, bye! 2020-02-15 14:11:05,740 INFO [finders::fileresource] Crawling done in ThreadId(6), leaving, bye! 2020-02-15 14:11:05,740 INFO [finders::fileresource] Crawling done in ThreadId(3), leaving, bye! 2020-02-15 14:11:05,740 INFO [finders::indexer] Commiting the index 2020-02-15 14:11:05,740 INFO [tantivy::indexer::indexwriter] Preparing commit 2020-02-15 14:11:05,757 INFO [finders::scheduler] No more threads to schedule, I am done. Bye! 2020-02-15 14:11:05,899 INFO [tantivy::indexer::segmentupdater] Starting merge - [Seg("8cc31b4d"), Seg("97576eb1"), Seg("2b7bcba3"), Seg("f1bbcb09"), Seg("4c3cf582"), Seg("699c0c3b"), Seg("4e08a0dd"), Seg("1e6b5009")] 2020-02-15 14:11:05,904 INFO [tantivy::indexer::indexwriter] Prepared commit 500530 2020-02-15 14:11:05,904 INFO [tantivy::indexer::preparedcommit] committing 500530 2020-02-15 14:11:05,904 INFO [tantivy::indexer::segmentupdater] save metas 2020-02-15 14:11:05,905 INFO [tantivy::indexer::segmentupdater] Running garbage collection 2020-02-15 14:11:05,905 INFO [tantivy::directory::manageddirectory] Garbage collect 2020-02-15 14:11:05,905 INFO [finders::indexer] Index created in "/tmp/" 2020-02-15 14:11:05,905 INFO [finders::indexer] Index has 12 segments 2020-02-15 14:11:05,906 INFO [finde_rs] Finished crawling /Users/raghu/repo, took 6s ./target/release/finde-rs -p $HOME/repo -v -i 6 -m 12 --index-dir 12.81s user 26.84s system 636% cpu 6.232 total
```
```
cargo test Compiling finde-rs v0.1.1 (/Users/raghu/repo/finde-rs) Finished test [unoptimized] target(s) in 1.22s Running target/debug/deps/finde_rs-c62a74cfdff79a3e
running 3 tests test scheduler::test::testscalewithbounds ... ok test crawler::test::testrootfromdisconnectedchannel ... ok test crawler::test::testrootfromchannel ... ok
test result: ok. 3 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
```
cargo clippy