tantivywarcindexer builds a tantivy index from common crawl warc.wet files
Install rust (e.g. via rustup).
make
``` ./target/release/tantivywarcindexer --help WARC Indexer
Usage:
warcparserĀ [-t
Options:
-h --help Show this help
-t
Where
./target/release/tantivy_warc_indexer ../common_crawl_tantivy_index ../wet
To create an index:
mkdir ../common_crawl_tantivy_index
cp template/meta.json ../common_crawl_tantivy_index/
Best Andreas