Common Index File Format CIFF is an inverted index exchange format as defined as part of the Open-Source IR Replicability Challenge (OSIRRC) initiative. The primary idea is to allow indexes to be dumped from Lucene via Anserini which can then be ingested by other search engines. This repository contains the necessary code to read the CIFF into a format which PISA can use for building (and then searching) indexes.
We currently provide a Rust binary for converting CIFF data to a PISA canonical index.
Just run cargo build --release
to build the binary. It can then be executed:
./target/release/common-index-format