gnverify

Takes a name or a list of names and verifies them against a variety of biodiversity Data Sources

Features

Installation

Compile

bash cargo install gnverify

MS Windows

Download the latest release from [github], unzip.

One possible way would be to create a default folder for executables and place gnveriry there.

Use Windows+R keys combination and type "cmd". In the appeared terminal window type:

cmd mkdir C:\Users\your_username\bin copy path_to\gnverify.exe C:\Users\your_username\bin

Add C:\Users\your_username\bin directory to your PATH environment variable.

Another, simpler way, would be to use cd C:\Users\your_username\bin command in cmd terminal window. The gnverify program then will be automatically found by Windows operating system when you run its commands from that directory.

Linux and Mac

Download the latest release from [github], untar, and install binary somewhere in your path.

```bash tar xvf gnverify-linux-0.2.0.tar.xz

or tar xvf gnverify-mac-0.2.0.tar.gz

sudo mv gnverify /usr/local/bin ```

Usage

gnverify takes one name-string or a tab-delimited file with many name-strings as an argument, sends a query with these data to remote gnindex server to match the name-strigs against many different biodiversity databases and returns results to STDOUT either in JSON or CSV format.

One name-string

bash gnverify "Monohamus galloprovincialis"

Many name-strings in a file

bash gnverify /path/to/names.tsv The app assumes that a file either contains a simple list of names, one per line, of a tab-separated list where the first column is an id associated with a name_string, and the second is the name-string itself. You can find examples of such files in the project's [test directory].

It is also possible to feed data via STDIN:

bash cat /path/to/names.txt | gnverify

Options and flags

According to POSIX standard flags and options can be given either before or after name-string or file name.

help

```bash gnverify -h

or

gnverify --help

or

gnverify ```

version

```bash gnverify -V

or

gnverify --version ```

format

Allows to pick a format for output. Supported format are

```bash gnverify -f compact file.txt

or

gnverify --format="pretty" file.csv ```

Note that a separate JSON "document" is returned for each separate record, instead of returning one big JSON document for all records. For large lists it significantly speeds up parsin of the JSON on the user side.

sources

By default gnverify returns only one "best" result of a match. If a user has a particular interest in a data set, s/he can set it with this option, and all matches that exist for this source will be returned as well. You need to provide a data source id for a dataset. Ids can be found at the following url. Some of them are provided in the gnverify help output as well.

Data from such sources will be returned in preferred_results section of JSON output, or with CSV rows that start with "PreferredMatch" string.

```bash gnverify file.csv -s "1,11,172"

or

gnverify file.tsv --sources="12"

or

cat file.txt | gnverify -s '1,12' ```

preferred_only

Sometimes all users wants is to map one list of names to a DataSource. They are not interested if name matched anywhere else. In such case you can use the preferred_only flag.

```bash gnverify -p -s '12' file.txt

or

gnverify --preferred_only --sources='1,12' file.tsv ```

Copyright

Authors: Dmitry Mozzherin

Copyright (c) 2020 Dmitry Mozzherin. See LICENSE for further details.