Crates.io | gnverify |
lib.rs | gnverify |
version | 0.3.1 |
source | src |
created_at | 2020-04-17 00:07:24.048173 |
updated_at | 2020-04-23 09:59:51.064837 |
description | Takes a scientific name or a list of names and verifies them against a variety of biodiversity Data Sources |
homepage | https://github.com/gnames/gnverify |
repository | https://github.com/gnames/gnverify |
max_upload_size | |
id | 231002 |
size | 654,419 |
Takes a name or a list of names and verifies them against a variety of biodiversity Data Sources
Download the latest release from github, unzip.
One possible way would be to create a default folder for executables and place
gnveriry
there.
Use Windows+R
keys
combination and type "cmd
". In the appeared terminal window type:
mkdir C:\Users\your_username\bin
copy path_to\gnverify.exe C:\Users\your_username\bin
Add C:\Users\your_username\bin
directory to your PATH
environment variable.
Another, simpler way, would be to use cd C:\Users\your_username\bin
command
in cmd
terminal window. The gnverify
program then will be automatically
found by Windows operating system when you run its commands from that
directory.
You can also read more a detailed guide for Windows users in a PDF document.
Download the latest release from github, untar, and install binary somewhere in your path.
tar xvf gnverify-linux-0.2.0.tar.xz
# or tar xvf gnverify-mac-0.2.0.tar.gz
sudo mv gnverify /usr/local/bin
Install Rust according to their installation instructions
cargo install gnverify
gnverify
takes one name-string or a tab-delimited file with many
name-strings as an argument, sends a query with these data to remote
gnindex
server to match the name-strigs against many different
biodiversity databases and returns results to STDOUT either in JSON or CSV
format.
gnverify "Monohamus galloprovincialis"
gnverify /path/to/names.tsv
The app assumes that a file either contains a simple list of names, one per line,
of a tab-separated list where the first column is an id
associated with a
name_string, and the second is the name-string itself. You can find examples
of such files in the project's test directory.
It is also possible to feed data via STDIN:
cat /path/to/names.txt | gnverify
According to POSIX standard flags and options can be given either before or after name-string or file name.
gnverify -h
# or
gnverify --help
# or
gnverify
gnverify -V
# or
gnverify --version
If the name-string's ScientificName field is not the first in your data,
the name-field
flag is very important. Set it to the position of
the name-string field.
For example, if your file has the following fields:
"ID", "Taxon_ID", "Name", "Reference", "Notes"
and the "Name" field contains the names you want to verify, use
gnverify -n 3
# or
gnverify --name-field=3
Allows to pick a format for output. Supported format are
gnverify -f compact file.txt
# or
gnverify --format="pretty" file.csv
Note that a separate JSON "document" is returned for each separate record, instead of returning one big JSON document for all records. For large lists it significantly speeds up parsin of the JSON on the user side.
By default gnverify
returns only one "best" result of a match. If a user
has a particular interest in a data set, s/he can set it with this option, and
all matches that exist for this source will be returned as well. You need to
provide a data source id for a dataset. Ids can be found at the following
url. Some of them are provided in the gnverify
help
output as well.
Data from such sources will be returned in preferred_results section of JSON output, or with CSV rows that start with "PreferredMatch" string.
gnverify file.csv -s "1,11,172"
# or
gnverify file.tsv --sources="12"
# or
cat file.txt | gnverify -s '1,12'
Sometimes all users wants is to map one list of names to a DataSource. They
are not interested if name matched anywhere else. In such case you can use
the preferred_only
flag.
gnverify -p -s '12' file.txt
# or
gnverify --preferred_only --sources='1,12' file.tsv
Authors: Dmitry Mozzherin
Copyright (c) 2020 Dmitry Mozzherin. See LICENSE for further details.