Crates.io | umgap |
lib.rs | umgap |
version | 1.1.0 |
source | src |
created_at | 2019-08-07 14:48:37.684702 |
updated_at | 2024-02-28 09:12:12.124616 |
description | The Unipept Metagenomics Analysis Pipeline |
homepage | https://unipept.ugent.be |
repository | https://github.com/unipept/umgap |
max_upload_size | |
id | 154812 |
size | 247,222 |
The Unipept Metagenomics Analysis Pipeline can analyse metagenomic samples and return a frequency table of the taxons it detected for each read. It is based on the Unipept Metaproteomics Analysis Pipeline. Both tools were developed at the Department of Applied Maths, Computer science and Statistics at Ghent University.
Install Rust, according to their installation instructions,
or use your favourite package manager (e.g. apt install rustc
). The
pipeline is developed for the latest stable release, but should work on 1.35
and higher.
Clone this repository and go to the repository root.
git clone https://github.com/unipept/umgap.git
cd umgap
Compile and install the UMGAP.
cargo build --release
cargo install --path .
For a multiuser installation, instead of cargo install
, use install
to
place the umgap program and the wrapper script were all users can reach it:
sudo install target/release/umgap scripts/umgap-analyse.sh /usr/bin
cargo install
will install the umgap
command to ~/.cargo/bin
by
default. Please ensure this directory is in your $PATH
. You can check if
the installation was succesful by asking for the version:
umgap -V
(optional) Install FragGeneScanPlusPlus to use as gene predictor in the pipeline.
Run scripts/umgap-setup.sh
to interactively
configure the UMGAP and download the data files required for some steps of
the pipeline.
Depending on which type of analysis you are planning, you will need the tryptic index file (less powerfull, but runs on any decent laptop) and the 9-mer index file (uses about 100GB disk space for storage and as much RAM during operation. The exact size depends on the version.)
Run sudo scripts/umgap-setup.sh -c /etc/umgap -d <datamap>
instead to share
the datafiles between users. Make sure the <datamap>
is accessible for the
end users.
(optional) Analyze some test data! Running
./scripts/umgap-analyse.sh -1 testdata/A1.fq -2 testdata/A2.fq -t tryptic-sensitivity -o - | tee output.fa
should show you a FASTA-like file with a taxon id per header. If you didn't download the tryptic index file but the 9-mer index file, use instead:
./scripts/umgap-analyse.sh -1 testdata/A1.fq -2 testdata/A2.fq -o - | tee output.fa
(optional) NOT YET INTEGRATED - Visualize some test data! Running
./scripts/umgap-visualize.sh output.fa output.html
will give you an HTML-file, which will show you a visualization of the test data in your favorite browser.
A source install can be updated by pulling the repository to get the latest changes and running in the repository root:
cargo install --force --path .
The UMGAP offers individual tools which integrate into a pipeline. Running
umgap help
will get you to the documentation of each tool, and the short
metagenomics casestudy at the Unipept website displays their usage.
This repository also offers 6 preconfigured pipelines
(scripts/umgap-analyse.sh
)
which should cover most usecases. Running the script without any arguments
should get you started. The preconfigured pipelines are:
high-precision
: the default, focusses on high precision with very decent
sensitivity.max-precision
: focusses on very high precision, at the cost of sensitivity.high-sensitivity
: focusses on high sensitivity, with decent precision.max-sensitivity
: focusses on very high sensitivity, at the cost of
precision.tryptic-precision
: focusses on high precision, using a much smaller index
file, which makes it usable on a laptop.tryptic-sensitivity
: focusses on high sensitivity, using a much smaller
index file, which makes it usable on a laptop.Another script, scripts/umgap-visualize.sh
will
help you to visualize the output of the pipeline. Again, running the script
without any arguments prints the usage instructions.
Please adhere to the editorconfig and RustFMT styles specified.
The UMGAP is released under the terms of the MIT License. See the LICENSE file for more info.