Installation#

The present tutorial will show you how to install V-pipe and the dependencies required to start using it - bioconda, conda-froge mamba and snakemake - before continuing with other tutorials and analysing virus data.

For the impatient#

Download the install script and run it with the following parameters:

curl -O 'https://raw.githubusercontent.com/cbg-ethz/V-pipe/master/utils/quick_install.sh'
bash quick_install.sh -p vp-analysis -w work

Requirements#

V-pipe is optimized for Linux or Mac OS systems, and we heavily rely on bioconda, which isn’t supported on Windows. Therefore, we recommend users with a Windows system to install WSL2.

Quick install V-pipe and conda#

V-pipe uses the Bioconda bioinformatics software repository for all its pipeline components. The pipeline itself is implemented using Snakemake. Although you can install all the dependencies manually, we recommend using our install quick install script:

## not run
curl -O 'https://raw.githubusercontent.com/cbg-ethz/V-pipe/master/utils/quick_install.sh'
bash quick_install.sh -p vp-analysis -w work

The script quick_install.sh has the following options:

  • using -p specifies the subdirectory where to download and install snakemake and V-pipe

  • using -w will create a working directory and populate it. It will create a boilerplate config/config.yaml, and create a handy vpipe short-cut script to invoke snakemake.

  • an additional option -b (not demonstrated above) allows to install a spefic branch or tagged version. If nothing is specified, the master branch will be installed.

After running the quick_install.sh script, you should have a directory structure like this:

vp-analysis
├── Mambaforge-Darwin-x86_64.sh
├── V-pipe # cloned from https://github.com/cbg-ethz/V-pipe
│   ├── CONTRIBUTING.md
│   └── ..
├── mambaforge # installation of dependencies including snakemake
│   ├── LICENSE.txt
│   └── ..
└── work # working directory
    ├── config.yaml
    └── vpipe
  • vp-analysis is the main directory where we will store everything.

  • mambaforge is the directory where conda will be installed including the dependencies to start using V-pipe.

  • V-pipe is the directory where V-pipe’s code will be downloaded from GitHub

  • work finally, each analysis of virus data will be performed in a directory like work…. If you start a new analysis of a dataset, you can create a new directory, run init_project.sh inside the directory and get started.

Now you can check your installation with a small test dataset:

## not run

cd vp-analysis/work
# copy the example data from the repository to your working directory
cp -r ../V-pipe/docs/example_HIV_data/* .
# check what will be run with a dry run
./vpipe -n
# run vpipe on a small HIV test dataset
# this will install all dependencies and run the pipeline
./vpipe 

Tip

To create and populate other new working directories, you can call init_project.sh from within the new directory:

## not run
cd vp-analysis/

mkdir -p working_2
cd working_2
../V-pipe/init_project.sh

## Other installation options

### Cloning the repository 

The V-pipe repository contains a snakemake pipeline. In order to run it directly with snakemake, clone the repository with:

git clone cbg-ethz/V-pipe.git


If you haven't already done so, install snakemake by using the [official instructions](https://github.com/cbg-ethz/V-pipe.git), and you can run the pipeline with `snakemake --use-conda`. 

Test the installation with a small dataset: 

```bash
## not run
mkdir work
cd work
cp -r ../V-pipe/docs/example_HIV_data/* .
snakemake -s ../V-pipe/workflow/Snakefile --use-conda --dry-run
snakemake -s ../V-pipe/workflow/Snakefile --use-conda --cores 4

Using Docker#

Note

Note: the docker image is only setup with components to run the workflow for HIV and SARS-CoV-2 virus base configurations. Using V-pipe with other viruses or configurations might require internet connectivity for additional software components.

Create config.yaml and then populate the directory containing raw reads, typically samples/. For example, the following config file could be used:

general:
  virus_base_config: hiv

output:
  snv: true
  local: true
  global: false
  visualization: true
  QA: true

Then execute:

docker run --rm -it -v $PWD:/work ghcr.io/cbg-ethz/v-pipe:master --jobs 4 --printshellcmds --dry-run

Using Snakedeploy#

Install snakedeploy according to the official instructions.

Snakemake’s official workflow installer Snakedeploy can now be used:

snakedeploy deploy-workflow https://github.com/cbg-ethz/V-pipe --tag master .
# edit config/config.yaml and provide samples/ directory
snakemake --use-conda --jobs 4 --printshellcmds --dry-run