Introduction

sci-rocket is a Snakemake workflow which performs processing of sci-RNA-seq3 sequencing, including barcode demultiplexing and downstream alignment / UMI-counting using STARSolo.

Please see the set-up instructions below for more information on how to install and run the workflow.

Pre-requirements

There is currently no LSF support yet in latest snakemake (v8). For LSF clusters (e.g. DKFZ), we recommend using snakemake v7.32.4 instead.

A conda system, e.g., conda, mamba or micromamba
Snakemake and a cluster-specific Snakemake configuration for batch-job submission (see instructions below).
- E.g., LSF or SLURM

We make use of pre-defined environment(s) which houses all software dependencies (workflow/envs/). These are installed automatically by Snakemake when running the workflow (--use-conda).

Set-up

Clone the repository:

git clone https://github.com/odomlab2/sci-rocket

Download and install snakemake (e.g. using conda or micromamba):

# This will install snakemake (7.32.4) + Python 3.11.7 into a new conda environment called 'snakemake'
micromamba create -c conda-forge -c bioconda -n snakemake snakemake==7.32.4 python==3.11.7 mamba
# Switch to the 'snakemake' environment
micromamba activate snakemake

Run the workflow:

cd workflow/
snakemake --use-conda --profile <profile_name> --configfile <path_config>

Useful Snakemake parameters:

-n: Perform dry-run (generate commands without executing).

-p: Print shell commands.

--notemp: Do not remove files flagged as temporary.

--rerun-incomplete: Rerun all jobs with missing output files.

Configuration

The workflow requires a configuration file (config.yaml) which can be copied from the example configuration file and adjusted to your needs.

Within the configuration file, the sample-sheet (path_samples) needs to be specified. This file contains the sample names and paths to the raw sequencing data (BCL or FASTQ).