Skip to content

Introduction

sci-rocket is a Snakemake workflow which performs processing of sci-RNA-seq3 sequencing, including barcode demultiplexing and downstream alignment / UMI-counting using STARSolo.

Please see the set-up instructions below for more information on how to install and run the workflow.

Pre-requirements

There is currently no LSF support yet in latest snakemake (v8). For LSF clusters (e.g. DKFZ), we recommend using snakemake v7.32.4 instead.

  1. A conda system, e.g., conda, mamba or micromamba
  2. Snakemake and a cluster-specific Snakemake configuration for batch-job submission (see instructions below).

We make use of pre-defined environment(s) which houses all software dependencies (workflow/envs/). These are installed automatically by Snakemake when running the workflow (--use-conda).

Set-up

  1. Clone the repository:

    git clone https://github.com/odomlab2/sci-rocket
    
  2. Download and install snakemake (e.g. using conda or micromamba):

    # This will install snakemake (7.32.4) + Python 3.11.7 into a new conda environment called 'snakemake'
    micromamba create -c conda-forge -c bioconda -n snakemake snakemake==7.32.4 python==3.11.7 mamba
    # Switch to the 'snakemake' environment
    micromamba activate snakemake
    
  3. Run the workflow:

    cd workflow/
    snakemake --use-conda --profile <profile_name> --configfile <path_config>
    

Useful Snakemake parameters:

  • -n: Perform dry-run (generate commands without executing).
  • -p: Print shell commands.
  • --notemp: Do not remove files flagged as temporary.
  • --rerun-incomplete: Rerun all jobs with missing output files.

Configuration

The workflow requires a configuration file (config.yaml) which can be copied from the example configuration file and adjusted to your needs.

Within the configuration file, the sample-sheet (path_samples) needs to be specified. This file contains the sample names and paths to the raw sequencing data (BCL or FASTQ).