TRINITY

From HPCC Wiki
Jump to navigation Jump to search

Briefly, the process works like so:

Inchworm assembles the RNA-seq data into the unique sequences of transcripts, often generating full-length transcripts for a dominant isoform, but then reports just the unique portions of alternatively spliced transcripts.

Chrysalis clusters the Inchworm contigs into clusters and constructs complete de Bruijn graphs for each cluster. Each cluster represents the full transcriptonal complexity for a given gene (or sets of genes that share sequences in common). Chrysalis then partitions the full read set among these disjoint graphs.

Butterfly then processes the individual graphs in parallel, tracing the paths that reads and pairs of reads take within the graph, ultimately reporting full-length transcripts for alternatively spliced isoforms, and teasing apart transcripts that corresponds to paralogous genes.

Additional Trinity documentation can be found at the Trinity website.

Trinity is installed on Penzias cluster. In order to run it load Trinity environment module with

module load trinity

Note that this will also load the module for BOWTIE which is required by Trinity. Once your environment set you can submit a job into SLURM queue. Here is a typical SLURM script that submits a 4-core Trinity job:

#!/bin/bash
#SBATCH --partition production
#SBATCH --job-name trinity
#SBATCH --nodes=1
#SBATCH --ntasks=4

# set unlimited stack size
ulimit -s unlimited

# set number of cores and memory requirements
NCORES=4
JMEM=8G
export OMP_NUM_THREADS=$NCORES

# set output directory
MY_OUT_DIR=myOutDir_trinity

echo "----------------Starting-----------------------"
cd $SLURM_SUBMIT_DIR
# run trinity
Trinity --CPU $NCORES --bflyCPU $NCORES --bflyGCThreads $NCORES \
    --max_memory $JMEM \
    --seqType fq --SS_lib_type RF \
    --left reads.left.fq  --right reads.right.fq \
    --output myOutDir_trinity >& log_trinity
echo "-----------------Done--------------------------"


  • Note that Trinity only runs in SMP mode. Therefore in the "-l select" statement you can request max of 1 chunk with up to 12 ntasks.
  • Name of the output directory needs to contain string "trinity". Otherwise Trinity run will immediately fail.

You can find example of SLURM script and sample input files (reads.left.fq/ reads.right.fq) under "/share/apps/trinity/default/example" directory.