Subread | RCIF

See the Subread project pageopen in new window. To use Subread, you’ll use the module tool.

You can see what versions are available by using:

[me@login01 ~]$ module avail subread

------------------------------ /opt/modulefiles -------------------------------
   subread/2.0.6

Use "module spider" to find all possible modules and extensions.
Use "module keyword key1 key2 ..." to search for all possible modules matching
any of the "keys".

To load a specific version, you would use:

[me@login01 ~]$ module load subread/2.0.6

while the "subread" wildcard will load the default version, subread-2.0.6 in this case.

You should now be able to run Subread commands:

[me@login01 ~]$ subread

Version 2.0.6

Usage:

./subread-align [options] -i <index_name> -r <input> -t <type> -o <output>

## Mandatory arguments:
    
  -i <string>       Base name of the index.

  -r <string>       Name of an input read file. If paired-end, this should be
                    the first read file (typically containing "R1"in the file
                    name) and the second should be provided via "-R".
                    Acceptable formats include gzipped FASTQ, FASTQ, gzipped
                    FASTA and FASTA.
                    These formats are identified automatically.
    
  -t <int>          Type of input sequencing data. Its values include
                      0: RNA-seq data
                      1: genomic DNA-seq data.
    
## Optional arguments:
# input reads and output
    
  -o <string>       Name of an output file. By default, the output is in BAM
                    format. Omitting this option makes the output be written to
                    STDOUT.

  -R <string>       Name of the second read file in paired-end data (typically
                    containing "R2" the file name).

  --SAMinput        Input reads are in SAM format.

  --BAMinput        Input reads are in BAM format.

  --SAMoutput       Save mapping results in SAM format.

# Phred offset

  -P <3:6>          Offset value added to the Phred quality score of each read
                    base. '3' for phred+33 and '6' for phred+64. '3' by default.

# thresholds for mapping

  -n <int>          Number of selected subreads, 10 by default.

  -m <int>          Consensus threshold for reporting a hit (minimal number of
                    subreads that map in consensus) . If paired-end, this gives
                    the consensus threshold for the anchor read (anchor read
                    receives more votes than the other read in the same pair).
                    3 by default

  -p <int>          Consensus threshold for the non- anchor read in a pair. 1 by
                    default.

  -M <int>          Maximum number of mis-matched bases allowed in each reported
                    alignment. 3 by default. Mis-matched bases found in soft-
                    clipped bases are not counted.

# unique mapping and multi-mapping

  --multiMapping    Report multi-mapping reads in addition to uniquely mapped
                    reads. Use "-B" to set the maximum number of equally-best
                    alignments to be reported.

  -B <int>          Maximum number of equally-best alignments to be reported for
                    a multi-mapping read. Equally-best alignments have the same
                    number of mis-matched bases. 1 by default.

# indel detection

  -I <int>          Maximum length (in bp) of indels that can be detected. 5 by
                    default. Indels of up to 200bp long can be detected.

  --complexIndels   Detect multiple short indels that are in close proximity
                    (they can be as close as 1bp apart from each other).

# read trimming

  --trim5 <int>     Trim off <int> number of bases from 5' end of each read. 0
                    by default.

  --trim3 <int>     Trim off <int> number of bases from 3' end of each read. 0
                    by default.

# distance and orientation of paired end reads

  -d <int>          Minimum fragment/insert length, 50bp by default.

  -D <int>          Maximum fragment/insert length, 600bp by default.

  -S <ff:fr:rf>     Orientation of first and second reads, 'fr' by default (
                    forward/reverse).

# number of CPU threads

  -T <int>          Number of CPU threads used, 1 by default.

# read group

  --rg-id <string>  Add read group ID to the output.

  --rg <string>     Add <tag:value> to the read group (RG) header in the output.

# read order

  --keepReadOrder   Keep order of reads in BAM output the same as that in the
                    input file. Reads from the same pair are always placed next
                    to each other no matter this option is specified or not.

  --sortReadsByCoordinates Output location-sorted reads. This option is
                    applicable for BAM output only. A BAI index file is also
                    generated for each BAM file so the BAM files can be directly
                    loaded into a genome browser.

# color space reads

  -b                Convert color-space read bases to base-space read bases in
                    the mapping output. Note that read mapping is performed at
                    color-space.

# dynamic programming

  --DPGapOpen <int> Penalty for gap opening in short indel detection. -1 by
                    default.

  --DPGapExt <int>  Penalty for gap extension in short indel detection. 0 by
                    default.

  --DPMismatch <int> Penalty for mismatches in short indel detection. 0 by
                    default.

  --DPMatch <int>   Score for matched bases in short indel detection. 2 by
                    default.

# detect structural variants

  --sv              Detect structural variants (eg. long indel, inversion,
                    duplication and translocation) and report breakpoints. Refer
                    to Users Guide for breakpoint reporting.

# gene annotation

  -a                Name of an annotation file (gzipped file is accepted).
                    GTF/GFF format by default. See -F option for more format
                    information.

  -F                Specify format of the provided annotation file. Acceptable
                    formats include 'GTF' (or compatible GFF format) and
                    'SAF'. 'GTF' by default. For SAF format, please refer to
                    Users Guide.

  -A                Provide a chromosome name alias file to match chr names in
                    annotation with those in the reads. This should be a two-
                    column comma-delimited text file. Its first column should
                    include chr names in the annotation and its second column
                    should include chr names in the index. Chr names are case
                    sensitive. No column header should be included in the
                    file.

  --gtfFeature <string>  Specify feature type in GTF annotation. 'exon'
                    by default. Features used for read counting will be 
                    extracted from annotation using the provided value.

  --gtfAttr <string>     Specify attribute type in GTF annotation. 'gene_id'
                    by default. Meta-features used for read counting will be 
                    extracted from annotation using the provided value.

# others

  -v                Output version of the program.

Refer to Users Manual for detailed description to the arguments.