The aim is to first determine what alignment flags are present in a SAM formatted file. Then these flags can be used to extract the associated reads in FASTQ format from the SAM formatted file (in singlet, left and right, or interleaved modes).
npm install -g samfilter
-g so that you can run the two tools which are part of this package directly. The two tools are countsamflags and flagfiltersamfile.
Input a SAM formatted file. SAM alignment flags are explained here.
Output tab separated data (flag (number), count, flags).
To speed up the process, a set number of lines can be sampled from the SAM formatted file. In non-silent mode a progress bar is displayed.
countsamflags -i example.sam -l 1000 > counts.tab
This command processes the first 1000 alignments and returns the alignment flags and their counts. The results are piped into the counts.tab file for viewing or further processing.
Prerequisite: the SAM input file is name sorted.
Input a SAM formatted file.
Output FASTQ formatted files.
flagfiltersamfile -i example.sam -f 97,145 -m interleavedReads.fastq
This command extracts the reads which either have the flag 97 or 145 from the SAM file and stores them in interleaved Reads.fastq.
Note: if reads are paired, they are only extracted as pairs thus at least two complementary flags have to be provided. Pairs are further enforced by their name (IDs). Typically, complementary reads have equal counts so counting will help determining the proper flags here. All flags and their meanings are also available in the count output so counting a sample first is beneficial.
You can submit errors or feature requests here: https://bitbucket.org/allmer/ios/src/master/