Filter SAM file by sequence identity and alignment ratio

install

Pre-compiled binary

wget https://github.com/jianshu93/sam_filter/releases/download/v0.1.0/sam_filter_Linux_x86-64_v01.0.zip
unzip sam_filter_Linux_x86-64_v01.0.zip
chmod a+x ./sam_filter
./sam_filter -h

Compile from source

git clone https://github.com/jianshu93/sam_filter
cd sam_filter
cargo build --release
./target/release/sam_filter -h

usage

Filters CIGAR strings by sequence identity and alignment ratio in SAM files

Usage: sam_filter [OPTIONS] --identity <min_identity> --ratio <min_ratio>

Options:
  -i, --identity <min_identity>  Minimum matching identity as a float
  -r, --ratio <min_ratio>        Minimum query alignment ratio as a float
      --reverse                  If used, print lines where identity <= min_identity (ignoring ratio), otherwise print lines where identity > min_identity and ratio > min_ratio
  -h, --help                     Print help
  -V, --version                  Print version

Generate sensitive mappings without SAM header (Bowtie2 v2.5.4 or later)

### interleaved mode
bowtie2 -p 64 -f --interleaved ./359250487_S94_L007_interleaved.fasta --seed 42 --very-sensitive -k 16 --np 1 --mp "1,1" --rdg "0,1" --rfg "0,1" --score-min "L,0,-0.05" --no-head --no-unal --no-1mm-upfront -S 359250487_S94_L007.sam -x pUC57.fasta

### R1 and R2 mode
bowtie2 -p 64 -q -1 359250487_S94_L007_R1.fastq.gz -2 359250487_S94_L007_R2.fastq.gz --seed 42 --very-sensitive -k 16 --np 1 --mp "1,1" --rdg "0,1" --rfg "0,1" --score-min "L,0,-0.05" --no-head --no-unal --no-1mm-upfront -S 359250487_S94_L007.sam -x pUC57.fasta

### R1 and R2 seperate mapping
bowtie2 -p 64 -q 359250487_S94_L007_R1.fastq.gz --seed 42 --very-sensitive -k 16 --np 1 --mp "1,1" --rdg "0,1" --rfg "0,1" --score-min "L,0,-0.05" --no-head --no-unal --no-1mm-upfront -S 359250487_S94_L007_R1.sam -x pUC57.fasta

bowtie2 -p 64 -q 359250487_S94_L007_R1.fastq.gz --seed 42 --very-sensitive -k 16 --np 1 --mp "1,1" --rdg "0,1" --rfg "0,1" --score-min "L,0,-0.05" --no-head --no-unal --no-1mm-upfront -S 359250487_S94_L007_R2.sam -x pUC57.fasta

git clone https://github.com/jianshu93/sam_filter.git
cd sam_filter
cargo build --release
### extract high identity matches
cat data/test.sam | ./target/release/sam_filter -i 0.98 -r 0.90 > test.filtered.sam

### extract low identity matches
cat data/test.sam | ./target/release/sam_filter -i 0.98 -r 0.90 --reverse > test.filtered.sam


### Extracting match or unmatched reads from original fasta file after the filtering above. Seqkit can be used (newest version v2.9.0 or later)

for R1_file in *_R1.*.sam; do base=${R1_file%_R1.*}; R2_file="${base}_R2.filtered.sam"; cat $R1_file | awk '{print $1}' > ${base}_R1.match.txt; seqkit grep -f ${base}_R1.match.txt /qmounts/qiita_data/per_sample_FASTQ/194282/${base}_R1_001.trimmed.fastq.gz > ../filter_matched_fasta/${base}_R1.fastq; seqkit grep -v -f ${base}_R1.match.txt /qmounts/qiita_data/per_sample_FASTQ/194282/${base}_R1_001.trimmed.fastq.gz > ../filter_unmatched_fasta/${base}_R1.fastq; cat $R2_file | awk '{print $1}' > ${base}_R2.match.txt; seqkit grep -f ${base}_R2.match.txt /qmounts/qiita_data/per_sample_FASTQ/194282/${base}_R2_001.trimmed.fastq.gz > ../filter_matched_fasta/${base}_R2.fastq; seqkit grep -v -f ${base}_R2.match.txt /qmounts/qiita_data/per_sample_FASTQ/194282/${base}_R2_001.trimmed.fastq.gz > ../filter_unmatched_fasta/${base}_R2.fastq; done

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
data		data
src		src
Cargo.toml		Cargo.toml
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Filter SAM file by sequence identity and alignment ratio

install

usage

Generate sensitive mappings without SAM header (Bowtie2 v2.5.4 or later)

About

Releases 1

Packages

Languages

jianshu93/sam_filter

Folders and files

Latest commit

History

Repository files navigation

Filter SAM file by sequence identity and alignment ratio

install

usage

Generate sensitive mappings without SAM header (Bowtie2 v2.5.4 or later)

About

Resources

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages