Creative Commons License
This blog by Tommy Tang is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

My github papge

Thursday, June 15, 2017

bwa aln or bwa mem for short reads (36bp)

My ChIP-seq data are 36bp single end reads. I usually use bowtie1 for mapping ChIP-seq reads, but bowtie1 does not handle indels. Since I want to call mutations on the ChIP-seq reads, I have to use another aligner BWA, the most popular mapper written by Heng Li.

The github page says if reads < 70bp, bwa aln should be used. Otherwise, bwa mem should be used.
bwa mem is a more recent algorithm (should be better?).

I searched on biostar, and found When and why is bwa aln better then bwa mem?

I did a simulation test using Teaser using default setting for each aligner.

The results are shown below:

The mapping rate:


Memory usage:


Run time:

Indeed, BWA aln is a little better than BWA mem for short reads.

For a real data set, the samtools flagstat results are shown below:

bwa aln:
282967631 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 secondary
0 + 0 supplementary
18963259 + 0 duplicates
240660130 + 0 mapped (85.05% : N/A)
0 + 0 paired in sequencing
0 + 0 read1
0 + 0 read2
0 + 0 properly paired (N/A : N/A)
0 + 0 with itself and mate mapped
0 + 0 singletons (N/A : N/A)
0 + 0 with mate mapped to a different chr
0 + 0 with mate mapped to a different chr (mapQ>=5)

bwa mem:
282967631 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 secondary
0 + 0 supplementary
18332921 + 0 duplicates
236558306 + 0 mapped (83.60% : N/A)
0 + 0 paired in sequencing
0 + 0 read1
0 + 0 read2
0 + 0 properly paired (N/A : N/A)
0 + 0 with itself and mate mapped
0 + 0 singletons (N/A : N/A)
0 + 0 with mate mapped to a different chr
0 + 0 with mate mapped to a different chr (mapQ>=5)

Indeed, bwa aln has a moderate higher mapping rate and a shorter run time for short 36bp reads.

No comments:

Post a Comment