5/25/2016 - 7:33 AM

From http://biofinysics.blogspot.fr/2014/05/how-does-bowtie2-assign-mapq-scores.html


  • The hypothesis stated above is false. (read the post
  • In bowtie2, the best a true multiread (AS=XS) can get is MAPQ=1 regardless of how low or high its multiplicity. This occurs when there are 0 or 1 mismatches over perfect base calls in the read, or when AS=XS goes down to -6. When there are 2-5 mismatches over perfect base calls (or the AS=XS <= -12 ---- i.e. -12 to -30.6), the MAPQ becomes 0.
  • If someone wanted to exclude "true multireads" from their data set, using MAPQ >= 2 would work.
    • However, this would also exclude any uniquely mapping reads with >=4 mismatches over high quality bases.
    • In terms of high quality bases and unireads, MAPQ >= 3 allows up to 3 mismatches, MAPQ >= 23 allows up to 2 mismatches, MAPQ >= 40 allows up to 1 mismatch, and MAPQ >= 42 allows 0 mismatches. There will also be other "maxireads" in most or all of these sets.

My test using bowtie2 on the uORF ribo-seq datasets

with MAPQ >31 we tend to the report of bowtie in term of "uniquely mapped". But many people say that this report is not accurate anyway