Late last year I mapped random sized DNA sequences back to the genome. The purpose was simply to see how long sequenced reads needed to be before they could be uniquely mapped to the genome. I couldn’t find the statistics on this, so I just did it myself. I didn’t dwell on the results too much.
One day, I heard someone say miRNA’s have definitely evolved to be 22-23 nt long. As with everything in nature, there’s a reason why things are the way they are. So just now, I decided to look back at my results.
The result to look at is the number of perfect matches vs. the length of the random DNA fragment. When I started to make random sized fragments larger than 20 nt long, very few of the random fragments could map uniquely. At 21 nt long, roughly 0.1% (982 out of 1_000_000) of the random DNA fragments mapped to the genome. At 22nt long, the number of mappable random DNA fragments drops to 255 out of 1_000_000 and at 23nt, 65 out of 1_000_000. 22-23 nt long DNA sequences seem like a good compromise between gaining sequence uniqueness with the shortest possible length, which is probably why they are this length. Even if there is a point mutation in a 22 nt miRNA, it will most likely still only bind to its intended target.
This work is licensed under a Creative Commons
Attribution 4.0 International License.