Silly mnemonics

Back in first year genetics (i.e. genetics 101), our tutor was telling us of a way to remember pyrimidines and purines. She said, pyrimidines reminded her of the pyramids and therefore Cleopatra and Tutankhamun and therefore Cytosines and Thymines. We laughed, but to this day that’s how I remember pyrimidines. Another thing I keep forgetting…

Continue Reading

How to deal with multi mapping reads

Eukaryotic genomes are repetitive in nature i.e. the sequence content is not unique. When mapping high throughput sequencing reads back to the genome, whether for de novo assembly or for RNA sequencing, a subset of reads will map to more than 1 location. Some people refer to these reads as multi-reads for multi mapping reads….

Continue Reading

Using blat

My multipurpose sequence aligner tool of choice for many years has been blat. This is a short post on the basics of blat. To use blat, download the 64bit Linux version of blat (or a version that matches your operating system) here. When aligning sequences to the genome, make sure you use the 64 bit…

Continue Reading

Getting started with TopHat

Updated links for the binaries on 2015 March 2nd TopHat is a tool that can find splice junctions without a reference annotation. By first mapping RNA-Seq reads to the genome (using Bowtie/2), TopHat identifies potential exons, since many RNA-Seq reads will contiguously align to the genome. Using this initial mapping information, TopHat builds a database…

Continue Reading

Installing Circos

A short post about installing Circos on Ubuntu, other Linux distributions and on Windows. Note: if you are using Ubuntu, the location of the env program is in /usr/bin/env. The gddiag and circos programs, use /bin/env, so when you run gddiag it gives a bad interpreter error. Change the first line to #!/usr/bin/perl for both…

Continue Reading

Equivalents in R, Python and Perl

Last update 2018 May 24th Perl was used by many computational biologists back in early 2000. The popularity of Perl may have been driven by its involvement with the human genome project. An article titled "How Perl Saved the Human Genome Project" explains why Perl was a good fit for computational biology projects (as well…

Continue Reading