Silly mnemonics

Back in first year genetics (i.e. genetics 101), our tutor was telling us of a way to remember pyrimidines and purines. She said, pyrimidines reminded her of the pyramids and therefore Cleopatra and Tutankhamun and therefore Cytosines and Thymines. We laughed, but to this day that's how I remember pyrimidines. Another thing I keep forgetting...

Continue Reading

How to deal with multi mapping reads

Eukaryotic genomes are repetitive in nature i.e. the sequence content is not unique. When mapping high throughput sequencing reads back to the genome, whether for de novo assembly or for RNA sequencing, a subset of reads will map to more than 1 location. Some people refer to these reads as multi-reads for multi mapping reads....

Continue Reading

Using blat

My multipurpose sequence aligner tool of choice for many years has been blat. This is a short post on the basics of blat. To use blat, download the 64bit Linux version of blat (or a version that matches your operating system) here. When aligning sequences to the genome, make sure you use the 64 bit...

Continue Reading

Getting started with TopHat

Updated links for the binaries on 2015 March 2nd TopHat is a tool that can find splice junctions without a reference annotation. By first mapping RNA-Seq reads to the genome (using Bowtie/2), TopHat identifies potential exons, since many RNA-Seq reads will contiguously align to the genome. Using this initial mapping information, TopHat builds a database...

Continue Reading

Installing Circos

A short post about installing Circos on Ubuntu, other Linux distributions and on Windows. Note: if you are using Ubuntu, the location of the env program is in /usr/bin/env. The gddiag and circos programs, use /bin/env, so when you run gddiag it gives a bad interpreter error. Change the first line to #!/usr/bin/perl for both...

Continue Reading

Equivalents in R, Python and Perl

Last update 2015 September 9th I've been using Perl heavily for several years until I started my PhD back in 2010 (I still use it for many tasks but much more sparingly). Perl was widely used back in the early days when the human genome was yet to be sequenced and this famous article explained...

Continue Reading