Getting started with Picard

Updated hyperlinks on the 2015 January 26th; please comment if you find any more dead links. Picard is a suite of Java-based command-line utilities that manipulate SAM/BAM files. Currently, I'm analysing some paired-end libraries and I wanted to calculate the average insert size based on the alignments; that's how I found Picard. While reading the...

Continue Reading

Understanding the BAM flags

I've tried to explain the BAM flags to my colleagues and I think each time I have left them more confused. So perhaps I can do a better job of explaining BAM flags in writing. For this post, I will use this BAM file from the 1000 Genomes Project: NA18553.chrom11.ILLUMINA.bwa.CHB.low_coverage.20120522.bam.

Continue Reading

Bowtie and multimapping reads

Updated 2014 June 8th I first tried this with BWA. Now I'll try it with Bowtie. Consider this reference sequence, which is the sequence "ACGTACGTACGTACGTAGGTACGTAGGG" repeated 20 times: >artificial ACGTACGTACGTACGTAGGTACGTAGGG ACGTACGTACGTACGTAGGTACGTAGGG ACGTACGTACGTACGTAGGTACGTAGGG ACGTACGTACGTACGTAGGTACGTAGGG ACGTACGTACGTACGTAGGTACGTAGGG ACGTACGTACGTACGTAGGTACGTAGGG ACGTACGTACGTACGTAGGTACGTAGGG ACGTACGTACGTACGTAGGTACGTAGGG ACGTACGTACGTACGTAGGTACGTAGGG ACGTACGTACGTACGTAGGTACGTAGGG ACGTACGTACGTACGTAGGTACGTAGGG ACGTACGTACGTACGTAGGTACGTAGGG ACGTACGTACGTACGTAGGTACGTAGGG ACGTACGTACGTACGTAGGTACGTAGGG ACGTACGTACGTACGTAGGTACGTAGGG ACGTACGTACGTACGTAGGTACGTAGGG ACGTACGTACGTACGTAGGTACGTAGGG ACGTACGTACGTACGTAGGTACGTAGGG ACGTACGTACGTACGTAGGTACGTAGGG ACGTACGTACGTACGTAGGTACGTAGGG and this read: >tag ACGTACGTACGTACGTAGGTACGTA The...

Continue Reading

Mapping qualities

Updated 2014 December 17th Current high throughput sequencers produces reads that are short; for example the HiSeq2000 produces millions of reads that are 50 and 100 bp long. To align such short reads with high speed and accuracy, many short read alignment programs have been developed, such as BWA. The major limitation is the length...

Continue Reading

Perl and SAM

Lincoln Stein has written a bunch of modules to deal with SAM/BAM files. Check out the CPAN module. If you are having trouble installing Bio::DB::Sam, you may have to recompile SAMTools with the following command: To install the Perl module on a machine where you don't have root access, follow these instructions. Using this module,...

Continue Reading