Using bins when comparing genomic features

Comparing two files containing genomic features is a common task e.g. finding out whether the coordinates of your tags intersect with genes. Of course you could use intersectBed (as part of the BEDTools suite) for this purpose but here’s how to do it anyway using Perl. NOTE: I hard code the length of my tags…

Continue Reading

Bidirectional genes

Download 5′ UTR for all RefSeq genes using the UCSC Table Browser. Separate features according to strand Use intersectBed to find overlapping features Performing a GO enrichment analysis on the unique list of bidirectional genes and using all the genes as the universe list: Although this was a brief analysis, the results are somewhat similar…

Continue Reading

Mapping qualities

Updated 2014 December 17th Current high throughput sequencers produces reads that are short; for example the HiSeq2000 produces millions of reads that are 50 and 100 bp long. To align such short reads with high speed and accuracy, many short read alignment programs have been developed, such as BWA. The major limitation is the length…

Continue Reading

Using Velvet

Write script for generating random tags from a longer piece of DNA Generate random tags and use as input for velvet I don’t know why in the definition line reads length = 480 (NODE_1_length_480) when the contig length is 500. BLAST the contig back to the original sequence Score = 924 bits (500), Expect =…

Continue Reading