Finding junctions with TopHat

For setting up TopHat see my previous post. Here, I wanted to test whether TopHat can find junctions with single end 27bp reads. The reference sequence I used was the test_ref.fa provided by the TopHat authors (see my previous post for the link), where the A’s mark the intron regions: >test_chromosome AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA ACTACTATCTGACTAGACTGGAGGCGCTTGCGACTGAGCTAGGACGTGCC ACTACGGGGATGACGACTAGGACTACGGACGGACTTAGAGCGTCAGATGC AGCGACTGGACTATTTAGGACGATCGGACTGAGGAGGGCAGTAGGACGCT…

Continue Reading

Annotating RNA-Seq data

After mapping your reads from an RNA-Seq experiment, usually the next task is identify the transcripts that the reads came from (i.e. annotating RNA-Seq data) and there are many ways of doing so. Here I just describe a rather crude method whereby I download sequence coordinates of hg19 RefSeqs as a BED12 file from the…

Continue Reading

Getting started with TopHat

Updated links for the binaries on 2015 March 2nd TopHat is a tool that can find splice junctions without a reference annotation. By first mapping RNA-Seq reads to the genome (using Bowtie/2), TopHat identifies potential exons, since many RNA-Seq reads will contiguously align to the genome. Using this initial mapping information, TopHat builds a database…

Continue Reading