Learning to use biomaRt

In the past I’ve been manually downloading tables of data annoation and parsing them with Perl. I guess it’s time to do things more elegantly. Below is code taken from the biomaRt vignette: Note If you are using Ubuntu and getting a “Cannot find xml2-config” problem while installing XML, a prequisite to biomaRt, try installing…

Continue Reading

Making a barplot in R

Just a short post on making a barplot in R after reading in data via the read.table() function. I created a file with two rows, the first row containing the header and the second row containing the data values. a b c d e 10 20 30 20 10 Horizontal barplot sorted by values

Continue Reading

Comparing different distributions

Updated 2017 September 7th The Kolmogorov-Smirnov test can be used to test whether two underlying one-dimensional probability distributions differ. As noted in the Wikipedia article: Note that the two-sample test checks whether the two data samples come from the same distribution. This does not specify what that common distribution is (e.g. whether it’s normal or…

Continue Reading

I’ve joined Twitter

Today while reading a paper, I found some interesting one-liner facts. They are way too short to create a post on but I would like to make a repository of them. What better place to store these facts than Twitter! You can follow me on Twitter for a list of facts on molecular biology and…

Continue Reading

Variance in RNA-Seq data

Updated 2014 April 18th For this post I will use data from this study, that has been nicely summarised already to examine the variance in RNA-Seq data. Briefly, the study used LNCaP cells, which are androgen-sensitive human prostate adenocarcinoma cells, and treated the cells with DHT and with a mock treatment as the control. The…

Continue Reading

Creating a matrix of scatter plots in R

Scatter plots are 2 dimensional plots that show the relationship between two variables. Here I demonstrate how we can create a matrix of scatter plots in R for datasets that have more than two variables. This is particularly useful when we want to visually inspect whether there are associations between variables. To display correlations on…

Continue Reading

Installing R on Ubuntu

The following information is available from CRAN but for my convenience, I’ve collated the crux here for installing R on Ubuntu. First find your closest mirror here. Currently I am in Japan, so mine is http://cran.ism.ac.jp/. To find out which Ubuntu version you’re using: Now edit the file /etc/apt/sources.list and add this line: deb http://cran.ism.ac.jp//bin/linux/ubuntu…

Continue Reading

DESeq vs. edgeR vs. baySeq using pnas_expression.txt

Following the instructions from a previous post, I filtered the pnas_expression.txt dataset and saved the results in “pnas_expression_filtered.tsv” and then performed the differential gene expression analyses using the respective packages. To run the Perl scripts below, just save the code into a file and name it “something.pl”. Then make the file executable by running “chmod…

Continue Reading