Variance in RNA-Seq data

Updated 2014 April 18th For this post I will use data from this study, that has been nicely summarised already to examine the variance in RNA-Seq data. Briefly, the study used LNCaP cells, which are androgen-sensitive human prostate adenocarcinoma cells, and treated the cells with DHT and with a mock treatment as the control. The...

Continue Reading

DESeq vs. edgeR vs. baySeq using pnas_expression.txt

Following the instructions from a previous post, I filtered the pnas_expression.txt dataset and saved the results in "pnas_expression_filtered.tsv" and then performed the differential gene expression analyses using the respective packages. To run the Perl scripts below, just save the code into a file and name it "something.pl". Then make the file executable by running "chmod...

Continue Reading

edgeR vs. SAMSeq

A while ago I received a comment on comparing parametric methods against nonparametric for calling differential expression in count data. Here I compare SAMSeq (Jun Li and Robert Tibshirani (2011) Finding consistent patterns: a nonparametric approach for identifying differential expression in RNA-Seq data. Statistical Methods in Medical Research, in press.) with edgeR. For more information...

Continue Reading

The DGEList object in R

I've updated this post (2013 June 29th) to use the latest version of R, Bioconductor and edgeR. I also demonstrate how results of edgeR can be saved and outputted into one useful table. The DGEList object holds the dataset to be analysed by edgeR and the subsequent calculations performed on the dataset. Specifically it contains:...

Continue Reading

edgeR vs. DESeq using pnas_expression.txt

Firstly from Davis's homepage download the file pnas_expression.txt. For more information on the dataset please refer to the edgeR manual and this paper. The latest R version at the time of writing is R 2.13.1. You can download it from here. Install bioconductor and the required packages: source(“http://www.bioconductor.org/biocLite.R”) biocLite() biocLite(“DESeq”) biocLite(“edgeR”) A filtering criteria of...

Continue Reading

edgeR's common dispersion

Updated: 2017 September 7th When I was first learning about conducting a differential expression (DE) analysis with RNA-seq data, I found it very difficult to understand the statistical procedures implemented in various R packages that performed the DE analysis. This really bugged me. However, it was not difficult to carry out the analysis, since the...

Continue Reading

Normalisation methods implemented in edgeR

Updated 2014 December 12th A short post on the different normalisation methods implemented within edgeR; to see the normalisation methods type: From the documentation: method="TMM" is the weighted trimmed mean of M-values (to the reference) proposed by Robinson and Oshlack (2010), where the weights are from the delta method on Binomial data. If refColumn is...

Continue Reading