Pooling technical replicates in edgeR

This post is very old and should just be ignored. But if you came across this, here's a thread on the Bioconductor mailing list that may be relevant

4 libraries each with technical replicates and 2 conditions. Technical replicates are the same samples performed identically. First treat technical replicates separately:

library(edgeR)
targets <- read.delim(file = "targets.txt", stringsAsFactors = F)
d <- readDGE(targets, comment.char="#")
d <- estimateCommonDisp(d)

d$samples$lib.size
#[1] 258680 733173 334908 687753 321877 320562 582971 539963 570570 484404
#[11] 408124 467288

d$common.lib.size
#[1] 453057

>colSums(d$pseudo.alt)
#1        2        3        4        5        6        7        8
#451962.0 453242.9 452553.6 454109.2 451432.0 451713.3 455116.5 453668.7
#9       10       11       12
#453444.9 453177.2 453137.0 453183.9

d$common.dispersion
#[1] 0.4648173

sqrt(d$common.dispersion)
#[1] 0.6817751

de.com <- exactTest(d)

options(digits=4)

topTags(de.com)

detags.com <- rownames(topTags(de.com)$table)

sum(de.com$table$p.value < 0.01)
#[1] 1981

top1981 < topTags(de.com, n=1981)

sum(top1981$table$logFC > 0)
#[1] 1123

sum(top1981$table$logFC &lt; 0)
#[1] 858

sum(p.adjust(de.com$table$p.value,method="BH") < 0.05)
#[1] 335

Now pooling everything together, so that we are comparing 2 x 2 and following the same pipeline:

targets = read.delim(file = "targets2.txt", stringsAsFactors=F)

d <- estimateCommonDisp(d)
d$samples$lib.size
#[1] 1326914 1330453 1693521 1359966

d$common.lib.size
#[1] 1420006

colSums(d$pseudo.alt)
#1       2       3       4
#1420267 1419859 1420363 1420132

d$common.dispersion
#[1] 0.2876606

sqrt(d$common.dispersion)

de.com <- exactTest(d)

options(digits=4)

topTags(de.com)

detags.com = rownames(topTags(de.com)$table)

sum(de.com$table$p.value < 0.01)
#[1] 1024

top1024 = topTags(de.com, n=1024)

sum(top1024$table$logFC > 0)
#[1] 795

&gt;sum(top1024$table$logFC < 0)
#[1] 229

sum(p.adjust(de.com$table$p.value,method="BH") < 0.05)
#[1] 89

None of the 89 from the adjusted set in the pooled analysis overlap the 335 from the unpooled analysis.




Creative Commons License
This work is licensed under a Creative Commons
Attribution 4.0 International License
.
Posted in RTagged
One comment Add yours
  1. Additionally, using d = calcNormFactors(d) for the unpooled dataset

    > sum(p.adjust(de.com$table$p.value,method=”BH”) < 0.05) [1] 411 Increased differentially expressed number from 335 to 411

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.