The code, which allowed me to use the Spearman's rank correlation coefficient, was kindly provided to me by the developer of pvclust.
Firstly download the unofficial package or just source it from my DropBox account. Start up R and follow:
#load the package #if you are having trouble sourcing from https run #install.packages('devtools') #library(devtools) #source_url("https://dl.dropboxusercontent.com/u/15251811/pvclust/pvclust.R") #source_url("https://dl.dropboxusercontent.com/u/15251811/pvclust/pvclust-internal.R") source("https://dl.dropboxusercontent.com/u/15251811/pvclust/pvclust.R") source("https://dl.dropboxusercontent.com/u/15251811/pvclust/pvclust-internal.R") #use a test dataset from DESeq #install DESeq if necessary source("http://bioconductor.org/biocLite.R") biocLite("DESeq") #load DESeq library("DESeq") example_file <- system.file ("extra/TagSeqExample.tab", package="DESeq") data <- read.delim(example_file, header=T, row.names="gene") head(data) T1a T1b T2 T3 N1 N2 Gene_00001 0 0 2 0 0 1 Gene_00002 20 8 12 5 19 26 Gene_00003 3 0 2 0 0 0 Gene_00004 75 84 241 149 271 257 Gene_00005 10 16 4 0 4 10 Gene_00006 129 126 451 223 243 149 # Define a distance function spearman <- function(x, ...) { x <- as.matrix(x) res <- as.dist(1 - cor(x, method = "spearman", use = "everything")) res <- as.dist(res) attr(res, "method") <- "spearman" return(res) } result <- pvclust(data, method.dist=spearman, nboot=100) result Cluster method: average Distance : spearman Estimates on edges: au bp se.au se.bp v c pchi 1 1 1 0 0 0 0 0 2 1 1 0 0 0 0 0 3 1 1 0 0 0 0 0 4 1 1 0 0 0 0 0 5 1 1 0 0 0 0 0 #pvclust classed object class(result) [1] "pvclust" names(result) [1] "hclust" "edges" "count" "msfit" "nboot" "r" "store" plot(result) pvrect(result, alpha=0.95)
This work is licensed under a Creative Commons
Attribution 4.0 International License.
this is wrong, its giving error in R. the error is something like this x not in data frame
Thanks for letting me know; the code got mangled up; try again now.
what does “level2_for_paper_norm_id.tsv” look like? I’d like to try this.
Hi Alex,
The file was a tab delimited file, where the rows are the genes and the columns the different samples.
I’ve updated this post to use a dataset that is available from the DESeq package. You can try the example I’ve provided to see if it’s useful for you.
Hope that helps,
Dave
Dear Davo,
many thanks for the code. I searched a lot for some examples about how to use pvclust with alternative distance measures. It seems many people are also having the same question, and it seems your code just hit the spot !
All the best.
Tiago
Hi Tiago,
Yeah the code was kindly written for me when I emailed the author of pvclust, so all credits to him.
But glad you could find it and that it was helpful.
Cheers,
Dave
Thanks a lot. I was looking for exactly how make pvclust use the spearman’s correlation!
No problems. I merely just provided the code the author of pvclust wrote for me 🙂
Hi, I wanted to use this function but when I run this code I got an error.
Erreur in as.character(x) :
cannot coerce type ‘closure’ to vector of type ‘character’
Thank you
Hi Patricia,
are you following the example in the post, or using your own data? It’s a bit hard to troubleshoot just from the error.
Cheers,
Dave
Hi Dave,
So I am using my own data,
when I run the commands to create the function I don’t get an error but just after when I run the pvclust code.
Best regards.
Patricia
Hi Patricia,
without knowing the exact commands you ran and without a copy of your data, it’s quite hard to know what is wrong.
Cheers,
Dave
Hi,
when I run the code above I get the same error:
Error in as.character(x) :
cannot coerce type ‘closure’ to vector of type ‘character’
This happens with the example data as well as with my own data.
R version 3.0.2 (2013-09-25)
pvclust_1.2-2
I found the reason – or at least I now can run the example:
I also downloaded the 2 R scripts from the dropbox urls and sourced both files. Using this functions I was able tun run the examples.
It seems that there are differences in the code from tha package I used (pvclust_1.2-2) and this code from dropbox.
Strange
Yes the code from DropBox is the unofficial code (from the same author of pvclust); use that code if you want Spearman’s correlation instead of Pearson.
I tested this script and it is working without errors.
However the result is wrong while I am using exactly the same data:
au bp se.au se.bp v c pchi
1 1 1 0 0 0 0 0
2 1 1 0 0 0 0 0
3 1 1 0 0 0 0 0
4 1 1 0 0 0 0 0
5 1 1 0 0 0 0 0
Any idea what the reason might be? Is it possible that this is related to my R-version (R3.1.3.)?
I’m not sure what steps you performed or how you came to the conclusion that the result is incorrect.
If I use exactly the same script, shouldn’t I have the same outcome, or at least more or less the same outcome?
Oh right, my bad; didn’t realise you meant you followed the code exactly. You’re right; the results on this page are wrong. I must have copied and pasted the wrong results, since the dendrogram values are all 100%. I updated the results; thanks for letting me know!
Hello all, this seems to be most promising post online for using pvclust() in R. I may just be getting too frustrated and not seeing a simple correction, but I keep getting this error as well:
Error in cor(x, method = “pearson”, use = use.cor) : ‘x’ must be numeric
I am using my own data. It has the following columns: Individual ID, followed by 4 criteria columns that I used in clustering (i.e. # of sightings, # of gap events, etc.). I have conducted 2 hierarchical cluster analyses: (1) complete linkage, (2) average linkage. And now i want to use pvclust() to test the uncertainty in the clustering (2 cluster for complete, 3 for average). I have tried to replicate the code above, and have the following:
result <- pvclust(data, method.dist="correlation", method.hclust="average", nboot=1000, r=seq(0.7,1.4,by=.1)) #error right away
plot(result)
pvrect(result, alpha=0.95)
I think the issue is my "data". Do I use the original data (with the 5 columns i stated above?), or do i make a data.frame? I also tried first making as.matrix(data), and then putting that as x in pvclust(x…). But i still get an error.
Any help would be great!! Thanks so much!