Making a heatmap in R with the superheat package

I discovered the superheat package a few days ago and the adjacent plots feature was really attractive.

I'm going to test out this heatmap package using some single cell data that's available on Bioconductor.

source("https://bioconductor.org/biocLite.R")
biocLite("HSMMSingleCell")

Install the superheat package.

# install devtools if necessary
install.packages("devtools")
# use devtools to install superheat
devtools::install_github("rlbarter/superheat")

This dataset has four time points: 0, 24, 48, and 72 hours and each time point has 69, 74, 79, and 49 cells respectively (n = 271).

data(HSMM_expr_matrix)
# easier to type my_mat
my_mat <- HSMM_expr_matrix

# create a data frame as well
df <- as.data.frame(HSMM_expr_matrix)

dim(df)
47192   271

I made some completely arbitrary filters to bring the number of genes down to a plottable number. The expression values are in FPKM units, as calculated with the Cufflinks package.

# calculate median from cells at each time point
df$t0_median  <- apply(df[,grep(pattern = "^T0_", x = colnames(df))], 1, median)
df$t24_median <- apply(df[,grep(pattern = "^T24_", x = colnames(df))], 1, median)
df$t48_median <- apply(df[,grep(pattern = "^T48_", x = colnames(df))], 1, median)
df$t72_median <- apply(df[,grep(pattern = "^T72_", x = colnames(df))], 1, median)

# boolean vector where TRUE is
# median expression of a time point > 5 FPKM
min_5 <- apply(df[,grep('median',colnames(df))], 1, min)>5
# subset genes
df_subset <- df[min_5,]

dim(df_subset)
[1] 2589  275

# fold change between time points
df_subset$t0_t24 <- df_subset$t24_median / df_subset$t0_median
df_subset$t24_48 <- df_subset$t48_median / df_subset$t24_median
df_subset$t48_72 <- df_subset$t72_median / df_subset$t48_median

# find genes that go up, up, and up!
my_fold_change <- 1.1
my_wanted <- row.names(subset(df_subset, t0_t24 > my_fold_change & t24_48 > my_fold_change & t48_72 > my_fold_change))

# remove outlier
my_wanted <- my_wanted[-grep(pattern = 'ENSG00000269028.2', my_wanted)]

Plot heatmap.

library(superheat)
superheat(my_mat[my_wanted,],
          scale=TRUE,
          pretty.order.rows = TRUE,
          heat.col.scheme = "red",
          n.clusters.rows = 4,
          yr = rowSums(my_mat[my_wanted,]),
          yr.axis.name = 'Row expression',
          yr.plot.type = "bar",
          yt.plot.type = "scattersmooth",
          yt = colSums(my_mat[my_wanted,]),
          yt.axis.name = "Column expression",
          yt.line.col = 'tomato3')

Plot heatmap with smoothing.

# in case you didn't run the code above
library(superheat)
superheat(my_mat[my_wanted,],
          scale=TRUE,
          pretty.order.rows = TRUE,
          heat.col.scheme = "red",
          n.clusters.rows = 4,
          yr = rowSums(my_mat[my_wanted,]),
          yr.axis.name = 'Row expression',
          yr.plot.type = "bar",
          yt.plot.type = "scattersmooth",
          yt = colSums(my_mat[my_wanted,]),
          yt.axis.name = "Column expression",
          yt.line.col = 'tomato3',
          smooth.heat = TRUE)

Further reading

See the official documentation.

Print Friendly, PDF & Email



Creative Commons License
This work is licensed under a Creative Commons
Attribution 4.0 International License
.
3 comments Add yours

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.