Interactive plots, as the name suggests, are plots that users can interact with. In my last post, I mentioned that for interactive heatmaps I use the d3heatmap package. To get started with this post, I'll create the same heatmap as my last post but this time using the d3heatmap package.
# install packages if you haven't already install.packages("d3heatmap") install.packages("RColorBrewer") source("https://bioconductor.org/biocLite.R") biocLite("DESeq") # load libraries library("DESeq") library("RColorBrewer") library("d3heatmap") example_file <- system.file ("extra/TagSeqExample.tab", package="DESeq") data <- read.delim(example_file, header=T, row.names="gene") data_subset <- as.matrix(data[rowSums(data)>50000,]) # using the same colour scheme as pheatmap d3heatmap(data_subset, colors = colorRampPalette(rev(brewer.pal(n = 7, name = "RdYlBu")))(100))
If you hover over the cells, you can see the corresponding gene and sample. You can drag and select to zoom into specific cells too (clicking once on the zoomed in area will bring you back to the full heatmap).
The next example uses the plotly package to make ggplot2 plots interactive. We'll make some plots using my latest web traffic for this blog, which I have saved as a csv file.
my_csv <- "https://davetang.org/site_stat/blog_20180517.csv" d <- read.csv(my_csv) d$date <- as.Date(d$date) d$day <- factor(weekdays(d$date), levels = c('Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday')) d$weekend <- grepl(pattern = "^S", x = d$day) d$month <- factor(months(d$date), levels = month.name) d$quarter <- factor(quarters(d$date)) d$year <- format(d$date, "%Y") d$cumsum <- cumsum(d$views) head(d) date views day weekend month quarter year cumsum 1 2013-01-22 130 Tuesday FALSE January Q1 2013 130 2 2013-01-23 269 Wednesday FALSE January Q1 2013 399 3 2013-01-24 258 Thursday FALSE January Q1 2013 657 4 2013-01-25 146 Friday FALSE January Q1 2013 803 5 2013-01-26 52 Saturday TRUE January Q1 2013 855 6 2013-01-27 53 Sunday TRUE January Q1 2013 908
I will also make use of the ggbeeswarm, ggthemes, ggplot2, and plotly packages. The ggbeeswarm package is a nice visualisation as it plots all observations and arranges the points according to the density.
# install packages if you haven't already install.packages("ggbeeswarm") install.packages("ggthemes") install.packages("ggplot2") install.packages("plotly") # load libraries library("ggbeeswarm") library("ggthemes") library("ggplot2") library("plotly") p <- ggplot(d, aes(x = day, y = views, colour = day, text = date)) + ggbeeswarm::geom_quasirandom() + theme_tufte() + theme(legend.title = element_blank(), axis.title.x = element_blank(), panel.border = element_rect(fill = NA)) + ylab("Views") ggplotly(p)
Please wait patiently while the plot loads. Once loaded, you can hover over the points to see the view count on a particular date. For example, the most traffic I have ever gotten for a single day was just two days ago. You can also click on the days in the legend to hide points for a particular day (not that useful here but useful for scatter plots with different groups).
From the plot, we can see that we get more overall traffic on certain days. Since most of my blog posts are work related, I get a lot more visitors on weekdays and of the weekdays, I get the least traffic on Fridays.
I'll use dplyr to separate the traffic per day and conduct a pairwise Wilcoxon rank sum test between all days. The p-values suggest that traffic distributions are not the same between most days.
# install packages if you haven't already install.packages("dplyr") library("dplyr") # to ensure that I have equal lengths of each day # I start on a Monday and end on a Sunday my_monday <- d %>% filter(day == "Monday", date > "2013-01-27", date < "2018-05-14") %>% select(views) %>% pull my_tuesday <- d %>% filter(day == "Tuesday", date > "2013-01-27", date < "2018-05-14") %>% select(views) %>% pull my_wednesday <- d %>% filter(day == "Wednesday", date > "2013-01-27", date < "2018-05-14") %>% select(views) %>% pull my_thursday <- d %>% filter(day == "Thursday", date > "2013-01-27", date < "2018-05-14") %>% select(views) %>% pull my_friday <- d %>% filter(day == "Friday", date > "2013-01-27", date < "2018-05-14") %>% select(views) %>% pull my_saturday <- d %>% filter(day == "Saturday", date > "2013-01-27", date < "2018-05-14") %>% select(views) %>% pull my_sunday <- d %>% filter(day == "Sunday", date > "2013-01-27", date < "2018-05-14") %>% select(views) %>% pull my_view <- c(my_monday, my_tuesday, my_wednesday, my_thursday, my_friday, my_saturday, my_sunday) my_factor <- factor(rep(c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday"), c(length(my_monday), length(my_tuesday), length(my_wednesday), length(my_thursday), length(my_friday), length(my_saturday), length(my_sunday))), levels = c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday")) pairwise.wilcox.test(my_view, my_factor, p.adjust.method = "BH") Pairwise comparisons using Wilcoxon rank sum test data: my_view and my_factor Monday Tuesday Wednesday Thursday Friday Saturday Tuesday 0.0143 - - - - - Wednesday 0.0617 0.5160 - - - - Thursday 0.2887 0.1608 0.4072 - - - Friday 0.0038 7.5e-08 1.0e-06 4.2e-05 - - Saturday < 2e-16 < 2e-16 < 2e-16 < 2e-16 < 2e-16 - Sunday < 2e-16 < 2e-16 < 2e-16 < 2e-16 < 2e-16 0.0155 P value adjustment method: BH
Finally, I'll use dygraphs to plot the web traffic. Since my web traffic is a time-series, I'll use the zoo and xts packages to create time-series objects; xts objects are compatible with dygraphs.
# install packages if you haven't already install.packages("dygraphs") install.packages("zoo") install.packages("xts") # load libraries library("dygraphs") library("zoo") library("xts") # I only plotted the weekdays here, since the weekend traffic is too different # since I have irregular intervals (I removed the weekends) I used the zoo package # and converted the zoo object to an xts object for use with dygraph my_weekday_date <- d %>% filter(weekend == FALSE) %>% select(date) %>% pull my_weekday_view <- d %>% filter(weekend == FALSE) %>% select(views) %>% pull my_zoo_weekday <- zoo(my_weekday_view, my_weekday_date) my_zoo_weekday_xts <- as.xts(my_zoo_weekday, order.by = my_weekday_date) dygraph(my_zoo_weekday_xts, main = "Web traffic for https://davetang.org/muse") %>% dyRangeSelector(dateWindow = c("2017-01-01", "2018-05-17"))
You can mouse over the graph to show the view counts for particular days and adjust the slider to focus the plot on specific time periods. (The increase in the traffic as of late is because I upgraded my web hosting plan, which provides more resources; I didn't realise I was hitting resource limits.)
For my last plot, I create a separate time-series for each day of the week.
# the time-series interval is per week # that way I can provide counts per day for that particular week my_week <- seq(d$date[7], d$date[1938], by = "week") my_monday <- d %>% filter(day == "Monday", date > "2013-01-27", date < "2018-05-14") %>% select(views) %>% pull my_tuesday <- d %>% filter(day == "Tuesday", date > "2013-01-27", date < "2018-05-14") %>% select(views) %>% pull my_wednesday <- d %>% filter(day == "Wednesday", date > "2013-01-27", date < "2018-05-14") %>% select(views) %>% pull my_thursday <- d %>% filter(day == "Thursday", date > "2013-01-27", date < "2018-05-14") %>% select(views) %>% pull my_friday <- d %>% filter(day == "Friday", date > "2013-01-27", date < "2018-05-14") %>% select(views) %>% pull my_saturday <- d %>% filter(day == "Saturday", date > "2013-01-27", date < "2018-05-14") %>% select(views) %>% pull my_sunday <- d %>% filter(day == "Sunday", date > "2013-01-27", date < "2018-05-14") %>% select(views) %>% pull my_monday_xts <- as.xts(zoo(my_monday, my_week), order.by = my_week) my_tuesday_xts <- as.xts(zoo(my_tuesday, my_week), order.by = my_week) my_wednesday_xts <- as.xts(zoo(my_wednesday, my_week), order.by = my_week) my_thursday_xts <- as.xts(zoo(my_thursday, my_week), order.by = my_week) my_friday_xts <- as.xts(zoo(my_friday, my_week), order.by = my_week) my_saturday_xts <- as.xts(zoo(my_saturday, my_week), order.by = my_week) my_sunday_xts <- as.xts(zoo(my_sunday, my_week), order.by = my_week) my_merged_xts <- merge.zoo(my_monday_xts, my_tuesday_xts, my_wednesday_xts, my_thursday_xts, my_friday_xts, my_saturday_xts, my_sunday_xts) dygraph(my_merged_xts, main = "Web traffic for https://davetang.org/muse") %>% dyRangeSelector(dateWindow = c("2017-01-01", "2018-05-17")) %>% dyLegend(width = 200, show = "follow")
Mousing over the plot will show the view counts for each day for a particular week.
Summary
Interactive plots are quite useful, especially for finding outliers. Be sure to check out plotly and htmlwidgets for even more interactive plots.

This work is licensed under a Creative Commons
Attribution 4.0 International License.
Thanks, Dave, excellent post on interactive visualization! All the useful tools with good examples. An honorable mention is heatmaply, also doing interactive heatmaps, https://cran.r-project.org/web/packages/heatmaply/vignettes/heatmaply.html
Thanks Mikhail! The d3heatmap is missing a scale, which heatmaply provides, so +1 for heatmaply.