Running RStudio Server with Docker

I highly recommend using RStudio if you use R because it makes working with R so much easier. I primarily use RStudio for writing up my analyses in R Markdown. Some RStudio features I couldn’t live without include: Vim keybindings, code completion, and code highlighting (rainbow parentheses are awesome!). Other nice features I like to use include the re-indent code shortcut, insert chunk shortcut, and the file explorer. RStudio v1.4 comes with a Visual R Markdown editor but I haven’t tried it out yet. This post is about how you can use RStudio by running RStudio Server inside a Docker container.

Firstly, why? You can install RStudio on your computer by simply downloading an installation file for your operating system and then you’re done. Why complicate your life with Docker? (If you aren’t familiar with Docker, I gave a workshop on it.) Below are some reasons I have resorted to using RStudio by running RStudio Server inside a Docker container.

1. I was given a Windows computer for work and some R packages don’t work well on Windows.
2. I work in different environments and I wanted the same RStudio installation
3. My computer at home did not have enough compute resources for my RStudio session

There are certain requirements necessary for each of the points above. For point 1. you will need administrator privileges to install and use Docker and not everyone has this privilege. Point 2. is the same as point 1. but for the different environments you work on. For point 3. you will need access to a better computational resource like a compute server or a cloud instance that has Docker installed and where you have permission to use Docker. Unfortunately, having Docker access is the equivalent of gaining root access, so you may not be able to convince your system administrator to install and give you access to Docker. Fortunately for me, I do have Docker access on the server and have a container running RStudio Server on said server that I can access from different computers.

Docker image

The Rocker Project provides Docker containers for the R Environment. They have already prepared a RStudio Server image, so all you really have to do is the following.

# pull the image
docker pull rocker/rstudio:4.0.5

# run container
docker run --rm \
           -p 8888:8787 \
           -e PASSWORD=password \
           rocker/rstudio:4.0.5

Now open your favourite browser and type http://localhost:8888/. You should see a login page: using the username “rstudio” and password “password”. And that’s it! You are running RStudio Server inside a Docker container.

Adding to the Docker image

Sometimes installing R packages requires additional libraries. You can build on top of rocker/rstudio:4.0.5 to include the necessary libraries by specifying your own Dockerfile. I have created my own Dockerfile that includes some necessary libraries for common bioinformatics tools and also included some R packages I always use. Below is my Dockerfile.

FROM rocker/rstudio:4.0.5

MAINTAINER Dave Tang <me@davetang.org>

RUN apt-get clean all && \
	apt-get update && \
	apt-get upgrade -y && \
	apt-get install -y \
		libhdf5-dev \
		libcurl4-gnutls-dev \
		libssl-dev \
		libxml2-dev \
		libpng-dev \
		libxt-dev \
		zlib1g-dev \
		libbz2-dev \
		liblzma-dev \
		libglpk40 \
		libgit2-28 \
	&& apt-get clean all && \
	apt-get purge && \
	rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*

RUN Rscript -e "install.packages(c('rmarkdown', 'tidyverse', 'workflowr', 'BiocManager'));"
RUN Rscript -e "BiocManager::install(version = '3.12')"

COPY user-settings /home/rstudio/.rstudio/monitored/user-settings/user-settings
COPY .Rprofile /home/rstudio/

User settings

You may have noticed the user-settings line in my Dockerfile. As I mentioned at the start of the post, I have some settings that I really like, such as Vim keybindings. Each time we start a new container, we start with the default preferences and naturally I don’t want to manually change them each time. Luckily, the settings are saved in a specific file inside the container: /home/rstudio/.rstudio/monitored/user-settings/user-settings. Start a container and make all your preferred settings and then save the user-settings file back to your local computer. You can overwrite the default settings just like in my Dockerfile above.

Installing packages

Another line you may have noticed in my Dockerfile is the .Rprofile line. My R profile file simply contains the line “.libPaths(“/packages/”)”, which tells R to look for packages in /packages/. When I start my RStudio Server container, I mount a volume that keeps all my R packages from local to the container. This way I don’t need to re-install packages each time I start a new container. Below is an example:

docker run --rm \
           -p 8888:8787 \
           -d \
           --name rstudio_server \
           -v /home/dtang/r_packages/:/packages \
           -e PASSWORD=password \
           -e USERID=$(id -u) \
           -e GROUPID=$(id -g) \
           davetang/rstudio:4.0.5

I run the container in detached mode (“-d”), so if you run your container this way, make sure you stop the container when you’re done by running “docker stop container_name”.

I wrote a helper script to run my Docker image. It uses my Docker image that was built using the Dockerfile shown in this post. You can specify directories to mount to the script and it will mount them to /data/ inside the container.

Other tips

You can limit the resource usage of your Docker container if you’re running your container in a shared environment and want to make sure you don’t use all the resources.

If you are running your Docker container on a server and want to access it on your local computer, you can use SSH port forwarding.

# -N Do not execute a remote command. This is useful for just forwarding ports
# -f Requests ssh to go to background just before command execution
# -Y Enables trusted X11 forwarding
# -L Specifies that connections to the given TCP port or Unix socket on the local (client) host are to be forwarded to the given host and port

ssh -N -f -Y -L 8888:localhost:8888 dtang@192.168.1.42

I have more notes in my GitHub repo, so check it out if you’re interested.

Print Friendly, PDF & Email



Creative Commons License
This work is licensed under a Creative Commons
Attribution 4.0 International License
.
4 comments Add yours
  1. Hi Dave – many thanks for the post. I found it right after taking a course on reproducibility for bioinformatics. One thing that came up in the course was using Conda to control Rstudio version as well as package versions.

    With regards to Rstudio itself, it seemed that if one wanted to use the latest Rstudio version (i.e. 1.4, with new interactive markdown features) then Docker (or Rocker) would be the only route as the latest versions aren’t hosted on any Conda channel. Looking this up brought me to the blogpost. Now I’m curious as to whether you have any thoughts on controlling R package versions at the level of Dockerfile or a Conda environment within the Docker container?

    1. Hi Chris,

      is the course on reproducibility publicly available? I would like to check it out too, if possible. I’m always looking for new ways to enhance reproducibility.

      With regards to your question, you could use Conda to install specific R packages when creating the image. I have an example in my Dockerfile that installs Miniconda https://github.com/davetang/learning_docker/blob/master/Dockerfile.base#L28-L33.

      You could also install R packages directly when creating the image. I also have an example here https://github.com/davetang/learning_docker/blob/master/rstudio/Dockerfile#L24-L25.

      With those two approaches, you can ensure that others have the exact environment. One downsize may be the potential size of the image especially if an R package has many dependencies. For example, the rocker/rstudio:4.0.5 image is 1.93GB in size. My image (davetang/rstudio:4.0.5), which installs additional libraries and a couple of R packages, is 2.33GB.

      One additional way for fixing R packages is to mount a volume to the Docker container and install all the R packages necessary for an analysis into the mounted volume. I’m not sure if this will save more space than simply installing the R packages directly when creating the Docker image. I guess this approach is preferable if the size of all the R packages is large, especially if the Docker image will be shared on a public repository. You could then share the R package volume via some dedicated file sharing platform, like storage buckets.

      Cheers,
      Dave

      1. Hi Dave,

        You can find the course at the following link (fairly sure it’s OK to share this because it’s google searchable): https://nbis-reproducible-research.readthedocs.io/en/latest/

        Thanks for your explanation re. the different ways one can do this – my take home is that there are several ways to go about this. I reckon I will go down the Conda route, which you also mentioned. If you dig into the Docker section in the course you’ll see that in their Dockerfiles they specify Miniconda to create an environment based off a YAML file. Maybe that could be something you find interesting. Anyhow, thanks for the discussion!

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.