Last updated: 2024/05/30
If you are running into problems with using RStudio Server and R version 4.2.2, I wrote a new blog post about it.
I highly recommend using RStudio if you use R because it makes working with R so much easier. I primarily use RStudio for writing up my analyses in R Markdown. Some RStudio features I couldn't live without include: Vim keybindings, code completion, and code highlighting (rainbow parentheses are awesome!). Other nice features I like to use include the re-indent code shortcut, insert chunk shortcut, and the file explorer. RStudio v1.4 comes with a Visual R Markdown editor but I haven't tried it out yet. This post is about how you can use RStudio by running RStudio Server inside a Docker container.
Firstly, why? You can install RStudio on your computer by simply downloading an installation file for your operating system and then you're done. Why complicate your life with Docker? (If you aren't familiar with Docker, I gave a workshop on it.) Below are some reasons I have resorted to using RStudio by running RStudio Server inside a Docker container.
- I was given a Windows computer for work and some R packages don't work well on Windows.
- I work in different environments and I wanted the same RStudio installation
- My computer at home did not have enough compute resources for my RStudio session
There are certain requirements necessary for each of the points above. For point 1. you will need administrator privileges to install and use Docker and not everyone has this privilege. Point 2. is the same as point 1. but for the different environments you work on. For point 3. you will need access to a better computational resource like a compute server or a cloud instance that has Docker installed and where you have permission to use Docker. For example, you can run RStudio Server on an Amazon EC2 instance.
Unfortunately, having Docker access is the equivalent of gaining root access, so you may not be able to convince your system administrator to install and give you access to Docker. Fortunately for me, I do have Docker access on the server and have a container running RStudio Server on said server that I can access from different computers.
If you can't convince your sysadmin to install and/or give you access to Docker, ask them to install Singularity, which does not require root access and works nicely with HPCs! Then follow this guide I wrote on running RStudio Server with Singularity. Nowadays I use Singularity more often than Docker and if you're interested, you can check out some of my Singularity notes on GitHub.
Docker image
The Rocker Project have prepared various Docker containers for the R Environment. They have already prepared a RStudio Server image, so all you really have to do is the following.
# pull the image
docker pull rocker/rstudio:4.2.2
# run container
docker run --rm \
-p 8888:8787 \
-e PASSWORD=password \
rocker/rstudio:4.2.2
Now open your favourite browser and type http://localhost:8888/. You should see a login page: enter the username "rstudio" and password "password" to login and that's it! You are running RStudio Server inside a Docker container.
Docker group
In the previous section, I ran Docker as my user on the Server instead of running sudo docker run
. This works because my user belongs to the docker
group; you (or your system administrator) should add your user to the docker
group instead of running Docker with sudo
.
# check /etc/group to see if the docker group exists
cat /etc/group | grep docker
# create a docker group if it does not exist
sudo groupadd docker
# add yourself to the docker group
sudo usermod -aG docker $(whoami)
Adding to the Docker image
Sometimes installing R packages requires additional libraries. You can build on top of the Rocker RStudio Server image to include the necessary libraries by creating your own Dockerfile. I have created my own Dockerfile that includes some necessary shared libraries for common bioinformatics tools and also included some R packages I always use. Below is my latest Dockerfile.
FROM rocker/rstudio:4.2.2
LABEL source="https://github.com/davetang/learning_docker/blob/main/rstudio/Dockerfile"
MAINTAINER Dave Tang <me@davetang.org>
ARG bioc_ver=3.16
RUN apt-get clean all && \
apt-get update && \
apt-get upgrade -y && \
apt-get install -y \
libhdf5-dev \
libcurl4-gnutls-dev \
libssl-dev \
libxml2-dev \
libpng-dev \
libxt-dev \
zlib1g-dev \
libbz2-dev \
liblzma-dev \
libglpk40 \
libgit2-dev \
&& apt-get clean all && \
apt-get purge && \
rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
RUN Rscript -e "install.packages(c('rmarkdown', 'tidyverse', 'workflowr', 'BiocManager'));"
RUN Rscript -e "BiocManager::install(version = '${bioc_ver}')"
# the rstudio/ path is set for building with GitHub Actions
COPY --chown=rstudio:rstudio rstudio/rstudio-prefs.json /home/rstudio/.config/rstudio
COPY --chown=rstudio:rstudio rstudio/.Rprofile /home/rstudio/
WORKDIR /home/rstudio
Note that the above Dockerfile was written for building using GitHub Actions, which is also automatically uploaded to Docker Hub. If you want to build off this Dockerfile, remove the rstudio
path in the two COPY
steps.
Saving your user settings
The following line in my Dockerfile:
COPY --chown=rstudio:rstudio rstudio/rstudio-prefs.json /home/rstudio/.config/rstudio
copies my user settings/preferences to the Docker image. As I mentioned at the start of the post, I have some settings that I really like, such as Vim keybindings. Each time we start a new container, we start with the default preferences and naturally I don't want to manually change them each time. Luckily, the settings are saved in a specific file inside the container: /home/rstudio/.config/rstudio
.
In order to generate your own preference file, start a container and make all your preferred settings. Then simply save the rstudio-prefs.json
file back to your local/host computer. You can use that file to overwrite the default settings just like in my Dockerfile above. You will need to mount a local directory to the Docker container in order to copy the file; this is discussed next.
Mounting volumes
When you use Rstudio Server, the expected/default user is rstudio
; you can create and use other users (as I show later in this post). The rstudio
user will have the following user and group ID inside the container.
docker run --rm -it davetang/rstudio:4.3.0 cat /etc/passwd | grep rstudio:
rstudio:x:1000:1000::/home/rstudio:/bin/bash
If you mount a volume, the user and group ID will match the IDs of your local user, which in my case is 1004:1006.
docker run --rm -it -v $HOME:/tmp davetang/rstudio:4.3.0 ls -al /tmp | head
# --snipped--
drwxr-xr-x 74 1004 1006 4096 Oct 9 23:49 .
# --snipped--
Here is the important parameter to include to make the rstudio
user have the same USER and GROUP ID as your local user:
docker run --rm -e USERID=$(id -u) -e GROUPID=$(id -g) davetang/rstudio:4.3.0
# start up log
# --snipped
deleting the default user
creating new rstudio with UID 1004
useradd: warning: the home directory /home/rstudio already exists.
useradd: Not copying any file from skel directory into it.
Modifying primary group rstudio
Primary group ID is now custom_group 1006
[cont-init.d] 02_userconf: exited 0.
[cont-init.d] done.
[services.d] starting services
[services.d] done.
Now you can mount anything and the rstudio
user will have the same user and group IDs, which means that the rstudio
user can edit and write to the mounted files.
Installing packages
In my Dockerfile, the following line:
COPY --chown=rstudio:rstudio rstudio/.Rprofile /home/rstudio/
copies my .Rprofile
file to the Docker image.
My R profile file simply contains the line .libPaths("/packages/")
, which tells R to look for packages in /packages/
.
When I start my RStudio Server container, I mount a volume (i.e. directory) from my local host to the container that will be used to keep installed R packages. This way I don't need to re-install packages each time I start a new container.
Below is an example:
docker run --rm \
-p 8888:8787 \
-d \
--name rstudio_server \
-v /home/dtang/r_packages/:/packages \
-e PASSWORD=password \
-e USERID=$(id -u) \
-e GROUPID=$(id -g) \
davetang/rstudio:4.2.2
When you log into your RStudio Server, packages should be installed in /packages
, which will be stored in your local host.
.libPaths()
# [1] "/packages" "/usr/local/lib/R/site-library" "/usr/local/lib/R/library"
install.packages("beepr")
# Installing package into ‘/packages’
# snipped
Next time you start the container with the same docker run
command, the beepr
package will be available.
Note that I ran the container in detached mode with the -d
parameter, so if you run your container this way, make sure you stop the container when you're finished by running docker stop container_name
, especially if you are on a shared compute resource.
In the example above I named the container rstudio_server
, so you can stop it by running the following:
docker stop rstudio_server
I wrote a helper script to run my Docker image. It uses my Docker image that was built using the Dockerfile shown in this post. You can specify additional directories to mount via the script and it will mount them to /data/
inside the container.
Adding other users
If you want to add additional users such that multiple users can use the same RStudio Server container, simply "log" into the container and create a new user.
# change rstudio_dtang to your container name or ID
# you can find the name or ID by typing
# docker ps -a
docker exec -it rstudio_dtang /bin/bash
my_user=new_user
# once you are inside the container
useradd ${my_user}
# enter password when prompted
passwd ${my_user}
mkdir /home/${my_user}
chown ${my_user}:${my_user} /home/${my_user}
exit
When you get to the Rstudio login page, enter the newly created user (new_user in the example above) and use the password chosen at the passwd step.
Ideally each user should run their own Docker container but since not everyone has Docker privileges, this is an alternative. Some clever solution is necessary for maintaining packages for each user, especially with many users. If there are few users, they can install their R packages in their newly created home directories but make sure the container is not removed when the Docker daemon is stopped, i.e. do not use the --rm argument. However, when using a new Docker image (for example when a new version of RStudio or R comes out), they will have to reinstall all their packages from scratch (rather than updating).
I have additional tips for adding multiple users on this GitHub Issues page.
Restarting rstudio-server
Sometimes RStudio Server hangs because of a bad code chunk and we may not have saved our latest changes, so we do not want to restart the Docker container because this will wipe our unsaved changes. One potential solution is to "log" into the container as root and restart the daemon.
The operating system for rocker/rstudio:4.2.2
is Ubuntu and the steps below work for Ubuntu/Debian.
# log into the container, change the container ID/name accordingly
docker exec -it rstudio_server /bin/bash
cat /etc/os-release
# PRETTY_NAME="Ubuntu 22.04.1 LTS"
# NAME="Ubuntu"
# VERSION_ID="22.04"
# VERSION="22.04.1 LTS (Jammy Jellyfish)"
# VERSION_CODENAME=jammy
# ID=ubuntu
# ID_LIKE=debian
# HOME_URL="https://www.ubuntu.com/"
# SUPPORT_URL="https://help.ubuntu.com/"
# BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
# PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
# UBUNTU_CODENAME=jammy
# list services
service --status-all
# [ ? ] hwclock.sh
# [ - ] procps
# [ ? ] rstudio-server
# [ - ] x11-common
# restart
service rstudio-server restart
# * Restarting RStudio Server rserver
# TTY detected. Printing informational message about logging configuration. Logging configuration loaded from '/etc/rstudio/logging.conf'. Logging to 'syslog'.
# [ OK ]
exit
Hopefully when you visit the RStudio Server page again, the page is responsive again and you can save your work.
Other tips
You can limit the resource usage of your Docker container if you're running your container in a shared environment and want to make sure you don't use all the resources.
If you are running your Docker container on an external server and want to access it on your local computer, you can use SSH port forwarding.
# -N Do not execute a remote command. This is useful for just forwarding ports
# -f Requests ssh to go to background just before command execution
# -Y Enables trusted X11 forwarding
# -L Specifies that connections to the given TCP port or Unix socket on the local (client) host are to be forwarded to the given host and port
ssh -N -f -Y -L 8888:localhost:8888 dtang@192.168.1.42
Further reading
I have more notes in my GitHub repository, so check it out if you're interested.
This work is licensed under a Creative Commons
Attribution 4.0 International License.
Hi Dave – many thanks for the post. I found it right after taking a course on reproducibility for bioinformatics. One thing that came up in the course was using Conda to control Rstudio version as well as package versions.
With regards to Rstudio itself, it seemed that if one wanted to use the latest Rstudio version (i.e. 1.4, with new interactive markdown features) then Docker (or Rocker) would be the only route as the latest versions aren’t hosted on any Conda channel. Looking this up brought me to the blogpost. Now I’m curious as to whether you have any thoughts on controlling R package versions at the level of Dockerfile or a Conda environment within the Docker container?
Hi Chris,
is the course on reproducibility publicly available? I would like to check it out too, if possible. I’m always looking for new ways to enhance reproducibility.
With regards to your question, you could use Conda to install specific R packages when creating the image. I have an example in my Dockerfile that installs Miniconda https://github.com/davetang/learning_docker/blob/master/Dockerfile.base#L28-L33.
You could also install R packages directly when creating the image. I also have an example here https://github.com/davetang/learning_docker/blob/master/rstudio/Dockerfile#L24-L25.
With those two approaches, you can ensure that others have the exact environment. One downsize may be the potential size of the image especially if an R package has many dependencies. For example, the rocker/rstudio:4.0.5 image is 1.93GB in size. My image (davetang/rstudio:4.0.5), which installs additional libraries and a couple of R packages, is 2.33GB.
One additional way for fixing R packages is to mount a volume to the Docker container and install all the R packages necessary for an analysis into the mounted volume. I’m not sure if this will save more space than simply installing the R packages directly when creating the Docker image. I guess this approach is preferable if the size of all the R packages is large, especially if the Docker image will be shared on a public repository. You could then share the R package volume via some dedicated file sharing platform, like storage buckets.
Cheers,
Dave
Hi Dave,
You can find the course at the following link (fairly sure it’s OK to share this because it’s google searchable): https://nbis-reproducible-research.readthedocs.io/en/latest/
Thanks for your explanation re. the different ways one can do this – my take home is that there are several ways to go about this. I reckon I will go down the Conda route, which you also mentioned. If you dig into the Docker section in the course you’ll see that in their Dockerfiles they specify Miniconda to create an environment based off a YAML file. Maybe that could be something you find interesting. Anyhow, thanks for the discussion!
Hi Chris,
thank you for the link to the course! I had a quick look and it looks extremely useful. I’ll share my workshop material https://davetang.github.io/reproducible_bioinformatics/.
Cheers,
Dave
Hi Dave,
Great post, thanks. I was wondering if this could be extended to multiple users?
Best,
Vikas
Hi Vikas,
you sure can. I updated the blog post with a new section for adding a new user.
Cheers,
Dave
Hi Davo,
Great, thanks!
Best wishes,
Vikas
Hi Dave, this is great. I was able to install this and add users successfully. How do you mount a share into your R home directory? The share is on my server that runs the rstudio docker
Hi there,
I mount volumes as per https://github.com/davetang/learning_docker/blob/main/rstudio/run_rstudio.sh. In that script I mount three directories, one of which is for R packages. Then in my .Rprofile I have
.libPaths("/packages/")
. By doing this I don’t have to re-install packages each time.Cheers,
Dave
Hi Dave,
Thanks for the blog post! I find it really helpful. I’m curious in knowing whether I can specify which version of R to use for the RStudio server? If yes how do I do so? Thank you!
Cheers,
Evan
Hi Evan,
I have different versions built; see https://hub.docker.com/r/davetang/rstudio/tags. For the latest version run:
docker pull davetang/rstudio:4.2.2
If I don’t have the specific version you want, modify the first line of my Dockerfile with the version you want from https://hub.docker.com/r/rocker/rstudio/tags.
Hope that helps,
Dave
Hi Dave,
Thanks for your reply! That really helped. I was able to get my container up and running but I seem to have problems saving the outputs of my R stuff. When I try to save files the error code says cannot open file and that permission is denied. I didn’t specify a user name when I run this container. I wonder if there’s a way to fix this? Thanks.
Cheers,
Evan
Hi Evan,
I’m glad you could proceed. What are the parameters you issued to
docker run
? Did you include these lines:-e USERID=$(id -u) \
-e GROUPID=$(id -g) \
This should make the rstudio user in your Docker container have the same UID and GID as your user in your host environment, which should avoid the permission denied problem.
Cheers,
Dave
Hi Dave,
Thanks for your reply. When I do that I can’t seem to login using my user id number. It says that’s the username/password is in correct. The username that I put in is the number that I was returned when typing id -u in my terminal.
Cheers,
Evan
Hi Evan,
log in using rstudio/password. Start the Docker container as per the blog post and see how that goes.
Cheers,
Dave
Hi Dave,
Thanks for your reply. I couldn’t log in with rstudio/password either when I include the two lines.
I tried running it without the detached mode and this is the message that I got.
[s6-init] making user provided files available at /var/run/s6/etc…exited 0.
[s6-init] ensuring user provided files have correct perms…exited 0.
[fix-attrs.d] applying ownership & permissions fixes…
[fix-attrs.d] done.
[cont-init.d] executing container initialization scripts…
[cont-init.d] 01_set_env: executing…
skipping /var/run/s6/container_environment/HOME
skipping /var/run/s6/container_environment/PASSWORD
skipping /var/run/s6/container_environment/RSTUDIO_VERSION
[cont-init.d] 01_set_env: exited 0.
[cont-init.d] 02_userconf: executing…
0 is less than 1000
setting minumum authorised user to 499
deleting the default user
creating new rstudio with UID 0
useradd: UID 0 is not unique
chown: invalid user: ‘rstudio’
usermod: user ‘rstudio’ does not exist
id: ‘rstudio’: no such user
Modifying primary group
id: ‘rstudio’: no such user
groupmod: group ” does not exist
Primary group ID is now custom_group 0
chpasswd: (user rstudio) pam_chauthtok() failed, error:
Authentication token manipulation error
chpasswd: (line 1, user rstudio) password not changed
[cont-init.d] 02_userconf: exited 0.
[cont-init.d] done.
[services.d] starting services
[services.d] done.
Not sure what the problem is here…
I really appreciate your help.
Cheers,
Evan
Hi Evan,
no worries, we’ll get it working. Probably easier by email; may you email me@davetang.org with your operating system version (e.g. Ubuntu 22.10), Docker version, and the exact command you are using to start a container?
Cheers,
Dave
Hi evan,
I also logged in with rstudio/password, but I can’t log in. command is “docker run –rm -p 8888:8787 -e PASSWORD=password rocker/rstudio:4.2.2”. I’m using rocker/rstudio:4.2.2 and my system is Ubuntu 18.04.6 LTS and docker is version 20.10.7.
sincerely,
– chaehwa
Hi,
may you take a look at my new blog post, to see whether it provides some clues regarding your issue?
https://davetang.org/muse/2023/02/24/running-rstudio-server-with-r-4-2-2-with-docker/
Cheers,
Dave
Many thanks for this! I will experiment on a server I have access to.
I have a somewhat different question. I can virtual desktop into a Linux server with a GUI desktop environment. Can I use a similar method to launch the RStudio application from a Docker container? In other words, run the RStudio application itself instead of RStudio Server in a browser window?
Sure thing. I guess you’re using something like VNC? Just open your browser in the GUI desktop and enter http://localhost:port_num.
I could use some advice about file permissions. I have code in /home/[user]/Documents/code/ and this is exposed as a Docker volume to the rocker/tidyverse container. I’d like to be able to edit this code from RStudio Server. I log into RStudio Server with user “rstudio.” I can see the code files, but not write to them. Do I need to log into RStudio Server as root in order to edit them? (Because most things in Docker as as root). Or should I find a way to widen the permissions of the code files in the global environment? Many thanks!
I updated the post with a section on “Mounting volumes”. Hopefully it answers your question/s.