Backticks in R

The only times I used backticks in R was when a file was imported into R and the column names had spaces (as a side note, please don’t use spaces in your column or file names but use an underscore, i.e. _). For example, you can access col a by using backticks. my_df <- data.frame(…

Continue Reading

Wrapping R vectors with parentheses

In this vectors and lists tutorial, the example code wraps the R vector assignments with parentheses or round brackets. (v_log <- c(TRUE, FALSE, FALSE, TRUE)) #> [1] TRUE FALSE FALSE TRUE (v_int <- 1:4) #> [1] 1 2 3 4 (v_doub <- 1:4 * 1.2) #> [1] 1.2 2.4 3.6 4.8 (v_char <- letters[1:4]) #>…

Continue Reading

Split single column of key-value pairs into multiple columns

Two widely used file formats in bioinformatics, VCF and GTF, have single columns that are packed with annotation information. This makes them a bit inconvenient to work with in R when using data frames because the values need to be unpacked, i.e. split. In addition, this violates one of the conditions for tidy data, which…

Continue Reading

Map, join, and pivot in R

In this post, I will describe a series of data processing steps in R that I often perform that involves the map_df, inner_join, and pivot_longer functions from the purrr, dplyr, and tidyr packages, respectively. They are all part of the tidyverse, so to follow this post, please install the tidyverse package. install.packages("tidyverse") library(tidyverse) The typical…

Continue Reading

Finding out weather conditions from the command line

In this post, I outline an approach for retrieving weather conditions from the command line. There are websites and widgets that provide weather details but I like using the command line because I find that it’s more efficient than pointing and clicking on stuff. In addition, this approach enables us to program specific tasks. For…

Continue Reading

Running RStudio Server with Docker

I highly recommend using RStudio if you use R because it makes working with R so much easier. I primarily use RStudio for writing up my analyses in R Markdown. Some RStudio features I couldn’t live without include: Vim keybindings, code completion, and code highlighting (rainbow parentheses are awesome!). Other nice features I like to…

Continue Reading

Compiling R with GNU Readline

Updated 2018 March 23rd for R-3.4.4 I use a lot of shortcuts provided by GNU Readline. I recently compiled R without Readline support and it was almost unusable! This was because I ran into the error: configure: error: –with-readline=yes (default) and headers/libs are not available To circumvent this I compiled R by running: ./configure –with-readline=no…

Continue Reading

Matrix to adjacency list in R

An adjacency list is simply an unordered list that describes connections between vertices. It’s a commonly used input format for graphs. In this post, I use the melt() function from the reshape2 package to create an adjacency list from a correlation matrix. I use the geneData dataset, which consists of real but anonymised microarray expression…

Continue Reading

Intersect in R

A colleague asked me this question: “I’m trying to find a way to find genes that overlap between three datasets. I have used intersect for two dataframes but can’t seem to find a solution for three dataframes on google. Do you know any snazzy way of doing that?” I thought using the venn() function from…

Continue Reading

Basic Shiny app to fetch variant information

I created a basic Shiny app that uses the myvariant package to fetch variant information from MyVariant.info. The variants need to be represented in the format recommended by the Human Genome Variation Society. Once you have your variant of interest in the correct format, just hit “Get variant info!” and the annotations will appear on…

Continue Reading