Reading a list of files into a single R data frame

I had been using map_dfr from the purrr package to load multiple files into one single data frame. But this function has been superseded with the following explanation: The functions were superseded in purrr 1.0.0 because their names suggest they work like _lgl(), _int(), etc which require length 1 outputs, but actually they return results…

Continue Reading

Split single column of key-value pairs into multiple columns

Two widely used file formats in bioinformatics, VCF and GTF, have single columns that are packed with annotation information. This makes them a bit inconvenient to work with in R when using data frames because the values need to be unpacked, i.e. split. In addition, this violates one of the conditions for tidy data, which…

Continue Reading

Map, join, and pivot in R

In this post, I will describe a series of data processing steps in R that I often perform that involves the map_df, inner_join, and pivot_longer functions from the purrr, dplyr, and tidyr packages, respectively. They are all part of the tidyverse, so to follow this post, please install the tidyverse package. install.packages("tidyverse") library(tidyverse) The typical…

Continue Reading