When I searched for “merging multiple data frames”, I got this hit as the top result. The post was a bit too terse for me, so here’s my expansion of merging multiple data frames in R.
First let’s create multiple data frames that we wish to merge together:
set.seed(31) a <- data.frame(one=sample(1:10), two=runif(10,0,1)) set.seed(21) b <- data.frame(one=sample(1:10), two=runif(10,0,1)) set.seed(11) c <- data.frame(one=sample(1:10), two=runif(10,0,1)) head(a) # one two #1 6 0.32825976 #2 9 0.83023225 #3 4 0.65186753 #4 3 0.01410936 #5 10 0.82100377 #6 7 0.11513744 head(b) # one two #1 8 0.66754012 #2 3 0.93521022 #3 6 0.05818433 #4 2 0.61861583 #5 10 0.17491846 #6 5 0.03767539 head(c) # one two #1 3 0.1751129 #2 1 0.4407503 #3 5 0.9071830 #4 9 0.8510419 #5 7 0.7339875 #6 8 0.5736857
We now have three data frames: a, b, and c. I want to merge them based on the first column, which is not in the same order in each data frame.
#if a, b, and c were in the same order #things would be much easier cbind(a,b$two,c$two) # one two b$two c$two #1 6 0.32825976 0.66754012 0.1751129 #2 9 0.83023225 0.93521022 0.4407503 #3 4 0.65186753 0.05818433 0.9071830 #4 3 0.01410936 0.61861583 0.8510419 #5 10 0.82100377 0.17491846 0.7339875 #6 7 0.11513744 0.03767539 0.5736857 #7 8 0.10124820 0.52531317 0.4817655 #8 1 0.47392895 0.28218425 0.3306110 #9 2 0.23244771 0.49904520 0.1576602 #10 5 0.10766965 0.63382510 0.4801341 #if I wanted to merge a and b #I could use the merge() function merge(a,b, by='one') # one two.x two.y #1 1 0.47392895 0.52531317 #2 2 0.23244771 0.61861583 #3 3 0.01410936 0.93521022 #4 4 0.65186753 0.28218425 #5 5 0.10766965 0.03767539 #6 6 0.32825976 0.05818433 #7 7 0.11513744 0.49904520 #8 8 0.10124820 0.66754012 #9 9 0.83023225 0.63382510 #10 10 0.82100377 0.17491846 #using the reshape and reshape2 package library(reshape) library(reshape2) list_df <- list(a,b,c) merged_df <- merge_all(list_df) head(merged_df) # one two #8 1 0.4739290 #17 1 0.5253132 #22 1 0.4407503 #9 2 0.2324477 #14 2 0.6186158 #29 2 0.1576602 #merging done in the order of a, b, c #check below results with head(merged_df) a[a$one==1,] # one two #8 1 0.473929 b[b$one==1,] # one two #7 1 0.5253132 c[ c$one == 1, ] # one two #2 1 0.4407503
Now what if I wanted the merged dataset in a data frame like object?
blah <- data.frame(one=merged_df$one, class=rep(c('a','b','c'), 10), two=merged_df$two) head(blah) # one class two #1 1 a 0.4739290 #2 1 b 0.5253132 #3 1 c 0.4407503 #4 2 a 0.2324477 #5 2 b 0.6186158 #6 2 c 0.1576602 abc <- dcast(blah, one ~ class) #Using two as value column: use value.var to override. abc # one a b c #1 1 0.47392895 0.52531317 0.4407503 #2 2 0.23244771 0.61861583 0.1576602 #3 3 0.01410936 0.93521022 0.1751129 #4 4 0.65186753 0.28218425 0.3306110 #5 5 0.10766965 0.03767539 0.9071830 #6 6 0.32825976 0.05818433 0.4817655 #7 7 0.11513744 0.49904520 0.7339875 #8 8 0.10124820 0.66754012 0.5736857 #9 9 0.83023225 0.63382510 0.8510419 #10 10 0.82100377 0.17491846 0.4801341 #visualise http://davetang.org/muse/2012/04/09/creating-a-matrix-of-scatter-plots-in-r/ pairs(~a+b+c, abc)
Paired scatter plots of a, b, and c using the merged data frame.
See also

This work is licensed under a Creative Commons
Attribution 4.0 International License.