Using SQL on R data.frames

In R, I typically use data.frames to hold all my data. I've wondered when I should just use a matrix instead of a data.frame and this was nicely answered. It is also quite easy to perform operations on data.frames to obtain a subset of the data.

However if I had two data frames and wanted to connect them based on a common column, nothing seems more appropriate than a SQL join clause (especially if you've worked with relational databases before). Then I found this package in R that lets you perform SQL queries on data.frames (for example, joins!).

Installing:

install.packages("sqldf")
library("sqldf")

Testing:

library(sqldf)
sqldf("select * from iris limit 5")
Loading required package: tcltk
Error in structure(.External("dotTcl", ..., PACKAGE = "tcltk"), class = "tclObj") :
  [tcl] unknown math function "min".

After reading the troubleshooting page, I found out there's some problem with the tcltk library on the server. I don't have root access to the server running CentOS, so I just went with using the slower R engine.

#make sure your gsubfn version is gsubfn 0.6-4 or later
packageVersion("gsubfn")
[1] '0.6.5'
options(gsubfn.engine = "R")
sqldf("select * from iris limit 5")
  Sepal_Length Sepal_Width Petal_Length Petal_Width Species
1          5.1         3.5          1.4         0.2  setosa
2          4.9         3.0          1.4         0.2  setosa
3          4.7         3.2          1.3         0.2  setosa
4          4.6         3.1          1.5         0.2  setosa
5          5.0         3.6          1.4         0.2  setosa

On the contrary, I installed sqldf on my Windows 7 laptop running R 2.15.2 and it worked without any errors.

Getting table information

sqldf("pragma table_info(iris)")
  cid         name type notnull dflt_value pk
1   0 Sepal_Length REAL       0       <NA>  0
2   1  Sepal_Width REAL       0       <NA>  0
3   2 Petal_Length REAL       0       <NA>  0
4   3  Petal_Width REAL       0       <NA>  0
5   4      Species TEXT       0       <NA>  0

More examples

For more examples, see the documentation.

Print Friendly, PDF & Email



Creative Commons License
This work is licensed under a Creative Commons
Attribution 4.0 International License
.
Posted in RTagged
One comment Add yours

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.