Double square brackets in R

This deserved its own post because I had some difficulty understanding the double square brackets in R. If we search for "double square brackets in R" we come across this tutorial, which shows us that the double square brackets, i.e. [[]], can be used to directly access columns:

head(iris)
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1          5.1         3.5          1.4         0.2  setosa
2          4.9         3.0          1.4         0.2  setosa
3          4.7         3.2          1.3         0.2  setosa
4          4.6         3.1          1.5         0.2  setosa
5          5.0         3.6          1.4         0.2  setosa
6          5.4         3.9          1.7         0.4  setosa
#vector of sepal lengths using the column name
iris[['Sepal.Length']]
  [1] 5.1 4.9 4.7 4.6 5.0 5.4 4.6 5.0 4.4 4.9 5.4 4.8 4.8 4.3 5.8 5.7 5.4 5.1 5.7 5.1 5.4 5.1
 [23] 4.6 5.1 4.8 5.0 5.0 5.2 5.2 4.7 4.8 5.4 5.2 5.5 4.9 5.0 5.5 4.9 4.4 5.1 5.0 4.5 4.4 5.0
 [45] 5.1 4.8 5.1 4.6 5.3 5.0 7.0 6.4 6.9 5.5 6.5 5.7 6.3 4.9 6.6 5.2 5.0 5.9 6.0 6.1 5.6 6.7
 [67] 5.6 5.8 6.2 5.6 5.9 6.1 6.3 6.1 6.4 6.6 6.8 6.7 6.0 5.7 5.5 5.5 5.8 6.0 5.4 6.0 6.7 6.3
 [89] 5.6 5.5 5.5 6.1 5.8 5.0 5.6 5.7 5.7 6.2 5.1 5.7 6.3 5.8 7.1 6.3 6.5 7.6 4.9 7.3 6.7 7.2
[111] 6.5 6.4 6.8 5.7 5.8 6.4 6.5 7.7 7.7 6.0 6.9 5.6 7.7 6.3 6.7 7.2 6.2 6.1 6.4 7.2 7.4 7.9
[133] 6.4 6.3 6.1 7.7 6.3 6.4 6.0 6.9 6.7 6.9 5.8 6.8 6.7 6.7 6.3 6.5 6.2 5.9
#vector of sepal lengths using the column index
iris[[1]]
  [1] 5.1 4.9 4.7 4.6 5.0 5.4 4.6 5.0 4.4 4.9 5.4 4.8 4.8 4.3 5.8 5.7 5.4 5.1 5.7 5.1 5.4 5.1
 [23] 4.6 5.1 4.8 5.0 5.0 5.2 5.2 4.7 4.8 5.4 5.2 5.5 4.9 5.0 5.5 4.9 4.4 5.1 5.0 4.5 4.4 5.0
 [45] 5.1 4.8 5.1 4.6 5.3 5.0 7.0 6.4 6.9 5.5 6.5 5.7 6.3 4.9 6.6 5.2 5.0 5.9 6.0 6.1 5.6 6.7
 [67] 5.6 5.8 6.2 5.6 5.9 6.1 6.3 6.1 6.4 6.6 6.8 6.7 6.0 5.7 5.5 5.5 5.8 6.0 5.4 6.0 6.7 6.3
 [89] 5.6 5.5 5.5 6.1 5.8 5.0 5.6 5.7 5.7 6.2 5.1 5.7 6.3 5.8 7.1 6.3 6.5 7.6 4.9 7.3 6.7 7.2
[111] 6.5 6.4 6.8 5.7 5.8 6.4 6.5 7.7 7.7 6.0 6.9 5.6 7.7 6.3 6.7 7.2 6.2 6.1 6.4 7.2 7.4 7.9
[133] 6.4 6.3 6.1 7.7 6.3 6.4 6.0 6.9 6.7 6.9 5.8 6.8 6.7 6.7 6.3 6.5 6.2 5.9
#the double square brackets in R can also be used
#with the single square brackets
iris[[1]][2]
[1] 4.9

More technically, the R manual states that one generally uses [[ to select any single element for lists and the [[ form allows only a single element to be selected using integer or character indices.

Take for example this list created following this tutorial:

#for upper, the first twenty letters are removed
#and thus we're left with the last six
my_list <- list(lower=letters[1:4], upper=letters[-1:-20])
my_list
$lower
[1] "a" "b" "c" "d"

$upper
[1] "u" "v" "w" "x" "y" "z"
#using [[]] we can access the lower characters of the alphabet
my_list[['lower']]
[1] "a" "b" "c" "d"
my_list[['upper']]
[1] "u" "v" "w" "x" "y" "z"

I can see how the [[]]'s are quite useful when we need to reference elements within a list that have a list.

I would also like to point out that [[ can be used as a function. This is handy when you have a list within a list.

my_list <- list(one = list(first = "Bob", last = "Smith"),
                two = list(first = "Jane", last = "Turner"))

# retrieve entry "one"
`[[`(my_list, "one")
$first
[1] "Bob"

$last
[1] "Smith"

# retrieve "last" from all entries
lapply(my_list, `[[`, "last")
$one
[1] "Smith"

$two
[1] "Turner"

Sometimes the results returned from an analysis package is a list, which I want to convert to a data frame (say for merging reasons). Below is some code to obtain gene symbols for probe ids (as per my post on Using the Bioconductor annotation packages) and how to convert this list into a data frame:

#install if necessary
source("http://bioconductor.org/biocLite.R")
biocLite("illuminaMousev1p1.db")
library("illuminaMousev1p1.db")
 
gs <- illuminaMousev1p1SYMBOL
gs_probe <- mappedkeys(gs)
 
head(gs_probe)
[1] "ILMN_1212602" "ILMN_1212605" "ILMN_1212607" "ILMN_1212610" "ILMN_1212612" "ILMN_1212614"
 
gs_probe_lookup <- as.list(gs[gs_probe])
head(gs_probe_lookup)
$ILMN_1212602
[1] "Best1"

$ILMN_1212605
[1] "1500011K16Rik"

$ILMN_1212607
[1] "Cradd"

$ILMN_1212610
[1] "Zfp626"

$ILMN_1212612
[1] "Rcan2"

$ILMN_1212614
[1] "Med12l"

#refer to the names of the list
head(names(gs_probe_lookup))
[1] "ILMN_1212602" "ILMN_1212605" "ILMN_1212607" "ILMN_1212610" "ILMN_1212612" "ILMN_1212614"

#refer to only the symbols
#use unlist but this keeps the names
head(unlist(gs_probe_lookup))
   ILMN_1212602    ILMN_1212605    ILMN_1212607    ILMN_1212610    ILMN_1212612    ILMN_1212614 
        "Best1" "1500011K16Rik"         "Cradd"        "Zfp626"         "Rcan2"        "Med12l"
#so use the parameter use.names=F to remove them
head(unlist(gs_probe_lookup, use.names=F))
[1] "Best1"         "1500011K16Rik" "Cradd"         "Zfp626"        "Rcan2"         "Med12l"

#voila
probe_to_symbol <- data.frame(probe=names(gs_probe_lookup), symbol=unlist(gs_probe_lookup, use.names=F))
head(probe_to_symbol)
         probe        symbol
1 ILMN_1212602         Best1
2 ILMN_1212605 1500011K16Rik
3 ILMN_1212607         Cradd
4 ILMN_1212610        Zfp626
5 ILMN_1212612         Rcan2
6 ILMN_1212614        Med12l

Conclusions

The double square brackets in R can be used to reference data frame columns, as shown with the iris dataset. An additional set of square brackets can be used in conjunction with the [[]] to reference a specific element in that vector of elements.

I can see that the [[]]'s are much more useful in lists that are structured as my_list in the example above. To flatten the list structure use the unlist() function, which can be useful when you want to convert a list into a data frame.

So how did a post on double square brackets in R turn into a post about lists? Initially, I thought I could use the double square brackets to refer to the elements in a list given a list structure (such as the results from using the annotation package). However I was unable to and the best solution seems to be flattening the list using unlist() and the use.names=F parameter.

See more

The R Language Definition manual.

Print Friendly, PDF & Email



Creative Commons License
This work is licensed under a Creative Commons
Attribution 4.0 International License
.
4 comments Add yours
  1. 7 years later, this blog post continues to help anxious learners.

    Also, I liked the way you began the blog.
    That it’s okay to face troubles with concepts that may seem so ordinary.

    Really appreciate it!

    1. Thanks for the comment! I’m glad the post is still useful 7 years down the road. And yes, it’s definitely OK to not understand something that may seem trivial!

  2. Is there any way to get double square bracket semantics when creating a list?
    This:
    1) L <- list("a"=1)
    Is the same as these two statements:
    2) L<-list()
    L["a"] <- 1
    Is there an assignment operator for form 1) that gives the equivalent behavior for the double bracket as-is assignment:
    L[["df"]] <- I(some_complex_obj_like_a_dataframe)
    ?
    Using "=" does not work and results in an embedding instead of the desired simple reference.

    1. If you want to create a list of data frames, you can do something like this:

      my_list <- list(df = chickwts, df2 = cars)

      I'm guessing that's not what you want but I don't quite understand what you meant by results in an embedding rather than a simple reference. Did you want to store a specific column of the data frame?

      my_list <- list(feed = chickwts$feed, df2 = cars)

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.