Processing rows of a data frame in R

Once you've read in a tab delimited file into a data.frame, here's one way of operating on the rows

#read in file
data <- read.table("test.tsv",header=TRUE,row.names=1)
#print out row number 1
data[1,]
#                      A B C D E F
#row_1       2       4       3       9       9      13
#calculate the variance of row 1 in the data.frame
var(as.vector(as.matrix(data[1,])))
#[1] 18.66667
#Just to test the results
#make some test variable corresponding to the values in row 1
test <- c(2,4,3,9,9,13)
#calculate the variance
var(test)
#[1] 18.66667

I'm still wondering why I need two conversion steps ( e.g. var(as.vector(as.matrix(data_subset[1,]))) ), since var(as.vector(data_subset[1,])) doesn't work. In time, when I learn more about data.frames and R in general I hope to address this or if some expert comes across this, may you kindly explain it to me. Thanks!

Calculating the variance for each row and storing the variance as an additional column

for (i in 1:nrow(data_subset)){
   print(var(as.vector(as.matrix(data_subset[i,]))))
}
#to add the variance as an additional column in the data frame
data_subset$variance <- apply(data_subset,1,function(row) var(as.vector(row[1:6])))
#and to delete the variance column
#data_subset <- subset(data_subset,select=-c(variance))



Creative Commons License
This work is licensed under a Creative Commons
Attribution 4.0 International License
.
4 comments Add yours
  1. read.table() returns a data.frame object, but the following code works…

    > a b data row.names(data) data
    x y
    a 1 5
    b 2 6
    c 3 7
    d 4 8
    > var(as.vector(data[,1]))
    [1] 1.666667
    > var(data[,1])
    [1] 1.666667

      1. >x
        a b
        1 1 5
        2 2 6
        3 3 7
        4 4 8
        > write.table(x, file=”test.tab”, quote=F, row.names=F)
        > y y
        a b
        1 1 5
        2 2 6
        3 3 7
        4 4 8
        > x
        a b
        1 1 5
        2 2 6
        3 3 7
        4 4 8

        > var(as.vector(y[,1]))
        [1] 1.666667
        > var(x[,1])
        [1] 1.666667

      2. > x
        a b
        1 1 5
        2 2 6
        3 3 7
        4 4 8

        > write.table(x, file=”test.tab”, quote=F, row.names=F)
        > y = read.table(file=”test.tab”, head=T)
        > y
        a b
        1 1 5
        2 2 6
        3 3 7
        4 4 8
        > x
        a b
        1 1 5
        2 2 6
        3 3 7
        4 4 8

        > var(as.vector(y[,1]))
        [1] 1.666667
        > var(x[,1])
        [1] 1.666667
        >

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.