A quick and short post on parallel distance calculation in R using the mclapply() function from the parallel package. I'll use data from the Biobase and datamicroarray packages to illustrate.
One of the projects I have been involved with is SeqNextGen, where I'm analysing exomes of patients who have a suspected rare genetic disorder. It's a change from what I was previously researching during my PhD; instead of working on an RNA level, I've reverse transcribed1 and I'm now examining DNA sequence and analysing genetic variants. There was a lot to learn to get started and I have written posts on "Getting started with analysing DNA sequencing data" and "Getting acquainted with analysing DNA sequencing data". I guess this is part three of the series where I'm "Getting serious with analysing DNA sequencing data2."
I saw this question on Quora:
A teacher assigns each of her 18 students a different integer from 1 through 18. The teacher forms pairs of study partners by using the rule that the sum of the pair of numbers is a perfect square. Assuming the 9 pairs of students follow this rule, the student assigned which number must be paired with the student assigned the number 1?
I recently completed Data Manipulation in R with dplyr and realised that dplyr can be used to aggregate and summarise data the same way that aggregate() does. I wrote a post on using the aggregate() function in R back in 2013 and in this post I'll contrast between dplyr and aggregate().
It has been a quiet year of blogging since my 5th anniversary; there has only been 13 posts since. Though as I have mentioned before, I am using GitHub to share tutorials and some of my work. However, I will try to write at least twice a month, especially now that I have decided to learn more about tidyr, dplyr, and ggplot2.
This is my third post on learning R through the BetaBit package, which contains three mini games for learning R. I wrote about the first game, called proton, late last year and the second game, called frequon, a week and a half ago. The third game is called regression and it's much more statistical than the other two. I actually couldn't complete the last task of the game, so if you know how to approach it, please let me know!
Late last year I discovered proton, an educational game in R about processing data frames, via R-bloggers and had a go at it. I thought it was fun and educational; it was also the first time I tried to use the dplyr package. I recently learned that there are two more games produced by the same developer of proton. This post is on the frequon game.