I’ve been spending more and more time learning R because a lot of the statistical procedures used in bioinformatics are being made available (most times exclusively) via R and Bioconductor. As I keep learning more about R, I’m continually impressed with its capabilities and wondered why I didn’t learn it earlier. Don’t make the same mistake, learn R as soon as possible if you’re serious about data analysis!
For those coming from a biological background (like myself) and want to learn R with respect to data analysis and visualisation of high throughput sequence data, have a look at the material presented at this course; I found it extremely useful. I have also written a short post on doing simple stuff in R. This book, Bioinformatics Data Skills, also has a nice chapter on getting started with R (among other bioinformatic topics).
You can click on the R tag, to retrieve most of my posts related to R. I say most because I’m sure I’ve forgotten to add the R tag to a few posts. Lastly, as with the rest of my site, I use this site as a learning tool for myself so please view everything with a grain of salt (and please let me know where I have erred!).
Books
- R for Data Science
- Advanced R
- R Packages
- Advanced R Solutions
- Mastering Software Development in R
- R Markdown: The Definitive Guide
- R Graphics Cookbook
Must read
There are some must read articles available at the R Manuals page, such as “An Introduction to R” and “R Data Import/Export”.
Links to R resources
A bunch of useful R commands that I’ve aggregated at my R wiki.
swirl is a software package for the R statistical programming language. Its purpose is to teach users statistics and R simultaneously and interactively.
A course on data Analysis and visualisation course
A Survival Guide to Data Science with R
Some R packages that I have found useful
#Bioconductor packages source("http://bioconductor.org/biocLite.R") biocLite("ctc") biocLite("edgeR") biocLite("DESeq") biocLite("baySeq") biocLite("GO.db") biocLite("GOstats") biocLite("biomaRt") biocLite("Ringo") biocLite("ShortRead") biocLite("org.Hs.eg.db") biocLite("goseq") biocLite("Rsamtools") biocLite("GenomicRanges") biocLite("IRanges") #CAGE analysis biocLite("CAGEr") #R packages install.packages("gplots") install.packages("ggplot2") install.packages("snow") install.packages("RSvgDevice") install.packages("reshape") #text mining install.packages("tm") install.packages("wordcloud") #Twitter related install.packages("ROAuth") install.packages("twitteR") #analysing sequences install.packages("seqinr") #Enhanced data.frame install.packages("data.table") #For the Riemann's Zeta function #http://rss.acs.unt.edu/Rdoc/library/VGAM/html/zeta.html install.packages("VGAM") #Nonlinear regression with R install.packages("nlrwr") #Analysis of dose-response curves install.packages("drc")
Great to have stumbled on your blog. Your post on PCA really helped, since i have been wrecking my brain like no tomorrow. Cheers!
Hi Siewfong,
Glad it helped! A PCA is not the easiest thing to grasp; sometimes I have to look back at the post to remind myself.
Cheers,
Dave
Now the challenge is to understand my results, and explain that. Will try to have fun there…… 🙂
Have a good day and keep up the (generous) good work!
~2 years later, I found this post https://georgemdallas.wordpress.com/2013/10/30/principal-component-analysis-4-dummies-eigenvectors-eigenvalues-and-dimension-reduction/ which was extremely useful in understanding PCA. I thought I share it with you.
100 percent agree with you !
R is pretty suitable for us who are biological background like you and me, help us to avoid taking a roundabout course both in our study and work .
however, It’s really not easy for most of us to learn it well by ourselves, we should keep practice and communicate with each other.
I am lucky to see you blog, Thank you again .
Call me Jimmy