Creating a flowchart using R

The diagram package makes it easy to create flowcharts in R. In this post I’ll show an example of creating a simple flowchart. The most important part is to understand how the coordinate systems works; once you understand that, it’s just a matter of placing your arrows and boxes accordingly to create your flowchart. To…

Continue Reading

Gene to OMIM phenotype

A couple of weeks ago, I wrote a post on identifying OMIM phenotypes that are associated with a gene of interest. I thought I solved the problem by using one of my favourite R packages (biomaRt) but alas. For example, I could not find any OMIM IDs associated with the TTN gene using biomaRt. In…

Continue Reading

Matrix to adjacency list in R

An adjacency list is simply an unordered list that describes connections between vertices. It’s a commonly used input format for graphs. In this post, I use the melt() function from the reshape2 package to create an adjacency list from a correlation matrix. I use the geneData dataset, which consists of real but anonymised microarray expression…

Continue Reading

gnomAD allele frequency of pathogenic ClinVar variants

Updated 2018 June 7th Just recently, the genome Aggregation Database (gnomAD) VCF files were available for download: The long-awaited gnomAD VCF is here – sites + frequencies for 123,136 exomes and 15,496 genomes: https://t.co/8puaTvJ45w pic.twitter.com/sxKOEVFDml — Daniel MacArthur (@dgmacarthur) February 27, 2017

Continue Reading

Gene to OMIM Morbid Map

Update 2017 May 10th: I realised that this approach doesn’t work for all genes, unfortunately. For example, the gene TTN (which is an HGNC approved gene symbol) is associated with 600334, 603689, 604145, 608807, 611705, and 613765 but biomaRt returns an NA. Please refer to an updated post. I was interested in the number of…

Continue Reading