A couple of weeks ago, I wrote a post on identifying OMIM phenotypes that are associated with a gene of interest. I thought I solved the problem by using one of my favourite R packages (biomaRt) but alas. For example, I could not find any OMIM IDs associated with the TTN gene using biomaRt. In the end, I resorted to using the OMIM API through a small R package I wrote called romim.
An adjacency list is simply an unordered list that describes connections between vertices. It's a commonly used input format for graphs. In this post, I use the melt() function from the reshape2 package to create an adjacency list from a correlation matrix. I use the geneData dataset, which consists of real but anonymised microarray expression data, from the Biobase package as an example. Finally, I'll show some features of the igraph package.
Just recently, the genome Aggregation Database (gnomAD) VCF files were available for download:
— Daniel MacArthur (@dgmacarthur) February 27, 2017
Update 2017 March 15th: I realised that this approach doesn't work for all genes, unfortunately. For example, the gene TTN (which is an HGNC approved gene symbol) is associated with 600334, 603689, 604145, 608807, 611705, and 613765 but biomaRt returns an NA.
I was interested in the number of Online Mendelian Inheritance in Man (OMIM) disorders a particular gene was associated with, which in this case was FGFR2. Once again it was biomaRt to the rescue. OMIM is a collection of genes and disorders, and the morbid map refers to the disorders. This post is on looking up the OMIM morbid IDs for FGFR2.