Update 2017 May 10th: I realised that this approach doesn't work for all genes, unfortunately. For example, the gene TTN (which is an HGNC approved gene symbol) is associated with 600334, 603689, 604145, 608807, 611705, and 613765 but biomaRt returns an NA. Please refer to an updated post.
I was interested in the number of Online Mendelian Inheritance in Man (OMIM) disorders a particular gene was associated with, which in this case was FGFR2. Once again it was biomaRt to the rescue. OMIM is a collection of genes and disorders, and the morbid map refers to the disorders. This post is on looking up the OMIM morbid IDs for FGFR2.
To begin, we need to find and use the appropriate mart and dataset.
# install if necessary
# source("https://bioconductor.org/biocLite.R")
# biocLite("biomaRt")
library(biomaRt)
listMarts()
biomart version
1 ENSEMBL_MART_ENSEMBL Ensembl Genes 87
2 ENSEMBL_MART_MOUSE Mouse strains 87
3 ENSEMBL_MART_SNP Ensembl Variation 87
4 ENSEMBL_MART_FUNCGEN Ensembl Regulation 87
5 ENSEMBL_MART_VEGA Vega 67
# use Ensembl genes
ensembl <- useMart('ENSEMBL_MART_ENSEMBL')
# find human dataset
listDatasets(ensembl)[grep('human', listDatasets(ensembl)$description, TRUE),]
dataset description version
32 hsapiens_gene_ensembl Human genes (GRCh38.p7) GRCh38.p7
# use the human genes dataset
ensembl <- useMart('ENSEMBL_MART_ENSEMBL', dataset = 'hsapiens_gene_ensembl')
Now we need to find the right attributes.
attributes <- listAttributes(ensembl)
dim(attributes)
[1] 1468 3
head(attributes)
name description page
1 ensembl_gene_id Gene ID feature_page
2 ensembl_transcript_id Transcript ID feature_page
3 ensembl_peptide_id Protein ID feature_page
4 ensembl_exon_id Exon ID feature_page
5 description Description feature_page
6 chromosome_name Chromosome/scaffold name feature_page
# find the gene symbol attribute
attributes[grep('symbol', attributes$description, TRUE),]
name description page
71 hgnc_symbol HGNC symbol feature_page
# find the OMIM attributes
attributes[grep('MIM', attributes$description, TRUE),]
name description page
74 mim_gene_accession MIM Gene Accession feature_page
75 mim_gene_description MIM Gene Description feature_page
79 mim_morbid MIM MORBID feature_page
# now to perform the query
my_gene <- 'FGFR2'
my_result <- getBM(attributes=c('hgnc_symbol', 'mim_morbid'), filters = 'hgnc_symbol', values = my_gene, mart = ensembl)
my_result
hgnc_symbol mim_morbid
1 FGFR2 614592
2 FGFR2 613659
3 FGFR2 609579
4 FGFR2 207410
5 FGFR2 149730
6 FGFR2 123790
7 FGFR2 123500
8 FGFR2 123150
9 FGFR2 101600
10 FGFR2 101400
11 FGFR2 101200
FGFR2 is associated with 11 OMIM disorders. However, I would like to see the descriptions of these disorders and I could not find them in any of the attributes. I did write a small package called romim that can obtain the names. To use it you will need to obtain your own OMIM API key.
library(romim)
my_key <- set_key('not_a_real_key')
my_list_omim <- sapply(as.list(my_result$mim_morbid), get_omim)
sapply(my_list_omim, get_title)
[1] "BENT BONE DYSPLASIA SYNDROME; BBDS"
[2] "GASTRIC CANCER"
[3] "SCAPHOCEPHALY, MAXILLARY RETRUSION, AND MENTAL RETARDATION"
[4] "ANTLEY-BIXLER SYNDROME WITHOUT GENITAL ANOMALIES OR DISORDERED STEROIDOGENESIS; ABS2"
[5] "LACRIMOAURICULODENTODIGITAL SYNDROME; LADD"
[6] "BEARE-STEVENSON CUTIS GYRATA SYNDROME; BSTVS"
[7] "CROUZON SYNDROME"
[8] "JACKSON-WEISS SYNDROME; JWS"
[9] "PFEIFFER SYNDROME"
[10] "SAETHRE-CHOTZEN SYNDROME; SCS"
[11] "APERT SYNDROME"

This work is licensed under a Creative Commons
Attribution 4.0 International License.