Update 2017 March 15th: I realised that this approach doesn't work for all genes, unfortunately. For example, the gene TTN (which is an HGNC approved gene symbol) is associated with 600334, 603689, 604145, 608807, 611705, and 613765 but biomaRt returns an NA.
I was interested in the number of Online Mendelian Inheritance in Man (OMIM) disorders a particular gene was associated with, which in this case was FGFR2. Once again it was biomaRt to the rescue. OMIM is a collection of genes and disorders, and the morbid map refers to the disorders. This post is on looking up the OMIM morbid IDs for FGFR2.
To begin, we need to find and use the appropriate mart and dataset.
# install if necessary # source("https://bioconductor.org/biocLite.R") # biocLite("biomaRt") library(biomaRt) listMarts() biomart version 1 ENSEMBL_MART_ENSEMBL Ensembl Genes 87 2 ENSEMBL_MART_MOUSE Mouse strains 87 3 ENSEMBL_MART_SNP Ensembl Variation 87 4 ENSEMBL_MART_FUNCGEN Ensembl Regulation 87 5 ENSEMBL_MART_VEGA Vega 67 # use Ensembl genes ensembl <- useMart('ENSEMBL_MART_ENSEMBL') # find human dataset listDatasets(ensembl)[grep('human', listDatasets(ensembl)$description, TRUE),] dataset description version 32 hsapiens_gene_ensembl Human genes (GRCh38.p7) GRCh38.p7 # use the human genes dataset ensembl <- useMart('ENSEMBL_MART_ENSEMBL', dataset = 'hsapiens_gene_ensembl')
Now we need to find the right attributes.
attributes <- listAttributes(ensembl) dim(attributes)  1468 3 head(attributes) name description page 1 ensembl_gene_id Gene ID feature_page 2 ensembl_transcript_id Transcript ID feature_page 3 ensembl_peptide_id Protein ID feature_page 4 ensembl_exon_id Exon ID feature_page 5 description Description feature_page 6 chromosome_name Chromosome/scaffold name feature_page # find the gene symbol attribute attributes[grep('symbol', attributes$description, TRUE),] name description page 71 hgnc_symbol HGNC symbol feature_page # find the OMIM attributes attributes[grep('MIM', attributes$description, TRUE),] name description page 74 mim_gene_accession MIM Gene Accession feature_page 75 mim_gene_description MIM Gene Description feature_page 79 mim_morbid MIM MORBID feature_page # now to perform the query my_gene <- 'FGFR2' my_result <- getBM(attributes=c('hgnc_symbol', 'mim_morbid'), filters = 'hgnc_symbol', values = my_gene, mart = ensembl) my_result hgnc_symbol mim_morbid 1 FGFR2 614592 2 FGFR2 613659 3 FGFR2 609579 4 FGFR2 207410 5 FGFR2 149730 6 FGFR2 123790 7 FGFR2 123500 8 FGFR2 123150 9 FGFR2 101600 10 FGFR2 101400 11 FGFR2 101200
FGFR2 is associated with 11 OMIM disorders. However, I would like to see the descriptions of these disorders and I could not find them in any of the attributes. I did write a small package called romim that can obtain the names. To use it you will need to obtain your own OMIM API key.
library(romim) my_key <- set_key('not_a_real_key') my_list_omim <- sapply(as.list(my_result$mim_morbid), get_omim) sapply(my_list_omim, get_title)  "BENT BONE DYSPLASIA SYNDROME; BBDS"  "GASTRIC CANCER"  "SCAPHOCEPHALY, MAXILLARY RETRUSION, AND MENTAL RETARDATION"  "ANTLEY-BIXLER SYNDROME WITHOUT GENITAL ANOMALIES OR DISORDERED STEROIDOGENESIS; ABS2"  "LACRIMOAURICULODENTODIGITAL SYNDROME; LADD"  "BEARE-STEVENSON CUTIS GYRATA SYNDROME; BSTVS"  "CROUZON SYNDROME"  "JACKSON-WEISS SYNDROME; JWS"  "PFEIFFER SYNDROME"  "SAETHRE-CHOTZEN SYNDROME; SCS"  "APERT SYNDROME"
This work is licensed under a Creative Commons
Attribution 4.0 International License.