For converting Ensembl Gene IDs to gene symbols, using biomaRt is often recommended and indeed it is what I typically use. However, recently I needed to use Ensembl version 112 and could not get {biomaRt} to work with this specific version. Here's what I tried:
- Used
listEnsemblArchives()
to find the host URL for version 112, which is https://may2024.archive.ensembl.org
ensembl <- useMart(
biomart = "ENSEMBL_MART_ENSEMBL",
dataset = "hsapiens_gene_ensembl",
host = "https://may2024.archive.ensembl.org"
)
Convert Ensembl gene IDs to HUGO Gene Nomenclature Committee (HGNC) gene symbols.
my_genes <- c('ENSG00000118473', 'ENSG00000162426')
getBM(
attributes=c('ensembl_gene_id', "hgnc_symbol", "description"),
filters = "ensembl_gene_id",
values=my_genes,
mart=ensembl
)
Error in .processResults(postRes, mart = mart, hostURLsep = sep, fullXmlQuery = fullXmlQuery, : Query ERROR: caught BioMart::Exception::Database: Error during query execution: Table 'ensembl_mart_112.hsapiens_gene_ensembl__ox_hgnc__dm' doesn't exist
I tried the suggestion to use useEnsembl()
but I got the same error.
ensembl_112 <- useEnsembl(
biomart = "genes",
dataset = "hsapiens_gene_ensembl",
version = 112
)
getBM(
attributes=c('ensembl_gene_id', "hgnc_symbol", "description"),
filters = "ensembl_gene_id",
values=my_genes,
mart=ensembl_112
)
Error in .processResults(postRes, mart = mart, hostURLsep = sep, fullXmlQuery = fullXmlQuery, : Query ERROR: caught BioMart::Exception::Database: Error during query execution: Table 'ensembl_mart_112.hsapiens_gene_ensembl__ox_hgnc__dm' doesn't exist
I needed Ensembl 112 and using {biomaRt} didn't seem like an option anymore, so I went to the Ensembl FTP site for version 112 and after looking in all the directories I couldn't find a simple file that I could use to create an Ensembl Gene ID to gene symbol lookup. I was about to give up when I found and read the README that had the following: (output is snipped)
|-- mysql MySQL database per-table text files
| |
| |-- ensembl_mart_<release> BioMart database for genes
I navigated to https://ftp.ensembl.org/pub/release-112/mysql/ensembl_mart_112/, which takes some time to load because the FTP site is slow and there are a lot of files.
After downloading and checking several files, I think I found the file I needed, which was hsapiens_gene_ensembl__gene__main.txt.gz
wget https://ftp.ensembl.org/pub/release-112/mysql/ensembl_mart_112/hsapiens_gene_ensembl__gene__main.txt.gz
Unfortunately this file does not have a header so I'm not sure what all the columns contain but I could figure out that I needed columns 7 (Ensembl Gene ID) and 8 (HGNC gene symbol).
I have some Ensembl Gene IDs where I know the HGNC gene symbol, so I decided to look them up in this file as a confirmation.
ensembl_gene_id hgnc_symbol
1 ENSG00000118473 SGIP1
2 ENSG00000162426 SLC45A1
zcat hsapiens_gene_ensembl__gene__main.txt.gz | cut -f7,8 | grep ENSG00000118473
ENSG00000118473 SGIP1
zcat hsapiens_gene_ensembl__gene__main.txt.gz | cut -f7,8 | grep ENSG00000162426
ENSG00000162426 SLC45A1
Looks like it's the file I need!
zcat hsapiens_gene_ensembl__gene__main.txt.gz | wc -l
70611
Is the URL consistent for different versions, such that I can simply change the version number and download the same file for the different version? Yes!
wget https://ftp.ensembl.org/pub/release-113/mysql/ensembl_mart_113/hsapiens_gene_ensembl__gene__main.txt.gz -O hsapiens_gene_ensembl__gene__main_113.txt.gz
zcat hsapiens_gene_ensembl__gene__main_113.txt.gz | cut -f7,8 | grep ENSG00000162426
ENSG00000162426 SLC45A1
I sometimes get a network connection error to Biomart, which breaks and stops my workflows, so I might just download this file to have an offline way to convert Ensembl Gene IDs to gene symbols.

This work is licensed under a Creative Commons
Attribution 4.0 International License.