The Pachter Lab have developed some very useful bioinformatics software. In this post, I use gget to quickly query ARCHS4 on the command line to see where a gene of interest is expressed. The gget
tool has other functionality too including sequence alignment, enrichment analysis, and even protein structure prediction using AlphaFold. Check it out!
It's extremely easy to install gget
; you can use pip
(like I have) or conda
.
pip install --upgrade gget
To get the expression of ACE2 (ignore the strikethrough on ACE2; links are programmatically checked to see if they still exist and GeneCards disallows programmatic access) use gget archs4
.
gget archs4 -w tissue ACE2 | tail
# snipped
# },
# {
# "id": "System.Nervous System.CNS.MIDBRAIN",
# "min": 0.113644,
# "q1": 0.113644,
# "median": 0.113644,
# "q3": 1.81287,
# "max": 2.41968
# }
# ]
The default output is JSON but you can specify CSV.
gget archs4 -w tissue ACE2 --csv | tail
# snipped
# System.Immune System.Thymus.THYMUS,0.113644,0.113644,0.113644,0.113644,2.16272
# System.Integumentary System.Skin.FIBROBLAST,0.113644,0.113644,0.113644,1.20968,4.3519
# System.Integumentary System.Skin.HAIR FOLLICLE,0.113644,0.113644,0.113644,1.20968,3.91029
# System.Muscular System.Skeletal muscle.MYOBLAST,0.113644,0.113644,0.113644,0.113644,1.81287
# System.Nervous System.CNS.CEREBELLUM,0.113644,0.113644,0.113644,0.113644,1.20968
# System.Connective Tissue.Bone.STROMAL CELL,0.113644,0.113644,0.113644,1.20968,4.47643
# System.Nervous System.CNS.NEURON,0.113644,0.113644,0.113644,1.81287,2.92457
# System.Nervous System.CNS.OLIGODENDROCYTE,0.113644,0.113644,0.113644,1.20968,2.41968
# System.Nervous System.CNS.SPINAL CORD,0.113644,0.113644,0.113644,1.20968,2.16272
# System.Nervous System.CNS.MIDBRAIN,0.113644,0.113644,0.113644,1.81287,2.41968
The expression quantification is performed using Kallisto, so the values should be Transcripts Per Million.
Since it's easier to look at a graph instead of a table, I wrote a simple R script to plot the JSON output returned by gget archs4
. I initially used base R to plot the results but it was easier to create a nicer graph using ggplot2
.
If you do not have R, install it first. Then install these two R packages.
install.packages(c("jsonlite", "ggplot2"))
Download archs4.R
, the script I wrote to plot JSON output from gget archs4
, and make it executable.
wget https://raw.githubusercontent.com/davetang/learning_r/main/code/archs4.R
chmod 755 archs4.R
You can move archs4.R
into a directory included in your PATH
or run it as follows.
gget archs4 -w tissue ACE2 | ./archs4.R
A PNG file called expr.png
will be created.
Summary
You can easily generate a gene expression figure from the ARCHS4 website; the graph is nice because in addition to the expression values it displays the system, organ, and tissue hierarchy. But the point of gget
is to quickly make queries on the command line, which is where I spend most of my time. And since it's easier to look at a graph, I wrote a simply R script that plots the output from gget archs4
.
Initially I tried to use just base R to plot the output but it would take me much more effort to produce a graph similar to the one produced by ggplot2
. I could also output CSV from gget archs4
to remove the jsonlite
dependency but since JSON is the default output and JSON is widely used, I decided to use the package in the R script.
(If you are interested in how data can be piped into a R script, take a look at this demo script.)
Finally, I should note that this is only available for human and mouse. But those working on non-model organisms probably already expected this.

This work is licensed under a Creative Commons
Attribution 4.0 International License.
Great work!! ? I’ll add a link to this tutorial to the gget archs4 documentation on the gget website (pachterlab.github.io/gget).
Hi Laura,
thanks for the great tool! And thanks for linking to this page; it’ll also motivate me to improve the plotting script. Hopefully people find this useful.
Cheers,
Dave