Check where a gene is expressed from the command line

The Pachter Lab have developed some very useful bioinformatics software. In this post, I use gget to quickly query ARCHS4 on the command line to see where a gene of interest is expressed. The gget tool has other functionality too including sequence alignment, enrichment analysis, and even protein structure prediction using AlphaFold. Check it out!

It's extremely easy to install gget; you can use pip (like I have) or conda.

pip install --upgrade gget

To get the expression of ACE2 (ignore the strikethrough on ACE2; links are programmatically checked to see if they still exist and GeneCards disallows programmatic access) use gget archs4.

gget archs4 -w tissue ACE2 | tail
# snipped
#     },
#     {
#         "id": "System.Nervous System.CNS.MIDBRAIN",
#         "min": 0.113644,
#         "q1": 0.113644,
#         "median": 0.113644,
#         "q3": 1.81287,
#         "max": 2.41968
#     }
# ]

The default output is JSON but you can specify CSV.

gget archs4 -w tissue ACE2 --csv | tail
# snipped
# System.Immune System.Thymus.THYMUS,0.113644,0.113644,0.113644,0.113644,2.16272
# System.Integumentary System.Skin.FIBROBLAST,0.113644,0.113644,0.113644,1.20968,4.3519
# System.Integumentary System.Skin.HAIR FOLLICLE,0.113644,0.113644,0.113644,1.20968,3.91029
# System.Muscular System.Skeletal muscle.MYOBLAST,0.113644,0.113644,0.113644,0.113644,1.81287
# System.Nervous System.CNS.CEREBELLUM,0.113644,0.113644,0.113644,0.113644,1.20968
# System.Connective Tissue.Bone.STROMAL CELL,0.113644,0.113644,0.113644,1.20968,4.47643
# System.Nervous System.CNS.NEURON,0.113644,0.113644,0.113644,1.81287,2.92457
# System.Nervous System.CNS.OLIGODENDROCYTE,0.113644,0.113644,0.113644,1.20968,2.41968
# System.Nervous System.CNS.SPINAL CORD,0.113644,0.113644,0.113644,1.20968,2.16272
# System.Nervous System.CNS.MIDBRAIN,0.113644,0.113644,0.113644,1.81287,2.41968

The expression quantification is performed using Kallisto, so the values should be Transcripts Per Million.

Since it's easier to look at a graph instead of a table, I wrote a simple R script to plot the JSON output returned by gget archs4. I initially used base R to plot the results but it was easier to create a nicer graph using ggplot2.

If you do not have R, install it first. Then install these two R packages.

install.packages(c("jsonlite", "ggplot2"))

Download archs4.R, the script I wrote to plot JSON output from gget archs4, and make it executable.

wget https://raw.githubusercontent.com/davetang/learning_r/main/code/archs4.R
chmod 755 archs4.R

You can move archs4.R into a directory included in your PATH or run it as follows.

gget archs4 -w tissue ACE2 | ./archs4.R

A PNG file called expr.png will be created.

Summary

You can easily generate a gene expression figure from the ARCHS4 website; the graph is nice because in addition to the expression values it displays the system, organ, and tissue hierarchy. But the point of gget is to quickly make queries on the command line, which is where I spend most of my time. And since it's easier to look at a graph, I wrote a simply R script that plots the output from gget archs4.

Initially I tried to use just base R to plot the output but it would take me much more effort to produce a graph similar to the one produced by ggplot2. I could also output CSV from gget archs4 to remove the jsonlite dependency but since JSON is the default output and JSON is widely used, I decided to use the package in the R script.

(If you are interested in how data can be piped into a R script, take a look at this demo script.)

Finally, I should note that this is only available for human and mouse. But those working on non-model organisms probably already expected this.

Print Friendly, PDF & Email



Creative Commons License
This work is licensed under a Creative Commons
Attribution 4.0 International License
.
2 comments Add yours
  1. Great work!! ? I’ll add a link to this tutorial to the gget archs4 documentation on the gget website (pachterlab.github.io/gget).

    1. Hi Laura,

      thanks for the great tool! And thanks for linking to this page; it’ll also motivate me to improve the plotting script. Hopefully people find this useful.

      Cheers,
      Dave

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.