Downloading molecular signatures from MSigDB in R

The Molecular Signatures Database (MSigDB) is a nice resource containing various gene sets designed for use in Gene Set Enrichment Analyses (GSEA) and its variants. It was co-developed with the GSEA by the Broad Institute and is still maintained by them; you can read more in the classic paper: Gene set enrichment analysis: A knowledge-based…

Continue Reading

Ensembl Gene IDs to gene symbols

For converting Ensembl Gene IDs to gene symbols, using biomaRt is often recommended and indeed it is what I typically use. However, recently I needed to use Ensembl version 112 and could not get {biomaRt} to work with this specific version. Here’s what I tried: Used listEnsemblArchives() to find the host URL for version 112,…

Continue Reading

Running a fork bomb

Since it was Halloween and all, I shared an article with some scary Linux commands that one should never run to some of my colleagues. One of them was a fork bomb, which looks like this: :(){:|:&};: In Bash, a function is defined like so: function_name () { commands } So the fork bomb starts…

Continue Reading

An example differential gene expression results table

This post contains the analysis steps used to create a differential gene expression results table generated from RNA-seq counts summarised using nf-core/rnaseq. The comparison was done between two conditions: normal versus (lung) cancer. We will be using {edgeR}, so install it if you haven’t already. if (!require("BiocManager", quietly = TRUE)) install.packages("BiocManager") BiocManager::install("edgeR") We will also…

Continue Reading

An example RNA-seq count table

I have been using pnas_expression.txt as a test dataset for count table analyses for many years. It was created by Davis McCarthy and was hosted on their Google Sites website. After some time, the site became unavailable and I have been hosting it on my web server since then. The RNA-seq libraries were generated using…

Continue Reading

14th Anniversary

It has been 14 years since I’ve started this blog and as per tradition, I write a blog post reflecting on stuff. It’s a loose tradition since some years I didn’t write anything. When I do write, it’s usually about something new I’ve learned and/or started practising since the last anniversary post. Recently, I’ve been…

Continue Reading

Getting stuff done

There’s this nice tip from a book I had read a long time ago. Some of the tips/lessons from the aforementioned book weren’t useful to me but the following tip is something that I have found quite effective in getting stuff done. Say you have some task to do and you’re finding it hard to…

Continue Reading

The potato paradox

I came across this question last night on some trivia app I have on my phone: When you let fruits consisting of 99% water by weight dry so that they become 98% water, what percentage of weight do they lose? Stop scrolling (or looking) down right now, if you want to give it some thought…

Continue Reading

Rate limited by GitHub when using remotes::install_github()

When you use remotes::install_github() too often in a short span of time, GitHub may rate limit you. This means you can’t use install the package you wanted without waiting for the rate limit to end. One way around this (and to prevent getting rate limited) is to simply clone the repo and use remotes::install_local()! To…

Continue Reading

Check what genes are correlated to your gene of interest

ARCHS4 (All RNA-seq and ChIP-seq sample and signature search) is a resource that provides access to gene and transcript counts uniformly processed (using kallisto) from all human and mouse RNA-seq experiments from the Gene Expression Omnibus (GEO) and the Sequence Read Archive (SRA). The tool gget and the sub-tool archs4 can be used to query…

Continue Reading