TIL about code snippets in RStudio

Last updated: 2023/07/28 I recently learned about code snippets in RStudio (ignore the strikethrough). As the name suggests, they are snippets of code that you can quickly insert into your source code in the source pane of RStudio. There are two ways to add code snippets: Click on the following: Tools -> Global Options ->…

Continue Reading

Wrap and unwrap a line in Vim

Last updated: 2023/07/06 Vim is a text editor that is renowned for having a steep learning curve. People like to illustrate its difficulty by pointing out that Vim is so unfriendly to new users that they do not even know how to quit the program! I learned Vim (and Perl) because that was what my…

Continue Reading

TIL that you can download SRA data from AWS

The Sequence Read Archive (SRA) is the largest publicly available repository of high throughput sequencing data. (Fun fact: it used to be called the Short Read Archive since most of the data was from short read sequencers.) The tool fastq-dump from the SRA Toolkit can be used to download SRA data. A while ago I…

Continue Reading

Grepping PowerPoint files

Last updated: 2023/03/07 I’m not really a fan of PowerPoint but it’s ubiquitous in research, so I have to work with them. Sometimes I need to find a slide amongst a pile of PowerPoint files and waste a lot of time opening and closing files. I wondered whether I could grep PowerPoint files and sure…

Continue Reading

Stop BLAST from phoning home

Some time back I learned from Devon Ryan on the bird app (no link because I have stopped using said app) that BLAST phones home every time you used it, by default. I was never aware of this until I saw the post and I’m not really a fan of having this turned on by…

Continue Reading

Twelfth year: Python

As this blog enters its twelfth year, I am finally using Python instead of Perl as my scripting language of choice (which is contrary to what I said two years ago!). As I wrote in my learning Python repo, my interest in deep learning finally tipped me over since several popular deep learning frameworks (TensorFlow,…

Continue Reading

Grepping a list with a list

The grep command-line utility is a commonly used tool for searching plain text files for lines that match a pattern. For example, you could search a gene or SNP ID in a BED/GFF/GTF file to find out its coordinates. In this post, I will demonstrate how you can search for a list of things in…

Continue Reading

SQL group by statement on the command line

The GROUP BY statement allows you to perform operations in a group wise manner. I first learned of the Useful FILe and stream Operations (filo) repository a long long time ago and keep coming back to it over and over again. The filo toolkit comes with three tools: groupBy, stats, and shuffle. The groupBy tool…

Continue Reading

Ten years

As of today, it has been a decade since my first post on this blog. It started aimlessly during my PhD as a place to post analysis notes for myself and ten years later it still remains so. However, over the years I have started to focus more on better project management practices and on…

Continue Reading

Reproducible Bioinformatics

I will be giving a workshop titled "Reproducible Bioinformatics" at BioC Asia tomorrow. I have been thinking a lot about this topic and my aim for the workshop is to introduce computational tools and demonstrate how they can be used to help promote reproducibility when performing bioinformatic analyses. Ensuring reproducibility shouldn’t be an extra burden…

Continue Reading