Grepping PowerPoint files

Last updated: 2023/03/07 I’m not really a fan of PowerPoint but it’s ubiquitous in research, so I have to work with them. Sometimes I need to find a slide amongst a pile of PowerPoint files and waste a lot of time opening and closing files. I wondered whether I could grep PowerPoint files and sure…

Continue Reading

Stop BLAST from phoning home

Some time back I learned from Devon Ryan on the bird app (no link because I have stopped using said app) that BLAST phones home every time you used it, by default. I was never aware of this until I saw the post and I’m not really a fan of having this turned on by…

Continue Reading

Twelfth year: Python

As this blog enters its twelfth year, I am finally using Python instead of Perl as my scripting language of choice (which is contrary to what I said two years ago!). As I wrote in my learning Python repo, my interest in deep learning finally tipped me over since several popular deep learning frameworks (TensorFlow,…

Continue Reading

Grepping a list with a list

The grep command-line utility is a commonly used tool for searching plain text files for lines that match a pattern. For example, you could search a gene or SNP ID in a BED/GFF/GTF file to find out its coordinates. In this post, I will demonstrate how you can search for a list of things in…

Continue Reading

SQL group by statement on the command line

The GROUP BY statement allows you to perform operations in a group wise manner. I first learned of the Useful FILe and stream Operations (filo) repository a long long time ago and keep coming back to it over and over again. The filo toolkit comes with three tools: groupBy, stats, and shuffle. The groupBy tool…

Continue Reading

Ten years

As of today, it has been a decade since my first post on this blog. It started aimlessly during my PhD as a place to post analysis notes for myself and ten years later it still remains so. However, over the years I have started to focus more on better project management practices and on…

Continue Reading

Reproducible Bioinformatics

I will be giving a workshop titled “Reproducible Bioinformatics” at BioC Asia tomorrow. I have been thinking a lot about this topic and my aim for the workshop is to introduce computational tools and demonstrate how they can be used to help promote reproducibility when performing bioinformatic analyses. Ensuring reproducibility shouldn’t be an extra burden…

Continue Reading

The Golden Rule of Bioinformatics

I’m a big fan of the book Bioinformatics Data Skills by Vince Buffalo and I highly recommend it to everyone who works in the bioinformatics field. The book introduces the reader to The Golden Rule of Bioinformatics, which is: Never ever trust your tools (or data). I am a strong proponent of this rule, which…

Continue Reading

Compiling R with GNU Readline

Updated 2018 March 23rd for R-3.4.4 I use a lot of shortcuts provided by GNU Readline. I recently compiled R without Readline support and it was almost unusable! This was because I ran into the error: configure: error: –with-readline=yes (default) and headers/libs are not available To circumvent this I compiled R by running: ./configure –with-readline=no…

Continue Reading