Getting started with SQLite

SQLite is an embedded relational database engine that is self-contained, meaning that it has everything it needs to run. You don’t need to configure anything because it’s serverless and it has a transactional SQL database engine. To get started all you need to do is download the relevant binary file. My motivation for checking out…

Continue Reading

Learning about Makefiles

I started learning about Makefiles around the time I was learning about C. I still don’t know much about Makefiles or C, but I’m revisiting Makefiles because I’m interested in using them for building reproducible pipelines. Initially I thought Makefiles were text files that were used to help compile software. However, as I learned from…

Continue Reading

Calculating the h-index

Updated 2014 September 19th to include a method that does not require sorting. The h-index is an index that is calculated by measuring the number of articles, say $$n$$, that has at least $$n$$ citations. If you published 10 articles, and each of them had four citations, your h-index would be four, since there are…

Continue Reading

A transpose tool

Updated 2014 September 19th to compare different transpose tools I wrote a simple transpose tool, using Perl, for taking in tabular data and outputting a transposed version of the data. The primary motivation for writing this was because when viewing files with a lot of columns on the command-line, it becomes hard to match the…

Continue Reading

Getting started with C

I learned Perl as my first language as it was the language of choice in the first lab I joined. Over the years I’ve heard many criticisms, such as Perl code looks ugly and its motto “There’s more than one way to do it” allows too much flexibility. I particularly like this description of Perl:…

Continue Reading

Getting started with Git

Git is a distributed version control and source code management (SCM) system with an emphasis on speed. What’s version control? Version control is a system that records changes to a file or a set of files over time so that you can recall specific versions later. Here’s an example: check out this tweet and the…

Continue Reading

Saving disk space with Perl

Disk space is cheaper these days but here’s one way of using less disk space by working directly with gzipped files. Here’s a very straight forward example of Perl code that opens a gzipped file and outputs a gzipped file. And here’s some other code that just counts the number of lines in a file,…

Continue Reading

Equivalents in R, Python and Perl

Last update 2018 May 24th Perl was used by many computational biologists back in early 2000. The popularity of Perl may have been driven by its involvement with the human genome project. An article titled “How Perl Saved the Human Genome Project” explains why Perl was a good fit for computational biology projects (as well…

Continue Reading

Passing arguments from the command line in Perl

I used to do this for specifying the usage: However this became a problem when I needed to pass the number “0” as an argument. So I thought I’ll improve the code via the Perl module Getopt::Std. Depending on how your script works, you can set up conditional checks (e.g. unless exists $opt{‘f’}) to see…

Continue Reading