Calculating the h-index

Updated 2014 September 19th to include a method that does not require sorting. The h-index is an index that is calculated by measuring the number of articles, say , that has at least citations. If you published 10 articles, and each of them had four citations, your h-index would be four, since there are four...

Continue Reading

A transpose tool

Updated 2014 September 19th to compare different transpose tools I wrote a simple transpose tool, using Perl, for taking in tabular data and outputting a transposed version of the data. The primary motivation for writing this was because when viewing files with a lot of columns on the command-line, it becomes hard to match the...

Continue Reading

Getting started with C

I learned Perl as my first language as it was the language of choice in the first lab I joined. Over the years I've heard many criticisms, such as Perl code looks ugly and its motto "There's more than one way to do it" allows too much flexibility. I particularly like this description of Perl:...

Continue Reading

Saving disk space with Perl

Disk space is cheaper these days but here's one way of using less disk space by working directly with gzipped files. Here's a very straight forward example of Perl code that opens a gzipped file and outputs a gzipped file. And here's some other code that just counts the number of lines in a file,...

Continue Reading

Equivalents in R, Python and Perl

Last update 2015 September 9th I've been using Perl heavily for several years until I started my PhD back in 2010 (I still use it for many tasks but much more sparingly). Perl was widely used back in the early days when the human genome was yet to be sequenced and this famous article explained...

Continue Reading

Passing arguments from the command line in Perl

I used to do this for specifying the usage: However this became a problem when I needed to pass the number "0" as an argument. So I thought I'll improve the code via the Perl module Getopt::Std. Depending on how your script works, you can set up conditional checks (e.g. unless exists $opt{'f'}) to see...

Continue Reading

Using bins when comparing genomic features

Comparing two files containing genomic features is a common task e.g. finding out whether the coordinates of your tags intersect with genes. Of course you could use intersectBed (as part of the BEDTools suite) for this purpose but here's how to do it anyway using Perl. NOTE: I hard code the length of my tags...

Continue Reading