Learning WDL

An approach I like to use when learning a new tool is to get started by trying to run an example and then gradually work out the details. In this post, I’m trying to learn the basics of the Workflow Description Language (WDL) so that I can adapt GATK workflows for my own use. WDL,…

Continue Reading

Plotting weather data using R

The Australian Bureau of Meteorology provides historical weather data, some of which can be freely downloaded. In this blog post, I will plot the weather data collected at two weather stations in Brisbane: the Brisbane Regional Office weather station (latitude 27.466 degrees south, longitude 153.0270 degrees east, and elevation of 38 metres) with data available…

Continue Reading

Execute gatk-workflows locally

The Broad Institute have shared their GATK workflows on GitHub, however they are configured to be used with Google Cloud. I was not able to find a lot of information on executing the workflows locally and I only found this tutorial. I ran into problems while trying to follow the tutorial but eventually got it…

Continue Reading

Uploading to Amazon S3 using AWS CLI

Amazon S3 is an affordable resource for storing your data; you pay for what you use. There are four cost components to consider: 1. Storage pricing (how much space you use) 2. Request and data retrieval pricing (number of requests you make) 3. Data transfer and transfer acceleration pricing (how often you transfer the data)…

Continue Reading

Reproducible Bioinformatics

I will be giving a workshop titled “Reproducible Bioinformatics” at BioC Asia tomorrow. I have been thinking a lot about this topic and my aim for the workshop is to introduce computational tools and demonstrate how they can be used to help promote reproducibility when performing bioinformatic analyses. Ensuring reproducibility shouldn’t be an extra burden…

Continue Reading

Comparing VCF files

In this post, I will compare different tools for comparing VCF files. To create a reproducible example, I will make use of Docker and Conda. I highly recommend learning about these tools if you haven’t already; they make it easier to reproduce your work. I have written some notes on Docker and Conda that maybe…

Continue Reading

Importing vector images into R

The grImport package can be used to import vector images into R so that you can edit and/or combine it other plots. In this post, I will go through the grImport workflow and finally show how vector images can be incorporated with other graphical objects.

Continue Reading

The Golden Rule of Bioinformatics

I’m a big fan of the book Bioinformatics Data Skills by Vince Buffalo and I highly recommend it to everyone who works in the bioinformatics field. The book introduces the reader to The Golden Rule of Bioinformatics, which is: Never ever trust your tools (or data). I am a strong proponent of this rule, which…

Continue Reading

Visualising Google Trends results with R

I haven’t been blogging as much as I’d like to due to other commitments but I wanted to write a post before 2018 ends. This post is on plotting Google Trends results with R. If you’ve never heard of or used Google Trends, it’s fun! You can see how certain keywords have trended over the…

Continue Reading