Execute gatk-workflows locally

The Broad Institute have shared their GATK workflows on GitHub, however they are configured to be used with Google Cloud. I was not able to find a lot of information on executing the workflows locally and I only found this tutorial. I ran into problems while trying to follow the tutorial but eventually got it…

Continue Reading

Uploading to Amazon S3 using AWS CLI

Amazon S3 is an affordable resource for storing your data; you pay for what you use. There are four cost components to consider: 1. Storage pricing (how much space you use) 2. Request and data retrieval pricing (number of requests you make) 3. Data transfer and transfer acceleration pricing (how often you transfer the data)…

Continue Reading

Reproducible Bioinformatics

I will be giving a workshop titled "Reproducible Bioinformatics" at BioC Asia tomorrow. I have been thinking a lot about this topic and my aim for the workshop is to introduce computational tools and demonstrate how they can be used to help promote reproducibility when performing bioinformatic analyses. Ensuring reproducibility shouldn’t be an extra burden…

Continue Reading