Sequence analysis of SARS-CoV-2 part 3

This post is a continuation of a series of posts on the sequence analysis of SARS-CoV-2; see part 1 and part 2 if you haven’t already. Since my first post, I found out that you can blast sequences against a Betacoronavirus database on NCBI BLAST. The database, as of 2020/03/10, has a total of 7,844…

Continue Reading

Sequence analysis of SARS-CoV-2 part 2

I ended my previous post on the sequence analysis of SARS-CoV-2 with the amino acid alignment of the spike protein from SARS-CoV-2 (MN908947) and Bat coronavirus RaTG13 (MN996532). The spike protein is of specific interest as it is due to its binding with the angiotensin converting enzyme 2 (ACE2) receptor that it is able to…

Continue Reading

Sequence analysis of SARS-CoV-2

The article “A new coronavirus associated with human respiratory disease in China” released the full genome sequence (29,903 nt) of Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). In the paper, meta-transcriptomic sequencing was performed on bronchoalveolar lavage fluid (BALF) from a 41 year old male suffering from a severe respiratory disease. Contigs were assembled using…

Continue Reading

Domain renewal

Some of you may have noticed the Buy me a coffee link I have on my blog. I have just used all the contributions to help pay for the hosting fees; this blog should be here for another two years (extended up until 2022 April 24). Thank you to everyone who has bought me a…

Continue Reading

Learning WDL

An approach I like to use when learning a new tool is to get started by trying to run an example and then gradually work out the details. In this post, I’m trying to learn the basics of the Workflow Description Language (WDL) so that I can adapt GATK workflows for my own use. WDL,…

Continue Reading

Plotting weather data using R

The Australian Bureau of Meteorology provides historical weather data, some of which can be freely downloaded. In this blog post, I will plot the weather data collected at two weather stations in Brisbane: the Brisbane Regional Office weather station (latitude 27.466 degrees south, longitude 153.0270 degrees east, and elevation of 38 metres) with data available…

Continue Reading

Execute gatk-workflows locally

The Broad Institute have shared their GATK workflows on GitHub, however they are configured to be used with Google Cloud. I was not able to find a lot of information on executing the workflows locally and I only found this tutorial. I ran into problems while trying to follow the tutorial but eventually got it…

Continue Reading

Uploading to Amazon S3 using AWS CLI

Amazon S3 is an affordable resource for storing your data; you pay for what you use. There are four cost components to consider: 1. Storage pricing (how much space you use) 2. Request and data retrieval pricing (number of requests you make) 3. Data transfer and transfer acceleration pricing (how often you transfer the data)…

Continue Reading

Reproducible Bioinformatics

I will be giving a workshop titled “Reproducible Bioinformatics” at BioC Asia tomorrow. I have been thinking a lot about this topic and my aim for the workshop is to introduce computational tools and demonstrate how they can be used to help promote reproducibility when performing bioinformatic analyses. Ensuring reproducibility shouldn’t be an extra burden…

Continue Reading