Twelfth year: Python

As this blog enters its twelfth year, I am finally using Python instead of Perl as my scripting language of choice (which is contrary to what I said two years ago!). As I wrote in my learning Python repo, my interest in deep learning finally tipped me over since several popular deep learning frameworks (TensorFlow, Keras, and PyTorch) use a Python interface; they have other interfaces but Python is commonly used. Prior to my interest in deep learning, I could do everything I needed with Perl, which was mainly writing parsing and wrapper scripts and therefore I didn't need to learn another language. But Python and deep learning are kinda like R and genomics (at least to my limited knowledge), so I had no choice.

Since the best way to learn something is to immerse yourself in it, I have been writing all my would-have-been Perl scripts in Python. In addition, I have forced myself not to be lazy and writing better scripts that do not hardcode any values and are well documented. The argparse module makes it very easy to write scripts that accept runtime arguments that can (and should) be documented and all of this is displayed on the automatically rendered help page when the script is evoked without any parameters or with -h or --help. Other recommendations for writing better scripts can be found in the article Ten recommendations for creating usable bioinformatics command line software and this blog post by Heng Li.

I went through the Software Carpentry Python course, which I highly recommend if you're interested in learning Python too, and got introduced to defensive programming. As a result, I have been using assert statements to check whether certain assumptions are met for the sake of sanity checking. On a technical note, assertions can be turned off but isn't by default; I'm still not entirely sure whether I should be raising exceptions instead of using assert but the point I'm trying to make is that we should always check for exceptions in our scripts. Now, all of these "better practices" are definitely possible with Perl, but I guess when we start something anew, we want to begin with good habits.

In line with developing good habits, I have also been learning about CI/CD and used GitHub Actions to generate my documentation. I also played around with GitLab and could set up my own runner, scripts, and .gitlab-ci.yml file that submits the appropriate testing job to a queue when a commit is made. Recently, I have also been playing around with GNU Make, since I purchased the Linux Humble Bundle. While I will probably stick to using a workflow management system (I currently use WDL and Cromwell but I'm interested in learning more about Nextflow), learning more about GNU make will help me develop better workflows.

All in all, this tweet really summarises what I have been trying to do in the year leading up to the twelfth year of this blog:

The tweet can be reproduced using GNU bc, although I would have rounded up to 37.8 instead of rounding down to 37.7.

bc -l<<<1^365
# 1

bc -l<<<1.01^365
# 37.78343433288715887761

During my PhD the emphasis was always just on papers, papers, and papers. There was little focus on the PhD process (e.g. such as learning the tools of the trade) and all the focus was on the results. Because of that, I mostly stuck with what I knew prior to my PhD (I had already worked as a bioinformatician for four years) and focused on papers. Now, I have taken one step back, so that I can move two steps forward. So if you're doing your PhD and are more interested in the technical aspects, take some time to develop new skills and habits; little by little, a little becomes a lot.

In other news, Illumina recently announced the NovaSeq X series that will apparently bring the cost of whole genome sequencing down to 200 USD (if you can afford the sequencer, have space to house it, and get reagents for free), another platform for simplifying bioinformatics analyses gets announced, and my blog continues on its steady decline.

I haven't been writing much in the last five years, which probably explains the decline in visitors.

But hopefully I'll write more posts between now and the thirteenth year of this blog. Until then, please take care and keep learning!

Print Friendly, PDF & Email



Creative Commons License
This work is licensed under a Creative Commons
Attribution 4.0 International License
.
2 comments Add yours
  1. Welcome to the Python community! Terraform tends to be pretty useful for cloud deployments too. Been reading the blog for a while. Looking forward to more content!

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.