Organising computational biology projects with Cookiecutter

A quick guide to organizing computational biology projects has been published over 8 years ago but the main messages are still very relevant (perhaps even more so nowadays given the exponential increase in biological data). In a nutshell, computational biology projects need to be organised so that we can share and reproduce our work in an efficient manner. The guide provides an example of how a project can be organised.

Figure 1. Directory structure for a sample project from A quick guide to organizing computational biology projects.

The suggested structure may not suit everyone but the point of this post is to illustrate a command-line utility called Cookiecutter that can be used to template a project as per the example in the guide. This will ensure that each of your projects follows a well defined structure. My example below was adapted from a basic example from the official documentation.

To get started, install Cookiecutter using pip.

pip install cookiecutter

We’ll now create a new directory for our template and move into the new directory.

mkdir new_project
cd new_project

Now we’ll create the same directory structure as Figure 1 but with templating tags. These tags will be used in the cookiecutter.json file.

mkdir {{cookiecutter.project_name}}
cd {{cookiecutter.project_name}}
mkdir {{cookiecutter.documentation}}
mkdir {{cookiecutter.data}}
mkdir {{cookiecutter.source}}
mkdir {{cookiecutter.bin}}
mkdir {{cookiecutter.results}}

I usually have a README file inside each project directory that provides a general overview of the project. We’ll use this README template:

We’ll use wget to download the README template and then replace the “Project Title” with a templating tag.

wget https://gist.githubusercontent.com/PurpleBooth/109311bb0361f32d87a2/raw/824da51d0763e6855c338cc8107b2ff890e7dd43/README-Template.md -O tmp.md
cat tmp.md | sed 's/Project Title/{{cookiecutter.project_name}}/' > {{cookiecutter.README}}.md
rm -f tmp.md

Finally, we need to create the cookiecutter.json file, which resides in the new_project directory. Here’s how it looks:

{
    "project_name": "new_project",
    "documentation": "doc",
    "data": "data",
    "source": "src",
    "bin": "bin",
    "results": "results",
    "README": "README"
}

If you followed the steps above, you should have this directory structure inside the new_project directory.

tree --charset unicode new_project/
new_project/
|-- cookiecutter.json
`-- {{cookiecutter.project_name}}
    |-- {{cookiecutter.README}}.md
    |-- {{cookiecutter.bin}}
    |-- {{cookiecutter.data}}
    |-- {{cookiecutter.documentation}}
    |-- {{cookiecutter.results}}
    `-- {{cookiecutter.source}}

6 directories, 2 files

To use our newly created template, move into the directory where you want to create the new project. I will use the same directory as where I created the new_project directory. I will also use the default values except for the project_name when creating the project.

cookiecutter new_project/
project_name [new_project]: msms
documentation [doc]: 
data [data]: 
source [src]: 
bin [bin]: 
results [results]: 
README [README]: 

tree --charset unicode msms/
msms/
|-- README.md
|-- bin
|-- data
|-- doc
|-- results
`-- src

5 directories, 1 file

head -1 msms/README.md 
# msms

Summary

As stated in the basic tutorial:

Cookiecutter takes a source directory tree and copies it into your new project. It replaces all the names that it finds surrounded by templating tags {{ and }} with names that it finds in the file cookiecutter.json. That’s basically it.

There’s a lot of features provided by Cookiecutter and a lot of project templates that you can use and adapt to your liking.

Print Friendly, PDF & Email



Creative Commons License
This work is licensed under a Creative Commons
Attribution 4.0 International License
.
3 comments Add yours
  1. Thanks for a great blog with a lot of great content such as cookiecutter which I have begun to implement in my projects.
    It seems like you have a comma to much in your final line in you json file.

    Cheers

    1. Thanks Simon! In the file on my computer, I actually didn’t have the comma. Don’t know how it snuck in 🙂 I’ve updated the post.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.