Organising computational biology projects with Cookiecutter

A quick guide to organizing computational biology projects has been published over 8 years ago but the main messages are still very relevant (perhaps even more so nowadays given the exponential increase in biological data). In a nutshell, computational biology projects need to be organised so that we can share and reproduce our work in an efficient manner. The guide provides an example of how a project can be organised.

Figure 1. Directory structure for a sample project from A quick guide to organizing computational biology projects.

The suggested structure may not suit everyone but the point of this post is to illustrate a command-line utility called Cookiecutter that can be used to template a project as per the example in the guide. This will ensure that each of your projects follows a well defined structure. My example below was adapted from a basic example from the official documentation.

To get started, install Cookiecutter using pip.

pip install cookiecutter

We'll now create a new directory for our template and move into the new directory.

mkdir new_project
cd new_project

Now we'll create the same directory structure as Figure 1 but with templating tags. These tags will be used in the cookiecutter.json file.

mkdir {{cookiecutter.project_name}}
cd {{cookiecutter.project_name}}
mkdir {{cookiecutter.documentation}}
mkdir {{cookiecutter.data}}
mkdir {{cookiecutter.source}}
mkdir {{cookiecutter.bin}}
mkdir {{cookiecutter.results}}

I usually have a README file inside each project directory that provides a general overview of the project. We'll use this README template:

Project Title

One Paragraph of project description goes here

Getting Started

These instructions will get you a copy of the project up and running on your local machine for development and testing purposes. See deployment for notes on how to deploy the project on a live system.

Prerequisites

What things you need to install the software and how to install them

Give examples

Installing

A step by step series of examples that tell you how to get a development env running

Say what the step will be

Give the example

And repeat

until finished

End with an example of getting some data out of the system or using it for a little demo

Running the tests

Explain how to run the automated tests for this system

Break down into end to end tests

Explain what these tests test and why

Give an example

And coding style tests

Explain what these tests test and why

Give an example

Deployment

Add additional notes about how to deploy this on a live system

Built With

  • Dropwizard - The web framework used
  • Maven - Dependency Management
  • ROME - Used to generate RSS Feeds

Contributing

Please read CONTRIBUTING.md for details on our code of conduct, and the process for submitting pull requests to us.

Versioning

We use SemVer for versioning. For the versions available, see the tags on this repository.

Authors

See also the list of contributors who participated in this project.

License

This project is licensed under the MIT License - see the LICENSE.md file for details

Acknowledgments

  • Hat tip to anyone whose code was used
  • Inspiration
  • etc

We'll use wget to download the README template and then replace the "Project Title" with a templating tag.

wget https://gist.githubusercontent.com/PurpleBooth/109311bb0361f32d87a2/raw/824da51d0763e6855c338cc8107b2ff890e7dd43/README-Template.md -O tmp.md
cat tmp.md | sed 's/Project Title/{{cookiecutter.project_name}}/' > {{cookiecutter.README}}.md
rm -f tmp.md

Finally, we need to create the cookiecutter.json file, which resides in the new_project directory. Here's how it looks:

{
    "project_name": "new_project",
    "documentation": "doc",
    "data": "data",
    "source": "src",
    "bin": "bin",
    "results": "results",
    "README": "README"
}

If you followed the steps above, you should have this directory structure inside the new_project directory.

tree --charset unicode new_project/
new_project/
|-- cookiecutter.json
`-- {{cookiecutter.project_name}}
    |-- {{cookiecutter.README}}.md
    |-- {{cookiecutter.bin}}
    |-- {{cookiecutter.data}}
    |-- {{cookiecutter.documentation}}
    |-- {{cookiecutter.results}}
    `-- {{cookiecutter.source}}

6 directories, 2 files

To use our newly created template, move into the directory where you want to create the new project. I will use the same directory as where I created the new_project directory. I will also use the default values except for the project_name when creating the project.

cookiecutter new_project/
project_name [new_project]: msms
documentation [doc]: 
data [data]: 
source [src]: 
bin [bin]: 
results [results]: 
README [README]: 

tree --charset unicode msms/
msms/
|-- README.md
|-- bin
|-- data
|-- doc
|-- results
`-- src

5 directories, 1 file

head -1 msms/README.md 
# msms

Summary

As stated in the basic tutorial:

Cookiecutter takes a source directory tree and copies it into your new project. It replaces all the names that it finds surrounded by templating tags {{ and }} with names that it finds in the file cookiecutter.json. That’s basically it.

There's a lot of features provided by Cookiecutter and a lot of project templates that you can use and adapt to your liking.

Print Friendly, PDF & Email



Creative Commons License
This work is licensed under a Creative Commons
Attribution 4.0 International License
.
2 comments Add yours
  1. Thanks for a great blog with a lot of great content such as cookiecutter which I have begun to implement in my projects.
    It seems like you have a comma to much in your final line in you json file.

    Cheers

    1. Thanks Simon! In the file on my computer, I actually didn’t have the comma. Don’t know how it snuck in 🙂 I’ve updated the post.

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.