Last updated: 2023/03/07
A quick guide to organizing computational biology projects has been published over 8 years ago but the main messages are still very relevant (perhaps even more so nowadays given the exponential increase in biological data). In a nutshell, computational biology projects need to be organised so that we can share and reproduce our work in an efficient manner. The guide provides an example of how a project can be organised.
Figure 1. Directory structure for a sample project from A quick guide to organizing computational biology projects.
The suggested structure may not suit everyone but the point of this post is to illustrate a command-line utility called Cookiecutter that can be used to template a project as per the example in the guide. This will ensure that each of your projects follows a well defined structure. My example below was adapted from an example that no longer exists. For other tutorials, please see the official documentation.
To get started, install Cookiecutter using pip.
pip install cookiecutter
We will now create a new directory for our template and move into the new directory.
mkdir new_project && cd new_project
Now we will create the same directory structure as Figure 1 but with templating tags. These tags will be used in the cookiecutter.json file.
mkdir {{cookiecutter.project_name}} && cd {{cookiecutter.project_name}}
mkdir {{cookiecutter.documentation}}
mkdir {{cookiecutter.data}}
mkdir {{cookiecutter.source}}
mkdir {{cookiecutter.bin}}
mkdir {{cookiecutter.results}}
I usually have a README file inside each project directory that provides a general overview of the project. We will use a nice README template that we can download using wget
and then replace the Project Title with a templating tag.
url=https://gist.githubusercontent.com/PurpleBooth/109311bb0361f32d87a2/raw/824da51d0763e6855c338cc8107b2ff890e7dd43/README-Template.md
wget -O - ${url} \
| sed 's/Project Title/{{cookiecutter.project_name}}/' \
> {{cookiecutter.README}}.md
Finally, we need to create the cookiecutter.json
file, which will reside in the new_project
directory.
cat <<EOF > cookiecutter.json
{
"project_name": "new_project",
"documentation": "doc",
"data": "data",
"source": "src",
"bin": "bin",
"results": "results",
"README": "README"
}
EOF
If you followed the steps above, you should have this directory structure inside the new_project
directory.
tree --charset unicode new_project/
new_project/
|-- cookiecutter.json
`-- {{cookiecutter.project_name}}
|-- {{cookiecutter.README}}.md
|-- {{cookiecutter.bin}}
|-- {{cookiecutter.data}}
|-- {{cookiecutter.documentation}}
|-- {{cookiecutter.results}}
`-- {{cookiecutter.source}}
6 directories, 2 files
To use our newly created template, move into the directory where you want to create the new project. I will use the same directory as where I created the new_project
directory. I will also use the default values except for the project_name
when creating the project.
cookiecutter new_project/
project_name [new_project]: msms
documentation [doc]:
data [data]:
source [src]:
bin [bin]:
results [results]:
README [README]:
tree --charset unicode msms/
msms/
|-- README.md
|-- bin
|-- data
|-- doc
|-- results
`-- src
5 directories, 1 file
head -1 msms/README.md
# msms
Summary
As stated in the basic tutorial (that no longer exists):
Cookiecutter takes a source directory tree and copies it into your new project. It replaces all the names that it finds surrounded by templating tags {{ and }} with names that it finds in the file cookiecutter.json. That’s basically it.
There are a lot of features provided by Cookiecutter and a lot of project templates that you can use and adapt to your liking.
This work is licensed under a Creative Commons
Attribution 4.0 International License.
Thanks for a great blog with a lot of great content such as cookiecutter which I have begun to implement in my projects.
It seems like you have a comma to much in your final line in you json file.
Cheers
Thanks Simon! In the file on my computer, I actually didn’t have the comma. Don’t know how it snuck in 🙂 I’ve updated the post.