Please refer to Setting up Windows for bioinformatics in 2019.
I use Windows on all of my computers. Using just Windows for bioinformatics is not impossible but it's really just easier to have access to a Linux operating system. In the case of my desktop PC, I have a dual boot setup (Ubuntu and Windows 8) and for my laptop, which came pre-installed with Windows 8 making it a pain to setup Ubuntu, I use VirtualBox to have access to Linux.
Each time I set up a new computer, I always install the list of programs below:
Putty: SSH client
Xming X Server: X Window System Server for Windows
WordWeb: handy dictionary program, which looks up any word you highlight
Launchy: a keystroke program, for quick access to programs
7zip: general purpose zip program
R for Windows
RStudio
Opera: still one of my favourite web browsers and email client
Avast: antivirus
ActivePerl
Cygwin: Linux emulator on Windows
Dropbox: cloud file sharing program
VirtualBox: virtualisation software
My Linux distribution of choice is Ubuntu and using VirtualBox you can have Ubuntu installed inside your Windows installation. Below I outline a list of must have Linux bioinformatic tools for those working in the field of genomics and transcriptomics.
Ubuntu
Download Ubuntu (I would recommend Ubuntu 12.04 LTS). After installing VirtualBox and Ubuntu, here's what I installed immediately:
#VirtualBox guest additions: sudo ./VBoxLinuxAdditions.run #zlib (for bwa) sudo apt-get install zlib* #download bwa: http://sourceforge.net/projects/bio-bwa/files/ tar -xjf bwa-0.7.5a.tar.bz2 cd bwa-0.7.5a/ make #install ncurses (for samtools) sudo apt-get install ncurses-dev #download SAMTools: http://sourceforge.net/projects/samtools/files/ tar -xjf samtools-0.1.19.tar.bz2 cd samtools-0.1.19/ make #for BEDTools2 sudo apt-get install build-essential g++ git clone https://github.com/arq5x/bedtools2.git cd bedtools2 make clean; make all #FASTX-Toolkit: http://hannonlab.cshl.edu/fastx_toolkit/download.html wget http://hannonlab.cshl.edu/fastx_toolkit/libgtextutils-0.6.1.tar.bz2 tar -xjf libgtextutils-0.6.1.tar.bz2 cd libgtextutils-0.6.1/ ./configure make make check sudo make install wget http://hannonlab.cshl.edu/fastx_toolkit/fastx_toolkit-0.0.13.2.tar.bz2 tar -xjf fastx_toolkit-0.0.13.2.tar.bz2 cd fastx_toolkit-0.0.13.2/ ./configure make sudo make install #git sudo apt-get install git-core
For sharing folders click on between VirtualBox and Windows, use Devices/Shared Folders. Afterwards, add your user to the vboxsf group:
sudo adduser `whoami` vboxsf
See also my post on installing R on Ubuntu.
Others
#download your favourite genome #hg19 for me wget -O hg19.tar.gz http://hgdownload.soe.ucsc.edu/goldenPath/hg19/bigZips/chromFa.tar.gz rm *random.fa rm chrUn_gl0002* rm *hap*.fa for file in `ls *.fa | sort -k1V`; do echo $file; cat $file >> hg19.fa; done rm chr*.fa #download blat wget http://hgdownload.cse.ucsc.edu/admin/exe/linux.x86_64/blat/blat
Conclusions
Most personal computers available these days are powerful enough to be running a virtualised installation of an operating system. In my humble opinion, if you're setting up Windows for bioinformatics, the easiest thing to do is just to install VirtualBox and Ubuntu, and installing the bioinformatic programs in that Ubuntu instance.
This work is licensed under a Creative Commons
Attribution 4.0 International License.
Hi Dave,
Thank you, this is really useful!
Great blog, btw 🙂
Cheers,
Debora
Hi Debora,
you found me 🙂 Glad you found it useful!
Cheers,
Dave
Hi Dave,
Love the website. I’m just getting into bioinformatics in an amateur capacity and was wondering if your could give me some advice. I have a Windows desktop that is fairly modern, running with an i7 and 32 Gb RAM. When installing linux in a virtualized setting, what sort of performance hit do you usually encounter? Is it possible to make use of large amounts of RAM when running in Virtualbox? I understand that this is an old post, but I’d greatly appreciate some help with this.
Best wishes
Hi David,
I think overall it worked well; I don’t remember running into any performance issues. That being said, I never ran any heavy computational jobs on my virtual machines; I always used the compute servers at work for that.
Have a look into Docker (https://www.docker.com/) too. It’s like a light weight virtual machine that you can work and develop in, and share with people. I’ve been meaning to write a blog post on Docker and bioinformatics but haven’t had time yet.
Have fun,
Dave