Updated 2014 May 20th: I created this post on the 19th of August in 2011 when I just purchased yet another computer (Core i7 2600, 8 gigs ram and a GTX460) and was setting it up for doing some bioinformatics. I updated this page again around mid 2013 when I bought another laptop (Asus N56V, Core i7 3630QM with 12 gigs ram) and was setting it up for work again. I'm updating this page again because I'm seeing some increase in traffic to this page (and thankfully not because I had purchased another computer!).
I use Windows on all of my computers. Using just Windows for bioinformatics is not impossible but it's really just easier to have access to a Linux operating system. In the case of my desktop PC, I have a dual boot setup (Ubuntu and Windows 8) and for my laptop, which came pre-installed with Windows 8 making it a pain to setup Ubuntu, I use VirtualBox to have access to Linux.
Each time I set up a new computer, I always install the list of programs below:
Putty: SSH client
Xming X Server: X Window System Server for Windows
WordWeb: handy dictionary program, which looks up any word you highlight
Launchy: a keystroke program, for quick access to programs
7zip: general purpose zip program
R for Windows
Opera: still one of my favourite web browsers and email client
Cygwin: Linux emulator on Windows
Dropbox: cloud file sharing program
VirtualBox: virtualisation software
My Linux distribution of choice is Ubuntu and using VirtualBox you can have Ubuntu installed inside your Windows installation. Below I outline a list of must have Linux bioinformatic tools for those working in the field of genomics and transcriptomics.
Download Ubuntu (I would recommend Ubuntu 12.04 LTS). After installing VirtualBox and Ubuntu, here's what I installed immediately:
#VirtualBox guest additions: sudo ./VBoxLinuxAdditions.run #zlib (for bwa) sudo apt-get install zlib* #download bwa: http://sourceforge.net/projects/bio-bwa/files/ tar -xjf bwa-0.7.5a.tar.bz2 cd bwa-0.7.5a/ make #install ncurses (for samtools) sudo apt-get install ncurses-dev #download SAMTools: http://sourceforge.net/projects/samtools/files/ tar -xjf samtools-0.1.19.tar.bz2 cd samtools-0.1.19/ make #for BEDTools2 sudo apt-get install build-essential g++ git clone https://github.com/arq5x/bedtools2.git cd bedtools2 make clean; make all #FASTX-Toolkit: http://hannonlab.cshl.edu/fastx_toolkit/download.html wget http://hannonlab.cshl.edu/fastx_toolkit/libgtextutils-0.6.1.tar.bz2 tar -xjf libgtextutils-0.6.1.tar.bz2 cd libgtextutils-0.6.1/ ./configure make make check sudo make install wget http://hannonlab.cshl.edu/fastx_toolkit/fastx_toolkit-0.0.13.2.tar.bz2 tar -xjf fastx_toolkit-0.0.13.2.tar.bz2 cd fastx_toolkit-0.0.13.2/ ./configure make sudo make install #git sudo apt-get install git-core
For sharing folders click on between VirtualBox and Windows, use Devices/Shared Folders. Afterwards, add your user to the vboxsf group:
sudo adduser `whoami` vboxsf
See also my post on installing R on Ubuntu.
#download your favourite genome #hg19 for me wget -O hg19.tar.gz http://hgdownload.soe.ucsc.edu/goldenPath/hg19/bigZips/chromFa.tar.gz rm *random.fa rm chrUn_gl0002* rm *hap*.fa for file in `ls *.fa | sort -k1V`; do echo $file; cat $file >> hg19.fa; done rm chr*.fa #download blat wget http://hgdownload.cse.ucsc.edu/admin/exe/linux.x86_64/blat/blat
Most personal computers available these days are powerful enough to be running a virtualised installation of an operating system. In my humble opinion, if you're setting up Windows for bioinformatics, the easiest thing to do is just to install VirtualBox and Ubuntu, and installing the bioinformatic programs in that Ubuntu instance.