XHMM
XHMM (eXome-Hidden Markov Model)
The XHMM C++ software suite was written to call copy number variation (CNV) from next-generation sequencing projects, where exome capture was used (or targeted sequencing, more generally).
XHMM uses principal component analysis (PCA) normalisation and a hidden Markov model (HMM) to detect and genotype copy number variation (CNV) from normalised read-depth data from targeted sequencing experiments.
XHMM was explicitly designed to be used with targeted exome sequencing at high coverage (at least 60x - 100x) using Illumina HiSeq (or similar) sequencing of at least ~50 samples. However, no part of XHMM explicitly requires these particular experimental conditions, just high coverage of genomic regions for many samples.
http://atgu.mgh.harvard.edu/xhmm/index.shtml
Installing
git clone https://bitbucket.org/statgen/xhmm.git cd xhmm make make R
Then in R
install.packages("gplots") install.packages("plotrix") install.packages("xhmmScripts")
Download test data
wget http://atgu.mgh.harvard.edu/xhmm/EXAMPLE_BAMS.zip wget ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/phase3/data/HG00096/exome_alignment/HG00096.mapped.ILLUMINA.bwa.GBR.exome.20120522.bam