XHMM

From Dave's wiki
Jump to navigation Jump to search

XHMM (eXome-Hidden Markov Model)

The XHMM C++ software suite was written to call copy number variation (CNV) from next-generation sequencing projects, where exome capture was used (or targeted sequencing, more generally).

XHMM uses principal component analysis (PCA) normalisation and a hidden Markov model (HMM) to detect and genotype copy number variation (CNV) from normalised read-depth data from targeted sequencing experiments.

XHMM was explicitly designed to be used with targeted exome sequencing at high coverage (at least 60x - 100x) using Illumina HiSeq (or similar) sequencing of at least ~50 samples. However, no part of XHMM explicitly requires these particular experimental conditions, just high coverage of genomic regions for many samples.

http://atgu.mgh.harvard.edu/xhmm/index.shtml

Installing

git clone https://bitbucket.org/statgen/xhmm.git
cd xhmm
make
make R

Then in R

install.packages("gplots")
install.packages("plotrix")
install.packages("xhmmScripts")

Download test data

wget http://atgu.mgh.harvard.edu/xhmm/EXAMPLE_BAMS.zip
wget ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/phase3/data/HG00096/exome_alignment/HG00096.mapped.ILLUMINA.bwa.GBR.exome.20120522.bam