Chromatin state
Jump to navigation
Jump to search
ENCODE chromatin states
http://www.ncbi.nlm.nih.gov/pubmed/21441907
Profiled nine human cell types consisting:
#Cell line information: see http://genome.ucsc.edu/ENCODE/cellTypes.html H1ES - H1 human embryonic stem cells K562 - an immortalized cell line produced from a female patient with chronic myelogenous leukemia (CML) GM12878 - a lymphoblastoid cell line produced from the blood of a female donor with northern and western European ancestry by EBV transformation HepG2 - a cell line derived from a male patient with liver carcinoma HUVEC - human umbilical vein endothelial cells have a normal karyotype HSMM - skeletal muscle myoblasts from the mesoderm lineage and muscle tissue with a normal karyotype NHLF - lung fibroblasts from the endoderm lineage and lung tissue with a normal karyotype NHEK - epidermal keratinocytes from the ectoderm lineage and skin with a normal karyotype HMEC - mammary epithelial cells from the ectoderm lineage and breast tissue with a normal karyotype
fullpath=http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeBroadHmm/ wget ${fullpath}wgEncodeBroadHmmGm12878HMM.bed.gz wget ${fullpath}wgEncodeBroadHmmH1hescHMM.bed.gz wget ${fullpath}wgEncodeBroadHmmHepg2HMM.bed.gz wget ${fullpath}wgEncodeBroadHmmHmecHMM.bed.gz wget ${fullpath}wgEncodeBroadHmmHsmmHMM.bed.gz wget ${fullpath}wgEncodeBroadHmmHuvecHMM.bed.gz wget ${fullpath}wgEncodeBroadHmmK562HMM.bed.gz wget ${fullpath}wgEncodeBroadHmmNhekHMM.bed.gz wget ${fullpath}wgEncodeBroadHmmNhlfHMM.bed.gz
#the different states zcat wg*.bed.gz | cut -f4 | sort -un 1_Active_Promoter 2_Weak_Promoter 3_Poised_Promoter 4_Strong_Enhancer 5_Strong_Enhancer 6_Weak_Enhancer 7_Weak_Enhancer 8_Insulator 9_Txn_Transition 10_Txn_Elongation 11_Weak_Txn 12_Repressed 13_Heterochrom/lo 14_Repetitive/CNV 15_Repetitive/CNV
#how many states in each bed file for file in `ls *.gz`; do echo $file; zcat $file | wc -l; done wgEncodeBroadHmmGm12878HMM.bed.gz 571339 wgEncodeBroadHmmH1hescHMM.bed.gz 619061 wgEncodeBroadHmmHepg2HMM.bed.gz 546343 wgEncodeBroadHmmHmecHMM.bed.gz 609251 wgEncodeBroadHmmHsmmHMM.bed.gz 638969 wgEncodeBroadHmmHuvecHMM.bed.gz 549915 wgEncodeBroadHmmK562HMM.bed.gz 622257 wgEncodeBroadHmmNhekHMM.bed.gz 628266 wgEncodeBroadHmmNhlfHMM.bed.gz 641016
#number of strong enhancers identified in all cell lines zcat wg*.bed.gz | cut -f4 | grep Strong_Enhancer | wc 574810 574810 10346580
#number of weak enhancers identified in all cell lines zcat wg*.bed.gz | cut -f4 | grep Weak_Enhancer | wc 1680951 1680951 26895216
#total number of enhancers identified in all cell lines #sanity check zcat wg*.bed.gz | cut -f4 | grep -i enhancer | wc 2255761 2255761 37241796
#store all the enhancer regions zcat wg*.bed.gz | grep -i enhancer > enhancer.bed