Using GO.db and GOstats, I obtained the gene list with bona fide CpG islands upstream and conducted a GO enrichment analysis. The choice of the gene universe is again all RefSeq gene models. Enriched Biological Processes include:
1 primary metabolic process
2 branched chain family amino acid metabolic process
3 regulation of nucleobase, nucleoside, nucleotide and nucleic acid metabolic process
4 regulation of transcription, DNA-dependent
5 nitrogen compound metabolic process
6 cellular metabolic process
7 mitotic cell cycle checkpoint
8 nucleobase, nucleoside and nucleotide metabolic process
9 cAMP biosynthetic process
10 G-protein signaling, coupled to cAMP nucleotide second messenger
11 chromatin organization
12 cellular biopolymer metabolic process
Molecular functions:
1 sequence-specific DNA binding
2 transcription factor activity
Cellular components:
Term
1 nucleus
2 membrane-bounded organelle
3 chromosomal part
4 cell
5 intracellular part
Transcription factors more likely to have CpG islands 1,000 bp upstream, and not overlapping the 5' UTR?

This work is licensed under a Creative Commons
Attribution 4.0 International License.