Abstract
DNA methylation, especially position-sensitive co-methylation of CpG islands (CGIs), is one of the key epigenomic mechanisms of gene expression regulation and chromosomal integrity. Therefore, thoroughly mapping the precise position of all CpG sequences within CGIs non-island clusters as well as their methylated status at single cell level under different physiological and pathological conditions becomes one of the ultimate goals for epigenomics. Toward this end, we compare our previously categorized position-defined CpG and methylation sites complementary to those of density-defined CpG islands to investigate patterns of such two categorized methylation sites relative to human gene expression regulation. Based on our previous analysis on LAUPs (Lineage-associated underrepresented permutations) and the discovery that CpG-containing sequences are underrepresented when the distance among CpG sequences is ranged from 10bp to 14bp, we define such distances as discrete intervals at basepair precision and compute 12bp, 25bp, and 50bp, three position-defined CGIs groups according to the interval lengths, which cover 1.85 times greater CpG sites (14.98%) than those of density-defined CGIs (8.08%). This novel scheme reveals: (1) There are three partially-overlapping yet distinct position-defined CGI subgroups in the human genome. (2) The 12-bp CGIs appear unique to low-density CGIs or LCGIs but the other two CGIs, 25-bp and 50-bp, are found in all three density-defined CGIs. (3) The largest fraction of unmethylated (75.99%) and moderately-methylated (12.91%) core promoter- associated CGIs are found among the 12-bp CGIs but less found in 50-bp CGIs (41.77% for HCGIs and 20.03% for ICGI) of the same sequence region. (4) We conclude that in the Precision Medicine Era all CpG sites and their clusters are to be mapped and annotated, and modelled for gene expression regulation at single basepair precision.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Dor, Y., Cedar, H.: Principles of DNA methylation and their implications for biology and medicine. Lancet 392(10149), 777–786 (2018)
Takahashi, Y., et al.: Integration of CpG-free DNA induces de novo methylation of CpG islands in pluripotent stem cells. Science 356(6337), 503–508 (2017)
Pongor, C.I., et al.: Optical trapping nanometry of hypermethylated CPG-Island DNA. Biophys. J. 112(3), 512 (2017)
Weber, M., et al.: Distribution, silencing potential and evolutionary impact of promoter DNA methylation in the human genome. Nat. Genet. 39(4), 457–466 (2007)
Ziller, M.J., et al.: Charting a dynamic DNA methylation landscape of the human genome. Nature, 500(7463), 477–81 (2013)
Gardinergarden, M., Frommer, M.: CpG islands in vertebrate genomes. J. Mol. Biol. 196(2), 261–82 (1987)
Ning, et al.: GaussianCpG: a gaussian model for detection of CpG island in human genome sequences. BMC Genomics 18(S4), 392 (2017)
Su, J., et al.: CpG_MI: a novel approach for identifying functional CpG islands in mammalian genomes Nucleic Acids Res. 38(1),e6 (2009)
Hackenberg, M.. et al.: CpGcluster: a distance-based algorithm for CpG-island detection, BMC Bioinform. 7(1), 446 (2006)
Hackenberg, M., et al.: Prediction of CpG-island function: CpG clustering vs. sliding-window methods. BMC Genomics, 11(327) (2010)
Zhang, L., et al.: Lineage-associated underrepresented permutations (LAUPs) of mammalian genomic sequences based on a Jellyfish-based LAUPs analysis application (JBLA). Bioinformatics 34(21), 3624–3630 (2018)
Luo, C., et al.: Dynamic DNA methylation: In the right place at the right time, (in eng). Science 361(6409), 1336–1340 (2018)
Zhu, J., et al.: On the nature of human housekeeping genes. Trends Genet. Tig. 24(10), 481 (2008)
Zhang, L., et al.: CpG-Island-based annotation and analysis of human housekeeping genes. Brief Bioinform. 22(1), 515–525 (2021)
Xiao, M., et al.: CGIDLA: developing the web server for CpG Island related density and LAUPs (lineage-associated underrepresented permutations) study. IEEE/ACM Trans. Comput. Biol. Bioinform. 17(6), 2148–2154 (2020)
Smith, Z.D., et al.: A unique regulatory phase of DNA methylation in the early mammalian embryo. Nature 484(7394), 339–344 (2012)
Acton, R., et al.: The genomic loci of specific human tRNA genes exhibit ageing-related DNA hypermethylation. Nat. Commun. 12(2655), 1–14 (2021)
Dede, E., et al.: Processing cassandra datasets with hadoop-streaming based approaches. IEEE Trans. Serv. Comput. 9(1), 46–58 (2016)
Schneider, V.A., et al.: Evaluation of GRCh38 and de novo haploid genome assemblies demonstrates the enduring quality of the reference assembly. Genome Res. 27(5), 849–864 (2017)
Pruitt, K.D., et al.: NCBI reference sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res. 33(Database issue), D501–4 (2005)
Casper, J., et al.: The UCSC genome browser database: 2018 update. Nucleic Acids Res. 46(Database issue), D762–D769 (2018)
Clark, K., et al.: GenBank. Nucleic Acids Res. 44(Database issue), D67–D72 (2016)
Wright, J.C., et al.: Improving GENCODE reference gene annotation using a high-stringency proteogenomics workflow. Nat. Commun. 7(11778), 1 (2016)
Harrow, J., et al.: GENCODE: producing a reference annotation for Encode. Genome Biol. 7(Suppl 1), 1–9 (2006)
Zhang, L., et al.: EZH2-, CHD4-, and IDH-linked epigenetic perturbation and its association with survival in glioma patients. J. Mol. Cell Biol. 9(6), 477–488 (2017)
Antequera, F.: Structure, function and evolution of CpG island promoters. Cell. Mol. Life Sci. CMLS 60(8), 1647–1658 (2003)
Greenberg, M.V.C., Bourc’his, D.: The diverse roles of DNA methylation in mammalian development and disease. Nat. Rev. Mol. Cell Biol. 20(10), 590–607 (2019)
Zhu, J., et al.: How many human genes can be defined as housekeeping with current expression data? BMC Genomics 9(1), 172 (2008)
Xiao, M., et al.: 2019nCoVAS: developing the web service for epidemic transmission prediction, genome analysis, and psychological stress assessment for 2019-nCoV. IEEE/ACM Trans. Comput. Biol. Bioinform. 18(4), 1250–1261 (2021)
Chen, K., et al.: Sequence signatures of nucleosome positioning in Caenorhabditis elegans, (in eng). Genomics Proteomics Bioinform. 8(2), 92–102 (2010)
Cui, P., et al.: The association between H3K4me3 and antisense transcription, (in eng). Genomics Proteomics Bioinform. 10(2), 74–81 (2012)
Cui, P., et al.: The transcript-centric mutations in human genomes, (in eng). Genomics Proteomics Bioinform. 10(1), 11–22 (2012)
Cui, P., et al.: Distinct contributions of replication and transcription to mutation rate variation of human genomes. Genomics Proteomics Bioinform. 10(4–10 (2012)
Xia, J., et al., Investigating the relationship of DNA methylation with mutation rate and allele frequency in the human genome, (in eng). BMC Genomics, 13 Suppl 8(Suppl 8), S7 (2012)
Piunti, A., Shilatifard, A.: The roles of Polycomb repressive complexes in mammalian development and cancer. Nat. Rev. Mol. Cell Biol. 22(5), 326–345 (2021)
Affinito, O., et al.: Nucleotide distance influences co-methylation between nearby CpG sites. Genomics 112(1), 144–150 (2020)
Villicaña, S., Bell, J.: Genetic impacts on DNA methylation: research findings and future perspectives. Genome Biol. 22(1), 1–35 (2021)
Blackledge, N.P., et al.: CpG island chromatin is shaped by recruitment of ZF-CxxC proteins. Cold Spring Harb. Perspect. Biol. 5(11), a018648 (2013)
Funding
This work was supported by grants from National Science and Technology Major Project (Grant No. 2018ZX10201002, China), National Natural Science Foundation of China (82001409, China), China Postdoctoral Science Foundation (2020M673221, China), and Fundamental Research Funds for the Central Universities (2020SCU12056, China).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Ethics declarations
Conflict of Interest
None declared.
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Xiao, M. et al. (2022). Position-Defined CpG Islands Provide Complete Co-methylation Indexing for Human Genes. In: Huang, DS., Jo, KH., Jing, J., Premaratne, P., Bevilacqua, V., Hussain, A. (eds) Intelligent Computing Theories and Application. ICIC 2022. Lecture Notes in Computer Science, vol 13394. Springer, Cham. https://doi.org/10.1007/978-3-031-13829-4_27
Download citation
DOI: https://doi.org/10.1007/978-3-031-13829-4_27
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-13828-7
Online ISBN: 978-3-031-13829-4
eBook Packages: Computer ScienceComputer Science (R0)