Skip to main content

Position-Defined CpG Islands Provide Complete Co-methylation Indexing for Human Genes

  • Conference paper
  • First Online:
Intelligent Computing Theories and Application (ICIC 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13394))

Included in the following conference series:

Abstract

DNA methylation, especially position-sensitive co-methylation of CpG islands (CGIs), is one of the key epigenomic mechanisms of gene expression regulation and chromosomal integrity. Therefore, thoroughly mapping the precise position of all CpG sequences within CGIs non-island clusters as well as their methylated status at single cell level under different physiological and pathological conditions becomes one of the ultimate goals for epigenomics. Toward this end, we compare our previously categorized position-defined CpG and methylation sites complementary to those of density-defined CpG islands to investigate patterns of such two categorized methylation sites relative to human gene expression regulation. Based on our previous analysis on LAUPs (Lineage-associated underrepresented permutations) and the discovery that CpG-containing sequences are underrepresented when the distance among CpG sequences is ranged from 10bp to 14bp, we define such distances as discrete intervals at basepair precision and compute 12bp, 25bp, and 50bp, three position-defined CGIs groups according to the interval lengths, which cover 1.85 times greater CpG sites (14.98%) than those of density-defined CGIs (8.08%). This novel scheme reveals: (1) There are three partially-overlapping yet distinct position-defined CGI subgroups in the human genome. (2) The 12-bp CGIs appear unique to low-density CGIs or LCGIs but the other two CGIs, 25-bp and 50-bp, are found in all three density-defined CGIs. (3) The largest fraction of unmethylated (75.99%) and moderately-methylated (12.91%) core promoter- associated CGIs are found among the 12-bp CGIs but less found in 50-bp CGIs (41.77% for HCGIs and 20.03% for ICGI) of the same sequence region. (4) We conclude that in the Precision Medicine Era all CpG sites and their clusters are to be mapped and annotated, and modelled for gene expression regulation at single basepair precision.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Dor, Y., Cedar, H.: Principles of DNA methylation and their implications for biology and medicine. Lancet 392(10149), 777–786 (2018)

    Article  Google Scholar 

  2. Takahashi, Y., et al.: Integration of CpG-free DNA induces de novo methylation of CpG islands in pluripotent stem cells. Science 356(6337), 503–508 (2017)

    Article  Google Scholar 

  3. Pongor, C.I., et al.: Optical trapping nanometry of hypermethylated CPG-Island DNA. Biophys. J. 112(3), 512 (2017)

    Article  Google Scholar 

  4. Weber, M., et al.: Distribution, silencing potential and evolutionary impact of promoter DNA methylation in the human genome. Nat. Genet. 39(4), 457–466 (2007)

    Article  Google Scholar 

  5. Ziller, M.J., et al.: Charting a dynamic DNA methylation landscape of the human genome. Nature, 500(7463), 477–81 (2013)

    Google Scholar 

  6. Gardinergarden, M., Frommer, M.: CpG islands in vertebrate genomes. J. Mol. Biol. 196(2), 261–82 (1987)

    Google Scholar 

  7. Ning, et al.: GaussianCpG: a gaussian model for detection of CpG island in human genome sequences. BMC Genomics 18(S4), 392 (2017)

    Article  Google Scholar 

  8. Su, J., et al.: CpG_MI: a novel approach for identifying functional CpG islands in mammalian genomes Nucleic Acids Res. 38(1),e6 (2009)

    Google Scholar 

  9. Hackenberg, M.. et al.: CpGcluster: a distance-based algorithm for CpG-island detection, BMC Bioinform. 7(1), 446 (2006)

    Google Scholar 

  10. Hackenberg, M., et al.: Prediction of CpG-island function: CpG clustering vs. sliding-window methods. BMC Genomics, 11(327) (2010)

    Google Scholar 

  11. Zhang, L., et al.: Lineage-associated underrepresented permutations (LAUPs) of mammalian genomic sequences based on a Jellyfish-based LAUPs analysis application (JBLA). Bioinformatics 34(21), 3624–3630 (2018)

    Article  Google Scholar 

  12. Luo, C., et al.: Dynamic DNA methylation: In the right place at the right time, (in eng). Science 361(6409), 1336–1340 (2018)

    Article  Google Scholar 

  13. Zhu, J., et al.: On the nature of human housekeeping genes. Trends Genet. Tig. 24(10), 481 (2008)

    Article  Google Scholar 

  14. Zhang, L., et al.: CpG-Island-based annotation and analysis of human housekeeping genes. Brief Bioinform. 22(1), 515–525 (2021)

    Article  Google Scholar 

  15. Xiao, M., et al.: CGIDLA: developing the web server for CpG Island related density and LAUPs (lineage-associated underrepresented permutations) study. IEEE/ACM Trans. Comput. Biol. Bioinform. 17(6), 2148–2154 (2020)

    Article  Google Scholar 

  16. Smith, Z.D., et al.: A unique regulatory phase of DNA methylation in the early mammalian embryo. Nature 484(7394), 339–344 (2012)

    Article  Google Scholar 

  17. Acton, R., et al.: The genomic loci of specific human tRNA genes exhibit ageing-related DNA hypermethylation. Nat. Commun. 12(2655), 1–14 (2021)

    Google Scholar 

  18. Dede, E., et al.: Processing cassandra datasets with hadoop-streaming based approaches. IEEE Trans. Serv. Comput. 9(1), 46–58 (2016)

    Article  Google Scholar 

  19. Schneider, V.A., et al.: Evaluation of GRCh38 and de novo haploid genome assemblies demonstrates the enduring quality of the reference assembly. Genome Res. 27(5), 849–864 (2017)

    Google Scholar 

  20. Pruitt, K.D., et al.: NCBI reference sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res. 33(Database issue), D501–4 (2005)

    Google Scholar 

  21. Casper, J., et al.: The UCSC genome browser database: 2018 update. Nucleic Acids Res. 46(Database issue), D762–D769 (2018)

    Google Scholar 

  22. Clark, K., et al.: GenBank. Nucleic Acids Res. 44(Database issue), D67–D72 (2016)

    Google Scholar 

  23. Wright, J.C., et al.: Improving GENCODE reference gene annotation using a high-stringency proteogenomics workflow. Nat. Commun. 7(11778), 1 (2016)

    Google Scholar 

  24. Harrow, J., et al.: GENCODE: producing a reference annotation for Encode. Genome Biol. 7(Suppl 1), 1–9 (2006)

    Article  Google Scholar 

  25. Zhang, L., et al.: EZH2-, CHD4-, and IDH-linked epigenetic perturbation and its association with survival in glioma patients. J. Mol. Cell Biol. 9(6), 477–488 (2017)

    Article  Google Scholar 

  26. Antequera, F.: Structure, function and evolution of CpG island promoters. Cell. Mol. Life Sci. CMLS 60(8), 1647–1658 (2003)

    Article  Google Scholar 

  27. Greenberg, M.V.C., Bourc’his, D.: The diverse roles of DNA methylation in mammalian development and disease. Nat. Rev. Mol. Cell Biol. 20(10), 590–607 (2019)

    Google Scholar 

  28. Zhu, J., et al.: How many human genes can be defined as housekeeping with current expression data? BMC Genomics 9(1), 172 (2008)

    Article  Google Scholar 

  29. Xiao, M., et al.: 2019nCoVAS: developing the web service for epidemic transmission prediction, genome analysis, and psychological stress assessment for 2019-nCoV. IEEE/ACM Trans. Comput. Biol. Bioinform. 18(4), 1250–1261 (2021)

    Article  Google Scholar 

  30. Chen, K., et al.: Sequence signatures of nucleosome positioning in Caenorhabditis elegans, (in eng). Genomics Proteomics Bioinform. 8(2), 92–102 (2010)

    Article  Google Scholar 

  31. Cui, P., et al.: The association between H3K4me3 and antisense transcription, (in eng). Genomics Proteomics Bioinform. 10(2), 74–81 (2012)

    Article  Google Scholar 

  32. Cui, P., et al.: The transcript-centric mutations in human genomes, (in eng). Genomics Proteomics Bioinform. 10(1), 11–22 (2012)

    Article  Google Scholar 

  33. Cui, P., et al.: Distinct contributions of replication and transcription to mutation rate variation of human genomes. Genomics Proteomics Bioinform. 10(4–10 (2012)

    Google Scholar 

  34. Xia, J., et al., Investigating the relationship of DNA methylation with mutation rate and allele frequency in the human genome, (in eng). BMC Genomics, 13 Suppl 8(Suppl 8), S7 (2012)

    Google Scholar 

  35. Piunti, A., Shilatifard, A.: The roles of Polycomb repressive complexes in mammalian development and cancer. Nat. Rev. Mol. Cell Biol. 22(5), 326–345 (2021)

    Article  Google Scholar 

  36. Affinito, O., et al.: Nucleotide distance influences co-methylation between nearby CpG sites. Genomics 112(1), 144–150 (2020)

    Article  Google Scholar 

  37. Villicaña, S., Bell, J.: Genetic impacts on DNA methylation: research findings and future perspectives. Genome Biol. 22(1), 1–35 (2021)

    Google Scholar 

  38. Blackledge, N.P., et al.: CpG island chromatin is shaped by recruitment of ZF-CxxC proteins. Cold Spring Harb. Perspect. Biol. 5(11), a018648 (2013)

    Article  Google Scholar 

Download references

Funding

This work was supported by grants from National Science and Technology Major Project (Grant No. 2018ZX10201002, China), National Natural Science Foundation of China (82001409, China), China Postdoctoral Science Foundation (2020M673221, China), and Fundamental Research Funds for the Central Universities (2020SCU12056, China).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Le Zhang .

Editor information

Editors and Affiliations

Ethics declarations

Conflict of Interest

None declared.

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Xiao, M. et al. (2022). Position-Defined CpG Islands Provide Complete Co-methylation Indexing for Human Genes. In: Huang, DS., Jo, KH., Jing, J., Premaratne, P., Bevilacqua, V., Hussain, A. (eds) Intelligent Computing Theories and Application. ICIC 2022. Lecture Notes in Computer Science, vol 13394. Springer, Cham. https://doi.org/10.1007/978-3-031-13829-4_27

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-13829-4_27

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-13828-7

  • Online ISBN: 978-3-031-13829-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics