skip to main content
10.1145/2147805.2147852acmconferencesArticle/Chapter ViewAbstractPublication PagesbcbConference Proceedingsconference-collections
short-paper

A novel K-mer mixture logistic regression for methylation susceptibility modeling of CpG dinucleotides in human gene promoters

Published: 01 August 2011 Publication History

Abstract

DNA methylation is essential for normal cell development and differentiation and plays a crucial role in the development of nearly all types of cancer. Although it is now possible, using next generation sequencing technologies, to assess human methylomes at base resolution, no reports currently exist on modeling cell type-specific DNA methylation susceptibility. Thus, we conducted a comprehensive modeling study of cell type-specific DNA methylation susceptibility at three different resolutions: CpG dinucleotides, CpG segments, and individual gene promoter regions.
Using a k-mer mixture logistic regression model, we effectively modeled DNA methylation susceptibility across five different cell types. The significance of these results is three fold: 1) this is the first report to indicate that CpG methylation susceptible "segments" exist; 2) our model demonstrates the significance of certain k-mers for the mixture model, potentially highlighting DNA sequence features (k-mers) of differentially methylated, promoter CpG island sequences across different tissue types; 3) as only 3 or 4 bp patterns had previously been used for modeling DNA methylation susceptibility, ours is the first demonstration that 6-mer modeling can be performed without loss of accuracy.

References

[1]
A. Bird. DNA methylation patterns and epigenetic memory. Genes & development, 16(1):6--21, January 2002.
[2]
P. A. Jones and P. W. Laird. Cancer-epigenetics comes of age. Nat Genet, 21(2):163--167, 1999.
[3]
A. H. Ting, et al. The cancer epigenome--components and functional correlates. Genes Dev, 20(23):3215--3231, December 2006.
[4]
J. G. Herman and S. B. Baylin. Gene silencing in cancer in association with promoter hypermethylation. N Engl J Med, 349(21):2042--2054, November 2003.
[5]
J. F. Costello, et al. Aberrant CpG-island methylation has non-random and tumour-type-specific patterns. Nature genetics, 24(2):132--138, February 2000.
[6]
P. W. Laird. Principles and challenges of genome-wide DNA methylation analysis. Nature Reviews Genetics, 11(3):191--203, February 2010.
[7]
F. A. Feltus, et al. Predicting aberrant CpG island methylation. Proceedings of the National Academy of Sciences, 100(21):12253--12258, October 2003.
[8]
M. T. McCabe, et al. A multifactorial signature of DNA sequence and polycomb binding predicts aberrant CpG island methylation. Cancer research, 69(1):282--291, January 2009.
[9]
F. A. Feltus, et al. DNA motifs associated with aberrant CpG island methylation. Genomics, 87(5):572--579, May 2006.
[10]
I. Keshet, et al. Evidence for an instructive mechanism of de novo methylation in cancer cells. Nature Genetics, 38(2):149--153, January 2006.
[11]
F. Fang, et al. Predicting methylation status of CpG islands in the human brain. Bioinformatics, 22(18):2204--2209, September 2006.
[12]
C. Bock, et al. CpG island methylation in human lymphocytes is highly correlated with DNA sequence, repeats, and predicted DNA structure. PLoS Genet, 2(3):e26+, March 2006.
[13]
V. Handa and A. Jeltsch. Profound flanking sequence preference of dnmt3a and dnmt3b mammalian DNA methyltransferases shape the human epigenome. Journal of Molecular Biology, 348(5):1103--1112, 2005.
[14]
Y. Zhang, et al. DNA methylation analysis of chromosome 21 gene promoters at single base pair and single allele resolution. PLoS Genet, 5(3):e1000438+, March 2009.
[15]
Y. Zhang, et al. amplicon 193, July 2010. Available: http://biochem.jacobs-university.de/name21/presentation/amplicon_summaries/193_amplicon_summary.html
[16]
K. H. Taylor, et al. Ultradeep bisulfite sequencing analysis of DNA methylation patterns in multiple gene promoters by 454 sequencing. Cancer Res, 67(18):8511--8518, September 2007.
[17]
S. Kim, et al. Predicting DNA methylation susceptibility using CpG flanking sequences. Pacific Symposium on Biocomputing 2008, 2008.
[18]
Y. Yang. Supplementary, July 2010. Available: http://cancer.informatics.indiana.edu/SegmentModeling/
[19]
T. H. Cormen, et al. Introduction to Algorithms. McGraw-Hill Science/Engineering/Math, 2nd edition, December 2003.
[20]
L. Breiman. Random forests. Machine Learning, 45(1):5--32, October 2001.
  1. A novel K-mer mixture logistic regression for methylation susceptibility modeling of CpG dinucleotides in human gene promoters

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    BCB '11: Proceedings of the 2nd ACM Conference on Bioinformatics, Computational Biology and Biomedicine
    August 2011
    688 pages
    ISBN:9781450307963
    DOI:10.1145/2147805
    • General Chairs:
    • Robert Grossman,
    • Andrey Rzhetsky,
    • Program Chairs:
    • Sun Kim,
    • Wei Wang
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 01 August 2011

    Permissions

    Request permissions for this article.

    Check for updates

    Qualifiers

    • Short-paper

    Conference

    BCB' 11
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 254 of 885 submissions, 29%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 74
      Total Downloads
    • Downloads (Last 12 months)2
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 08 Mar 2025

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media