Skip to main content

Adapting Machine Learning Technique for Periodicity Detection in Nucleosomal Locations in Sequences

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4881))

Abstract

DNA sequence is an important determinant of the positioning, stability, and activity of nucleosome, yet the molecular basis of these remains elusive. Positioned nucleosomes are believed to play an important role in transcriptional regulation and for the organization of chromatin in cell nuclei. After completing the genome project of many organisms, sequence mining received considerable and increasing attention. Many works devoted a lot of effort to detect the periodicity in DNA sequences, namely, the DNA segments that wrap the Histone protein. In this paper, we describe and apply a dynamic periodicity detection algorithm to discover periodicity in DNA sequences. Our algorithm is based on suffix tree as the underlying data structure. The proposed approach considers the periodicity of alternative substrings, in addition to considering dynamic window to detect the periodicity of certain instances of substrings. We demonstrate the applicability and effectiveness of the proposed approach by reporting test results on three data sets.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Indyk, P., Koudas, N., Muthukrishan, S.: Identifying representative trends in massive time series data sets using sketches. In: Proc. of VLDB (2000)

    Google Scholar 

  2. Elfeky, M.G., Aref, W.G., Elmagramid, A.K.: Periodicity detection in time series databases. IEEE TKDE 17(7), 875–887 (2005)

    Google Scholar 

  3. Bina, M.: Periodicity of dinucleotide in Nucleosomes derived from simian virus 40 chromatin. Journal of Molecular Biology 235, 198–208 (1994)

    Article  Google Scholar 

  4. Sarchwell, S.C., et al.: Sequence periodicities in checkin nucleosome core DNA. Journal of Molecular Biology 191, 659–675 (1986)

    Article  Google Scholar 

  5. Herzel, H., Weiss, O., Trifonov, N.: 10-11 bp periodicities in complete genomes reflect protein structure and DNA folding. Bioinformatics 15(3), 187–193 (1999)

    Article  Google Scholar 

  6. Thastrom, A., Bingham, L.M., Widom, J.: Nucleosomal locations of dominant DNA sequence motifs for histone-DNA interaction and nucleosome positioning. Journal of Molecular Biology 338, 695–709 (2004)

    Article  Google Scholar 

  7. Segal, E., et al.: Agenomic code for nucleosome positioning. Nature, 442(17), 772–778

    Google Scholar 

  8. Herzel, H., Weiss, O., Trifonov, E.N.: Periodicity in complete genome of archaea suggests positive supercoiling. Journal of Biomol. Struct. Dyn. 16, 341–345 (1998)

    Google Scholar 

  9. Hosid, S., et al.: Sequence periodicity of Escherichia coli is concentrated in intergenic regions. BMC Molecular Biology 5(14) (2004)

    Google Scholar 

  10. Gusfield, D.: Algorithms on Strings, Trees, and Sequences. Cambridge Univ. Press, Cambridge (1997)

    MATH  Google Scholar 

  11. Grossi, R., Italiano, G.F.: Suffix trees and their applications in string algorithms. In: Proc. of South American Workshop on String Processing, pp. 57–76 (1993)

    Google Scholar 

  12. Gartenberg, M.R., Wang, J.C.: Positive supercoiling of DNA greatly diminishes mRNA synthesis in yeast. PNAS 89(23), 11461–11465 (1992)

    Article  Google Scholar 

  13. Rasheed, F., Alhajj, R.: Using suffix trees for the periodicity detection in time series databases, Technical Report, Dept of Computer Science, University of Calgary (May 2007)

    Google Scholar 

  14. Elfeky, M.G., Aref, W.G., Elmagarmid, A.K.: WARP: Time Warping for Periodicity Detection. In: Proc. of IEEE ICDM, pp. 138–145 (2005)

    Google Scholar 

  15. Ma, S., Hellerstein, J.L.: Mining partially periodic event patterns with unknown periods. In: Proc. of IEEE ICDE, pp. 205–214 (2001)

    Google Scholar 

  16. Han, J., Yin, Y., Dong, G.: Efficient Mining of Partial Periodic Patterns in Time Series Database. In: Proc. of IEEE ICDE, p.106 (1999)

    Google Scholar 

  17. Ukkonen, E.: Online Construction of Suffix Trees. Algorithmica 4(3), 249–260 (1995)

    Article  MathSciNet  Google Scholar 

  18. Ahdesmäki, M., Lähdesmäki, H., Yli-Harja, O.: Robust Fisher’s test for periodicity detection in noisy biological time series. In: Proc. of IEEE International Workshop on Genomic Signal Processing and Statistics, Tuusula, FINLAND (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Hujun Yin Peter Tino Emilio Corchado Will Byrne Xin Yao

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Rasheed, F., Alshalalfa, M., Alhajj, R. (2007). Adapting Machine Learning Technique for Periodicity Detection in Nucleosomal Locations in Sequences. In: Yin, H., Tino, P., Corchado, E., Byrne, W., Yao, X. (eds) Intelligent Data Engineering and Automated Learning - IDEAL 2007. IDEAL 2007. Lecture Notes in Computer Science, vol 4881. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-77226-2_87

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-77226-2_87

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-77225-5

  • Online ISBN: 978-3-540-77226-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics