Skip to main content

Discovering the Most Characteristic Motif from a Set of Peak Sequences

  • Conference paper
  • First Online:
  • 1654 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 12108))

Abstract

Chromatin immunoprecipitation experiments and the subsequent sequencing of these fragments (ChIP-Seq) are subject to great uncertainty, due to execution errors, technical and calculation limitations and the inherent complexity of the biological systems to be studied. Therefore, one of the challenges that researchers face when analyzing the results of ChIP-Seq experiments is to elucidate the pattern behind the obtained sequences (peaks), facing a huge amount of data and noise. A significant amount of statistical tools and algorithms have been proposed to solve this issue in the last years. The method presented in this paper innovates by taking advantage of both the structure of the data obtained in these experiments (peaks) and the existing resources. The motif or pattern obtained by this procedure from these peaks is considered the most characteristic motif. This method also allows to obtain the quality metrics of the analyzed experiment. The method has been validated with data retrieved from public repositories.

This work has been funded by the Spanish Ministry of Economy, Industry and Competitiveness, the European Regional Development Fund (ERDF) Programme through grant TIN2017-85949-C2-1-R and by the Spanish Ministry of Education, Culture and Sports through fellowship FPU014/06303.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Data Availability

The data generated in this work is available at https://github.com/gines-almagro/ChIP-Seq-motif.

References

  1. Johnson, D.S., Mortazavi, A., Myers, R.M., Wold, B.: Genome-wide mapping of in vivo protein-DNA interactions. Science 316(5830), 1497–1502 (2007)

    Article  CAS  Google Scholar 

  2. Lin, C.Y., et al.: Transcriptional amplification in tumor cells with elevated c-Myc. Cell 1(151), 56–67 (2012)

    Article  Google Scholar 

  3. Cheneby, J., Gheorghe, M., Artufel, M., Mathelier, A., Ballester, B.: ReMap 2018: an updated regulatory regions atlas from an integrative analysis of DNA-binding ChIP-Seq experiments. Nucleic Acids Res. 46(DI), D267–D275 (2018)

    Article  CAS  Google Scholar 

  4. Cunningham, F., Achuthan, P., et al.: Ensembl 2019. Nucleic Acids Res. 47(DI), D745–D751 (2019)

    Article  CAS  Google Scholar 

  5. Heinz, S., Benner, C., et al.: Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol. Cell 38(4), 576–589 (2010)

    Article  CAS  Google Scholar 

  6. Tremblay, B.J.M.: Universalmotif: Import, Modify, and Export Motifs with R. R package version 1.4.6 (2020). https://github.com/bjmt/universalmotif

  7. Leonardi, T.: ggheatmap: generate pretty ggplot2 heatmaps with row and column dendrograms. R package version 0.0.0.9000. (2020)

    Google Scholar 

  8. R Core Team: R: a language and environment for statistical computing. R Foundation for Statistical Computing (2019). https://www.R-project.org/

  9. Fornés, O., Castro-Mondragon, J.A., et al.: JASPAR 2020: update of the open-access database of transcription factor binding profiles. Nucleic Acids Res. 48, D87–D92 (2019)

    Google Scholar 

  10. Mahony, S., Benos, P.V.: STAMP: a web tool for exploring DNA-binding motif similarities. Nucleic Acids Res. 35(Web Server issue), W253–W258 (2007)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Ginés Almagro-Hernández or Jesualdo Tomás Fernández-Breis .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Almagro-Hernández, G., Fernández-Breis, J.T. (2020). Discovering the Most Characteristic Motif from a Set of Peak Sequences. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L., Ortuño, F. (eds) Bioinformatics and Biomedical Engineering. IWBBIO 2020. Lecture Notes in Computer Science(), vol 12108. Springer, Cham. https://doi.org/10.1007/978-3-030-45385-5_40

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-45385-5_40

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-45384-8

  • Online ISBN: 978-3-030-45385-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics