Discovering the Most Characteristic Motif from a Set of Peak Sequences

Almagro-Hernández, Ginés; Fernández-Breis, Jesualdo Tomás

doi:10.1007/978-3-030-45385-5_40

Discovering the Most Characteristic Motif from a Set of Peak Sequences

Conference paper
First Online: 30 April 2020

1654 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 12108))

Abstract

Chromatin immunoprecipitation experiments and the subsequent sequencing of these fragments (ChIP-Seq) are subject to great uncertainty, due to execution errors, technical and calculation limitations and the inherent complexity of the biological systems to be studied. Therefore, one of the challenges that researchers face when analyzing the results of ChIP-Seq experiments is to elucidate the pattern behind the obtained sequences (peaks), facing a huge amount of data and noise. A significant amount of statistical tools and algorithms have been proposed to solve this issue in the last years. The method presented in this paper innovates by taking advantage of both the structure of the data obtained in these experiments (peaks) and the existing resources. The motif or pattern obtained by this procedure from these peaks is considered the most characteristic motif. This method also allows to obtain the quality metrics of the analyzed experiment. The method has been validated with data retrieved from public repositories.

This work has been funded by the Spanish Ministry of Economy, Industry and Competitiveness, the European Regional Development Fund (ERDF) Programme through grant TIN2017-85949-C2-1-R and by the Spanish Ministry of Education, Culture and Sports through fellowship FPU014/06303.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Data Availability

The data generated in this work is available at https://github.com/gines-almagro/ChIP-Seq-motif.

References

Johnson, D.S., Mortazavi, A., Myers, R.M., Wold, B.: Genome-wide mapping of in vivo protein-DNA interactions. Science 316(5830), 1497–1502 (2007)
Article CAS Google Scholar
Lin, C.Y., et al.: Transcriptional amplification in tumor cells with elevated c-Myc. Cell 1(151), 56–67 (2012)
Article Google Scholar
Cheneby, J., Gheorghe, M., Artufel, M., Mathelier, A., Ballester, B.: ReMap 2018: an updated regulatory regions atlas from an integrative analysis of DNA-binding ChIP-Seq experiments. Nucleic Acids Res. 46(DI), D267–D275 (2018)
Article CAS Google Scholar
Cunningham, F., Achuthan, P., et al.: Ensembl 2019. Nucleic Acids Res. 47(DI), D745–D751 (2019)
Article CAS Google Scholar
Heinz, S., Benner, C., et al.: Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol. Cell 38(4), 576–589 (2010)
Article CAS Google Scholar
Tremblay, B.J.M.: Universalmotif: Import, Modify, and Export Motifs with R. R package version 1.4.6 (2020). https://github.com/bjmt/universalmotif
Leonardi, T.: ggheatmap: generate pretty ggplot2 heatmaps with row and column dendrograms. R package version 0.0.0.9000. (2020)
Google Scholar
R Core Team: R: a language and environment for statistical computing. R Foundation for Statistical Computing (2019). https://www.R-project.org/
Fornés, O., Castro-Mondragon, J.A., et al.: JASPAR 2020: update of the open-access database of transcription factor binding profiles. Nucleic Acids Res. 48, D87–D92 (2019)
Google Scholar
Mahony, S., Benos, P.V.: STAMP: a web tool for exploring DNA-binding motif similarities. Nucleic Acids Res. 35(Web Server issue), W253–W258 (2007)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Informatics and Systems, Faculty of Computer Science Campus de Espinardo, University of Murcia, 30100, Murcia, Spain
Ginés Almagro-Hernández & Jesualdo Tomás Fernández-Breis
Murcian Bio-Health Institute (IMIB-Arrixaca), Campus de Ciencias de la Salud, 30120, Murcia, Spain
Ginés Almagro-Hernández & Jesualdo Tomás Fernández-Breis

Authors

Ginés Almagro-Hernández
View author publications
You can also search for this author in PubMed Google Scholar
Jesualdo Tomás Fernández-Breis
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Ginés Almagro-Hernández or Jesualdo Tomás Fernández-Breis .

Editor information

Editors and Affiliations

University of Granada, Granada, Spain
Ignacio Rojas
University of Granada, Granada, Spain
Olga Valenzuela
University of Granada, Granada, Spain
Fernando Rojas
University of Granada, Granada, Spain
Luis Javier Herrera
University of Chicago and Fundacion Progreso y Salud, Granada, Spain
Francisco Ortuño

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Almagro-Hernández, G., Fernández-Breis, J.T. (2020). Discovering the Most Characteristic Motif from a Set of Peak Sequences. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L., Ortuño, F. (eds) Bioinformatics and Biomedical Engineering. IWBBIO 2020. Lecture Notes in Computer Science(), vol 12108. Springer, Cham. https://doi.org/10.1007/978-3-030-45385-5_40

Download citation

DOI: https://doi.org/10.1007/978-3-030-45385-5_40
Published: 30 April 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-45384-8
Online ISBN: 978-3-030-45385-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics