Abstract
Following recent discoveries about the important roles of non-coding RNAs (ncRNAs) in the cellular machinery, there is now great interest in identifying new occurrences of ncRNAs in available genomic sequences. In this paper, we show how the problem of finding new occurrences of characterized ncRNAs can be modeled as the problem of finding all locally-optimal solutions of a weighted constraint network using dedicated weighted global constraints, encapsulating pattern-matching algorithms and data structures. This is embodied in DARN!, a software tool for ncRNA localization, which, compared to existing pattern-matching based tools, offers additional expressivity (such as enabling RNA–RNA interactions to be described) and improved specificity (through the exploitation of scores and local optimality) without compromises in CPU efficiency. This is demonstrated on the actual search for tRNAs and H/ACA sRNA on different genomes.
Similar content being viewed by others
References
Abouelhoda, M., Kurtz, S., & Ohlebusch E. (2004). Replacing suffix trees with enhanced suffix arrays. Journal of Discrete Algorithms, 2, 53–86.
Altschul, S., Gish, W., Miller, W., Myers, E., & Lipman D. (1990). Basic local alignment search tool. Journal of Molecular Biology, 215, 403–410.
Billoud, B., Kontic, M., & Viari, A. (1996). Palingol: A declarative programming language to describe nucleic acids’ secondary structures and to scan sequence database. Nucleic Acids Research, 24(8), 1395–1403.
Cherry, J., Ball, C., Weng, S., Juvik, G., Schmidt, R., Adler, C., et al. (1997). Genetic and physical maps of Saccharomyces cerevisiae. Nature, 387(6632 Suppl), 67–73.
Dsouza, M., Larsen, N., & Overbeek, R. (1997). Searching for patterns in genomic data. Trends in Genetics, 13(12).
Eddy, S. (1996). Rnabob: A program to search for RNA secondary structure motifs in sequence databases. http://bioweb.pasteur.fr/docs/man/man/rnabob.1.html#toc1
Eddy, S., & Durbin, R. (1994). RNA sequence analysis using covariance models. Nucleic Acids Research, 22(11), 2079–2088.
Griffiths-Jones, S., Moxon, S., Marshall, M., Khanna, A., Eddy, S., & Bateman, A. (2005). Rfam: Annotating non-coding RNAs in complete genomes. Nucleic Acids Research, 33, D121–D124.
Laferrière, A., Gautheret, D., & Cedergren R. (1994). An RNA pattern matching program with enhanced performance and portability. Computer Applications in Biosciences, 10(2), 211–212.
Larrosa, J., & Schiex, T. (2004). Solving weighted CSP by maintaining arc-consistency. Artificial Intelligence, 159(1–2), 1–26.
Lhomme, O. (1993). Consistency techniques for numeric CSPs. In IJCAI-93 (pp. 232–238).
Macke, T., Ecker, D., Gutell, R., Gautheret, D., Case, D., & Sampath, R. (2001). Rnamotif, an RNA secondary structure definition and search algorithm. Nucleic Acids Research, 29(22), 4724–4735.
Navarro, G., & Raffinot, M. (2002). Flexible pattern matching in strings—practical on-line search algorithms for texts and biological sequences. Cambridge University Press.
Needleman, S., & Wunsch, C. (1970). A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology, 48(3), 443–453.
Sakakibara, Y., Brown, M., Hughey, R., Mian, I., Sjölander, K., Underwood, R., et al. (1994). Recent methods for RNA modeling using stochastic context-free grammars. In CPM’94 (pp. 289–306).
Thébault, P., de Givry, S., Schiex, T., & Gaspin, C. (2006). Searching RNA motifs and their intermolecular contacts with constraint networks. Bioinformatics, 22(17), 2074–2080.
Vialette, S. (2004). On the computational complexity of 2-interval pattern matching problems. Theoretical Computer Science, 312(2–3), 223–249.
Zytnicki, M., Gaspin, C., & Schiex, T. (2006). A new local consistency for weighted CSP dedicated to long domains. In SAC’06: Proceedings of the 2006 ACM symposium on applied computing (pp. 394–398).
Zytnicki, M., Gaspin, C., & Schiex, T. (2006). Suffix arrays and weighted CSPs. In A. Dal Palu, A. Dovier, & S. Will (Eds.), Workshop on constraint based methods for bioinformatics (pp. 69–74). Nantes.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Zytnicki, M., Gaspin, C. & Schiex, T. DARN! A Weighted Constraint Solver for RNA Motif Localization. Constraints 13, 91–109 (2008). https://doi.org/10.1007/s10601-007-9033-9
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10601-007-9033-9