Abstract
In this work, we consider a special type of uncertain sequence called weighted string. In a weighted string every position contains a subset of the alphabet and every letter of the alphabet is associated with a probability of occurrence such that the sum of probabilities at each position equals 1. Usually a cumulative weight threshold is specified, and one considers only strings that match the weighted string with probability at least . We provide an \(\mathcal {O}(nz)\)-time and \(\mathcal {O}(nz)\)-space off-line algorithm, where n is the length of the weighted string and is the given threshold, to compute a smallest maximal palindromic factorization of a weighted string. This factorization has applications in hairpin structure prediction in a set of closely-related DNA or RNA sequences. Along the way, we provide an \(\mathcal {O}(nz)\)-time and \(\mathcal {O}(nz)\)-space off-line algorithm to compute maximal palindromes in weighted strings.
M. Alzamel and C.S. Iliopoulos—Partially supported by the Onassis Foundation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Alatabbi, A., Iliopoulos, C.S., Rahman, M.S.: Maximal palindromic factorization. In: PSC, pp. 70–77 (2013)
Almirantis, Y., Charalampopoulos, P., Gao, J., Iliopoulos, C.S., Mohamed, M., Pissis, S.P., Polychronopoulos, D.: On avoided words, absent words, and their application to biological sequence analysis. Algorithms Mol. Biol. 12(1), 5 (2017)
Amir, A., Gotthilf, Z., Shalom, B.R.: Weighted LCS. J. Discrete Algorithms 8(3), 273–281 (2010)
Apostolico, A., Breslauer, D., Galil, Z.: Parallel detection of all palindromes in a string. Theoret. Comput. Sci. 141(1), 163–173 (1995)
Barton, C., Iliopoulos, C.S., Pissis, S.P.: Optimal computation of all tandem repeats in a weighted sequence. Algorithms Mol. Biol. 9(21), 21 (2014)
Barton, C., Kociumaka, T., Liu, C., Pissis, S.P., Radoszewski, J.: Indexing Weighted Sequences: Neat and Efficient. CoRR, abs/1704.07625 (2017)
Barton, C., Kociumaka, T., Pissis, S.P., Radoszewski, J.: Efficient index for weighted sequences. In: CPM. LIPIcs, vol. 54, pp. 4:1–4:13. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik (2016)
Barton, C., Liu, C., Pissis, S.P.: Linear-time computation of prefix table for weighted strings and applications. Theoret. Comput. Sci. 656, 160–172 (2016)
Barton, C., Liu, C., Pissis, S.P.: On-line pattern matching on uncertain sequences and applications. In: Chan, T.-H.H., Li, M., Wang, L. (eds.) COCOA 2016. LNCS, vol. 10043, pp. 547–562. Springer, Cham (2016). doi:10.1007/978-3-319-48749-6_40
Barton, C., Pissis, S.P.: Crochemore’s partitioning on weighted strings and applications. Algorithmica (2017). doi:10.1007/s00453-016-0266-0
Bender, M.A., Farach-Colton, M.: The LCA problem revisited. In: Gonnet, G.H., Viola, A. (eds.) LATIN 2000. LNCS, vol. 1776, pp. 88–94. Springer, Heidelberg (2000). doi:10.1007/10719839_9
Cygan, M., Kubica, M., Radoszewski, J., Rytter, W., Walen, T.: Polynomial-time approximation algorithms for weighted LCS problem. Discrete Appl. Math. 204, 38–48 (2016)
Farach, M.: Optimal suffix tree construction with large alphabets. In: FOCS, pp. 137–143. IEEE Computer Society (1997)
Fici, G., Gagie, T., Kärkkäinen, J., Kempa, D.: A subquadratic algorithm for minimum palindromic factorization. J. Discrete Algorithms 28, 41–48 (2014)
Gusfield, D.: Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology. Cambridge University Press, New York (1997)
Tomohiro, I., Sugimoto, S., Inenaga, S., Bannai, H., Takeda, M.: Computing palindromic factorizations and palindromic covers on-line. In: Kulikov, A.S., Kuznetsov, S.O., Pevzner, P. (eds.) CPM 2014. LNCS, vol. 8486, pp. 150–161. Springer, Cham (2014). doi:10.1007/978-3-319-07566-2_16
Iliopoulos, C.S., Makris, C., Panagis, Y., Perdikuri, K., Theodoridis, E., Tsakalidis, A.: The weighted suffix tree: an efficient data structure for handling molecular weighted sequences and its applications. Fundamenta Informaticae 71(2, 3), 259–277 (2006)
Kociumaka, T., Pissis, S.P., Radoszewski, J.: Pattern matching and consensus problems on weighted sequences and profiles. In: ISAAC. LIPIcs, vol. 64, pp. 46:1–46:12. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik (2016)
Manacher, G.: A new linear-time “on-line" algorithm for finding the smallest initial palindrome of a string. J. ACM 22(3), 346–351 (1975)
Muhire, B.M., Golden, M., Murrell, B., Lefeuvre, P., Lett, J.-M., Gray, A., Poon, A.Y.F., Ngandu, N.K., Semegni, Y., Tanov, E.P., et al.: Evidence of pervasive biologically functional secondary structures within the genomes of eukaryotic single-stranded DNA viruses. J. Virol. 88(4), 1972–1989 (2014)
Rubinchik, M., Shur, A.M.: EERTREE: an efficient data structure for processing palindromes in strings. In: Lipták, Z., Smyth, W.F. (eds.) IWOCA 2015. LNCS, vol. 9538, pp. 321–333. Springer, Cham (2016). doi:10.1007/978-3-319-29516-9_27
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Alzamel, M., Gao, J., Iliopoulos, C.S., Liu, C., Pissis, S.P. (2017). Efficient Computation of Palindromes in Sequences with Uncertainties. In: Boracchi, G., Iliadis, L., Jayne, C., Likas, A. (eds) Engineering Applications of Neural Networks. EANN 2017. Communications in Computer and Information Science, vol 744. Springer, Cham. https://doi.org/10.1007/978-3-319-65172-9_52
Download citation
DOI: https://doi.org/10.1007/978-3-319-65172-9_52
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-65171-2
Online ISBN: 978-3-319-65172-9
eBook Packages: Computer ScienceComputer Science (R0)