Skip to main content

Application of A\(^*\) to the Generalized Constrained Longest Common Subsequence Problem with Many Pattern Strings

  • Conference paper
  • First Online:
Pattern Recognition and Artificial Intelligence (ICPRAI 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13364))

  • 1012 Accesses

Abstract

This paper considers the constrained longest common subsequence problem with an arbitrary set of input strings and an arbitrary set of pattern strings as input. The problem has applications, for example, in computational biology, serving as a measure of similarity among different molecules that are characterized by common putative structures. We develop an exact A\(^*\) search to solve it. Our A\(^*\) search is compared to the only existing competitor from the literature, an Automaton approach. The results show that A\(^*\) is very efficient for real-world benchmarks, finding provenly optimal solutions in run times that are an order of magnitude lower than the ones of the competitor. Even some of the large-scale real-world instances were solved to optimality by A\(^*\) search.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Adi, S.S.: Repetition-free longest common subsequence. Discr. Appl. Math. 158(12), 1315–1324 (2010)

    Article  MathSciNet  Google Scholar 

  2. Blum, C., Blesa, M.J., López-Ibáñez, M.: Beam search for the longest common subsequence problem. Comput. Oper. Res. 36(12), 3178–3186 (2009)

    Article  MathSciNet  Google Scholar 

  3. Chowdhury, S.R., Hasan, M., Iqbal, S., Rahman, M.S.: Computing a longest common palindromic subsequence. Fund. Inform. 129(4), 329–340 (2014)

    MathSciNet  MATH  Google Scholar 

  4. Deorowicz, S.: Bit-parallel algorithm for the constrained longest common subsequence problem. Fund. Inform. 99(4), 409–433 (2010)

    MathSciNet  MATH  Google Scholar 

  5. Deorowicz, S., Obstój, J.: Constrained longest common subsequence computing algorithms in practice. Comput. Inf. 29(3), 427–445 (2012)

    MathSciNet  MATH  Google Scholar 

  6. Djukanovic, M., Berger, C., Raidl, G.R., Blum, C.: On solving a generalized constrained longest common subsequence problem. In: Olenev, N., Evtushenko, Y., Khachay, M., Malkova, V. (eds.) OPTIMA 2020. LNCS, vol. 12422, pp. 55–70. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-62867-3_5

    Chapter  Google Scholar 

  7. Djukanovic, M., Berger, C., Raidl, G.R., Blum, C.: An A\(^*\) search algorithm for the constrained longest common subsequence problem. Inf. Process. Lett. 166, 106041 (2021)

    Article  MathSciNet  Google Scholar 

  8. Djukanovic, M., Kartelj, A., Matic, D., Grbic, M., Blum, C., Raidl, G.: Graph search and variable neighborhood search for finding constrained longest common subsequences in artificial and real gene sequences. Technical report AC-TR-21-008 (2021)

    Google Scholar 

  9. Djukanovic, M., Raidl, G.R., Blum, C.: A beam search for the longest common subsequence problem guided by a novel approximate expected length calculation. In: Nicosia, G., Pardalos, P., Umeton, R., Giuffrida, G., Sciacca, V. (eds.) LOD 2019. LNCS, vol. 11943, pp. 154–167. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-37599-7_14

    Chapter  Google Scholar 

  10. Djukanovic, M., Raidl, G.R., Blum, C.: Anytime algorithms for the longest common palindromic subsequence problem. Comput. Oper. Res. 114, 104827 (2020)

    Article  MathSciNet  Google Scholar 

  11. Djukanovic, M., Raidl, G.R., Blum, C.: Finding longest common subsequences: new anytime A\(^*\) search results. Appl. Soft Comput. 95, 106499 (2020)

    Article  Google Scholar 

  12. Farhana, E., Rahman, M.S.: Constrained sequence analysis algorithms in computational biology. Inf. Sci. 295, 247–257 (2015)

    Article  MathSciNet  Google Scholar 

  13. Gotthilf, Z., Hermelin, D., Landau, G.M., Lewenstein, M.: Restricted LCS. In: Chavez, E., Lonardi, S. (eds.) SPIRE 2010. LNCS, vol. 6393, pp. 250–257. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-16321-0_26

    Chapter  Google Scholar 

  14. Gotthilf, Z., Hermelin, D., Lewenstein, M.: Constrained LCS: hardness and approximation. In: Ferragina, P., Landau, G.M. (eds.) CPM 2008. LNCS, vol. 5029, pp. 255–262. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-69068-9_24

  15. Gusfield, D.: Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology. Cambridge University Press (1997)

    Google Scholar 

  16. Hart, P.E., Nilsson, N.J., Raphael, B.: A formal basis for the heuristic determination of minimum cost paths. IEEE Trans. Syst. Sci. Cybern. 4(2), 100–107 (1968)

    Google Scholar 

  17. Jiang, T., Lin, G., Ma, B., Zhang, K.: The longest common subsequence problem for arc-annotated sequences. J. Discrete Algorithms 2(2), 257–270 (2004)

    Google Scholar 

  18. Li, Y., Wang, Y., Zhang, Z., Wang, Y., Ma, D., Huang, J.: A novel fast and memory efficient parallel MLCS algorithm for long and large-scale sequences alignments. In: Proceedings of the 32nd International Conference on Data Engineering, ICDE 2019, pp. 1170–1181 (2016)

    Google Scholar 

  19. Liu, W., Chen, L.: A fast longest common subsequence algorithm for biosequences alignment. In: Li, D. (ed.) CCTA 2007. TIFIP, vol. 258, pp. 61–69. Springer, Boston (2008). https://doi.org/10.1007/978-0-387-77251-6_8

  20. Maier, D.: The complexity of some problems on subsequences and supersequences. J. ACM 25(2), 322–336 (1978)

    Article  MathSciNet  Google Scholar 

  21. Martínez-Porchas, M., Vargas-Albores, F.: An efficient strategy using k-mers to analyse 16s rRNA sequences. Heliyon 3(7), e00370 (2017)

    Google Scholar 

  22. Mount, D.W.: Bioinformatics: Sequence and Genome Analysis, 2nd edn. Cold Spring Harbour Laboratory Press, Cold Spring Harbour (2004)

    Google Scholar 

  23. Tang, C.Y.: Constrained multiple sequence alignment tool development and its application to RNase family alignment. J. Bioinf. Comput. Biol. 01(02), 267–287 (2003)

    Google Scholar 

  24. Tsai, Y.-T.: The constrained longest common subsequence problem. Inf. Process. Lett. 88(4), 173–176 (2003)

    Article  MathSciNet  Google Scholar 

  25. Wang, Q., Korkin, D., Shang, Y.: A fast multiple longest common subsequence (MLCS) algorithm. IEEE Trans. Knowl. Data Eng. 23(3), 321–334 (2011)

    Article  Google Scholar 

Download references

Acknowledgements

Christian Blum was funded by project CI-SUSTAIN of the Spanish Ministry of Science and Innovation (PID2019-104156GB-I00). Dragan Matić is partially supported by Ministry for Scientific and Technological Development, Higher Education and Information Society, Government of Republic of Srpska, B&H under the Project “Development of artificial intelligence methods for solving computer biology problems”.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marko Djukanovic .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Djukanovic, M., Matic, D., Blum, C., Kartelj, A. (2022). Application of A\(^*\) to the Generalized Constrained Longest Common Subsequence Problem with Many Pattern Strings. In: El Yacoubi, M., Granger, E., Yuen, P.C., Pal, U., Vincent, N. (eds) Pattern Recognition and Artificial Intelligence. ICPRAI 2022. Lecture Notes in Computer Science, vol 13364. Springer, Cham. https://doi.org/10.1007/978-3-031-09282-4_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-09282-4_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-09281-7

  • Online ISBN: 978-3-031-09282-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics