Skip to main content

All Instantiations of the Greedy Algorithm for the Shortest Common Superstring Problem are Equivalent

  • Conference paper
  • First Online:
String Processing and Information Retrieval (SPIRE 2021)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12944))

Included in the following conference series:

  • 342 Accesses

Abstract

In the Shortest Common Superstring problem (SCS), one needs to find the shortest superstring for a set of strings. While SCS is NP-hard and MAX-SNP-hard, the Greedy Algorithm “choose two strings with the largest overlap; merge them; repeat” achieves a constant factor approximation that is known to be at most 3.5 and conjectured to be equal to 2. The Greedy Algorithm is not deterministic, so its instantiations with different tie-breaking rules may have different approximation factors. In this paper, we show that it is not the case: all factors are equal. To prove this, we show how to transform a set of strings so that all overlaps are different whereas their ratios stay roughly the same.

The research is supported by Russian Science Foundation (18-71-10042).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 54.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Blum, A., Jiang, T., Li, M., Tromp, J., Yannakakis, M.: Linear approximation of shortest superstrings. In: STOC 1991, pp. 328–336. ACM (1991). https://doi.org/10.1145/179812.179818

  2. Cazaux, B., Juhel, S., Rivals, E.: Practical lower and upper bounds for the shortest linear superstring. In: SEA 2018, vol. 103, pp. 18:1–18:14. LIPIcs (2018). https://doi.org/10.4230/LIPIcs.SEA.2018.18

  3. Gallant, J.K.: String compression algorithms. Ph.D. thesis, Princeton (1982)

    Google Scholar 

  4. Gallant, J., Maier, D., Storer, J.A.: On finding minimal length superstrings. J. Comput. Syst. Sci. 20(1), 50–58 (1980). https://doi.org/10.1016/0022-0000(80)90004-5

    Article  MathSciNet  MATH  Google Scholar 

  5. Gevezes, T.P., Pitsoulis, L.S.: The shortest superstring problem. In: Rassias, T.M., Floudas, C.A., Butenko, S. (eds.) Optimization in Science and Engineering, pp. 189–227. Springer, New York (2014). https://doi.org/10.1007/978-1-4939-0808-0_10

    Chapter  Google Scholar 

  6. Golovnev, A., Kulikov, A.S., Logunov, A., Mihajlin, I., Nikolaev, M.: Collapsing superstring conjecture. In: Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2019). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik (2019). https://doi.org/10.4230/LIPIcs.APPROX-RANDOM.2019.26

  7. Golovnev, A., Kulikov, A.S., Mihajlin, I.: Approximating shortest superstring problem using de bruijn graphs. In: Fischer, J., Sanders, P. (eds.) CPM 2013. LNCS, vol. 7922, pp. 120–129. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-38905-4_13

    Chapter  Google Scholar 

  8. Kaplan, H., Shafrir, N.: The greedy algorithm for shortest superstrings. Inf. Process. Lett. 93(1), 13–17 (2005). https://doi.org/10.1016/j.ipl.2004.09.012

    Article  MathSciNet  MATH  Google Scholar 

  9. Kulikov, A.S., Savinov, S., Sluzhaev, E.: Greedy conjecture for strings of length 4. In: Cicalese, F., Porat, E., Vaccaro, U. (eds.) CPM 2015. LNCS, vol. 9133, pp. 307–315. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-19929-0_26

    Chapter  Google Scholar 

  10. Laube, U., Weinard, M.: Conditional inequalities and the shortest common superstring problem. Int. J. Found. Comput. Sci. 16(06), 1219–1230 (2005). https://doi.org/10.1142/S0129054105003777

    Article  MathSciNet  MATH  Google Scholar 

  11. Mucha, M.: Lyndon words and short superstrings. In: SODA 2013, pp. 958–972. SIAM (2013). https://doi.org/10.1137/1.9781611973105.69

  12. Pevzner, P.A., Tang, H., Waterman, M.S.: An Eulerian path approach to DNA fragment assembly. Proc. Natl. Acad. Sci. U.S.A. 98(17), 9748–9753 (2001). https://doi.org/10.1073/pnas.171285098

    Article  MathSciNet  MATH  Google Scholar 

  13. Rivals, E., Cazaux, B.: Superstrings with multiplicities. In: CPM 2018, vol. 105, pp. 21:1–21:16 (2018). https://doi.org/10.4230/LIPIcs.CPM.2018.21

  14. Romero, H.J., Brizuela, C.A., Tchernykh, A.: An experimental comparison of two approximation algorithms for the common superstring problem. In: ENC 2004, pp. 27–34. IEEE (2004). https://doi.org/10.1109/ENC.2004.1342585

  15. Storer, J.A.: Data compression: methods and theory. Computer Science Press Inc., (1987)

    Google Scholar 

  16. Tarhio, J., Ukkonen, E.: A greedy approximation algorithm for constructing shortest common superstrings. Theor. Comput. Sci. 57(1), 131–145 (1988). https://doi.org/10.1016/0304-3975(88)90167-3

    Article  MathSciNet  MATH  Google Scholar 

  17. Turner, J.S.: Approximation algorithms for the shortest common superstring problem. Inf. Comput. 83(1), 1–20 (1989). https://doi.org/10.1016/0890-5401(89)90044-8

    Article  MathSciNet  MATH  Google Scholar 

  18. Ukkonen, E.: A linear-time algorithm for finding approximate shortest common superstrings. Algorithmica 5(1–4), 313–323 (1990). https://doi.org/10.1007/BF01840391

    Article  MathSciNet  MATH  Google Scholar 

  19. Waterman, M.S.: Introduction to Computational Biology: Maps, Sequences andGenomes.CRC Press, Boca Raton (1995). https://doi.org/10.1201/9780203750131

  20. Weinard, M., Schnitger, G.: On the greedy superstring conjecture. SIAM J. Discret. Math. 20(2), 502–522 (2006). https://doi.org/10.1137/040619144

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgments

Many thanks to Alexander Kulikov for valuable discussions and proofreading the text, and the anonymous reviewers for their useful comments.

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Nikolaev, M.S. (2021). All Instantiations of the Greedy Algorithm for the Shortest Common Superstring Problem are Equivalent. In: Lecroq, T., Touzet, H. (eds) String Processing and Information Retrieval. SPIRE 2021. Lecture Notes in Computer Science(), vol 12944. Springer, Cham. https://doi.org/10.1007/978-3-030-86692-1_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-86692-1_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-86691-4

  • Online ISBN: 978-3-030-86692-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics