Skip to main content

Computability of Models for Sequence Assembly

  • Conference paper
Algorithms in Bioinformatics (WABI 2007)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 4645))

Included in the following conference series:

Abstract

Graph-theoretic models have come to the forefront as some of the most powerful and practical methods for sequence assembly. Simultaneously, the computational hardness of the underlying graph algorithms has remained open. Here we present two theoretical results about the complexity of these models for sequence assembly. In the first part, we show sequence assembly to be NP-hard under two different models: string graphs and de Bruijn graphs. Together with an earlier result on the NP-hardness of overlap graphs, this demonstrates that all of the popular graph-theoretic sequence assembly paradigms are NP-hard. In our second result, we give the first, to our knowledge, optimal polynomial time algorithm for genome assembly that explicitly models the double-strandedness of DNA. We solve the Chinese Postman Problem on bidirected graphs using bidirected flow techniques and show to how to use it to find the shortest double-stranded DNA sequence which contains a given set of k-long words. This algorithm has applications to sequencing by hybridization and short read assembly.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Edmonds, J.: An introduction to matching. Notes of engineering summer conference, University of Michigan, Ann Arbor (1967)

    Google Scholar 

  2. Edmonds, J., Johnson, E.L.: Matching, Euler tours, and the Chinese postman. Mathemetical Programming 5, 88–124 (1973)

    Article  MATH  MathSciNet  Google Scholar 

  3. Gabow, H.N.: An efficient reduction technique for degree-constrained subgraph and bidirected network flow problems. In: STOC, pp. 448–456 (1983)

    Google Scholar 

  4. Gallant, J., Maier, D., Storer, J.A.: On finding minimal length superstrings. J. Comput. Syst. Sci. 20(1), 50–58 (1980)

    Article  MATH  MathSciNet  Google Scholar 

  5. Garey, M.R., Johnson, D.S.: Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman, New York (1979)

    MATH  Google Scholar 

  6. Kececioglu, J.D., Myers, E.W.: Combinatiorial algorithms for DNA sequence assembly. Algorithmica 13(1/2), 7–51 (1995)

    Article  MATH  MathSciNet  Google Scholar 

  7. Kececioglu, J.D., Sankoff, D.: Exact and approximation algorithms for sorting by reversals, with application to genome rearrangement. Algorithmica 13(1/2), 180–210 (1995)

    Article  MATH  MathSciNet  Google Scholar 

  8. Kececioglu, J.D.: Exact and approximation algorithms for DNA sequence reconstruction. PhD thesis, Tucson, AZ, USA (1992)

    Google Scholar 

  9. Myers, E.W.: Toward simplifying and accurately formulating fragment assembly. Journal of Computational Biology 2(2), 275–290 (1995)

    Article  Google Scholar 

  10. Myers, E.W.: The fragment assembly string graph. In: ECCB/JBI, p. 85 (2005)

    Google Scholar 

  11. Pevzner, P.A.: 1-Tuple DNA sequencing: computer analysis. J. Biomol. Struct. Dyn. 7(1), 63–73 (1989)

    Google Scholar 

  12. Pevzner, P.A., Tang, H., Waterman, M.S.: An Eulerian path approach to DNA fragment assembly. In: Proceedings of the National Academy of Sciences, vol. 98, pp. 9748–9753 (2001)

    Google Scholar 

  13. Pevzner, P.A., Tang, H., Tesler, G.: De novo repeat classification and fragment assembly. In: RECOMB, pp. 213–222 (2004)

    Google Scholar 

  14. Schrijver, A.: Combinatorial Optimization, vol. A. Springer, Heidelberg (2003)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Raffaele Giancarlo Sridhar Hannenhalli

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Medvedev, P., Georgiou, K., Myers, G., Brudno, M. (2007). Computability of Models for Sequence Assembly. In: Giancarlo, R., Hannenhalli, S. (eds) Algorithms in Bioinformatics. WABI 2007. Lecture Notes in Computer Science(), vol 4645. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74126-8_27

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-74126-8_27

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-74125-1

  • Online ISBN: 978-3-540-74126-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics