Skip to main content

Genomic Scaffold Filling: A Progress Report

  • Conference paper
  • First Online:
Frontiers in Algorithmics (FAW 2016)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9711))

Included in the following conference series:

Abstract

The genomic scaffold filling problem has attracted a lot of attention since 2010. The general problem is on filling an incomplete sequence (sequence scaffold) I into \(I'\), with respect to a complete reference genome G, such that the number of adjacencies between G and \(I'\) is maximized. The problem is NP-complete and APX-hard, and admits a 1.2-approximation. In this survey paper, we will first review the progress being made for this setting.

However, the sequence input I is not quite practical and does not fit most of the real datasets (where a scaffold is more often given as a list of contigs). Then, we will review the most recent progress on this new version of the genomic scaffold filling problem where, (1) a scaffold S is given, the missing genes \(X=c(G)-c(S)\) can only be inserted in between the contigs, and the objective is to maximize the number of adjacencies between G and the filled \(S'\), and (2) a scaffold S is given, a subset of the missing genes \(X'\subset X=c(G)-c(S)\) can only be inserted in between the contigs, and the objective is still to maximize the number of adjacencies between G and the filled \(S''\). Some open problems will be posed for further research.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. http://wgs-assembler.sourceforge.net/

  2. Angibaud, S., Fertin, G., Rusu, I., Thevenin, A., Vialette, S.: On the approximability of comparing genomes with duplicates. J. Graph Algorithms Appl. 13(1), 19–53 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  3. Alon, N., Yuster, R., Zwick, U.: Color-coding. J. ACM 42(4), 844–856 (1995)

    Article  MathSciNet  MATH  Google Scholar 

  4. Blin, G., Fertin, G., Sikora, F., Vialette, S.: The exemplar breakpoint distance for non-trivial genomes cannot be approximated. In: Das, S., Uehara, R. (eds.) WALCOM 2009. LNCS, vol. 5431, pp. 357–368. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  5. Bulteau, L., Carrieri, A.P., Dondi, R.: Fixed-parameter algorithms for scaffold filling. Theoret. Comput. Sci. 568, 72–83 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  6. Chain, P.S., Grafham, D.V., Fulton, R.S., et al.: Genome project standards in a new era of sequencing. Science 326, 236–237 (2009)

    Article  Google Scholar 

  7. Chen, Z., Fu, B., Zhu, B.: The approximability of the exemplar breakpoint distance problem. In: Cheng, S.-W., Poon, C.K. (eds.) AAIM 2006. LNCS, vol. 4041, pp. 291–302. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  8. Chen, Z., Fu, B., Xu, J., Yang, B., Zhao, Z., Zhu, B.: Non-breaking similarity of genomes with gene repetitions. In: Ma, B., Zhang, K. (eds.) CPM 2007. LNCS, vol. 4580, pp. 119–130. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  9. Chen, Z., Fu, B., Fowler, R., Zhu, B.: On the inapproximability of the exemplar conserved interval distance problem of genomes. J. Comb. Optim. 15(2), 201–221 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  10. Chen, Z., Fu, B., Goebel, R., Lin, G., Tong, W., Xu, J., Yang, B., Zhao, Z., Zhu, B.: On the approximability of the exemplar adjacency number problem of genomes with gene repetitions. Theoret. Comput. Sci. 550, 59–65 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  11. Cormode, G., Muthukrishnan, S.: The string edit distance matching problem with moves. In: Proceedings of the 13th ACM-SIAM Symposium on Discrete Algorithms (SODA 2002), pp. 667–676 (2002)

    Google Scholar 

  12. Downey, R., Fellows, M.: Parameterized Complexity. Springer, New York (1999)

    Book  MATH  Google Scholar 

  13. Flum, J., Grohe, M.: Parameterized Complexity Theory. Springer, Heidelberg (2006)

    MATH  Google Scholar 

  14. Guo, J., Niedermeier, R., Wernicke, S.: Parameterized complexity of vertex cover variants. Theor. Comput. Syst. 41(3), 501–520 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  15. Jiang, H., Zhong, F., Zhu, B.: Filling scaffolds with gene repetitions: maximizing the number of adjacencies. In: Giancarlo, R., Manzini, G. (eds.) CPM 2011. LNCS, vol. 6661, pp. 55–64. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  16. Jiang, H., Zheng, C., Sankoff, D., Zhu, B.: Scaffold filling under the breakpoint, related distances. IEEE/ACM Trans. Comput. Biol. Bioinform. 9(4), 1220–1229 (2012)

    Article  Google Scholar 

  17. Jiang, H., Ma, J., Luan, J., Zhu, D.: Approximation and nonapproximability for the one-sided scaffold filling problem. In: Xu, D., Du, D., Du, D. (eds.) COCOON 2015. LNCS, vol. 9198, pp. 251–263. Springer, Heidelberg (2015)

    Chapter  Google Scholar 

  18. Jiang, H., Fan, C., Yang, B., Zhong, F., Zhu, D., Zhu, B.: Genomic scaffold filling revisited. In: Proceedings of the 27th Annual Symposium on Combinatorial Pattern Matching (CPM 2016), Tel Aviv, Israel, 27–29 June 2016 (2016)

    Google Scholar 

  19. Jiang, M.: The zero exemplar distance problem. In: Tannier, E. (ed.) RECOMB-CG 2010. LNCS, vol. 6398, pp. 74–82. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  20. Khanna, S., Motwani, R., Sudan, M., Vazirani, U.: On syntactic versus computational views of approximability. SIAM J. Comput. 28(1), 164–191 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  21. Liu, N., Jiang, H., Zhu, D., Zhu, B.: An improved approximation algorithm for scaffold filling to maximize the common adjacencies. IEEE/ACM Trans. Comput. Biol. Bioinform. 10(4), 905–913 (2013)

    Article  MATH  Google Scholar 

  22. Liu, N., Zhu, D., Jiang, H., Zhu, B.: A 1.5-approximation algorithm for two-sided scaffold filling. Algorithmica 74(1), 91–116 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  23. Liu, N., Zou, P., Zhu, B.: A polynomial time solution for permutation scaffold filling. Technical report, Department of Computer Science, Montana State University (2016)

    Google Scholar 

  24. Muñoz, A., Zheng, C., Zhu, Q., Albert, V., Rounsley, S., Sankoff, D.: Scaffold filling, contig fusion and gene order comparison. BMC Bioinform. 11, 304 (2010)

    Article  Google Scholar 

  25. Niedermeier, R.: Invitation to Fixed-Parameter Algorithms. Oxford University Press, New York (2006)

    Book  MATH  Google Scholar 

  26. Yancopoulos, S., Attie, O., Friedberg, R.: Efficient sorting of genomic permutations by translocation, inversion and block interchange. Bioinformatics 21, 3340–3346 (2005)

    Article  Google Scholar 

  27. Zhu, B.: A retrospective on genomic preprocessing for comparative genomics. In: Chauve, C., et al. (eds.) Models and Algorithms for Genome Evolution, pp. 183–206. Springer, London (2013)

    Chapter  Google Scholar 

Download references

Acknowledgments

I would like to thank my collaborators for this series of research: Chenglin Fan, Haitao Jiang, Nan Liu, David Sankoff, Boting Yang, Chunfang Zheng, Farong Zhong, Daming Zhu and Peng Zou.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Binhai Zhu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Zhu, B. (2016). Genomic Scaffold Filling: A Progress Report. In: Zhu, D., Bereg, S. (eds) Frontiers in Algorithmics. FAW 2016. Lecture Notes in Computer Science(), vol 9711. Springer, Cham. https://doi.org/10.1007/978-3-319-39817-4_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-39817-4_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-39816-7

  • Online ISBN: 978-3-319-39817-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics