Skip to main content

Optimizing Multiple Spaced Seeds for Homology Search

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3109))

Abstract

Optimized spaced seeds improve sensitivity and specificity in local homology search [1]. Recently, several authors [2-4] have shown that multiple seeds can have better sensitivity and specificity than single seeds. We describe a linear programming-based algorithm to optimize a set of seeds. Our algorithm offers a performance guarantee: the sensitivity of a chosen seed set is at least 70% of what can be achieved, in most reasonable models of homologous sequences. Our method achieves performance comparable to that of a greedy algorithm, but our work gives this area a mathematical foundation.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ma, B., Tromp, J., Li, M.: Patternhunter: Faster and more sensitive homology search. Bioinformatics 18, 440–445 (2002)

    Article  Google Scholar 

  2. Buhler, J., Keich, U., Sun, Y.: Designing seeds for similarity search in genomic DNA. In: Proceedings of the Seventh Annual International Conference on Computational Molecular Biology (RECOMB 2003), Berlin, Germany, April 2003, pp. 67–75 (2003)

    Google Scholar 

  3. Li, M., Ma, B., Kisman, D., Tromp, J.: Patternhunter II: Highly sensitive and fast homology search. Journal of Bioinformatics and Computational Biology (2004) (in Press)

    Google Scholar 

  4. Brejová, B., Brown, D.G., Vinař, T.: Vector seeds: An extension to spaced seeds allows substantial improvements in sensitivity and specificity. In: Benson, G., Page, R.D.M. (eds.) WABI 2003. LNCS (LNBI), vol. 2812, pp. 39–54. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  5. Altschul, S.F., Gish, W., Miller, W., Myers, E.W., Lipman, D.J.: Basic local alignment search tool. Journal of Molecular Biology 215, 403–410 (1990)

    Google Scholar 

  6. Keich, U., Li, M., Ma, B., Tromp, J.: On spaced seeds for similarity search. Discrete Applied Mathematics (2004) (in Press)

    Google Scholar 

  7. Choi, K.P., Zhang, L.: Sensitivity analysis and efficient method for identifying optimal spaced seeds. Journal of Computer and System Sciences 68, 22–40 (2004)

    Article  MATH  MathSciNet  Google Scholar 

  8. Brejová, B., Brown, D.G., Vinař, T.: Optimal spaced seeds for hidden markov models, with application to homologous coding regions. In: Baeza-Yates, R., Chávez, E., Crochemore, M. (eds.) CPM 2003. LNCS, vol. 2676, pp. 42–54. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  9. Smith, T.F., Waterman, M.S.: Identification common molecular subsequences. Journal of. Molecular Biology 147, 195–197 (1981)

    Article  Google Scholar 

  10. Hochbaum, D.S.: Approximating covering and packing problems:set cover, vertex cover, independent set, and related problems. In: Hochbaum, D.S. (ed.) Approximation Algorithms for NP-hard Problems, ch. 3, pp. 135–137. PWS (1997)

    Google Scholar 

  11. Feige, U.: A threshold of ln n for approximating set cover. Journal of the ACM 45(4), 634–652 (1998)

    Article  MATH  MathSciNet  Google Scholar 

  12. Motwani, R., Raghavan, P.: Randomized Algorithm. Cambridge University Press, New York (1995)

    Google Scholar 

  13. Raghavan, P.: Probabilistic construction of deterministic algorithms: Approximating packing integer programs. Journal of Computer and System Sciences 37, 130–143 (1988)

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Xu, J., Brown, D.G., Li, M., Ma, B. (2004). Optimizing Multiple Spaced Seeds for Homology Search. In: Sahinalp, S.C., Muthukrishnan, S., Dogrusoz, U. (eds) Combinatorial Pattern Matching. CPM 2004. Lecture Notes in Computer Science, vol 3109. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-27801-6_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-27801-6_4

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-22341-2

  • Online ISBN: 978-3-540-27801-6

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics