Skip to main content

Finding Optimal Alignment and Consensus of Circular Strings

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6129))

Abstract

We consider the problem of finding the optimal alignment and consensus (string) of circular strings. Circular strings are different from linear strings in that the first (leftmost) symbol of a circular string is wrapped around next to the last (rightmost) symbol. In nature, for example, bacterial and mitochondrial DNAs typically form circular strings. The consensus string problem is finding a representative string (consensus) of a given set of strings, and it has been studied on linear strings extensively. However, only a few efforts have been made for the consensus problem for circular strings, even though circular strings are biologically important. In this paper, we introduce the consensus problem for circular strings and present novel algorithms to find the optimal alignment and consensus of circular strings under the Hamming distance metric. They are O(n 2logn)-time algorithms for three circular strings and an O(n 3logn)-time algorithm for four circular strings. Our algorithms are O(n/ logn) times faster than the naïve algorithm directly using the solutions for the linear consensus problems, which takes O(n 3) time for three circular strings and O(n 4) time for four circular strings. We achieved this speedup by adopting a convolution and a system of linear equations into our algorithms to reflect the characteristics of circular strings that we found.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Gusfield, D.: Algorithms on Strings, Tree, and Sequences. Cambridge University Press, Cambridge (1997)

    Google Scholar 

  2. Frances, M., Litman, A.: On covering problems of codes. Theory of Computing Systems 30(2), 113–119 (1997)

    MATH  MathSciNet  Google Scholar 

  3. Ben-Dor, A., Lancia, G., Perone, J., Ravi, R.: Banishing bias from consensus sequences. In: Hein, J., Apostolico, A. (eds.) CPM 1997. LNCS, vol. 1264, pp. 247–261. Springer, Heidelberg (1997)

    Google Scholar 

  4. Gasieniec, L., Jansson, J., Lingas, A.: Approximation algorithms for Hamming clustering problems. Journal of Discrete Algorithms 2(2), 289–301 (2004)

    Article  MATH  MathSciNet  Google Scholar 

  5. Lanctot, K., Li, M., Ma, B., Wang, S., Zhang, L.: Distinguishing string selection problems. In: Proceedings of the 10th ACM-SIAM Symposium on Discrete Algorithms, pp. 633–642 (1999)

    Google Scholar 

  6. Li, M., Ma, B., Wang, L.: Finding similar regions in many strings. In: Proceedings of the 31st Annual ACM Symposium on Theory of Computing, pp. 473–482 (1999)

    Google Scholar 

  7. Li, M., Ma, B., Wang, L.: On the closest string and substring problems. Journal of the ACM 49(2), 157–171 (2002)

    Article  MathSciNet  Google Scholar 

  8. Ma, B., Sun, X.: More efficient algorithms for closest string and substring problems. In: Vingron, M., Wong, L. (eds.) RECOMB 2008. LNCS (LNBI), vol. 4955, pp. 396–409. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  9. Gramm, J., Niedermeier, R., Rossmanith, P.: Exact solutions for closest string and related problems. In: Eades, P., Takaoka, T. (eds.) ISAAC 2001. LNCS, vol. 2223, pp. 441–453. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  10. Gramm, J., Niedermeier, R., Rossmanith, P.: Fixed-parameter algorithms for closest string and related problems. Algorithmica 37(1), 25–42 (2003)

    Article  MATH  MathSciNet  Google Scholar 

  11. Stojanovic, N., Berman, P., Gumucio, D., Hardison, R., Miller, W.: A linear-time algorithm for the 1-mismatch problem. In: Rau-Chaplin, A., Dehne, F., Sack, J.-R., Tamassia, R. (eds.) WADS 1997. LNCS, vol. 1272, pp. 126–135. Springer, Heidelberg (1997)

    Chapter  Google Scholar 

  12. Sze, S., Lu, S., Chen, J.: Integrating sample-driven and pattern-driven approaches in motif finding. In: Jonassen, I., Kim, J. (eds.) WABI 2004. LNCS (LNBI), vol. 3240, pp. 438–449. Springer, Heidelberg (2004)

    Google Scholar 

  13. Boucher, C., Brown, D., Durocher, S.: On the structure of small motif recognition instances. In: Amir, A., Turpin, A., Moffat, A. (eds.) SPIRE 2008. LNCS, vol. 5280, pp. 269–281. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  14. Amir, A., Landau, G.M., Na, J.C., Park, H., Park, K., Sim, J.S.: Consensus optimizing both distance sum and radius. In: Karlgren, J., Tarhio, J., Hyyrö, H. (eds.) SPIRE 2009. LNCS, vol. 5721, pp. 234–242. Springer, Heidelberg (2009)

    Google Scholar 

  15. Mosig, A., Hofacker, I., Stadler, P.: Comparative analysis of cyclic sequences: Viroids and other small circular RNAs. Lecture Notes in Informatics, vol. P-83, pp. 93–102 (2006)

    Google Scholar 

  16. Fernandes, F., Pereira, L., Freitas, A.: CSA: An efficient algorithm to improve circular DNA multiple alignment. BMC Bioinformatics 10(1), 230 (2009)

    Article  Google Scholar 

  17. Thompson, J., Higgins, D., Gibson, T.: CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Research 22, 4673–4680 (1994)

    Article  Google Scholar 

  18. Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms, 2nd edn. The MIT Press, Cambridge (2001)

    MATH  Google Scholar 

  19. Fischer, M.J., Paterson, M.S.: String matching and other products. In: Karp, R.M. (ed.) Complexity of Computation. SIAM-AMS Proceedings, pp. 113–125 (1974)

    Google Scholar 

  20. Abrahamson, K.: Generalized string matching. SIAM J. Comput. 16(6), 1039–1051 (1987)

    Article  MATH  MathSciNet  Google Scholar 

  21. Amir, A., Lewenstein, M., Porat, E.: Faster algorithms for string matching with k mismatches. In: SODA 2000: Proceedings of the eleventh annual ACM-SIAM symposium on Discrete algorithms, Philadelphia, PA, USA, pp. 794–803. Society for Industrial and Applied Mathematics (2000)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lee, T., Na, J.C., Park, H., Park, K., Sim, J.S. (2010). Finding Optimal Alignment and Consensus of Circular Strings. In: Amir, A., Parida, L. (eds) Combinatorial Pattern Matching. CPM 2010. Lecture Notes in Computer Science, vol 6129. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13509-5_28

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-13509-5_28

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-13508-8

  • Online ISBN: 978-3-642-13509-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics