Skip to main content

Optimal Offline Extraction of Irredundant Motif Bases

(Extended Abstract)

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4598))

Abstract

The problem of extracting a basis of irredundant motifs from a sequence is considered. In previous work such bases were built incrementally for all suffixes of the input string s in O(n 3), where n is the length of s. Faster, non-incremental algorithms have been based on the landmark approach to string searching due to Fischer and Paterson, and exhibit respective time bounds of O(n 2 logn log|Σ|) and O(|Σ|n 2 log2 n loglogn), with Σ denoting the alphabet. The algorithm by Fischer and Paterson makes crucial use of the FFT, which is impractical with long sequences.

The algorithm presented in the present paper does not need to resort to the FFT and yet is asymptotically faster than previously available ones. Specifically, an off-line algorithm is presented taking time O(|Σ|n 2), which is optimal for finite Σ.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Apostolico, A., Galil, Z.: Pattern matching algorithms. Oxford University Press, New York (1997)

    MATH  Google Scholar 

  2. Apostolico, A., Parida, L.: ncremental paradigms of motif discovery. Journal of Computational Biology 11(1), 15–25 (2004)

    Article  Google Scholar 

  3. Apostolico, A.: Pattern discovery and the algorithmics of surprise. Artificial Intelligence and Heuristic Methods for Bioinformatics, pp. 111–127 (2003)

    Google Scholar 

  4. Cole, R., Hariharan, R.: Verifying candidate matches in sparse and wildcard matching. In: STOC 2002. Proceedings of the thiry-fourth annual ACM symposium on Theory of computing, pp. 592–601 (2002)

    Google Scholar 

  5. Fischer, M.J., Paterson, M.S.: String matching and other products. In: Karp, R. (ed.) Proceedings of the SIAM-AMS Complexity of Computation, Providence, R.I. American Mathematical Society, pp. 113–125 (1974)

    Google Scholar 

  6. Navarro, G.: A guided tour to approximate string matching. ACM Computing Surveys 33(1), 31–88 (2001)

    Article  Google Scholar 

  7. Pelfrêne, J., Abdeddaïm, S., Alexandre, J.: Extracting approximate patterns. Journal of Discrete Algorithms 3(2-4), 293–320 (2005)

    Article  MATH  MathSciNet  Google Scholar 

  8. Parida, L.: Algorithmic Techniques in Computational Genomics. PhD thesis, Department of Computer Science, New York University (1998)

    Google Scholar 

  9. Pisanti, N., Crochemore, M., Grossi, R., Sagot, M.-F.: Bases of motifs for generating repeated patterns with wild cards. IEEE/ACM Trans. Comput. Biol. Bioinformatics 2(1), 40–50 (2005)

    Article  Google Scholar 

  10. Parida, L., Rigoutsos, I., Floratos, A., Platt, D., Gao, Y.: Pattern discovery on character sets and real-valued data: linear bound on irredundant motifs and an efficient polynomial time algorithm. In: Symposium on Discrete Algorithms, pp. 297–308 (2000)

    Google Scholar 

  11. Wang, J.T.L., Shapiro, B.A., Shasha, D.E.: Pattern Discovery in Biomolecular Data: Tools, Techniques and Applications. Oxford University Press, Oxford (1999)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Guohui Lin

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Apostolico, A., Tagliacollo, C. (2007). Optimal Offline Extraction of Irredundant Motif Bases. In: Lin, G. (eds) Computing and Combinatorics. COCOON 2007. Lecture Notes in Computer Science, vol 4598. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73545-8_36

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-73545-8_36

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-73544-1

  • Online ISBN: 978-3-540-73545-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics