Skip to main content

WM+: An Optimal Multi-pattern String Matching Algorithm Based on the WM Algorithm

  • Conference paper
Advanced Parallel Processing Technologies (APPT 2005)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3756))

Included in the following conference series:

  • 753 Accesses

Abstract

The WM algorithm, designed by Sun Wu and Udi Manber, is considered the fastest multi-pattern string matching algorithm in practice except when the pattern number is very large or the alphabet size is small[2]. Theoretically, the scanning time of WM is average-optimal (i.e. O(nlogĻƒ(rm)/m)), but in the worst case, its scanning time can not be evaluated at all. The maximum shift of the original WM algorithm is m-B+1, where m is the minimum length of all patterns and B is the q-gram size. The tuned WM algorithm (abbreviated as WM+) can reach higher performance by improving the shift table building algorithm and combining the AC algorithm with the original WM algorithm. And the scanning time of the WM+ algorithm in the worst case is predictable. Experiments show that the scanning time of the WM+ algorithm is less or not great than that of the WM algorithm for varied size of m and number of patterns, especially in the worst case.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Wu, S., Manber, U.: A Fast Algorithm for Multi-pattern Searching. Report TR-94-17, Department of Computer Science, University of Arizona (1994)

    Google Scholar 

  2. Baeza-Yates, R., Navarro, G.: Text Searching: Theory and Practice (2004), http://citeseer.ist.psu.edu/605426.htm

  3. Aho, A., Corasick, M.: Efficient String Matching: An Aid to Bibliographic Search. Communications of the ACM 18, 333ā€“340 (1975)

    Article  MATH  MathSciNet  Google Scholar 

  4. Boyer, R., Moore, J.: A Fast String Searching Algorithm. Communications of ACM 20(10), 762ā€“772 (1987)

    Article  Google Scholar 

  5. Allauzen, C., Raffinot, M.: Factor Oracle of a Set of Words. Technical report 99-11, Institute Gaspard-Monge, University de Marne-la-Vallee (1999)

    Google Scholar 

  6. Fredriksson, K., Navarro, G.: Average-optimal Multiple Approximate String Matching. In: Baeza-Yates, R., ChĆ”vez, E., Crochemore, M. (eds.) CPM 2003. LNCS, vol. 2676, pp. 109ā€“128. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  7. Knuth, D., Morris, J., Pratt, V.: Fast Pattern Matching in Strings. SIAM Journal on Computing 6(2), 323ā€“350 (1977)

    Article  MATH  MathSciNet  Google Scholar 

  8. Horspool., N.: Practical Fast Searching in Strings. Software-Practice and Experience 10(6), 501ā€“506 (1980)

    Article  Google Scholar 

  9. Xin, Z., Jianlong, T., Xueqi, C.: An Improved Wu-Manber Multi-Pattern Matching Algorithm(In Chinese). Computer Application 23(7), 29ā€“31 (2003)

    Google Scholar 

  10. Wu, S., Manber, U.: Agrep ā€” A Fast Approximate Pattern-matching Tool. In: Usenix Winter 1992 Technical Conference, San Francisco, pp. 153ā€“162 (1992)

    Google Scholar 

  11. Kim, J.Y., Taylor, J.S.: Fast String Matching Using An n-gram Algorithm. Software ā€“ Practice And Experience 24(1), 79ā€“88 (1994)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

Ā© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Chen, X., Fang, B., Li, L., Jiang, Y. (2005). WM+: An Optimal Multi-pattern String Matching Algorithm Based on the WM Algorithm. In: Cao, J., Nejdl, W., Xu, M. (eds) Advanced Parallel Processing Technologies. APPT 2005. Lecture Notes in Computer Science, vol 3756. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11573937_55

Download citation

  • DOI: https://doi.org/10.1007/11573937_55

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-29639-3

  • Online ISBN: 978-3-540-32107-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics