Abstract
Bitwise operations are executed very fast in computer architecture. Algorithms aiming to benefit from this intrinsic property can be classified as bit-parallel algorithms. Bit-parallelism has been widely investigated in the pattern matching area since the introduction of the Shift-Or algorithm. In the original idea, there is no shift mechanism, and the input pattern length is required to be less than the computer word size (W) to benefit from the full power of bit-parallelism. The lack of the shift mechanism was removed by the succeeding algorithms of this genre, but W limitation has not been overcome in an elegant way. This study proposes a new bit-parallel algorithm, given name BLIM (bit-parallel length independent matching), for exact pattern matching that does not restrict the input pattern to be shorter than the word size. The multiple pattern case is also addressed, and it is shown that up to computer word size number of patterns, whatever their lengths are, can be searched simultaneously in a single bit-parallel framework. Similar to other algorithms of this genre, BLIM is also capable of handling fixed-length gaps and character classes in the input strings as well. The proposed algorithm is compared with the other alternatives of its class, mainly the shift-or and BNDM variants. Experimental results indicate that BLIM is compatible with the previous bit-parallel algorithms with an additional gain of overcoming the word size limitation.
Similar content being viewed by others
References
Aho A.V., Corasick M.J.: Efficient string matching: an aid to bibliographic search. Commun. ACM 18, 333–340 (1975)
Apostolico, A., Galil, Z. (eds): Pattern Matching Algorithms. Oxford University Press, Oxford (1997)
Baeza-Yates R.A., Gonnet G.H.: A new approach to text searching. Commun. ACM 35(10), 74–82 (1992)
Charras C., Lecroq T.: Handbook of Exact String Matching Algorithms. King’s Collage Publications, London (2004)
Cleophas, L., Watson, B.W., Zwaan, G.: A new taxonomy of sublinear keyword pattern matching algorithms. Computer Science Report 04-07, Department of Mathematics and Computer Science, Eindhoven University of Technology, Eindhoven, April 2004
Crochemore M., Rytter W.: Jewels of stringology. World Scientific Publishing, Singapore (2003)
Durian, B., Holub, J., Peltola, H., Tarhio, J.: Tuning BNDM with q-grams. In: Proceedings of the Tenth Workshop on Algorithm Engineering and Experiments (ALENEX09), pp. 29–37, New York City, January 2009
Fredriksson, K.: Faster string matching with super–alphabets. In: Proceedings of the 9th International Symposium on String Processing and Information Retrieval (SPIRE’2002), LNCS, vol. 2476, pp. 44–57. Springer-Verlag, New York (2002)
Fredriksson, K., Grabowski, S.: Practical and optimal string matching. In: Proceedings of the 12th International Symposium on String Processing and Information Retrieval (SPIRE’2005), LNCS, vol. 3772, pp. 374–385. Springer-Verlag, New York (2005)
Fredriksson K., Grabowsky S.: Average-optimal string matching. J. Discrete Algorithms 7(4), 579–594 (2009)
Holub, J., Durian, B.: Fast variants of bit parallel approach to suffix automata. Unpublished Lecture, University of Haifa, April 2005
Külekci, M.O.: A method to overcome computer word size limitation in bit-parallel pattern matching. In: Hong, S.-H, Nagamochi, H., Fukunaga, T. (eds.) Proceedings of 19th International Symposium on Algorithms and Computation, ISAAC’2008. Lecture Notes in Computer Science, vol. 5369, pp. 496–506, Gold Coast, Australia, December 2008. Springer-Verlag (2008)
Külekci, M.O.: Overcoming computer word size limitation in bit-parallel pattern matching. Talk given in LSD&LAW’09, London Stringology Days & London Algorithmic Workshop, King’s College, London, UK, February 2009
Navarro G., Raffinot M.: Fast and flexible string matching by combining bit-parallelism and suffix automata. ACM J. Exp. Algorithms 5(4), 1–36 (2000)
Navarro G., Raffinot M.: Flexible Pattern Matching in Strings—Practical On-line Search Algorithms for Texts and Biological Sequences. Cambridge University Press, Cambridge (2002)
Peltola, H., Tarhio, J.: Alternative algorithms for bit-parallel string matching. In: LNCS, vol. 2857, Proceedings of SPIRE’2003, pp. 80–94 (2003)
Sunday D.M.: A very fast substring search algorithm. Commun. ACM 33(8), 132–142 (1990)
Watson, B.W.: A new family of Commentz-Walter-style multiple-keyword pattern matching algorithms. In: Proceedings of the Prague Stringology Club Workshop, pp. 71–76 (2000)
Watson B.W., Cleophas L.: SPARE parts: a C++ toolkit for string pattern recognition. Softw. Pract. Exp. 34, 697–710 (2004)
Wu S., Manber U.: Fast text searching allowing errors. Commun. ACM 35(10), 83–91 (1992)
Author information
Authors and Affiliations
Corresponding author
Additional information
A less extensive version of this study has appeared in ISAAC’08, the 19th International Symposium on Algorithms and Computation [12], and a related talk was given in LSD & LAW’09, London Stringology Days & London Algorithmic Workshop [13].
Rights and permissions
About this article
Cite this article
Külekci, M.O. BLIM: A New Bit-Parallel Pattern Matching Algorithm Overcoming Computer Word Size Limitation. Math.Comput.Sci. 3, 407–420 (2010). https://doi.org/10.1007/s11786-010-0035-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11786-010-0035-4