Abstract
We study exact string matching in special texts, which consist of consecutive fixed-length chunks where each position of a chunk has a character distribution of its own. This kind of setting can also be interpreted so that a chunk represents a character of a larger alphabet. If texts and patterns are of this kind, it may ruin the efficiency of common algorithms. We examine anomalies related to the Horspool and Sunday algorithms in this setting. In addition we present two new algorithms.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Baeza-Yates, R.: Improved string searching. Software: Practice and Experience 19(3), 257–271 (1989)
Berry, T., Ravindran, S.: A fast string matching algorithm and experimental results. In: Proc. of the Prague Stringology Club Workshop 1999, Czech Technical University, Prague, Czech Republic, Collaborative Report DC-99-05, pp. 16–28 (1999)
Boyer, R.S., Moore, J.S.: A fast string searching algorithm. Communications of the ACM 20(10), 762–772 (1977)
Horspool, R.N.: Practical fast searching in strings. Software: Practice and Experience 10(6), 501–506 (1980)
Hume, A., Sunday, D.: Fast string searching. Software: Practice and Experience 21(11), 1221–1248 (1991)
Kim, J.Y., Shawe-Taylor, J.: Fast string matching using an n-gram algorithm. Software: Practice and Experience 24(1), 79–88 (1994)
Lecroq, T.: Experiments on string matching in memory structures. Software: Practice and Experience 28(5), 561–568 (1998)
Navarro, G., Raffinot, M.: Flexible pattern matching in strings. Cambridge University Press, Cambridge (2002)
Raita, T.: Tuning the Boyer–Moore–Horspool string searching algorithm. Software: Practice and Experience 22(10), 879–884 (1992)
Sunday, D.M.: A very fast substring search algorithm. Communications of the ACM 33(8), 132–142 (1990)
Zhu, R.F., Takaoka, T.: On improving the average case of the Boyer–Moore string matching algorithm. Journal of Information Processing 10(3), 173–177 (1987)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Peltola, H., Tarhio, J. (2007). On String Matching in Chunked Texts. In: Holub, J., Žďárek, J. (eds) Implementation and Application of Automata. CIAA 2007. Lecture Notes in Computer Science, vol 4783. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-76336-9_16
Download citation
DOI: https://doi.org/10.1007/978-3-540-76336-9_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-76335-2
Online ISBN: 978-3-540-76336-9
eBook Packages: Computer ScienceComputer Science (R0)