Abstract
A myriad of textual problems have been considered in the pattern matching field with many non-trivial results. Nevertheless, surprisingly little work has been done on the natural combination of pattern matching and hypertext. In contrast to regular text, hypertext has a nonlinear structure and the techniques of pattern matching for text cannot be directly applied to hypertext.
Manber and Wu pioneered the study of pattern matching in hypertext and defined a hypertext model for pattern matching. Subsequent papers gave algorithms for pattern matching on hypertext with special structures — trees and DAGS.
In this paper we present a much simpler algorithm achieving the same complexity which runs on any hypertext graph. We then extend the problem to approximate pattern matching in hypertext, first considering hamming distance and then edit distance. We show that in contrast to regular text, it does make a difference whether the errors occur in the hypertext or the pattern. The approximate pattern matching problem in hypertext with errors in the hypertext turns out to be NP-Complete and the approximate pattern matching problem in hypertext with errors in the pattern has a polynomial time solution.
Partially supported by NSF grant CCR-95-31939, the Israel Ministry of Science and the Arts grants 6297 and 8560 and a 1997 Bar-Ilan University Internal Research Grant.
Partially supported by the Israel Ministry of Science and the Arts grant 8560. This work is part of Noa Lewenstein's Ph.D. dissertation.
Preview
Unable to display preview. Download preview PDF.
References
K. Abrahamson. Generalized String Matching. SIAM J. Computing 16 (6), 1039–1051, 1987.
T. Akutsu. A Linear Time Pattern Matching Algorithm Between a String and a Tree. Proceedings of the 4th Symposium on Combinatorial Pattern Matching, 1–10, Padova, Italy, 1993.
A. Amir, M. Farach, R. Giancarlo, Z. Galil, and K. Park. Dynamic dictionary matching. Journal of Computer and System Sciences, 49(2):208–222, 1994.
A. Amir, M. Farach, R.M. Idury, J.A. La Poutré, and A.A. Schiffer. Improved dynamic dictionary matching. Information and Computation, 119(2):258–282, 1995.
A. Aviad. Hyper-Talmud: a hypertext system for the Babylonian Talmud and its commentaries. Dept. of Math and Computer Science, Bar-Ilan University, 1993.
R.S. Boyer and J.S. Moore. A fast string searching algorithm. Comm. ACM, 20:762–772, 1977.
T.H. Cormen, C.E. Leiserson and R.L. Rivest. Introduction to Algorithms The MIT Press, Cambridge, Massachusetts, 1990.
P. Ferragina and R. Grossi. Optimal on-line search and sublinear time update in string matching. Proc. 7th ACM-SIAM Symposium on Discrete Algorithms, pages 531–540, 1995.
A. S. Fraenkel and S. T. Klein. Information Retrieval from Annotated Texts. TR95-25, Dept. of Applied Math and Computer Science, The Weizmann Institute of Science, 1995.
D.E. Knuth, J.H. Morris, and V.R. Pratt. Fast pattern matching in strings. SIAM J. Computing, 6:323–350, 1977.
D.K. Kim and K. Park. String matching in hypertext. 6th Symposium on Combinatorial Pattern Matching, Helsinki, Finland, 1995.
S. Rao Kosaraju. Efficient String Matching. Manuscript, 1987.
G. M. Landau and U. Vishkin. Fast parallel and serial approximate string matching. Journal of Algorithms 10 (2), 157–169, 1989.
U. Manber and S. Wu. Approximate string matching with arbitrary costs for text and hypertext. IAPR Workshop on structural and syntactic pattern recognition, Bern, Switzerland, 1992.
J. Nielsen. Hypertext and Hypermedia. Academic Press Professional, Boston, 1993.
S. C. Sahinalp and U. Vishkin. Efficient approximate and dynamic matching of patterns using a labeling paradigm. Proc, of 37th Annual Symposium on Foundations of Computer Science, 320–328, Burlington, Vermont, 1996.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1997 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Amir, A., Lewenstein, M., Lewenstein, N. (1997). Pattern matching in hypertext. In: Dehne, F., Rau-Chaplin, A., Sack, JR., Tamassia, R. (eds) Algorithms and Data Structures. WADS 1997. Lecture Notes in Computer Science, vol 1272. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-63307-3_56
Download citation
DOI: https://doi.org/10.1007/3-540-63307-3_56
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-63307-5
Online ISBN: 978-3-540-69422-9
eBook Packages: Springer Book Archive