Skip to main content

Pattern matching in hypertext

  • Session 5B: Invited Lecture
  • Conference paper
  • First Online:
Algorithms and Data Structures (WADS 1997)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1272))

Included in the following conference series:

Abstract

A myriad of textual problems have been considered in the pattern matching field with many non-trivial results. Nevertheless, surprisingly little work has been done on the natural combination of pattern matching and hypertext. In contrast to regular text, hypertext has a nonlinear structure and the techniques of pattern matching for text cannot be directly applied to hypertext.

Manber and Wu pioneered the study of pattern matching in hypertext and defined a hypertext model for pattern matching. Subsequent papers gave algorithms for pattern matching on hypertext with special structures — trees and DAGS.

In this paper we present a much simpler algorithm achieving the same complexity which runs on any hypertext graph. We then extend the problem to approximate pattern matching in hypertext, first considering hamming distance and then edit distance. We show that in contrast to regular text, it does make a difference whether the errors occur in the hypertext or the pattern. The approximate pattern matching problem in hypertext with errors in the hypertext turns out to be NP-Complete and the approximate pattern matching problem in hypertext with errors in the pattern has a polynomial time solution.

Partially supported by NSF grant CCR-95-31939, the Israel Ministry of Science and the Arts grants 6297 and 8560 and a 1997 Bar-Ilan University Internal Research Grant.

Partially supported by the Israel Ministry of Science and the Arts grant 8560. This work is part of Noa Lewenstein's Ph.D. dissertation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. K. Abrahamson. Generalized String Matching. SIAM J. Computing 16 (6), 1039–1051, 1987.

    Google Scholar 

  2. T. Akutsu. A Linear Time Pattern Matching Algorithm Between a String and a Tree. Proceedings of the 4th Symposium on Combinatorial Pattern Matching, 1–10, Padova, Italy, 1993.

    Google Scholar 

  3. A. Amir, M. Farach, R. Giancarlo, Z. Galil, and K. Park. Dynamic dictionary matching. Journal of Computer and System Sciences, 49(2):208–222, 1994.

    Google Scholar 

  4. A. Amir, M. Farach, R.M. Idury, J.A. La Poutré, and A.A. Schiffer. Improved dynamic dictionary matching. Information and Computation, 119(2):258–282, 1995.

    Google Scholar 

  5. A. Aviad. Hyper-Talmud: a hypertext system for the Babylonian Talmud and its commentaries. Dept. of Math and Computer Science, Bar-Ilan University, 1993.

    Google Scholar 

  6. R.S. Boyer and J.S. Moore. A fast string searching algorithm. Comm. ACM, 20:762–772, 1977.

    Google Scholar 

  7. T.H. Cormen, C.E. Leiserson and R.L. Rivest. Introduction to Algorithms The MIT Press, Cambridge, Massachusetts, 1990.

    Google Scholar 

  8. P. Ferragina and R. Grossi. Optimal on-line search and sublinear time update in string matching. Proc. 7th ACM-SIAM Symposium on Discrete Algorithms, pages 531–540, 1995.

    Google Scholar 

  9. A. S. Fraenkel and S. T. Klein. Information Retrieval from Annotated Texts. TR95-25, Dept. of Applied Math and Computer Science, The Weizmann Institute of Science, 1995.

    Google Scholar 

  10. D.E. Knuth, J.H. Morris, and V.R. Pratt. Fast pattern matching in strings. SIAM J. Computing, 6:323–350, 1977.

    Google Scholar 

  11. D.K. Kim and K. Park. String matching in hypertext. 6th Symposium on Combinatorial Pattern Matching, Helsinki, Finland, 1995.

    Google Scholar 

  12. S. Rao Kosaraju. Efficient String Matching. Manuscript, 1987.

    Google Scholar 

  13. G. M. Landau and U. Vishkin. Fast parallel and serial approximate string matching. Journal of Algorithms 10 (2), 157–169, 1989.

    Google Scholar 

  14. U. Manber and S. Wu. Approximate string matching with arbitrary costs for text and hypertext. IAPR Workshop on structural and syntactic pattern recognition, Bern, Switzerland, 1992.

    Google Scholar 

  15. J. Nielsen. Hypertext and Hypermedia. Academic Press Professional, Boston, 1993.

    Google Scholar 

  16. S. C. Sahinalp and U. Vishkin. Efficient approximate and dynamic matching of patterns using a labeling paradigm. Proc, of 37th Annual Symposium on Foundations of Computer Science, 320–328, Burlington, Vermont, 1996.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Frank Dehne Andrew Rau-Chaplin Jörg-Rüdiger Sack Roberto Tamassia

Rights and permissions

Reprints and permissions

Copyright information

© 1997 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Amir, A., Lewenstein, M., Lewenstein, N. (1997). Pattern matching in hypertext. In: Dehne, F., Rau-Chaplin, A., Sack, JR., Tamassia, R. (eds) Algorithms and Data Structures. WADS 1997. Lecture Notes in Computer Science, vol 1272. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-63307-3_56

Download citation

  • DOI: https://doi.org/10.1007/3-540-63307-3_56

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-63307-5

  • Online ISBN: 978-3-540-69422-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics