Abstract
Associative searching using hypertext links is a useful extension for conventional IR systems; manual conversion of texts into hypertexts, however, is feasible only in very restricted environments. Therefore, for large textual knowledge bases automatic conversion becomes necessary. In this paper, we will give a survey of existing (and implemented) as well as of projected approaches to the goal of automatic hypertextual representation as a prerequisite for associative searching. We will describe and compare the main ideas of these approaches, including their advantages and disadvantages.
Preview
Unable to display preview. Download preview PDF.
References
E. Adar and J. Hylton. On-the-fly hyperlink creation for page images. In Proceedings of Digital Libraries '95 (http://csdl.tamu.edu/DL95), June 11–13, 1995, Austin, USA, 1995.
M. Agosti, M. Melucci, and F. Crestani. Automatic authoring and construction of hypermedia for information retrieval. Multimedia Systems, (3):15–24, 1995.
H. Argenton and P. Becker. Efficient retrieval of labeled binary trees. In International Symposium on Advanced Database Technologies and Their Integration, Nara, 1994.
A. D. Bagdanov and J. Kanai (eds.). Information Science Research Institute. Information Science Research Institute, University of Nevada, Las Vegas, 4505 Maryland Parkway, Box 454021, Las Vegas, Nevada 89154-4021, 1995.
Y. Chenevoy and A. Belaid. Low-level structural recognition of documents. In Third Annual Symposium on Document Analysis and Information Retrieval, April 11–13, 1994, Alexis Park Hotel, Las Vegas, Nevada, pages 365–374, 4505 Maryland Parway, Box 454021, Las Vegas, Nevada 89154-4021, USA, 1994. University of Nevada, Las Vegas.
N. Chomsky. Lectures on Government and Binding. Dordrecht, 1981.
N. Chomsky. A Minimalist Program for Linguistic Theory. Occasional Papers in Linguistics. Cambridge, Mass., 1992.
C. Cleary and R. Bareiss. Practical methods for automatically generating typed links. In Hypertext '96, Washington DC, March 16–20, 1996, pages 31–41, New York, 1996. The Association for Computing Machinery.
G. H. Collier. Thoth-II: Hypertext with explicit semantics. In Proceedings of the Hypertext '87, Chapel Hill, November, 1987, pages 269–289. ACM, 1987.
W. B. Croft and H. Turtle. A retrieval model for incorporating hypertext links. In Proceedings of the ACM Hypertext '89, Nov. 5–8, 1989, SIGCHI Bulletin, pages 213–224, Pittsburgh, Pennsylvania, 1989.
W. Fitzgerald and C. Wisdo. Using natural language processing to construct large-scale hypertext systems. In Proc. of the 8th Knowledge Acquisition for Knowledge-Based Systems Workshop, Banff, Canada, Jan. 30–Feb. 4, 1994.
D. Frei, H. P. and Stieger.Making use of hypertext links when retrieving information. In D. Lucarella, editor, Proceedings of the ACM Conference on Hypertext, Milano, Italy, Nova 30–Dec. 4,1992, pages 102-111, 1992.
D. Knuth. Sorting and Searching, volume 3 of The Art of Computer Programming. addison-Wesley, 1973.
R. Kuhlen and M. S. Hess. Passagen-Retrieval — auch eine Möglichkeit der automatischen Verknüpfung in Hypertexten. In G. Knorz, J. Krause, and C. Womser-Hacker, editors, Information Retrieval '93 — Von der Modellierung zur Anwendung, volume 12 of Schriften zur Informationswissenschaft, pages 100–115. Universitätsverlag Konstanz, 1993.
V. I. Levenshtein. Binary codes capable of correcting deletions, insertions and reversals. Soviet Physics Doklady, 10(8):707–710, February 1966.
M. Lipshutz and S. Liebowitz Taylor. Automatic generation of hypertext from legacy documents. In Proc. of the RIAO 94, Rockefeller University, New York, USA, Oct. 11–13, 1994, volume 2, pages 103–111. CASIS, CID, 1994.
J. Mayfield and C. Nicholas. Snitch: Augmenting hypertext documents with a semantic net. 1993.
E. Mittendorf, P. Schäuble, and P. Sheridan. Applying probabilistic term weighting to OCR text in the case of a large alphabetic library catalogue. In Proceedings of the SIGIR'95, Seattle, June 9–13, 1995, 1995.
A. Myka. Putting paper documents in the World-Wide Web. In I. Goldstein, editor, Proceedings of the 2nd International WWW Conference '94, Oct. 17–20, 1994, Chicago, volume 1, pages 199–208, 1994.
A. Myka and U. Güntzer. Automatic hypertext conversion of paper document collections. In N. Adam, B. Bhargava, and Y. Yesha, editors, Advances in Digital Libraries, number 916 in Lecture Notes in Computer Science, pages 65–90. Springer-Verlag, 1995.
A. Myka and U. Güntzer. Fuzzy full-text searches in OCR databases. In (to appear Proc. ADL '95, A Forum on Research and Technology Advances in Digital Libraries, May 15–19, 1995, Tysons Corner, Virginia, 1996.
A. Myka, U. Güntzer, and F. Sarre. Monitoring user actions in the hypertext system “HyperMan”. In Going Online — Conference Proceedings of the SIGDOC '92 (Oct. 13–16, 1992, Ottawa, Canada), pages 103–114, 1515 Broadway, New York, New York 10036, 1992. The Association for Computing Machinery.
A. Myka, M. Hiittl, and U. Güntzer. Hypertext conversion and representation of a printed manual. In Proceedings of the RIAO '94, New York, Oct. 11–13, 1994, pages 407–417, 36 bis rue Ballu, 75009 Paris, France, 1994. C.I.D.-C.A.S.I.S.
A. Myka, F. Sarre, and U. Güntzer. Rule-based machine learning of hypertext links. Upravlyaemye Sistemy i Machiny, (7/8):75–82, 1992.
R. Rada. Hypertext writing and document reuse: The role of a semantic net. Electronic Publishing — Origination, Dissemination and Design, 3(3):125–140, 1990.
W. Richter. Amos and its environment — our experiences. In Proc. Computers and Poetic Texts, Symposium on the Use of the Computer for the Study of Literary Texts in Middle Eastern Languages, Bern, 1992.
G. Salton and C. Buckley. On the automatic generation of content links in hypertext. Technical Report TR 89-993, Department of Computer Science, Cornell University, April 1989.
Gerard Salton, editor. The SMART Retrieval System: Experiments in Automatic Document Processing. Prentice-Hall, 1971.
Gerard Salton. Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer. Addison-Wesley, 1989.
F. Sarre and U. Güntzer. Automatic transformation of linear text into hypertext. In Proceedings of the International Symposium on Database Systems for Advanced Applications (DASFAA '91), Tokyo, Japan, April 2–4, 1991, pages 498–506, 1991.
G. Specht and B. Freitag. Amos: A natural language parser in lola. In Proc. Workshop on Programming with Logic Databases, University of Wisconsin, Madison, Vancouver BC, 1993.
K. Taghva, Borsack. J., and A. Condit. Results of applying probabilistic IR to OCR text. In ACM SIGIR Conference on Research and Development in Information Retrieval, Dublin, Ireland, July, 1994, pages 202–211, 1994.
J. Werner and U. Güntzer. A step towards a true electronic library. In Proceedings of the ITTE '92 Conference, Brisbane, Australia, Sept. 29–Oct. 2, 1992, pages 614–631, 1992.
S. Wiesener, W. Kowarschik, P. Vogel, and R. Bayer. Semantic hypermedia retrieval in digital libraries. In To appear in: Advances in Digital Libraries, Lecture Notes in Computer Science. Springer-Verlag, 1995.
T. W. Yan and H. Garcia-Molina. Index structures for information filtering under the vector space model. Technical Report STAN-CS-TR-93-1494, Department of Computer Science, Stanford University, November 1993.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1997 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Myka, A., Argenton, H., Güntzer, U. (1997). Towards automatic hypertextual representation of linear texts. In: Nicholas, C., Wood, D. (eds) Principles of Document Processing. PODP 1996. Lecture Notes in Computer Science, vol 1293. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-63620-X_58
Download citation
DOI: https://doi.org/10.1007/3-540-63620-X_58
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-63620-5
Online ISBN: 978-3-540-69614-8
eBook Packages: Springer Book Archive