Abstract
This paper proposes the use of ontologies representing domain and linguistic knowledge for guiding natural language (NL) communication on theWeb contents. This proposal deals with the problem of obtaining and processing the Web data required to answer users queries. Concepts and communication acts are represented in a conceptual ontology (CO). Domain-restricted linguistic resources are obtained automatically by adapting the general linguistic knowledge to cover the communication acts for a particular domain. The use of domain-restricted grammars and lexicons has proved to be efficient, especially when the user is guided in introducing the sentences. To answer users queries the system fires the appropriate wrappers to extract the data from the Web. The CO provides a unifying framework to represent and process the knowledge obtained from the Web. Following this proposal, a dialogue-system for accessing a set of Web sites on the travelling domain in Spanish has been implemented.
Grup de recerca consolidat 2001 SGR 00254, supported by DURSI
Preview
Unable to display preview. Download preview PDF.
References
Ashish, N., Knoblock, C.A.: Wrapper generation for semistructured Internet sources. ACM SIGMOD Workshop on Managment of Semi-structured Data (1997).
Bateman, J.A., Kasper, R.T., Moore, J. D., Whitney, R. A.: A General Organization of Knowledge for Natural Language Processing: the Penman Upper Model. Technical report. USC/Information Sciences Institute (1990).
Cardie, C., Ng, V., Pierce, D., Buckley, C.: Examining the Role of Statistical and Linguistic Knowledge Sources in a General-Knowledge Question-Answering System. The Sixth Applied Natural Language Processing Conference (2000).
Cohen, W.: Recognizing Structure in Web Pages using Similarity Queries. AAAI (1999) 59–66.
Cohen, W.: Whirl: A word-based information representation language. Artificial Intelligence, 118, (2000) 163–196.
Cohen, W., Jensen, L.S.: A structured wrapper induction system for extracting information from semi-structured documents. IJCAI Workshop on Adaptive Text Extraction and Mining (2001).
Eikvil, L.: Information Extraction from World Wide Web-A Survey. Report 945 (1999). ISBN 82-539-0429-0. Available at: http://www.nr.no/documents/samba/researchareas/BAMG/Publications/webIE/rep945.ps.
Garcia-Molina, H., Papakonstantinou, D., Quass, A., Rajaraman, Y., Sagiv, Y., Ullman, J., Vassalos, V., Widom, J.: The TSIMMIS approach to mediation: Data models and languages. The journal of Intelligent Information Systems (1997).
Gatius, M., Rodríguez, H.: Adapting general linguistic knowledge to applications in order to obtain friendly and efficient NL interfaces. VEXTAL Conference (1999).
Gatius, M.: Using an ontology for guiding natural language interaction with knowledge based systems. Ph.D. thesis. Technical University of Catalonia (2001).
Hearst, M.A.: Information Integration Trends and Controversies. Column IEEE Intelligent Systems, 13 (1998) 17–20.
Hsu, C., Dung, M.: Generating finite-state transducers for semistructured data extraction from the WEB. Journal of Information Systems, 23 (1998) 521–538.
Kobayashi, M., Takeda, K.: Information Retrieval on the Web. Computing Surveys, 32 (2000).
Kushmerick, N.: Wrapper induction: Efficiency and expressiveness. Artificial Intelligence, 118 (2000) 15–68.
Levy, A.Y.: Combining Artificial Intelligence and Databases for Data Integration. Special issue of LNAI: Artificial Intelligence Today; Recent Trends and Developments (1999).
Maybury, M.: Intelligent Multimedia Interfaces. AAAI Press & Cambridge MA: The MIT Press, Menlo Park, CA (1993).
Muslea, I.: Extraction Patterns for Information Extraction Tasks: A Survey. AAAI Workshop on Machine Learning for Information Extraction (1999).
Muslea, I., Minton, S., Knoblock, C.: Hierarchical Wrapper Induction for Semistructured, Webbased Information Sources. Conference on Automated Learning and Discovery(CONALD) (1999).
The Penman NL Generation Group. The Nigel Manual. Information Sciences Institute of the University of Southern California. Draft (1988).
Voorhees, E. Overview of the TREC 2001 Question Answering Track. Presentation to the Text Retrieval Conference, USA (2001).
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Gatius, M., Rodríguez, H. (2002). Natural Language Guided Dialogues for Accessing theWeb. In: Sojka, P., Kopeček, I., Pala, K. (eds) Text, Speech and Dialogue. TSD 2002. Lecture Notes in Computer Science(), vol 2448. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-46154-X_53
Download citation
DOI: https://doi.org/10.1007/3-540-46154-X_53
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44129-8
Online ISBN: 978-3-540-46154-8
eBook Packages: Springer Book Archive