A System for Building Intelligent Agents that Learn to Retrieve and Extract Information

Eliassi-Rad, Tina; Shavlik, Jude

doi:10.1023/A:1024009718142

A System for Building Intelligent Agents that Learn to Retrieve and Extract Information

Published: February 2003

Volume 13, pages 35–88, (2003)
Cite this article

User Modeling and User-Adapted Interaction Aims and scope Submit manuscript

Tina Eliassi-Rad¹ &
Jude Shavlik²

150 Accesses
7 Citations
Explore all metrics

Abstract

We present a system for rapidly and easily building instructable and self-adaptive software agents that retrieve and extract information. Our Wisconsin Adaptive Web Assistant (WAWA) constructs intelligent agents by accepting user preferences in the form of instructions. These user-provided instructions are compiled into neural networks that are responsible for the adaptive capabilities of an intelligent agent. The agent’s neural networks are modified via user-provided and system-constructed training examples. Users can create training examples by rating Web pages (or documents), but more importantly WAWA’s agents uses techniques from reinforcement learning to internally create their own examples. Users can also provide additional instruction throughout the life of an agent. Our experimental evaluations on a ‘home-page finder’ agent and a ‘seminar-announcement extractor’ agent illustrate the value of using instructable and adaptive agents for retrieving and extracting information.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Machine learning and deep learning

Article Open access 08 April 2021

A survey on large language model based autonomous agents

Article Open access 22 March 2024

Natural Language Processing

References

Aho, A., Sethi, R. and Ullman, J.: 1986, Compilers, Principles, Techniques and Tools. Reading, MA: Addison Wesley.
Google Scholar
Belew, R.K.: 2000, Finding Out About: A Cognitive Perspective on Search Engine Technology and the WWW. New York,NY: Cambridge University Press.
Google Scholar
Bikel, D., Schwartz, R. and Weischedel R.: 1999,’An algorithm that learns what’s in a name’. Machine Learning: Special Issue on Natural Language Learning 34(1/3),211231.
Google Scholar
Brill, E.: 1994,’Some advances in rule-based part of speech tagging’. In: Proceedings of the Twelfth National Conference on Artificial Intelligence. Seattle, WA, pp. 722-727.
Califf, M.E.: 1998,’Relational Learning Techniques for Natural Language Information Extraction’. Ph.D. thesis, Department of Computer Sciences,University of Texas, Austin,TX.
Google Scholar
Craven, M. and Kumlien, J.: 1999,Constructing biological knowledge-bases by extracting information from text sources. In: Proceedings of the Seventh International Conference on Intelligent Systems for Molecular Biology. Heidelberg, Germany, pp. 77-86.
Craven, M.W. and Shavlik, J. W.: 1996,Extracting Tree-Structured Representations of Trained Networks. In: Advances in Neural Information Processing Systems,Vol. 8. Denver, CO,pp. 24-30.
Google Scholar
Croft, W., Turtle, H. and Lewis, D.: 1991, The use of phrases and structured queries in information retrieval. In: Proceedings of the Fourteenth International ACMSIGIR Conference on R & D in Information Retrieval. Chicago, IL, pp. 3245.
Drummond, C., Ionescu, D. and Holte, R.: 1995, ‘A Learning Agent that Assists the Browsing of Software Libraries. Technical Report TR-95-12,University of Ottawa, Ottawa,Canada.
Google Scholar
Eliassi-Rad, T.: 2001,’Building Intelligent Agents that Learn to Retrieve and Extract Information’. Ph.D. thesis,Computer Sciences Department,University of Wisconsin, Madison, WI. (Also appears as UW Technical Report CS-TR-01-1431).
Google Scholar
Eliassi-Rad, T. and Shavlik, J.: 2001a, ‘Intelligent Web Agents that Learn to Retrieve and Extract Information’. In: P. Szczepaniak, F. Segovia, J. Kacprzyk, and L. Zadeh (eds.): Intelligent Exploration of the Web. Springer-Verlag.
Eliassi-Rad, T. and Shavlik, J.: 2001b, ‘A Theory-Refinement Approach to Information Extraction’. In: Proceedings of the Eighteenth International Conference on Machine Learning. Williamstown, MA.
Feldman, R., Liberzon, Y., Rosenfeld, B., Schler, J. and Stoppi, J.: 2000, ‘A framework for specifying explicit bias for revision of approximate information extraction rules’. In: Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Boston, MA,pp. 189-197.
Freitag, D.: 1997, ‘Using grammatical inference to improve precision in information extraction’. In: Proceedings of the Fourteenth International Conference on Machine Learning: Workshop on Automata Induction, Grammatical Inference, and Language Acquisition.
Freitag, D.: 1998,’Machine Learning for Information Extraction in Informal Domains’. Ph.D. thesis,Computer Science Department, Carnegie Mellon University, Pittsburgh, PA.
Google Scholar
Freitag, D. and Kushmerick, N.: 2000, ‘Boosted Wrapper Induction’. In: Proceedings of the Seventeenth National Conference on Artificial Intelligence. Austin, TX, pp. 577-583.
Freitag, D. and McCallum, A.: 1999, Information Extraction with HMMs and shrinkage. In: Proceedings of the Sixteenth National Conference on Artificial Intelligence Workshop on Machine Learning for Information Extraction. Orlando, FL,pp. 31-36.
Goecks, J. and Shavlik, J.: 2000,Learning Users’ Interests by Unobtrusively Observing Their Normal Behavior. In: Proceedings of the 2000 International Conference on Intelligent User Interfaces. New Orleans, LA.
Joachims, T., Freitag, D. and Mitchell, T.: 1997,Webwatcher: A Tour Guide for the World Wide Web. In: Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence. Nagoya, Japan,pp. 770-775.
Kushmerick, N.: 2000,Wrapper Induction: Efficiency and expressiveness. Artificial Intelligence 118,1568.
Article Google Scholar
Leek, T.: 1997,Information extraction using hidden Markov models. Master’s Thesis,Department of Computer Science & Engineering, University of California, San Diego.
Google Scholar
Lieberman, H.: 1995,Letzia: An Agent that Assists Web Browsing. In: Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence. Montreal, Canada, pp. 924-929.
Maclin, R. and Shavlik, J.: 1996,Creating advice-taking reinforcement learners. Machine Learning 22, 251281.
Google Scholar
Mitchell, T.: 1997, Machine Learning. McGraw-Hill.
Ourston, D. and Mooney, R.: 1994, Theory refinement: Combining analytical and empirical methods. Artifgicial Intelligence 66, 273-309.
Article Google Scholar
Pazzani, M. and Kibler, D.: 1992, The utility of knowledge in inductive learning. Machine Learning 9,57-94.
Google Scholar
Pazzani, M., Muramatsu, J. and Billsus, D.: 1996, Syskill & Webert: Identifying interesting web sites. In: Proceedings of the Thirteenth National Conference on Artificial Intelligence. Portland, OR,pp. 54-61.
Ray, S. and Craven, M.: 2001,Representing Sentence Structure in Hidden Markov Models for Information Extraction. In: Proc. of IJCAI01. Seattle, WA.
Riloff, E.: 1998,’The Sundance Sentence Analyzer’. http://www.cs.utah.edu/projects/nlp/.
Rumelhart, D., Hinton, G. and Williams, R.: 1986, Learning internal representations by error propagation. In: D. Rumelhart and J. McClelland (eds.): Parallel Distributed Processing: Explorations in the Microstructure of Cognition,Vol. 1. MIT Press, pp. 318-363.
Salton, G.: 1991, Developments in automatic text retrieval. Science 253,974-979.
Google Scholar
Salton, G. and Buckley, C.: 1988, ‘Term-weighting approaches in authomatic text retrieval’Information Processing and Management 24(5),513-523.
Article Google Scholar
Schapire, R. and Singer, Y.: 1998,’Improved boosting algorithms using confidence-rated predictions’. In: Proceedings of the Eleventh Annual Conference on Computational Learning Theory.
Sejnowski, T. and Rosenberg, C.: 1987, ‘Parallel networks that learn to pronounce English text’. Complex Systems 1,145-168.
Google Scholar
Selman, B., Kautz, H. and Cohen, B.: 1996,’Local search strategies for satisfiability testing. DIMACS Series in Discrete Mathematics and Theoretical CS 26,521-531.
Google Scholar
Seymore, K., McCallum, A. and Rosenfeld, R.: 1999,Learning Hidden Markov Model Structure for Information Extraction. In: Proceedings of the Sixteenth National Conference on Artificial Intelligence Workshop on Machine Learning for Information Extraction. Orlando, FL, pp. 37-42.
Shakes, J., Langheinrich, M. and Etzioni O.: 1997,Dynamic Reference Sifting: A Case Study in the Homepage Domain. In: Proceedings of the Sixth International World Wide Web Conference. Santa Clara, CA, pp. 189-200.
Shavlik, J., Calcari, S., Eliassi-Rad, T. and Solock J.: 1999, An Instructable, Adaptive Interface for Discovering and Monitoring Information on the World-Wide Web. In: Proceedings of the 1999 International Conference on Intelligent User Interfaces. Redondo Beach, CA, pp. 157-160.
Shavlik, J. and Eliassi-Rad, T.: 1998a, Building intelligent agents for web-based tasks: A theory-refinement approach. In: Proceedings of the Conference on Automated Learning and Discovery Workshop on Learning from Text and the Web. Pittsburgh, PA.
Shavlik, J. and Eliassi-Rad, T.: 1998b,Intelligent agents for web-based tasks: An advice-taking approach. In: Proceedings of the Fifteenth National Conference on Artificial Intelligence: Workshop on Learning for Text Categorization. Madison, WI, pp. 63-70.
Soderland, S.: 1999,Learning information extraction rules for semi-structured and free text. Machine Learning: Special Issue on Natural Language Learning 34(1/3),233-272.
Google Scholar
Sutton, R.: 1988,Learning to predict by the methods of temporal differences. Machine Learning 3,9-44.
Google Scholar
Sutton, R.S. and Barto, A.G.: 1998, Reinforcement Learning. MIT Press.
Towell, G.G. and Shavlik, J.W.: 1994, Knowledge-based artificial neural networks. Artificial Intelligence 70(1/2),119-165.
Article Google Scholar
Valiant, L.: 1984,A theory of the learnable. Communications of the ACM 27,1134-1142.
Article Google Scholar
van Rijsbergen, C.J.: 1979, Information Retrieval. London: Buttersworths, second edition.
Watkins, C.: 1989,’Learning from delayed rewards’. Ph.D. thesis, King’ College, Cambridge.
Google Scholar

Download references

Author information

Authors and Affiliations

Lawrence Livermore National Laboratory, Center for Applied Scientific Computing, Box 808, L-560, Livermore, CA, 94551, USA
Tina Eliassi-Rad
Computer Sciences Department, University of Wisconsin-Madison, 1210 West Dayton Street, Madison, WI, 53706, USA
Jude Shavlik

Authors

Tina Eliassi-Rad
View author publications
You can also search for this author in PubMed Google Scholar
Jude Shavlik
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Eliassi-Rad, T., Shavlik, J. A System for Building Intelligent Agents that Learn to Retrieve and Extract Information. User Model User-Adap Inter 13, 35–88 (2003). https://doi.org/10.1023/A:1024009718142

Download citation

Issue Date: February 2003
DOI: https://doi.org/10.1023/A:1024009718142

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A System for Building Intelligent Agents that Learn to Retrieve and Extract Information

Abstract

Access this article

Similar content being viewed by others

Machine learning and deep learning

A survey on large language model based autonomous agents

Natural Language Processing

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

A System for Building Intelligent Agents that Learn to Retrieve and Extract Information

Abstract

Access this article

Similar content being viewed by others

Machine learning and deep learning

A survey on large language model based autonomous agents

Natural Language Processing

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation