Abstract
SiteIF is a personal agent for a bilingual news web site that learns user’s interests from the requested pages.
In this paper we propose to use a content-based document representation as a starting point to build a model of the user’s interests. Documents passed over are processed and relevant senses (disambiguated over WORDNET) are extracted and then combined to form a semantic network. A filtering procedure dynamically predicts new documents on the basis of the semantic network.
There are two main advantages of a content-based approach: first, the model predictions, being based on senses rather then words, are more accurate; second, the model is language independent, allowing navigation in multilingual sites. We report the results of a comparative experiment that has been carried out to give a quantitative estimation of these improvements.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
A. Artale, B. Magnini, and C. Strapparava. WordNet for italian and its use for lexical discrimination. In AI*IA97: Advances in Artificial Intelligence. Springer Verlag, 1997.
C. Fellbaum. WordNet. An Electronic Lexical Database. The MIT Press, 1998.
J. Gonzalo, F. Verdejio, Chugur, and J. Cigarran. Indexing with wordnet synsets can improve text retrieval. In S. Harabagiu, editor, Proceeding of the Workshop “Usage of WordNet in Natural Language Processing Systems”, Montreal, Quebec, Canada, August 1998.
J. Gonzalo, F. Verdejio, C. Peters, and N. Calzolari. Applying eurowordnet to cross-language text retrieval. Computers and Humanities, 32(2-3):185–207, 1998.
Henry Lieberman, Neil W. Van Dyke, and Adrian S. Vivacqua. Let’s browse: A collaborative web browsing agent. In Proceedings of the 1999 International Conference on Intelligent User Interfaces, Collaborative Filtering and Collaborative Interfaces, pages 65–68, 1999.
B. Magnini and G. Cavaglià. Integrating subject field codes into WordNet. In Proceedings of LREC-2000, Second International Conference on Language Resources and Evaluation, Athens, Greece, June 2000.
B. Magnini and C. Strapparava. Experiments in word domain disambiguation for parallel texts. In Proc. of SIGLEX Workshop on Word Senses and Multi-linguality, Hong-Kong, October 2000. held in conjunction with ACL 2000.
M. Minio and C. Tasso. User modeling for information filtering on internet services: Exploiting an extended version of the UMT shell. In Proc. of Workshop on User Modeling for Information Filtering on the World Wide Web, Kailia-Kuna Hawaii, January 1996. held in conjunction with UM’96.
A. Stefani and C. Strapparava. Personaliziong access to web sites: The siteif project. In Proc. of second Workshop on Adaptive Hypertext and Hypermedia, Pittsburgh, June 1998. held in conjunction with HYPERTEXT’ 98.
C. Strapparava, B. Magnini, and A. Stefani. Sense-based user modelling for web sites. In Adaptive Hypermedia and Adaptive Web-Based Systems-Lecture Notes in Computer Science 1892. Springer Verlag, 2000.
Y. Wilks and M. Stevenson. Word sense disambiguation using optimised combination of knowledge sources. In Proc. of COLING-ACL’98, 98.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Magnini, B., Strapparava, C. (2001). Improving User Modelling with Content-Based Techniques. In: Bauer, M., Gmytrasiewicz, P.J., Vassileva, J. (eds) User Modeling 2001. UM 2001. Lecture Notes in Computer Science(), vol 2109. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44566-8_8
Download citation
DOI: https://doi.org/10.1007/3-540-44566-8_8
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42325-6
Online ISBN: 978-3-540-44566-1
eBook Packages: Springer Book Archive