Abstract
In many retrieval domains there exists a problematic gap between what computers can describe and what humans are capable of perceiving. This gap is most evident in the indexing of multimedia data such as images, video and sound where the low-level features are too semantically deficient to be of use from a typical users’ perspective. On the other hand, users possess the ability to quickly examine and summarise these documents, even subconsciously. Examples include specifying relevance between a query and results, rating preferences in film databases, purchasing items from online retailers, and even browsing web sites. Data from these interactions, captured and stored in log files, can be interpreted to have semantic meaning, which proves indispensable when used in a collaborative setting where users share similar preferences or goals. In this chapter we summarise techniques for efficiently exploiting user interaction in its many forms for the generation and augmentation of semantic data in large databases. This user interaction can be applied to improve performance in recommender and information retrieval systems. A case study is presented which applies a popular technique, latent semantic analysis, to improve retrieval on an image database.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
Sparsity values were not given for the subsampled datasets, but only users with more than 20 ratings were considered.
- 2.
See [26] for details.
- 3.
- 4.
In August 2006, AOL Research released millions of anonymous search histories only to have some queries subsequently linked to at least one individual by identifiable keywords [20]. Later, in October 2006, security researchers reversed the anonymous Netflix movie ratings data by cross-referencing ratings and dates with publicly available data on IMDB.com [41].
- 5.
Image categories used are: colors_and_textures, cougars, creative_crystals, creative_textures, cuisine, desserts, dolphins_and_whales, elephants, endangered_species, everyday_objects, fabulous_fruit, fireworks, fitness, flowering_potted_plants, flowers_closeup, foxes_and_coyotes, frost_textures, fruits_and_nuts, fungi, hawks_and_falcons.
References
Chris Anderson, The long tail, Wired Magazine 12 (2004), no. 10.
Ricardo Baeza-Yates and Neto-Ribeiro Berthier, Modern information retrieval, Addison-Wesley, Essex, England, 1999.
Marko Balabanović and Yoav Shoham, Fab: content-based, collaborative recommendation, Commun. ACM 40 (1997), no. 3, 66–72.
Pierre Baldi, Paolo Frasconi, and Padhraic Smyth, Modeling the internet and the web: Probabilistic methods and algorithms, John Wiley & Sons, West Sussex, England, 2003.
Daniel Billsus and Michael Pazzani, Learning probabilistic user models, Proceedings of the Workshop on Machine Learning for User Modeling (Chia Laguna, IT), 1997.
C. Boutilier, R. Zemel, and B. Marlin, Active collaborative filtering, In Proceedings of the Nineteenth Annual Conference on Uncertainty in Artificial Intelligence, 2003, pp. 98–106.
T. Brauen, Document vector modification, The SMART Retrieval System (G. Salton, ed.), Prentice Hall, New Jersey, 1971, pp. 456–484.
M. Cord and P. H. Gosselin, Image retrieval using long-term semantic learning, IEEE International Conference on Image Processing, 2006.
Nick Craswell and Martin Szummer, Random walks on the click graph, In Proceedings of SIGIR 2007, 2007.
S. Deerwester, S. Dumais, T. Landauer, G. Furnas, and R. Harshman, Indexing by latent semantic analysis, Journal of the American Society of Information Science 4 (1990), 391–407.
Scott C. Deerwester, Susan T. Dumais, Thomas K. Landauer, George W. Furnas, and Richard A. Harshman, Indexing by latent semantic analysis, Journal of the American Society of Information Science 41 (1990), no. 6, 391–407.
A. P. Dempster, N. M. Laird, and D. B. Rubin, Maximum likelihood from incomplete data via the em algorithm, Journal of the Royal Statistical Society. Series B (Methodological) 39 (1977), no. 1, 1–38.
Mukund Deshpande and George Karypis, Item-based top-N recommendation algorithms, ACM Trans. Inf. Syst. 22 (2004), no. 1, 143–177.
J. Fournier and M. Cord, Long-term similarity learning in content-based image retrieval, 2002.
Annalisa Franco and Alessandra Lumini, Mixture of KL subspaces for relevance feedback, Multimedia Tools Appl. 37 (2008), no. 2, 189–209.
Dan Frankowski, Shyong K. Lam, Shilad Sen, F. Maxwell Harper, Scott Yilek, Michael Cassano, and John Riedl, Recommenders everywhere: the wikilens community-maintained recommender system, WikiSym ’07: Proceedings of the 2007 international symposium on Wikis (New York, NY, USA), ACM, 2007, pp. 47–60.
David Goldberg, David Nichols, Brian M. Oki, and Douglas Terry, Using collaborative filtering to weave an information tapestry, Commun. ACM 35 (1992), no. 12, 61–70.
Ken Goldberg, Theresa Roeder, Dhruv Gupta, and Chris Perkins, Eigentaste: A constant time collaborative filtering algorithm, Information Retrieval 4 (2001), no. 2, 133–151.
P.-H. Gosselin and M. Cord, Semantic kernel learning for interactive image retrieval, IEEE International Conference on Image Processing (Genoa, Italy), IEEE, sept. 2005.
Katie Hafner, Tempting data, privacy concerns; researchers yearn to use AOL logs, but they hesitate, Web site: The New York Times, August 23, 2006. Retrieved on 2006-09-13. http://www.nytimes.com/2006/08/23/technology/23search.html
Xiaofei He, O. King, Wei-Ying Ma, Mingjing Li, and Hong-Jiang Zhang, Learning a semantic space from user’s relevance feedback for image retrieval, Circuits and Systems for Video Technology, IEEE Transactions on 13 (2003), no. 1, 39–48.
D. Heisterkamp, Building a latent-semantic index of an image database from patterns of relevance feedback, 2002.
Jon Herlocker, Joseph A. Konstan, and John Riedl, An empirical analysis of design choices in neighborhood-based collaborative filtering algorithms, Inf. Retr. 5 (2002), no. 4, 287–310.
Jonathan L. Herlocker, Joseph A. Konstan, Loren G. Terveen, and John T. Riedl, Evaluating collaborative filtering recommender systems, ACM Trans. Inf. Syst. 22 (2004), no. 1, 5–53.
Will Hill, Larry Stead, Mark Rosenstein, and George Furnas, Recommending and evaluating choices in a virtual community of use, CHI ’95: Proceedings of the SIGCHI conference on Human factors in computing systems (New York, NY, USA), ACM Press/Addison-Wesley Publishing Co., 1995, pp. 194–201.
Thomas Hofmann, Probabilistic latent semantic analysis, Proc. of Uncertainty in Artificial Intelligence, UAI’99 (Stockholm), 1999.
Thomas Hofmann, Unsupervised learning by probabilistic latent semantic analysis, IEEE Trans. on PAMI 25 (2000).
Thomas Hofmann, Latent semantic models for collaborative filtering, ACM Trans. Inf. Syst. 22 (2004), no. 1, 89–115.
Jeff Howe, The rise of crowdsourcing, Wired Magazine 14 (2006), no. 06.
Takeo Kanade and Shingo Uchihashi, User-powered “content-free” approach to image retrieval, Proceedings of International Symposium on Digital Libraries and Knowledge Communities in Networked Information Society 2004 (DLKC04), 2004, pp. 24–32.
Charles Kemp and Kotagiri Ramamohanarao, Long-term learning for web search engines, PKDD ’02: Proceedings of the 6th European Conference on Principles of Data Mining and Knowledge Discovery (London, UK), Springer-Verlag, 2002, pp. 263–274.
Markus Koskela and Jorma Laaksonen, Using long-term learning to improve efficiency of content-based image retrieval, 2003.
M. Li, Z. Chen, and H. Zhang, Statistical correlation analysis in image retrieval, 2002.
Rong Jin Luo Si, Steven C. H. Hoi, and Michael R. Lyu, Collaborative image retrieval via regularized metric learning, ACM Multimedia Systems Journal (MMSJ), Special Issue on Machine Learning Approaches to Multimedia Information Retrieval 12 (2006), no. 1, 34–44.
Stéphane Marchand-Maillet and Eric Bruno, Exploiting user interaction for semantic content-based image retrieval, Tech. report, Computer Vision and Multimedia Laboratory, Computing Centre, University of Geneva, 2003.
Benjamin Marlin and Richard S. Zemel, The multiple multiplicative factor model for collaborative filtering, ICML ’04: Proceedings of the twenty-first international conference on Machine learning (New York, NY, USA), ACM, 2004, p. 73.
P. McJones, EachMovie collaborative filtering dataset, Website: http://www.research.compaq.com/SRC/eachmovie/, 1997, DEC (now Compaq) Systems Research Center.
Donn Morrison, Stéphane Marchand-Maillet, and Eric Bruno, Semantic clustering of images using patterns of relevance feedback, Proceedings of the 6th International Workshop on Content-based Multimedia Indexing (London, UK), June 18-20 2008.
Henning Müller, Wolfgang Müller, David McG. Squire, Stéphane Marchand-Maillet, and Thierry Pun, Long-term learning from user behavior in content-based image retrieval, Tech. report, Université de Genève, 2000.
Henning Müller, Thierry Pun, and David Squire, Learning from user behavior in image retrieval: Application of market basket analysis, Int. J. Comput. Vision 56 (2004), no. 1-2, 65–77.
Arvind Narayanan and Vitaly Shmatikov, How to break anonymity of the netflix prize dataset, 2006.
O. Nasraoui, C. Cardona, C. Rojas, and F. Gonzalez, Mining evolving user profiles in noisy web clickstream data with a scalable immune system clustering algorithm, 2003.
Netflix, The Netflix Prize, Web site: http://www.netflixprize.com/, 2006.
P. Resnick, N. Iacovou, M. Suchak, P. Bergstorm, and J. Riedl, GroupLens: An Open Architecture for Collaborative Filtering of Netnews, Proceedings of ACM 1994 Conference on Computer Supported Cooperative Work (Chapel Hill, North Carolina), ACM, 1994, pp. 175–186.
J. J. Rocchio, Relevance feedback in information retrieval, The SMART Retrieval System (G. Salton, ed.), Prentice Hall, New Jersey, 1971, pp. 456–484.
Bryan C. Russell, Antonio Torralba, Kevin P. Murphy, and William T. Freeman, LabelMe: A database and web-based tool for image annotation, Int. J. Comput. Vision 77 (2008), no. 1–3, 157–173.
Ian Ruthven and Mounia Lalmas, A survey on the use of relevance feedback for information access systems, Knowl. Eng. Rev. 18 (2003), no. 2, 95–145.
B. Sarwar, G. Karypis, J. Konstan, and J. Riedl, Application of dimensionality reduction in recommender systems–a case study, 2000.
J.B. Schafer, J.A. Konstan, and J. Riedl, The view through metalens: Usage patterns for meta-recommendation system, IEE Proceedings Software 151 (2004), 267–279.
Cees Snoek, Content-based video indexing, Presentation given at Summer School on Multimedia Semantics (SSMS’07), Glasgow, UK, 2007, Slides URL: http://www.dcs.gla.ac.uk/ssms07/teaching-material/SSMS2007_CeesSnoek-part2.pdf.
Alvin Toffler, Future shock, Random House, New York City, NY, USA, 1970.
Carnegie Mellon University, reCAPTCHA, Web site: http://recaptcha.net/, 2007.
Masao Utiyama and Mikio Yamamoto, Relevance feedback models for recommendation, Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing (EMNLP 2006), 2006, pp. 449–456.
Luis von Ahn and Laura Dabbish, Labeling images with a computer game, CHI ’04: Proceedings of the SIGCHI conference on Human factors in computing systems (New York, NY, USA), ACM Press, 2004, pp. 319–326.
Luis von Ahn, Shiry Ginosar, Mihir Kedia, Ruoran Liu, and Manuel Blum, Improving accessibility of the web with a computer game, CHI ’06: Proceedings of the SIGCHI conference on Human Factors in computing systems (New York, NY, USA), ACM, 2006, pp. 79–82.
Luis von Ahn, Ruoran Liu, and Manuel Blum, Peekaboom: a game for locating objects in images, CHI ’06: Proceedings of the SIGCHI conference on Human Factors in computing systems (New York, NY, USA), ACM, 2006, pp. 55–64.
A. Walker, M. M. Recker, K. Lawless, and D. Wiley, Collaborative information filtering: a review and an educational application, International Journal of Artificial Intelligence in Education 14 (2004), 1–26.
Jun Wang, Arjen P. de Vries, and Marcel J. T. Reinders, Unifying user-based and item-based collaborative filtering approaches by similarity fusion, SIGIR ’06: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval (New York, NY, USA), ACM, 2006, pp. 501–508.
Jun Wang, Arjen P. de Vries, and Marcel J.T. Reinders, A user-item relevance model for log-based collaborative filtering, Proc. of European Conference on Information Retrieval (ECIR 2006), London, UK, 2006.
Yu Wang, Mingyue Ding, Chengping Zhou, and Ying Hu, Interactive relevance feedback mechanism for image retrieval using rough set, Know.-Based Syst. 19 (2006), no. 8, 696–703.
L. Wenyin, S. Dumais, Y. Sun, H. Zhang, M. Czerwinski, and B. Field, Semi-automatic image annotation, 2001.
Gui-Rong Xue, Hua-Jun Zeng, Zheng Chen, Yong Yu, Wei-Ying Ma, WenSi Xi, and WeiGuo Fan, Optimizing web search using web click-through data, CIKM ’04: Proceedings of the thirteenth ACM international conference on Information and knowledge management (New York, NY, USA), ACM, 2004, pp. 118–126.
Alexei Yavlinsky and Daniel Heesch, An online system for gathering image similarity judgements, MULTIMEDIA ’07: Proceedings of the 15th international conference on Multimedia (New York, NY, USA), ACM, 2007, pp. 565–568.
Tomohiro Yoshizawa and Haim Schweitzer, Long-term learning of semantic grouping from relevance-feedback, MIR ’04: Proceedings of the 6th ACM SIGMM international workshop on Multimedia information retrieval (New York, NY, USA), ACM, 2004, pp. 165–172.
Osmar R. Zaïane, Man Xin, and Jiawei Han, Discovering web access patterns and trends by applying OLAP and data mining technology on web logs, Advances in Digital Libraries, 1998, pp. 19–29.
Cai-Nicolas Ziegler, Sean M. McNee, Joseph A. Konstan, and Georg Lausen, Improving recommendation lists through topic diversification, WWW ’05: Proceedings of the 14th international conference on World Wide Web (New York, NY, USA), ACM, 2005, pp. 22–32.
Acknowledgements
This research was funded by the Swiss National Science Foundation (NSF) through IM2 (Interactive Multimedia Information Management).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag London Limited
About this chapter
Cite this chapter
Morrison, D., Marchand-Maillet, S., Bruno, E. (2010). Capturing the Semantics of User Interaction: A Review and Case Study. In: Chbeir, R., Badr, Y., Abraham, A., Hassanien, AE. (eds) Emergent Web Intelligence: Advanced Information Retrieval. Advanced Information and Knowledge Processing. Springer, London. https://doi.org/10.1007/978-1-84996-074-8_10
Download citation
DOI: https://doi.org/10.1007/978-1-84996-074-8_10
Published:
Publisher Name: Springer, London
Print ISBN: 978-1-84996-073-1
Online ISBN: 978-1-84996-074-8
eBook Packages: Computer ScienceComputer Science (R0)