Skip to main content

Capturing the Semantics of User Interaction: A Review and Case Study

  • Chapter
  • First Online:

Part of the book series: Advanced Information and Knowledge Processing ((AI&KP))

Abstract

In many retrieval domains there exists a problematic gap between what computers can describe and what humans are capable of perceiving. This gap is most evident in the indexing of multimedia data such as images, video and sound where the low-level features are too semantically deficient to be of use from a typical users’ perspective. On the other hand, users possess the ability to quickly examine and summarise these documents, even subconsciously. Examples include specifying relevance between a query and results, rating preferences in film databases, purchasing items from online retailers, and even browsing web sites. Data from these interactions, captured and stored in log files, can be interpreted to have semantic meaning, which proves indispensable when used in a collaborative setting where users share similar preferences or goals. In this chapter we summarise techniques for efficiently exploiting user interaction in its many forms for the generation and augmentation of semantic data in large databases. This user interaction can be applied to improve performance in recommender and information retrieval systems. A case study is presented which applies a popular technique, latent semantic analysis, to improve retrieval on an image database.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    Sparsity values were not given for the subsampled datasets, but only users with more than 20 ratings were considered.

  2. 2.

    See [26] for details.

  3. 3.

    http://archive.ics.uci.edu/ml/

  4. 4.

    In August 2006, AOL Research released millions of anonymous search histories only to have some queries subsequently linked to at least one individual by identifiable keywords [20]. Later, in October 2006, security researchers reversed the anonymous Netflix movie ratings data by cross-referencing ratings and dates with publicly available data on IMDB.com [41].

  5. 5.

    Image categories used are: colors_and_textures, cougars, creative_crystals, creative_textures, cuisine, desserts, dolphins_and_whales, elephants, endangered_species, everyday_objects, fabulous_fruit, fireworks, fitness, flowering_potted_plants, flowers_closeup, foxes_and_coyotes, frost_textures, fruits_and_nuts, fungi, hawks_and_falcons.

References

  1. Chris Anderson, The long tail, Wired Magazine 12 (2004), no. 10.

    Google Scholar 

  2. Ricardo Baeza-Yates and Neto-Ribeiro Berthier, Modern information retrieval, Addison-Wesley, Essex, England, 1999.

    Google Scholar 

  3. Marko Balabanović and Yoav Shoham, Fab: content-based, collaborative recommendation, Commun. ACM 40 (1997), no. 3, 66–72.

    Article  Google Scholar 

  4. Pierre Baldi, Paolo Frasconi, and Padhraic Smyth, Modeling the internet and the web: Probabilistic methods and algorithms, John Wiley & Sons, West Sussex, England, 2003.

    Google Scholar 

  5. Daniel Billsus and Michael Pazzani, Learning probabilistic user models, Proceedings of the Workshop on Machine Learning for User Modeling (Chia Laguna, IT), 1997.

    Google Scholar 

  6. C. Boutilier, R. Zemel, and B. Marlin, Active collaborative filtering, In Proceedings of the Nineteenth Annual Conference on Uncertainty in Artificial Intelligence, 2003, pp. 98–106.

    Google Scholar 

  7. T. Brauen, Document vector modification, The SMART Retrieval System (G. Salton, ed.), Prentice Hall, New Jersey, 1971, pp. 456–484.

    Google Scholar 

  8. M. Cord and P. H. Gosselin, Image retrieval using long-term semantic learning, IEEE International Conference on Image Processing, 2006.

    Google Scholar 

  9. Nick Craswell and Martin Szummer, Random walks on the click graph, In Proceedings of SIGIR 2007, 2007.

    Google Scholar 

  10. S. Deerwester, S. Dumais, T. Landauer, G. Furnas, and R. Harshman, Indexing by latent semantic analysis, Journal of the American Society of Information Science 4 (1990), 391–407.

    Article  Google Scholar 

  11. Scott C. Deerwester, Susan T. Dumais, Thomas K. Landauer, George W. Furnas, and Richard A. Harshman, Indexing by latent semantic analysis, Journal of the American Society of Information Science 41 (1990), no. 6, 391–407.

    Article  Google Scholar 

  12. A. P. Dempster, N. M. Laird, and D. B. Rubin, Maximum likelihood from incomplete data via the em algorithm, Journal of the Royal Statistical Society. Series B (Methodological) 39 (1977), no. 1, 1–38.

    MathSciNet  MATH  Google Scholar 

  13. Mukund Deshpande and George Karypis, Item-based top-N recommendation algorithms, ACM Trans. Inf. Syst. 22 (2004), no. 1, 143–177.

    Article  Google Scholar 

  14. J. Fournier and M. Cord, Long-term similarity learning in content-based image retrieval, 2002.

    Google Scholar 

  15. Annalisa Franco and Alessandra Lumini, Mixture of KL subspaces for relevance feedback, Multimedia Tools Appl. 37 (2008), no. 2, 189–209.

    Article  Google Scholar 

  16. Dan Frankowski, Shyong K. Lam, Shilad Sen, F. Maxwell Harper, Scott Yilek, Michael Cassano, and John Riedl, Recommenders everywhere: the wikilens community-maintained recommender system, WikiSym ’07: Proceedings of the 2007 international symposium on Wikis (New York, NY, USA), ACM, 2007, pp. 47–60.

    Google Scholar 

  17. David Goldberg, David Nichols, Brian M. Oki, and Douglas Terry, Using collaborative filtering to weave an information tapestry, Commun. ACM 35 (1992), no. 12, 61–70.

    Article  Google Scholar 

  18. Ken Goldberg, Theresa Roeder, Dhruv Gupta, and Chris Perkins, Eigentaste: A constant time collaborative filtering algorithm, Information Retrieval 4 (2001), no. 2, 133–151.

    Article  MATH  Google Scholar 

  19. P.-H. Gosselin and M. Cord, Semantic kernel learning for interactive image retrieval, IEEE International Conference on Image Processing (Genoa, Italy), IEEE, sept. 2005.

    Google Scholar 

  20. Katie Hafner, Tempting data, privacy concerns; researchers yearn to use AOL logs, but they hesitate, Web site: The New York Times, August 23, 2006. Retrieved on 2006-09-13. http://www.nytimes.com/2006/08/23/technology/23search.html

  21. Xiaofei He, O. King, Wei-Ying Ma, Mingjing Li, and Hong-Jiang Zhang, Learning a semantic space from user’s relevance feedback for image retrieval, Circuits and Systems for Video Technology, IEEE Transactions on 13 (2003), no. 1, 39–48.

    Article  Google Scholar 

  22. D. Heisterkamp, Building a latent-semantic index of an image database from patterns of relevance feedback, 2002.

    Google Scholar 

  23. Jon Herlocker, Joseph A. Konstan, and John Riedl, An empirical analysis of design choices in neighborhood-based collaborative filtering algorithms, Inf. Retr. 5 (2002), no. 4, 287–310.

    Article  Google Scholar 

  24. Jonathan L. Herlocker, Joseph A. Konstan, Loren G. Terveen, and John T. Riedl, Evaluating collaborative filtering recommender systems, ACM Trans. Inf. Syst. 22 (2004), no. 1, 5–53.

    Article  Google Scholar 

  25. Will Hill, Larry Stead, Mark Rosenstein, and George Furnas, Recommending and evaluating choices in a virtual community of use, CHI ’95: Proceedings of the SIGCHI conference on Human factors in computing systems (New York, NY, USA), ACM Press/Addison-Wesley Publishing Co., 1995, pp. 194–201.

    Google Scholar 

  26. Thomas Hofmann, Probabilistic latent semantic analysis, Proc. of Uncertainty in Artificial Intelligence, UAI’99 (Stockholm), 1999.

    Google Scholar 

  27. Thomas Hofmann, Unsupervised learning by probabilistic latent semantic analysis, IEEE Trans. on PAMI 25 (2000).

    Google Scholar 

  28. Thomas Hofmann, Latent semantic models for collaborative filtering, ACM Trans. Inf. Syst. 22 (2004), no. 1, 89–115.

    Article  Google Scholar 

  29. Jeff Howe, The rise of crowdsourcing, Wired Magazine 14 (2006), no. 06.

    Google Scholar 

  30. Takeo Kanade and Shingo Uchihashi, User-powered “content-free” approach to image retrieval, Proceedings of International Symposium on Digital Libraries and Knowledge Communities in Networked Information Society 2004 (DLKC04), 2004, pp. 24–32.

    Google Scholar 

  31. Charles Kemp and Kotagiri Ramamohanarao, Long-term learning for web search engines, PKDD ’02: Proceedings of the 6th European Conference on Principles of Data Mining and Knowledge Discovery (London, UK), Springer-Verlag, 2002, pp. 263–274.

    Google Scholar 

  32. Markus Koskela and Jorma Laaksonen, Using long-term learning to improve efficiency of content-based image retrieval, 2003.

    Google Scholar 

  33. M. Li, Z. Chen, and H. Zhang, Statistical correlation analysis in image retrieval, 2002.

    Google Scholar 

  34. Rong Jin Luo Si, Steven C. H. Hoi, and Michael R. Lyu, Collaborative image retrieval via regularized metric learning, ACM Multimedia Systems Journal (MMSJ), Special Issue on Machine Learning Approaches to Multimedia Information Retrieval 12 (2006), no. 1, 34–44.

    Google Scholar 

  35. Stéphane Marchand-Maillet and Eric Bruno, Exploiting user interaction for semantic content-based image retrieval, Tech. report, Computer Vision and Multimedia Laboratory, Computing Centre, University of Geneva, 2003.

    Google Scholar 

  36. Benjamin Marlin and Richard S. Zemel, The multiple multiplicative factor model for collaborative filtering, ICML ’04: Proceedings of the twenty-first international conference on Machine learning (New York, NY, USA), ACM, 2004, p. 73.

    Google Scholar 

  37. P. McJones, EachMovie collaborative filtering dataset, Website: http://www.research.compaq.com/SRC/eachmovie/, 1997, DEC (now Compaq) Systems Research Center.

  38. Donn Morrison, Stéphane Marchand-Maillet, and Eric Bruno, Semantic clustering of images using patterns of relevance feedback, Proceedings of the 6th International Workshop on Content-based Multimedia Indexing (London, UK), June 18-20 2008.

    Google Scholar 

  39. Henning Müller, Wolfgang Müller, David McG. Squire, Stéphane Marchand-Maillet, and Thierry Pun, Long-term learning from user behavior in content-based image retrieval, Tech. report, Université de Genève, 2000.

    Google Scholar 

  40. Henning Müller, Thierry Pun, and David Squire, Learning from user behavior in image retrieval: Application of market basket analysis, Int. J. Comput. Vision 56 (2004), no. 1-2, 65–77.

    Article  Google Scholar 

  41. Arvind Narayanan and Vitaly Shmatikov, How to break anonymity of the netflix prize dataset, 2006.

    Google Scholar 

  42. O. Nasraoui, C. Cardona, C. Rojas, and F. Gonzalez, Mining evolving user profiles in noisy web clickstream data with a scalable immune system clustering algorithm, 2003.

    Google Scholar 

  43. Netflix, The Netflix Prize, Web site: http://www.netflixprize.com/, 2006.

  44. P. Resnick, N. Iacovou, M. Suchak, P. Bergstorm, and J. Riedl, GroupLens: An Open Architecture for Collaborative Filtering of Netnews, Proceedings of ACM 1994 Conference on Computer Supported Cooperative Work (Chapel Hill, North Carolina), ACM, 1994, pp. 175–186.

    Google Scholar 

  45. J. J. Rocchio, Relevance feedback in information retrieval, The SMART Retrieval System (G. Salton, ed.), Prentice Hall, New Jersey, 1971, pp. 456–484.

    Google Scholar 

  46. Bryan C. Russell, Antonio Torralba, Kevin P. Murphy, and William T. Freeman, LabelMe: A database and web-based tool for image annotation, Int. J. Comput. Vision 77 (2008), no. 1–3, 157–173.

    Article  Google Scholar 

  47. Ian Ruthven and Mounia Lalmas, A survey on the use of relevance feedback for information access systems, Knowl. Eng. Rev. 18 (2003), no. 2, 95–145.

    Article  Google Scholar 

  48. B. Sarwar, G. Karypis, J. Konstan, and J. Riedl, Application of dimensionality reduction in recommender systems–a case study, 2000.

    Google Scholar 

  49. J.B. Schafer, J.A. Konstan, and J. Riedl, The view through metalens: Usage patterns for meta-recommendation system, IEE Proceedings Software 151 (2004), 267–279.

    Article  Google Scholar 

  50. Cees Snoek, Content-based video indexing, Presentation given at Summer School on Multimedia Semantics (SSMS’07), Glasgow, UK, 2007, Slides URL: http://www.dcs.gla.ac.uk/ssms07/teaching-material/SSMS2007_CeesSnoek-part2.pdf.

  51. Alvin Toffler, Future shock, Random House, New York City, NY, USA, 1970.

    Google Scholar 

  52. Carnegie Mellon University, reCAPTCHA, Web site: http://recaptcha.net/, 2007.

  53. Masao Utiyama and Mikio Yamamoto, Relevance feedback models for recommendation, Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing (EMNLP 2006), 2006, pp. 449–456.

    Google Scholar 

  54. Luis von Ahn and Laura Dabbish, Labeling images with a computer game, CHI ’04: Proceedings of the SIGCHI conference on Human factors in computing systems (New York, NY, USA), ACM Press, 2004, pp. 319–326.

    Google Scholar 

  55. Luis von Ahn, Shiry Ginosar, Mihir Kedia, Ruoran Liu, and Manuel Blum, Improving accessibility of the web with a computer game, CHI ’06: Proceedings of the SIGCHI conference on Human Factors in computing systems (New York, NY, USA), ACM, 2006, pp. 79–82.

    Google Scholar 

  56. Luis von Ahn, Ruoran Liu, and Manuel Blum, Peekaboom: a game for locating objects in images, CHI ’06: Proceedings of the SIGCHI conference on Human Factors in computing systems (New York, NY, USA), ACM, 2006, pp. 55–64.

    Google Scholar 

  57. A. Walker, M. M. Recker, K. Lawless, and D. Wiley, Collaborative information filtering: a review and an educational application, International Journal of Artificial Intelligence in Education 14 (2004), 1–26.

    Google Scholar 

  58. Jun Wang, Arjen P. de Vries, and Marcel J. T. Reinders, Unifying user-based and item-based collaborative filtering approaches by similarity fusion, SIGIR ’06: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval (New York, NY, USA), ACM, 2006, pp. 501–508.

    Google Scholar 

  59. Jun Wang, Arjen P. de Vries, and Marcel J.T. Reinders, A user-item relevance model for log-based collaborative filtering, Proc. of European Conference on Information Retrieval (ECIR 2006), London, UK, 2006.

    Google Scholar 

  60. Yu Wang, Mingyue Ding, Chengping Zhou, and Ying Hu, Interactive relevance feedback mechanism for image retrieval using rough set, Know.-Based Syst. 19 (2006), no. 8, 696–703.

    Article  Google Scholar 

  61. L. Wenyin, S. Dumais, Y. Sun, H. Zhang, M. Czerwinski, and B. Field, Semi-automatic image annotation, 2001.

    Google Scholar 

  62. Gui-Rong Xue, Hua-Jun Zeng, Zheng Chen, Yong Yu, Wei-Ying Ma, WenSi Xi, and WeiGuo Fan, Optimizing web search using web click-through data, CIKM ’04: Proceedings of the thirteenth ACM international conference on Information and knowledge management (New York, NY, USA), ACM, 2004, pp. 118–126.

    Google Scholar 

  63. Alexei Yavlinsky and Daniel Heesch, An online system for gathering image similarity judgements, MULTIMEDIA ’07: Proceedings of the 15th international conference on Multimedia (New York, NY, USA), ACM, 2007, pp. 565–568.

    Google Scholar 

  64. Tomohiro Yoshizawa and Haim Schweitzer, Long-term learning of semantic grouping from relevance-feedback, MIR ’04: Proceedings of the 6th ACM SIGMM international workshop on Multimedia information retrieval (New York, NY, USA), ACM, 2004, pp. 165–172.

    Google Scholar 

  65. Osmar R. Zaïane, Man Xin, and Jiawei Han, Discovering web access patterns and trends by applying OLAP and data mining technology on web logs, Advances in Digital Libraries, 1998, pp. 19–29.

    Google Scholar 

  66. Cai-Nicolas Ziegler, Sean M. McNee, Joseph A. Konstan, and Georg Lausen, Improving recommendation lists through topic diversification, WWW ’05: Proceedings of the 14th international conference on World Wide Web (New York, NY, USA), ACM, 2005, pp. 22–32.

    Google Scholar 

Download references

Acknowledgements

This research was funded by the Swiss National Science Foundation (NSF) through IM2 (Interactive Multimedia Information Management).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Donn Morrison .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag London Limited

About this chapter

Cite this chapter

Morrison, D., Marchand-Maillet, S., Bruno, E. (2010). Capturing the Semantics of User Interaction: A Review and Case Study. In: Chbeir, R., Badr, Y., Abraham, A., Hassanien, AE. (eds) Emergent Web Intelligence: Advanced Information Retrieval. Advanced Information and Knowledge Processing. Springer, London. https://doi.org/10.1007/978-1-84996-074-8_10

Download citation

  • DOI: https://doi.org/10.1007/978-1-84996-074-8_10

  • Published:

  • Publisher Name: Springer, London

  • Print ISBN: 978-1-84996-073-1

  • Online ISBN: 978-1-84996-074-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics