research-article

Contextual Video Recommendation by Multimodal Relevance and User Feedback

Authors:

Xian-Sheng Hua,

Shipeng LiAuthors Info & Claims

ACM Transactions on Information Systems (TOIS), Volume 29, Issue 2

Article No.: 10, Pages 1 - 24

https://doi.org/10.1145/1961209.1961213

Published: 01 April 2011 Publication History

Abstract

With Internet delivery of video content surging to an unprecedented level, video recommendation, which suggests relevant videos to targeted users according to their historical and current viewings or preferences, has become one of most pervasive online video services. This article presents a novel contextual video recommendation system, called VideoReach, based on multimodal content relevance and user feedback. We consider an online video usually consists of different modalities (i.e., visual and audio track, as well as associated texts such as query, keywords, and surrounding text). Therefore, the recommended videos should be relevant to current viewing in terms of multimodal relevance. We also consider that different parts of videos are with different degrees of interest to a user, as well as different features and modalities have different contributions to the overall relevance. As a result, the recommended videos should also be relevant to current users in terms of user feedback (i.e., user click-through). We then design a unified framework for VideoReach which can seamlessly integrate both multimodal relevance and user feedback by relevance feedback and attention fusion. VideoReach represents one of the first attempts toward contextual recommendation driven by video content and user click-through, without assuming a sufficient collection of user profiles available. We conducted experiments over a large-scale real-world video data and reported the effectiveness of VideoReach.

References

[1]

Adomavicius, G. and Tuzhilin, A. 2005. Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions. IEEE Trans. Knowl. Data Engin. 17, 6, 734--749.

Digital Library

[2]

Baeza-Yates, R. and Ribeiro-Neto, B. 1999. Modern Information Retrieval. Addison Wesley.

Digital Library

[3]

Balabanovic, M. 1998. Exploring versus exploiting when learning user models for text recommendation. User Model. User-Adapt. Interact. 8, 4, 71--102.

Digital Library

[4]

Baluja, S., Seth, R., Sivakumar, D., et al. 2008. Video suggestion and discovery for youtube, taking random walks through the view graph. In Proceedings of the International World Wide Web Conference.

Digital Library

[5]

Boll, S. 2007. Multitube-Where multimedia and web 2.0 could meet. IEEE Multimedia Mag. 14, 1, 9--13.

Digital Library

[6]

Bollen, J., Nelson, M. L., Araujo, R., and Geisler, G. 2005. Video recommendations for the open video project. In Proceedings of the ACM/IEEE-CS Joint Conference on Digital Libraries. 369--369.

Digital Library

[7]

Burke, R. 2002. Hybrid recommender systems: Survey and experiments. User Model. User-Adapt. Interact. 12, 4, 331--370.

Digital Library

[8]

Chang, S.-F., Ma, W.-Y., and Smeulders, A. 2007. Recent advances and challenges of semantic image/video search. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP).

[9]

Christakou, C. and Stafylopatis, A. 2005. A hybrid movie recommender system based on neural networks. In Proceedings of the 5th International Conference on Intelligent Systems Design and Applications.

Digital Library

[10]

Datta, R., Joshi, D., Li, J., and Wang, J. Z. 2008. Image retrieval: Ideas, influences, and trends of the new age. ACM Comput. Surv. 40, 65.

Digital Library

[11]

Encyclopedia. 2011. Encyclopedia. http://www.encyclopedia.com/.

[12]

Fouss, F., Pirotte, A., Renders, J. M., and Saerens, M. 2007. Random-Walk computation of similarities between nodes of a graph, with application to collaborative recommendation. IEEE Trans. Knowl. Data Engin. 19, 3, 355--369.

Digital Library

[13]

Gibas, M., Canahuate, G., and Ferhatosmanoglu, H. 2008. Online index recommendations for high-dimensional databases using query workloads. IEEE Trans. Knowl. Data Engin. 20, 2, 246--260.

Digital Library

[14]

Gu, Z., Mei, T., Hua, X.-S., Tang, J., and Wu, X. 2008. Multi-Layer multi-instance learning for video concept detection. IEEE Trans. Multimedia 10, 8, 1605--1616.

Digital Library

[15]

Hauptmann, A. G., Christel, M. G., and Yan, R. 2008. Video retrieval based on semantic concepts. Proc. IEEE 96, 4, 602--622.

[16]

Hu, J., Zeng, H.-J., Li, H., Niu, C., and Chen, Z. 2007. Demographic prediction based on user’s browsing behavior. In Proceedings of the International World Wide Web Conference.

Digital Library

[17]

Hua, X.-S., Lu, L., and Zhang, H.-J. 2004a. Optimization-Based automated home video editing system. IEEE Trans. Circ. Syst. Video Tech. 14, 5, 572--583.

Digital Library

[18]

Hua, X.-S. and Zhang, H.-J. 2004b. An attention-based decision fusion scheme for multimedia information retrieval. In Proceedings of the IEEE Pacific-Rim Conference on Multimedia.

Digital Library

[19]

Iwata, T., Saito, K., and Yamada, T. 2008. Recommendation method for improving customer lifetime value. IEEE Trans. Knowl. Data Engin. 20, 9, 1254--1263.

Digital Library

[20]

Kennedy, L., Chang, S.-F., and Natsev, A. 2008. Query-Adaptive fusion for multimodal search. Proc. IEEE 96, 4, 567--588.

[21]

Lew, M. S., Sebe, N., Djeraba, C., and Jain, R. 2006. Content-Based multimedia information retrieval: State of the art and challenges. ACM Trans. Multimedia Comput. Comm. Appl. 2, 1, 1--19.

Digital Library

[22]

Liu, Y., Mei, T., and Hua, X.-S. 2009. CrowdReranking: Exploring multiple search engines for visual search reranking. In Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval. 500--507.

Digital Library

[23]

Mei, T., Hua, X.-S., Lai, W., Yang, L., et al. 2007a. MSRA-USTC-SJTU at TRECVID 2007: High-Level feature extraction and search. In Proceedings of TREC Video Retrieval Evaluation Online.

[24]

Mei, T., Hua, X.-S., Yang, L., and Li, S. 2007b. VideoSense: Towards effective online video advertising. In Proceedings of ACM Multimedia. 1075--1084.

Digital Library

[25]

Mei, T., Yang, B., Hua, X.-S., Yang, L., Yang, S.-Q., and Li, S. 2007c. VideoReach: An online video recommendation system. In Proceedings of ACM SIGIR Conference on Research and Development in Information Retrieval. 767--768.

Digital Library

[26]

Moxley, E., Mei, T., and Manjunath, B. S. 2010. Video annotation through search and graph reinforcement mining. IEEE Trans. Multimedia 12, 3, 184--193.

Digital Library

[27]

MSN Video. 2011. MSN video. http://video.msn.com/video.aspx?mkt=en-us&tab=soapbox/.

[28]

Naphade, M., Smith, J. R., Tesic, J., Chang, S.-F., Hsu, W., Kennedy, L., Hauptmann, A., and Curtis, J. 2006. Large-Scale concept ontology for multimedia. IEEE Multimedia Mag. 13, 3, 86--91.

Digital Library

[29]

Resnick, P. and Varian, H. R. 1997. Recommender systems. Comm. ACM 40, 3, 56--58.

Digital Library

[30]

Rui, Y., Huang, T. S., Ortega, M., and Mehrotra, S. 1998. Relevance feedback: A power tool for interactive content-based image retrieval. IEEE Trans. Circ. Video Tech. 8, 5, 644--655.

Digital Library

[31]

Setten, M. V. and Veenstra, M. 2003. Prediction strategies in a TV recommender system---Method and experiments. In Proceedings of the International World Wide Web Conference.

[32]

Shen, D., Pan, R., Sun, J.-T., Pan, J. J., Wu, K., Yin, J., and Yang, Q. 2006a. Query enrichment for web-query classification. ACM Trans. Inf. Syst. 24, 3, 320--352.

Digital Library

[33]

Shen, D., Sun, J.-T., Yang, Q., and Chen, Z. 2006b. Building bridges for web query classification. In Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval. 131--138.

Digital Library

[34]

Shen, J., Shepherd, J., Cui, B., and Tan, K.-L. 2009. A novel framework for efficient automated singer identification in large music databases. ACM Trans. Inf. Syst. 27, 3.

Digital Library

[35]

Shen, J., Tao, D., and Li, X. 2008. Modality mixture projections for semantic video event detection. IEEE Trans. Circ. Syst. Video Tech. 18, 11, 1587--1596.

Digital Library

[36]

Siersdorfer, S., Pedro, J. S., and Sanderson, M. 2009. Automatic video tagging using content redundancy. In Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval. 395--402.

Digital Library

[37]

Snoek, C. G. M. and Worring, M. 2009. Concept-based video retrieval. Found. Trends Inf. Retr. 4, 2, 215--322.

Digital Library

[38]

Snoek, C., Worring, M., van Gemert, J., Geusebroek, J.-M., and Smeulders, A. W. M. 2006. The challenge problem for automated detection of 101 semantic concepts in multimedia. In Proceedings of the ACM International Conference on Multimedia. 421--430.

Digital Library

[39]

Tao, D., Tang, X., Li, X., and Wu, X. 2006. Asymmetric bagging and random subspace for support vector machines-based relevance feedback in image retrieval. IEEE Trans. Patt. Anal. Mach. Intell. 28, 7, 1088--1099.

Digital Library

[40]

TRECVID. 2011. TRECVID. http://www-nlpir.nist.gov/projects/trecvid/.

[41]

Wei, Y. Z., Moreau, L., and Jennings, N. R. 2005. Learning users interests by quality classification in market-based recommender systems. IEEE Trans. Knowl. Data Engin. 17, 12, 1678--1688.

Digital Library

[42]

Yahoo! 2011. Yahoo. http://www.yahoo.com/.

[43]

Yang, B., Mei, T., Hua, X.-S., Yang, L., Yang, S.-Q., and Li, M. 2007. Online video recommendation based on multimodal fusion and relevance feedback. In Proceedings of the ACM International Conference on Image and Video Retrieval. 73--80.

Digital Library

[44]

Yang, Y. and Liu, X. 1999. A re-examination of text categorization methods. In Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval.

Digital Library

[45]

YouTube. 2011. YouTube. http://www.youtube.com/.

[46]

Yu, B., Ma, W.-Y., Nahrstedt, K., and Zhang, H.-J. 2003. Video summarization based on user log enhanced link analysis. In Proceedings of the ACM International Conference on Multimedia. 382--391.

Digital Library

[47]

Zhou, D., Zhu, S., Yu, K., Song, X., Tseng, B. L., Zha, H., and Giles, C. L. 2008. Learning multiple graphs for document recommendations. In Proceedings of the International World Wide Web Conference. 141--150.

Digital Library

Cited By

Zhan YYang RYou JHuang MLiu WLiu X(2025)A systematic literature review on incomplete multimodal learning: techniques and challengesSystems Science & Control Engineering10.1080/21642583.2025.246708313:1Online publication date: 26-Feb-2025
https://doi.org/10.1080/21642583.2025.2467083
Wang GWu XTu XLiu ZYan J(2024)Unsupervised Video Moment Retrieval with Knowledge-Based Pseudo-Supervision ConstructionACM Transactions on Information Systems10.1145/370122943:1(1-26)Online publication date: 9-Dec-2024
https://dl.acm.org/doi/10.1145/3701229
Qiao YChen ALi XGao J(2024)Variational Stochastic Multiple Auto-Encoder For Multimodal RecommendationProceedings of the 6th ACM International Conference on Multimedia in Asia10.1145/3696409.3700269(1-7)Online publication date: 3-Dec-2024
https://dl.acm.org/doi/10.1145/3696409.3700269
Show More Cited By

Index Terms

Contextual Video Recommendation by Multimodal Relevance and User Feedback
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
2. Information systems
  1. World Wide Web
    1. Web applications
    2. Web services

Recommendations

Online video recommendation based on multimodal fusion and relevance feedback
CIVR '07: Proceedings of the 6th ACM international conference on Image and video retrieval

With Internet delivery of video content surging to an un-precedented level, video recommendation has become a very popular online service. The capability of recommending relevant videos to targeted users can alleviate users' efforts on finding the most ...
Query refinement suggestion in multimodal image retrieval with relevance feedback
ICMI '11: Proceedings of the 13th international conference on multimodal interfaces

In the literature, it has been shown that relevance feedback is a good strategy for the system to interact with the user and provide better results in a content-based image retrieval (CBIR) system. On the other hand, there are many retrieval systems ...
Multimodal retrieval with relevance feedback based on genetic programming

This paper presents a framework for multimodal retrieval with relevance feedback based on genetic programming. In this supervised learning-to-rank framework, genetic programming is used for the discovery of effective combination functions of (multimodal)...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Information Systems

ACM Transactions on Information Systems Volume 29, Issue 2

April 2011

193 pages

ISSN:1046-8188

EISSN:1558-2868

DOI:10.1145/1961209

Issue’s Table of Contents

Copyright © 2011 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 April 2011

Accepted: 01 December 2010

Revised: 01 August 2010

Received: 01 January 2010

Published in TOIS Volume 29, Issue 2

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

100
Total Citations
View Citations
1,523
Total Downloads

Downloads (Last 12 months)62
Downloads (Last 6 weeks)7

Reflects downloads up to 28 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Zhan YYang RYou JHuang MLiu WLiu X(2025)A systematic literature review on incomplete multimodal learning: techniques and challengesSystems Science & Control Engineering10.1080/21642583.2025.246708313:1Online publication date: 26-Feb-2025
https://doi.org/10.1080/21642583.2025.2467083
Wang GWu XTu XLiu ZYan J(2024)Unsupervised Video Moment Retrieval with Knowledge-Based Pseudo-Supervision ConstructionACM Transactions on Information Systems10.1145/370122943:1(1-26)Online publication date: 9-Dec-2024
https://dl.acm.org/doi/10.1145/3701229
Qiao YChen ALi XGao J(2024)Variational Stochastic Multiple Auto-Encoder For Multimodal RecommendationProceedings of the 6th ACM International Conference on Multimedia in Asia10.1145/3696409.3700269(1-7)Online publication date: 3-Dec-2024
https://dl.acm.org/doi/10.1145/3696409.3700269
Hu WChen WYuan WWang YCai SZhang YCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)Dual-Stream Pre-Training Transformer to Enhance Multimodal Learning for Social Media PredictionProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3688998(11450-11456)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3688998
Zeng KLiu WLiu D(2024)Solving the Short Video Assignment Problem via Federated Learning and Group Multi-Role Assignment2024 IEEE International Symposium on Parallel and Distributed Processing with Applications (ISPA)10.1109/ISPA63168.2024.00124(934-939)Online publication date: 30-Oct-2024
https://doi.org/10.1109/ISPA63168.2024.00124
Lubos SFelfernig ATautschnig M(2023)An overview of video recommender systems: state-of-the-art and research issuesFrontiers in Big Data10.3389/fdata.2023.12816146Online publication date: 30-Oct-2023
https://doi.org/10.3389/fdata.2023.1281614
Pan YLi NGao CChang JNiu YSong YJin DLi YFrommholz IHopfgartner FLee MOakes MLalmas MZhang MSantos R(2023)Learning and Optimization of Implicit Negative Feedback for Industrial Short-video Recommender SystemProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3615482(4787-4793)Online publication date: 21-Oct-2023
https://dl.acm.org/doi/10.1145/3583780.3615482
Huang ZJin BZhao HLiu QLian DTengfei BChen E(2023)Personal or General? A Hybrid Strategy with Multi-factors for News RecommendationACM Transactions on Information Systems10.1145/355537341:2(1-29)Online publication date: 13-Apr-2023
https://dl.acm.org/doi/10.1145/3555373
Liu YCao QShen HWu YTao SCheng XChen HDuh WHuang HKato MMothe JPoblete B(2023)Popularity Debiasing from Exposure to Interaction in Collaborative FilteringProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3591947(1801-1805)Online publication date: 19-Jul-2023
https://dl.acm.org/doi/10.1145/3539618.3591947
Chen XZhang YTsang IPan YSu J(2023)Toward Equivalent Transformation of User Preferences in Cross Domain RecommendationACM Transactions on Information Systems10.1145/352276241:1(1-31)Online publication date: 9-Jan-2023
https://dl.acm.org/doi/10.1145/3522762
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Issue’s Table of Contents