Abstract
In this paper, we present a novel framework on personalized retrieval of sports video, which includes two research tasks: semantic annotation and user preference acquisition. For semantic annotation, web-casting texts which are corresponding to sports videos are firstly captured from the webpages using data region segmentation and labeling. Incorporating the text, we detect events in the sports video and generate video event clips. These video clips are annotated by the semantics extracted from web-casting texts and indexed in a sports video database. Based on the annotation, these video clips can be retrieved from different semantic attributes according to the user preference. For user preference acquisition, we utilize click-through data as a feedback from the user. Relevance feedback is applied on text annotation and visual features to infer the intention and interested points of the user. A user preference model is learned to re-rank the initial results. Experiments are conducted on broadcast soccer and basketball videos and show an encouraging performance of the proposed method.
Similar content being viewed by others
References
Assfalg J, Bertini M, Colombo C, Bimbo A, Nunziati W (2003) Semantic annotation of soccer videos: automatic highlights identification. Comput Vis Image Underst 92:285–305
Babaguchi N, Kawai Y, Ogura T, Kitahashi T (2004) Personalized abstraction of broadcasted american football video by highlight selection. IEEE Trans Multimedia 6:575–586
Bertini M, Del Bimbo A, Torniai C, Grana C, Vezzani R, Cucchiara R (2007) Sports video annotation using enhanced hsv histograms in multimedia ontologies. In: Proceedings of international conference on image analysis and processing workshops, pp 160–170
Cai D, Yu S, Wen JR, Ma WY (2004) Block-based web search. In: Proceedings of international ACM SIGIR conference, pp 456–463
Cheshire D (1990) The complete book of video—techniques, subjects, equipment. Dorling Kindersley, London
Dahyot R, Kokaram A, Rea N, Denman H (2003) Joint audio visual retrieval for tennis broadcasts. In: Proceedings of international conference on acoustics, speech, and signal processing, pp III–561–564
Deerwester S, Dumais ST, Landauer TK, Furnas GW, Harshman R (1990) Indexing by latent semantic analysis. J Am Soc Inf Sci 41(6):391–407
Donald KM, Smeaton AF (2005) A comparison of score, rank and probability-based fusion methods for video shot retrieval. In: Proceedings of international conference on image and video retrieval, pp 61–70
Duan L, Xu M, Tian Q, Xu C (2004) Mean shift based video segment representation and application to replay detection. In: Proceedings of international conference on acoustics, speech and signal processing, pp 709–712
Ekin A, Tekalp M, Mehrotra R (2003) Automatic soccer video analysis and summarization. IEEE Trans Image Process 12:796–807
Fleischman M, Roy D (2007) Situated models of meaning for sports video retrieval. In: Human language technologies 2007: the conference of the North American chapter of the association for computational linguistics, pp 37–40
Gong Y, Lim T, Chua H, Zhang H, Sakauchi M (1995) Automatic parsing of tv soccer programs. In: Proceedings of IEEE international conference on multimedia computing and system, pp. 167–174
Han M, Hua W, Xu W, Gong Y (2002) An integrated baseball digest system using maximum entropy method. In: Proceedings of the 10th ACM international conference on multimedia, pp 347–350
Hsu WH, Kennedy L, Chang S (2006) Video search reranking via information bottleneck principle. In: Proceedings of the 14th ACM international conference on multimedia, pp 22–27
Joachims T (2002) Optimizing search engines using clickthrough data. In: Proceedings of the 8th ACM SIGKDD, pp 133–142
Kokaram A, Rea N, Dahyot R, Tekalp M, Bouthemy P, Gros P, Sezan I (2006) Singular value decomposition and principal component analysis. IEEE Signal Process Mag 23:47–58
Lafferty J, McCallum A, Pereira F (2001) Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of international conference on machine learning, pp 282–289
Leonardi R, Migliorati P, Prandini M (2004) Semantic indexing of soccer audiovisual sequences: a multimodal approach based on controlled markov chains. IEEE Trans Circuits Syst Video Technol 14:634–643
Li Y, Narayanan S, Kuo CCJ (2004) Content-based movie analysis and indexing based on audiovisual cues. IEEE Trans Circuits Syst Video Technol 14(8):1073–1085
Li Y, Xu C, Wan KW, Yan X, Yu X (2006) Reliable video clock time recognition. In: Proceedings of The 18th international conference on pattern recognition, pp 128–131
Liu DC, Nocedal J (1989) On the limited memory bfgs method for large scale optimization. Math Program 45:503–528
Nepal S, Srinivasan U, Reynolds G (2001) Automatic detection of goal segments in basketball videos. In: Proceedings of the 9th ACM international conference on multimedia, pp 261–269
Rabiner L (1989) A tutorial on hidden markov models and selected applications inspeech recognition. In: Proceedings of the IEEE, vol 77, pp 257–286
Rui Y, Gupta A, Acero A (2000) Automatically extracting highlights for tv baseball programs. In: Proceedings of the 8th ACM international conference on multimedia, pp 105–115
Rui Y, Huang TS, Ortega M, Mehrotra S (1998) Relevance feedback: a power tool for interactive content-based image retrieval. IEEE Trans Circuits Syst Video Technol 8:644–655
Salton G, Buckley C (1988) Term-weighting approaches in automatic text retrieval. Inf Process Manag 24:513–523
Sebe N, Lew MS, Zhou X, Huang TS, Bakker EM (2003) The state of the art in image and video retrieval. In: Proceedings of international conference on image and video retrieval, pp 1–7
Smeulders AWM, Worring M, Santini S, Gupta A, Jain R (2000) Content-based image retrieval at the end of the early years. IEEE Trans Pattern Anal Mach Intell 22:1349–1380
Veltkamp RC, Burkhardt H, Kriegel HP (2001) State-of-the-art in content-based image and video retrieval. Kluwer, Dordrecht
Viola P, Jones M (2001) Rapid object detection using a boosted cascade of simple features. In: Proceedings of computer vision and pattern recogntion, pp 511–518
Wall ME, Rechtsteiner A, Rocha LM (2003) Singular value decomposition and principal component analysis. In: A practical approach to microarray data analysis, pp 91–109
Wang J, Chng E, Xu C (2005) Soccer replay detection using scene transition stucture analysis. In: Proceedings of international conference on acoustics, speech and signal processing, pp II–433–436
Xie L, Xu P, Chang S, Divakaran A, Sun H (2004) Structure analysis of soccer video with domain knowledge and hidden markov models. Pattern Recogn Lett 25:767–775
Xiong Z, Radhakrishnan R, Divakaran A, Huang T (2003) Audio events detection based highlights extraction from baseball, golf and soccer games in a unified framework. In: Proceedings of international conference on acoustics, speech and signal processing, pp V–632–635
Xu H, Chua T (2005) Fusion of multiple asynchronous information sources for event detection in soccer video. In: Proceedings of IEEE international conference on multimedia and expo, pp 1242–1245
Xu C, Wang J, Wan K, Li Y, Duan L (2006) Live sports event detection based on broadcast video and web-casting text. In: Proceedings of the 14th ACM international conference on multimedia, pp 221–230
Xue GR, Zeng HJ, Chen Z, Yu Y, Ma WY, Xi W, Fan W (2004) Optimizing web search using web click-through data. In: Proceedings of the 13th ACM international conference on information and knowledge management, pp 118–126
Yan R, Yang J, Hauptmann AG (2004) Learning query-class dependent weights in automatic video retrieval. In: Proceedings of the 12th ACM international conference on multimedia, pp 548–555
Zhang D, Chang S (2002) Event detection in baseball video using superimposed caption recognition. In: Proceedings of ACM international conference on multimedia, pp 315–318
Zhang YF, Zhang X, Xu C, Lu H (2007) Personalized retrieval of sports video. In: Proceedings of ACM international workshop on multimedia information retrieval, pp 313–322
Acknowledgement
This work is supported by National Natural Science Foundation of China (Grant No. 60833006) and the 863 Program of China (Grant No. 2006AA01Z315).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Zhang, YF., Xu, C., Zhang, X. et al. Personalized retrieval of sports video based on multi-modal analysis and user preference acquisition. Multimed Tools Appl 44, 305–330 (2009). https://doi.org/10.1007/s11042-009-0291-y
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-009-0291-y