Unsupervised mining of visually consistent shots for sports genre categorization over large-scale database

Dong, Yuan; Zhao, Nan; Lian, Shiguo; Cen, Shusheng; Liu, Wei

doi:10.1007/s11235-014-9943-y

Unsupervised mining of visually consistent shots for sports genre categorization over large-scale database

Published: 12 December 2014

Volume 59, pages 381–391, (2015)
Cite this article

Telecommunication Systems Aims and scope Submit manuscript

Yuan Dong¹,
Nan Zhao¹,
Shiguo Lian³,
Shusheng Cen¹ &
…
Wei Liu²

196 Accesses
1 Citation
Explore all metrics

Abstract

In this paper, an algorithm is proposed to summarize sports videos based on viewpoints in TV broadcasts for sports genre classification. The redundancy of multiple views is one of the principal limitations in sports genre classification. In order to remove the redundancy, the algorithm chooses the most representative subset of shots from each game. After videos are broken into shots, single keyframe is utilized to represent each shot and uniform LBP feature is extracted to represent each keyframe. Agglomerative hierarchical clustering is then performed to cluster these keyframes. In this step, an energy-based function for clusters is introduced to match the statistical distribution of various views, and a refined distance metric is proposed as similarity measure of two shots. We modify the energy function to meet the fact that temporally neighbored shots with similar duration are more likely to be in the same views. To make full use of the high overlap of selected key-frames subset, sparse coding and geometry visual phrase are introduced in the sports genre categorization part. Our method is evaluated on videos recorded from Orangesports, ESPN and Eurosport TV broadcast. The average accuracy over 10 sports reaches 87.5 %. The proposed algorithm is already applied in the Orange TV video content delinearization service platform.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Video genre identification using clustering-based shot detection algorithm

Article 11 May 2019

Automatic Genre Classification from Videos

Event detection in soccer videos using unsupervised learning of Spatio-temporal features based on pooled spatial pyramid model

Article 03 January 2019

Notes

FIFA Document Football Stadiums: Technical recommendations and requirements – the \(5{th}\) Edition in www.fifa.com/aboutfifa/officialdocuments/doclists/laws.html
Fig. 1
TV camera positions required by FIFA and multiviews captured in one soccer broadcast video
Full size image

References

Yuan, D., Jiwei, Z., Nan, Z., Xiaofu, C., & Wei, L. (2012). Video concept detection based on multiple features and classifiers fusion, China Communications, 9(8), 105–121
Ekin, A., Tekalp, A. M., & Mehrotra, R. (2003). Automatic soccer video analysis and summarization. IEEE Transactions on Image Processing, 12(7), 796–807.
Article Google Scholar
Dong, Y., & Lian, S. (2012). Automatic and fast temporal segmentation for personalized news consuming. Information Systems Frontiers, 14(3), 517–526.
Article Google Scholar
Wang, J., Xu, C., & Chng, E. (2006). Automatic sports video genre classification using pseudo-2d-hmm[C]//Pattern Recognition, 2006. ICPR 2006. 18th International Conference on. IEEE, 4, 778–781
Jaser, E., Kittler, J., & Christmas, W. (2004). Hierarchical decision making scheme for sports video categorisation with temporal post-processing[C]//Computer Vision and Pattern Recognition, 2004. CVPR 2004. Proceedings of the 2004 IEEE Computer Society Conference on. IEEE, 2, II-908-II-913 Vol. 2.
Bosch, A., Zisserman, A., & Muoz, X. (2008). Scene classification using a hybrid generative/discriminative approach. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(4), 712–727.
Article Google Scholar
Li, L., Zhang, N., & Duan, L.Y, et al. (2009) Automatic sports genre categorization and view-type classification over large-scale dataset[C]//Proceedings of the 17th ACM international conference on Multimedia. ACM, 653–656.
Duan, L. Y., Xu, M., Tian, Q., et al. (2005). A unified framework for semantic shot classification in sports video. IEEE Transactions on Multimedia, 7(6), 1066–1083.
Article Google Scholar
Takahashi, Y., Nitta, N., & Babaguchi, N. (2005). Video summarization for large sports video archives[C], Multimedia and Expo, 2005. ICME 2005. IEEE International Conference on. IEEE, 1170–1173.
Petkovic, M., Mihajlovic, V., & Jonker, W, et al. (2002). Multi-modal extraction of highlights from TV formula 1 programs[C]//Multimedia and Expo, 2002. ICME’02. Proceedings. 2002 IEEE International Conference on. IEEE, 1, 817–820.
Ngo, C. W., Pong, T. C., & Zhang, H. J. (2002). On clustering and retrieval of video shots through temporal slices analysis. IEEE Transactions on Multimedia, 4(4), 446–458.
Article Google Scholar
Ngo, C.W., Pong, T.C., & Zhang, H.J. (2001). On clustering and retrieval of video shots[C]//Proceedings of the ninth ACM international conference on Multimedia. ACM, 51–60.
Schroff, F., Zitnick, C.L., & Baker, S. (2009). Clustering Videos by Location[C]//On British Machine Vision Conference (BMVC). 1–11.
Zhang, Y., Jia, Z., & Chen, T. (2011). Image retrieval with geometry-preserving visual phrases[C]//Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on. IEEE, 809–816.
Tao, K., Dong, Y., & Bian, Y. The France Telecom Orange Labs(Beijing) Video Semantic Indexing Systems - TRECVID 2012 Notebook Paper, http://www.nlpir.nist.gov/projects/tvpubs/tv12.papers/ftrdbj.pdf
Lazebnik, S., Schmid, C., & Ponce, J. (2006). Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories[C]//Computer Vision and Pattern Recognition, 2006 IEEE Computer Society Conference on. IEEE, 2, 2169–2178.
Dong, Y., Zhang, J., Zhao, N., Chang, X., & Liu, W. (2012).“Video Concept Detection Based on Multiple Features and Classifiers Fusion”, in China Communications, 9(8), 105–121.
Dong, Y., Tao, K., Chang, X., Gao, S., Zhang, J., Bai, H., Liu, W., Zhao, F., Li, P., & Zen, C. The France Telecom Orange Labs (Beijing) Video Semantic Indexing Systems - TRECVID 2011 Notebook Paper, http://www-nlpir.nist.gov/projects/tvpubs/tv11.papers/ftrdbj.pdf.
Yang, J., Yu, K., & Gong, Y. (2009). Linear spatial pyramid matching using sparse coding for image classification[C], Computer Vision and Pattern Recognition, et al. (2009). CVPR 2009. IEEE Conference on. IEEE, 1794–1801.
Zhao, J., Hayasaka, R., Muranoi, R., & Matsushita, Y. (1998). A MPEG video structure analysis scheme and its application to hierarchical video browser. Telecommunication Systems, 9(3–4), 403–422.
Article Google Scholar
Philips, M., & Wolf, W. (1998). A multi-attribute shot segmentation algorithm for video programs. Telecommunication Systems, 9(3–4), 393–402.
Article Google Scholar
Dong, Y., Qin, G., Xiao, G.R., Lian, S.G., & Chang, X.F. (2013). “Advanced news video parsing via visual characteristics of anchorperson scenes”, in Telecommunication Systems. doi:10.1007/s11235-013-9731-0.
Schaefer, G., & Zhou, H. Y. (2009). Fuzzy clustering for colour reduction in images. Telecommunication Systems, 40(1–2), 17–25.
Article Google Scholar
Saipullah, K. M., Kim, D. H., & Lee, S. L. (2011). Rotation invariant texture feature extraction based on sorted neighborhood differences. Proceedings - IEEE International Conference on Multimedia and Expo, 2011, 1–6.
Siagian, C., & Itti, L. (2007). “Rapid biologically-inspired scene classification using features shared with visual attention”, IEEE Trans. Pattern Analysis and Machine Intelligence, 300–312.
Dong, Y., Gao, S., & Tao, K. (2013). Performance evaluation of early and late fusion methods for generic semantics indexing, Pattern Analysis and Applications. Springer-Verlag. doi:10.1007/s10044-013-0336-8.
Mirzaei, A., & Rahmati, M. (2010). A novel hierarchical-clustering-combination scheme based on fuzzy-similarity relations. IEEE Transactions on Fuzzy Systems, 18(1), 27–39.
Article Google Scholar
Sivic, J., & Zisserman A. (2003). Video Google: A text retrieval approach to object matching in videos[C], Computer Vision, 2003. Proceedings. Ninth IEEE International Conference on. IEEE, 1470–1477.
Dong, Y., Zhang, J., Chang, X., & Zhao, J. (2012). “Automatic sports video genre categorization for broadcast videos”, in 2012 IEEE Visual Communications and Image Processing, 1–5.

Download references

Acknowledgments

This work is sponsored by collaborative Research Project (SEV01100474) between Beijing University of Posts and Telecommunications and France Telecom R&D – Orange Lab Beijing, the National High Technology Research and Development Program of China (863 Program,No.2012AA012505), and the National Natural Science Foundation of China (61372169)

Author information

Authors and Affiliations

Beijing University of Posts and Telecommunications, Beijing, 100876, People’s Republic of China
Yuan Dong, Nan Zhao & Shusheng Cen
France Telecom Research & Development – Orange Lab Beijing, Beijing, 100190, People’s Republic of China
Wei Liu
Huawei Central Research Institute, Beijing, 100086, People’s Republic of China
Shiguo Lian

Authors

Yuan Dong
View author publications
You can also search for this author in PubMed Google Scholar
Nan Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Shiguo Lian
View author publications
You can also search for this author in PubMed Google Scholar
Shusheng Cen
View author publications
You can also search for this author in PubMed Google Scholar
Wei Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yuan Dong.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dong, Y., Zhao, N., Lian, S. et al. Unsupervised mining of visually consistent shots for sports genre categorization over large-scale database. Telecommun Syst 59, 381–391 (2015). https://doi.org/10.1007/s11235-014-9943-y

Download citation

Published: 12 December 2014
Issue Date: July 2015
DOI: https://doi.org/10.1007/s11235-014-9943-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Unsupervised mining of visually consistent shots for sports genre categorization over large-scale database

Abstract

Access this article

Similar content being viewed by others

Video genre identification using clustering-based shot detection algorithm

Automatic Genre Classification from Videos

Event detection in soccer videos using unsupervised learning of Spatio-temporal features based on pooled spatial pyramid model

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Unsupervised mining of visually consistent shots for sports genre categorization over large-scale database

Abstract

Access this article

Similar content being viewed by others

Video genre identification using clustering-based shot detection algorithm

Automatic Genre Classification from Videos

Event detection in soccer videos using unsupervised learning of Spatio-temporal features based on pooled spatial pyramid model

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation