Abstract
The performance of video genre classification approaches strongly depends on the selected feature set. Feature selection requires for expert knowledge and is commonly driven by the underlying data, investigated video genres, and previous experience in related application scenarios. An alteration of the genres of interest results in reconsideration of the employed features by an expert. In this work, we introduce an unsupervised method for the selection of features that efficiently represent the underlying data. Performed experiments in the context of audio-based video genre classification demonstrate the outstanding performance of the proposed approach and its robustness across different video datasets and genres.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Brezeale, D., Cook, D.: Automatic video classification: a survey of the literature. IEEE Trans. Syst. Man Cybern. 38(3), 416–430 (2008)
Cover, T., Hart, P.: Nearest neighbor pattern classification. IEEE Trans. Inf. Theor. 13(1), 21–27 (1967)
Dinh, P.Q., Dorai, C., Venkatesh, S.: Video genre categorization using audio wavelet coefficients. In: Asian Conference on Computer Vision (2002)
Ekenel, H.K., Semela, T.: Multimodal genre classification of TV programs and YouTube videos. Multimedia Tools Appl. 63(2), 547–567 (2013)
Guo, J., Gurrin, C.: Short user-generated videos classification using accompanied audio categories. In: ACM International Workshop on Audio and Multimedia Methods for Large-scale Video Analysis, pp. 15–20 (2012)
Hotelling, H.: Relations between two sets of variates. Biometrika 28, 321–377 (1936)
Huang, Y.-F., Wang, S.-H.: Movie genre classification using SVM with audio and video features. In: Huang, R., Ghorbani, A.A., Pasi, G., Yamaguchi, T., Yen, N.Y., Jin, B. (eds.) AMT 2012. LNCS, vol. 7669, pp. 1–10. Springer, Heidelberg (2012)
Jolliffe, I.: Principal Component Analysis. Springer Series in Statistics. Springer, Heidelberg (2002)
Kim, S., Georgiou, P., Narayanan, S.: On-line genre classification of TV programs using audio content. In: IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 798–802 (2013)
Mitrovic, D., Zeppelzauer, M., Breiteneder, C.: Features for content-based audio retrieval. Adv. Comput.: Improving Web 78, 71–150 (2010)
Montagnuolo, M., Messina, A.: TV genre classification using multimodal information and multilayer perceptrons. In: Basili, R., Pazienza, M.T. (eds.) AI*IA 2007. LNCS (LNAI), vol. 4733, pp. 730–741. Springer, Heidelberg (2007)
Montagnuolo, M., Messina, A.: Parallel neural networks for multimodal video genre classification. Multimedia Tools Appl. 41(1), 125–159 (2009)
Natarajan, R., Chandrakala, S.: Audio-based event detection in videos - a comprehensive survey. Int. J. Eng. Technol. 6(4), 1663–1674 (2014)
Roach, M., Mason, J., Xu, L.Q.: Video genre verification using both acoustic and visual modes. In: IEEE Workshop on Multimedia Signal Processing, pp. 157–160 (2002)
Roach, M., Mason, J.: Classification of video genre using audio. Eurospeech 4, 2693–2696 (2001)
Rouvier, M., Linares, G., Matrouf, D.: On-the-fly video genre classification by combination of audio features. In: IEEE International Conference on Acoustics Speech and Signal Processing, pp. 45–48 (2010)
Rouvier, M., Oger, S., Linares, G., Matrouf, D., Merialdo, B., Li, Y.: Audio-based video genre identification. IEEE/ACM Audio, Speech, Lang. Process. 23(6), 1031–1041 (2015)
Sageder, G., Zaharieva, M., Zeppelzauer, M.: Unsupervised selection of robust audio feature subsets. In: SIAM International Conference on Data Mining, pp. 686–694 (2014)
Saz, O., Doulaty, M., Hain, T.: Background-tracking acoustic features for genre identification of broadcast shows. In: IEEE Spoken Language Technology Workshop, pp. 118–123 (2014)
Song, Y., Zhao, M., Yagnik, J., Wu, X.: Taxonomic classification for web-based videos. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 871–878 (2010)
Vapnik, V.: Nature Statistical Learning Theory. Springer, Heidelberg (1995)
Wu, X., Zhao, W.L., Ngo, C.W.: Towards google challenge: Combining contextual and social information for web video categorization. In: ACM International Conference on Multimedia, pp. 1109–1110 (2009)
Zhang, N., Duan, L.Y., Li, L., Huang, Q., Du, J., Gao, W., Guan, L.: A generic approach for systematic analysis of sports videos. ACM Trans. Intell. Syst. Technol. 3(3), 46:1–46:29 (2012)
Acknowledgments
This work has been partly funded by the Vienna Science and Technology Fund (WWTF) through project ICT12-010. The authors are thankful to Marcus Hudec for pointing our interest towards CCA. The authors would also like to thank Maurizio Montagnuolo from RAI Centre for Research and Technological Innovation for providing the RAI TV dataset.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Sageder, G., Zaharieva, M., Breiteneder, C. (2016). Group Feature Selection for Audio-Based Video Genre Classification. In: Tian, Q., Sebe, N., Qi, GJ., Huet, B., Hong, R., Liu, X. (eds) MultiMedia Modeling. MMM 2016. Lecture Notes in Computer Science(), vol 9516. Springer, Cham. https://doi.org/10.1007/978-3-319-27671-7_3
Download citation
DOI: https://doi.org/10.1007/978-3-319-27671-7_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-27670-0
Online ISBN: 978-3-319-27671-7
eBook Packages: Computer ScienceComputer Science (R0)