Abstract
Audio resources are a very important part of multimedia information. The classification effect of audio is directly related to the service mode of personal resource management systems. At present, vector features have been widely used in audio classification systems. However, some semantic correlations among different audio information can not be completely expressed by simple vector representation. Tensors are multidimensional matrices, and their mathematical expansion and application can express multi-semantic information. The tensor uniform content locator (TUCL) is proposed as a means of expressing the semantic information of audio, and a three-order tensor semantic space is constructed according to the semantic tensor. Tensor semantic dispersion (TSD) can aggregate some audio resources with the same semantics and, at the same time, its automatic classification can be accomplished by calculating the TSD. In order to effectively utilize TSD classification information, a radial basis function tensor neural network (RBFTNN) is constructed and used to train an intelligent learning model. Experimental results show that the tensor model can significantly improve the classification precision under multi-semantic classification requests within an information resource management system.
Similar content being viewed by others
References
Casey M A, Veltkamp R, Goto M. Content-based music information retrieval: current directions and future challenges. Proc IEEE, 2008, 96: 668–696
Wolter K, Bastuck C. Cartner D. Adaptive user modeling for content-based music retrieval. In: Detyniecki M, Leiner U, Nürnberger A, eds. 6th International Workshop AMR 2008. LNCS, 2 2010, 5811: 40–52
Han B J, Rho S, Jun S, et al. Music emotion classification and context-based music recommendation. Multimedia Tools Appl, 2010, 47: 433–460
Joder C, Essid S, Richard G. Temporal integration for audio classification with application to musical instrument classification. IEEE Trans Audio Speech Lang Process, 2009, 17: 174–186
Ruolun L, Zolzer U, Guulemard M. Excitation signature extraction for pitched musical instrument timbre analysis using higher order statistics. In: IEEE International Conference on Multimedia and Expo, Suntec City, 2011. 298–303
Panagakis Y, Kotropoulos C, Arce G R. Non-negative multilinear principal component analysis of auditory temporal modulations for music genre classification. IEEE Trans Audio Speech Lang Process, 2010, 18: 576–588
Tsunoo E, Tzanetakisy G, Ono N, et al. Audio genre classification using percussive pattern clustering combined with tumbrel features. In: IEEE International Conference on Multimedia and Expo, New York, 2009. 382–385
Libeks J, Turnbull D. You can judge an artist by an album cover: using images for music annotation. IEEE Trans Multimedia, 2011, 18: 30–37
Turnbull D, Barrington L, Torres D. Semantic annotation and retrieval of music and sound effects. IEEE Trans Audio Speech Lang Process, 2009, 16: 467–476
Fei W, Yanan L, Yueting Z. Tensor-based transductive learning for multimodality video semantic concept detection. IEEE Trans Multimedia, 2009, 11: 868–878
Ling X, Jianguo M, Youping L, et al. An information filtering method for Chinese web pages based on UCL. Acta Electron Sin, 2006, 34: 1752–1757
Zhang W, Lin Z C, Tang X O. Tensor linear Laplacian discrimination for feature extraction. Pattern Recog, 2009, 42: 1941–1948
Wang H, Yuan S C, Xu D, et al. Trace ratio vs. ratio trace for dimensionality reduction. In: Proceedings of Intermational Conference on Computer Vision and Pattern Recogition, Minneapolis, 2007. 1–8
Du D J, Li K, Fei M R. A fast multi-output RBF neural network construction method. Neurocomput, 2010, 73: 2196–2202
Park H S, Chung Y D, Oh S K, et al. Design of information granule-oriented RBF neural networks and its application to power supply for high-field magnet. Eng Appl Artif Intell, 2011, 24: 543–554
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Xing, L., Ma, Q. & Zhu, M. Tensor semantic model for an audio classification system. Sci. China Inf. Sci. 56, 1–9 (2013). https://doi.org/10.1007/s11432-013-4821-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11432-013-4821-x