Skip to main content
Log in

Tensor semantic model for an audio classification system

  • Research Paper
  • Published:
Science China Information Sciences Aims and scope Submit manuscript

Abstract

Audio resources are a very important part of multimedia information. The classification effect of audio is directly related to the service mode of personal resource management systems. At present, vector features have been widely used in audio classification systems. However, some semantic correlations among different audio information can not be completely expressed by simple vector representation. Tensors are multidimensional matrices, and their mathematical expansion and application can express multi-semantic information. The tensor uniform content locator (TUCL) is proposed as a means of expressing the semantic information of audio, and a three-order tensor semantic space is constructed according to the semantic tensor. Tensor semantic dispersion (TSD) can aggregate some audio resources with the same semantics and, at the same time, its automatic classification can be accomplished by calculating the TSD. In order to effectively utilize TSD classification information, a radial basis function tensor neural network (RBFTNN) is constructed and used to train an intelligent learning model. Experimental results show that the tensor model can significantly improve the classification precision under multi-semantic classification requests within an information resource management system.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Casey M A, Veltkamp R, Goto M. Content-based music information retrieval: current directions and future challenges. Proc IEEE, 2008, 96: 668–696

    Article  Google Scholar 

  2. Wolter K, Bastuck C. Cartner D. Adaptive user modeling for content-based music retrieval. In: Detyniecki M, Leiner U, Nürnberger A, eds. 6th International Workshop AMR 2008. LNCS, 2 2010, 5811: 40–52

    Google Scholar 

  3. Han B J, Rho S, Jun S, et al. Music emotion classification and context-based music recommendation. Multimedia Tools Appl, 2010, 47: 433–460

    Article  Google Scholar 

  4. Joder C, Essid S, Richard G. Temporal integration for audio classification with application to musical instrument classification. IEEE Trans Audio Speech Lang Process, 2009, 17: 174–186

    Article  Google Scholar 

  5. Ruolun L, Zolzer U, Guulemard M. Excitation signature extraction for pitched musical instrument timbre analysis using higher order statistics. In: IEEE International Conference on Multimedia and Expo, Suntec City, 2011. 298–303

    Google Scholar 

  6. Panagakis Y, Kotropoulos C, Arce G R. Non-negative multilinear principal component analysis of auditory temporal modulations for music genre classification. IEEE Trans Audio Speech Lang Process, 2010, 18: 576–588

    Article  Google Scholar 

  7. Tsunoo E, Tzanetakisy G, Ono N, et al. Audio genre classification using percussive pattern clustering combined with tumbrel features. In: IEEE International Conference on Multimedia and Expo, New York, 2009. 382–385

    Google Scholar 

  8. Libeks J, Turnbull D. You can judge an artist by an album cover: using images for music annotation. IEEE Trans Multimedia, 2011, 18: 30–37

    Article  Google Scholar 

  9. Turnbull D, Barrington L, Torres D. Semantic annotation and retrieval of music and sound effects. IEEE Trans Audio Speech Lang Process, 2009, 16: 467–476

    Article  Google Scholar 

  10. Fei W, Yanan L, Yueting Z. Tensor-based transductive learning for multimodality video semantic concept detection. IEEE Trans Multimedia, 2009, 11: 868–878

    Article  Google Scholar 

  11. Ling X, Jianguo M, Youping L, et al. An information filtering method for Chinese web pages based on UCL. Acta Electron Sin, 2006, 34: 1752–1757

    Google Scholar 

  12. Zhang W, Lin Z C, Tang X O. Tensor linear Laplacian discrimination for feature extraction. Pattern Recog, 2009, 42: 1941–1948

    Article  MATH  Google Scholar 

  13. Wang H, Yuan S C, Xu D, et al. Trace ratio vs. ratio trace for dimensionality reduction. In: Proceedings of Intermational Conference on Computer Vision and Pattern Recogition, Minneapolis, 2007. 1–8

    Google Scholar 

  14. Du D J, Li K, Fei M R. A fast multi-output RBF neural network construction method. Neurocomput, 2010, 73: 2196–2202

    Article  Google Scholar 

  15. Park H S, Chung Y D, Oh S K, et al. Design of information granule-oriented RBF neural networks and its application to power supply for high-field magnet. Eng Appl Artif Intell, 2011, 24: 543–554

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ling Xing.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Xing, L., Ma, Q. & Zhu, M. Tensor semantic model for an audio classification system. Sci. China Inf. Sci. 56, 1–9 (2013). https://doi.org/10.1007/s11432-013-4821-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11432-013-4821-x

Keywords

Navigation