Skip to main content
Log in

A cross-modal method of labeling music tags

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

In this paper, we discuss various features of music objects in two kinds of domain. Among these features, Mel-frequency cepstral coefficients (MFCCs) are further discussed and described by Gaussian mixture model (GMM). Also, the similarity between GMMs are investigated accordingly. Then, we employ the multimedia graph as a cross-modal method to associate MFCCs and genre tags of music objects. By applying link analysis algorithm in the graph, we label appropriate genre tags for target music objects. Also, we perform experiments to show performance, effectiveness, and parameter setting of our approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17

Similar content being viewed by others

References

  1. All Music Guide. Available online: http://www.allmusic.com

  2. Aucouturier J-J, Pachet F (2002) Music similarity measures: what’s the use? In: Proc. of international symposium on music information retrieval (ISMIR)

  3. Aucouturier J-J, Pachet F (2004) Improving timbre similarity: how high is the sky? JNRSAS 1(1)

  4. Aucouturier J-J, Pachet F, Sandler M (2005) The way it sounds: timbre models for analysis and retrieval of music signals. IEEE Trans Multimedia 7(6):1028–1035

    Article  Google Scholar 

  5. Berenzweig A, Logan B, Ellis D, Whitman B (2003) A large-scale evaluation of acoustic and subjective music similarity measures. In: Proc. of international symposium on music information retrieval (ISMIR)

  6. Berenzweig A, Logan B, Ellis D, Whitman B (2004) A large-scale evaluation of acoustic and subjective music-similarity measures. Comput Music J 28(2):63–76

    Article  Google Scholar 

  7. Blum TL, Keislar DF, Wheaton JA, Wold EH (1999) Method and article of manufacture for content-based analysis, storage, retrieval, and segmentation of audio information. US Patent 5918223

  8. Breyer L (2002) Markovian page ranking distributions: some theory and simulations. http://www.lbreyer.com/preprints.html

  9. Brin S, Motwani R, Page L, Winograd T (1998) What can you do with a Web in your pocket? In: Proc. of IEEE international conference on computer society technical committee on data engineering

  10. Chen J-Y, Hershey J, Olsen P, Yashchin E (2008) Accelerated Monte Carlo for Kullback–Leibler divergence between gaussian mixture models. In: Proc. of IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 4553–4556

  11. Dongge L, Nevenka D, Mingkun L (2006) Multimedia content processing through cross-modal association. In: Proc. of ACM international conference on multimedia, pp 32–37

  12. Ellis DP (2006) Extracting information from music audio. Commun ACM 49(8):32–37

    Article  Google Scholar 

  13. Eronen A, Klapuri A (2000) Musical instrument recognition using cepstral coefficients and temporal features. In: Proc. of IEEE international conference on acoustics, speech and signal processing (ICASSP)

  14. Haveliwala TH (2002) Topic-sensitive pagerank. In: Proc. of international World Wide Web conference (WWW)

  15. Krishnamoorthy P, Kumar S (2010) Hierarchical audio content classification system using an optimal feature selection algorithm. Multimed Tools Appl. doi:10.1007/s11042-010-0546-7

    Google Scholar 

  16. Kullback S (1968) Information theory and statistics. Dover, New York

    Google Scholar 

  17. Langlois T, Marques G (2009) Music classification method based on timbral features. In: Proc. of international symposium on music information retrieval (ISMIR)

  18. Li T, Ogihara M, Li Q (2003) A comparative study on content-based music genre classification. In: Proc. of ACM SIGIR, pp 282–289

  19. Logan B (2000) Mel frequency cepstral coefficients for music modeling. In: Proc. of international symposium on music information retrieval (ISMIR)

  20. Oliveira-Brochado A, Martins FV (2005) Assessing the number of components in mixture models: a review. Universidade do Porto, Faculdade de Economia do Porto. FEP working papers. [Online]. Available: http://econpapers.repec.org/RePEc:por:fepwps:194

  21. Pachet F, Aucouturier J-J, La Burthe A, Zils A, Beurive A (2006) The cuidado music browser: an end-to-end electronic music distribution system. Multimed Tools Appl 30(3):331–349. doi:10.1007/s11042-006-0030-6

    Article  Google Scholar 

  22. Page L, Brin S, Motwani R, Winograd T (1999) The pagerank citation ranking: bringing order to the Web. In: Stanford digital library. Stanford University, Stanford

    Google Scholar 

  23. Pampalk E, Flexer A, Widmer G (2005) Improvements of audio-based music similarity and genre classification. In: Proc. of international symposium on music information retrieval (ISMIR), pp 628–633

  24. Pan J-Y, Yang H-J, Faloutsos C, Duygulu P (2004) Automatic multimedia cross-modal correlation discovery. In: Proc. of ACM international conference on knowledge discovery and data mining (SIGKDD)

  25. Schwarz D, Rodet X (1999) Spectral estimation and fepresentation for sound analysis-synthesis. In: Proc. of international computer music conference (ICMC)

  26. Theodoridis S, Koutroumbas K (2008) Pattern recognition. Academic, New York

    Google Scholar 

  27. Tzanetakis G, Cook P (2002) Musical genre classification of audio signals. IEEE Trans Speech Audio Process 10(5):293–302

    Article  Google Scholar 

  28. USPOP2002 Pop Music Data Set. Available online: http://labrosa.ee.columbia.edu/projects/musicsim/uspop2002.html

  29. Zeng W, Hu R, Ai H (2010) Audio steganalysis of spread spectrum information hiding based on statistical moment and distance metric. Multimed Tools Appl. doi:10.1007/s11042-010-0564-5

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jia-Lien Hsu.

Additional information

This research was supported by Fu Jen Catholic University with Project No. 409731044039, and sponsored by the National Science Council under Contract No. NSC-97-2221-E-030-013.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hsu, JL., Li, YF. A cross-modal method of labeling music tags. Multimed Tools Appl 58, 521–541 (2012). https://doi.org/10.1007/s11042-011-0729-x

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-011-0729-x

Keywords

Navigation