A cross-modal method of labeling music tags

Hsu, Jia-Lien; Li, Yen-Fu

doi:10.1007/s11042-011-0729-x

A cross-modal method of labeling music tags

Published: 26 January 2011

Volume 58, pages 521–541, (2012)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Jia-Lien Hsu¹ &
Yen-Fu Li¹

136 Accesses
4 Citations
Explore all metrics

Abstract

In this paper, we discuss various features of music objects in two kinds of domain. Among these features, Mel-frequency cepstral coefficients (MFCCs) are further discussed and described by Gaussian mixture model (GMM). Also, the similarity between GMMs are investigated accordingly. Then, we employ the multimedia graph as a cross-modal method to associate MFCCs and genre tags of music objects. By applying link analysis algorithm in the graph, we label appropriate genre tags for target music objects. Also, we perform experiments to show performance, effectiveness, and parameter setting of our approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Designing a Multi-modal Association Graph for Music Objects

Graph-Based Multimodal Music Mood Classification in Discriminative Latent Space

Music auto-tagging based on the unified latent semantic modeling

Article 20 January 2018

References

All Music Guide. Available online: http://www.allmusic.com
Aucouturier J-J, Pachet F (2002) Music similarity measures: what’s the use? In: Proc. of international symposium on music information retrieval (ISMIR)
Aucouturier J-J, Pachet F (2004) Improving timbre similarity: how high is the sky? JNRSAS 1(1)
Aucouturier J-J, Pachet F, Sandler M (2005) The way it sounds: timbre models for analysis and retrieval of music signals. IEEE Trans Multimedia 7(6):1028–1035
Article Google Scholar
Berenzweig A, Logan B, Ellis D, Whitman B (2003) A large-scale evaluation of acoustic and subjective music similarity measures. In: Proc. of international symposium on music information retrieval (ISMIR)
Berenzweig A, Logan B, Ellis D, Whitman B (2004) A large-scale evaluation of acoustic and subjective music-similarity measures. Comput Music J 28(2):63–76
Article Google Scholar
Blum TL, Keislar DF, Wheaton JA, Wold EH (1999) Method and article of manufacture for content-based analysis, storage, retrieval, and segmentation of audio information. US Patent 5918223
Breyer L (2002) Markovian page ranking distributions: some theory and simulations. http://www.lbreyer.com/preprints.html
Brin S, Motwani R, Page L, Winograd T (1998) What can you do with a Web in your pocket? In: Proc. of IEEE international conference on computer society technical committee on data engineering
Chen J-Y, Hershey J, Olsen P, Yashchin E (2008) Accelerated Monte Carlo for Kullback–Leibler divergence between gaussian mixture models. In: Proc. of IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 4553–4556
Dongge L, Nevenka D, Mingkun L (2006) Multimedia content processing through cross-modal association. In: Proc. of ACM international conference on multimedia, pp 32–37
Ellis DP (2006) Extracting information from music audio. Commun ACM 49(8):32–37
Article Google Scholar
Eronen A, Klapuri A (2000) Musical instrument recognition using cepstral coefficients and temporal features. In: Proc. of IEEE international conference on acoustics, speech and signal processing (ICASSP)
Haveliwala TH (2002) Topic-sensitive pagerank. In: Proc. of international World Wide Web conference (WWW)
Krishnamoorthy P, Kumar S (2010) Hierarchical audio content classification system using an optimal feature selection algorithm. Multimed Tools Appl. doi:10.1007/s11042-010-0546-7
Google Scholar
Kullback S (1968) Information theory and statistics. Dover, New York
Google Scholar
Langlois T, Marques G (2009) Music classification method based on timbral features. In: Proc. of international symposium on music information retrieval (ISMIR)
Li T, Ogihara M, Li Q (2003) A comparative study on content-based music genre classification. In: Proc. of ACM SIGIR, pp 282–289
Logan B (2000) Mel frequency cepstral coefficients for music modeling. In: Proc. of international symposium on music information retrieval (ISMIR)
Oliveira-Brochado A, Martins FV (2005) Assessing the number of components in mixture models: a review. Universidade do Porto, Faculdade de Economia do Porto. FEP working papers. [Online]. Available: http://econpapers.repec.org/RePEc:por:fepwps:194
Pachet F, Aucouturier J-J, La Burthe A, Zils A, Beurive A (2006) The cuidado music browser: an end-to-end electronic music distribution system. Multimed Tools Appl 30(3):331–349. doi:10.1007/s11042-006-0030-6
Article Google Scholar
Page L, Brin S, Motwani R, Winograd T (1999) The pagerank citation ranking: bringing order to the Web. In: Stanford digital library. Stanford University, Stanford
Google Scholar
Pampalk E, Flexer A, Widmer G (2005) Improvements of audio-based music similarity and genre classification. In: Proc. of international symposium on music information retrieval (ISMIR), pp 628–633
Pan J-Y, Yang H-J, Faloutsos C, Duygulu P (2004) Automatic multimedia cross-modal correlation discovery. In: Proc. of ACM international conference on knowledge discovery and data mining (SIGKDD)
Schwarz D, Rodet X (1999) Spectral estimation and fepresentation for sound analysis-synthesis. In: Proc. of international computer music conference (ICMC)
Theodoridis S, Koutroumbas K (2008) Pattern recognition. Academic, New York
Google Scholar
Tzanetakis G, Cook P (2002) Musical genre classification of audio signals. IEEE Trans Speech Audio Process 10(5):293–302
Article Google Scholar
USPOP2002 Pop Music Data Set. Available online: http://labrosa.ee.columbia.edu/projects/musicsim/uspop2002.html
Zeng W, Hu R, Ai H (2010) Audio steganalysis of spread spectrum information hiding based on statistical moment and distance metric. Multimed Tools Appl. doi:10.1007/s11042-010-0564-5
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Information Engineering, Fu Jen Catholic University, Taiwan, Republic of China
Jia-Lien Hsu & Yen-Fu Li

Authors

Jia-Lien Hsu
View author publications
You can also search for this author in PubMed Google Scholar
Yen-Fu Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jia-Lien Hsu.

Additional information

This research was supported by Fu Jen Catholic University with Project No. 409731044039, and sponsored by the National Science Council under Contract No. NSC-97-2221-E-030-013.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hsu, JL., Li, YF. A cross-modal method of labeling music tags. Multimed Tools Appl 58, 521–541 (2012). https://doi.org/10.1007/s11042-011-0729-x

Download citation

Published: 26 January 2011
Issue Date: June 2012
DOI: https://doi.org/10.1007/s11042-011-0729-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A cross-modal method of labeling music tags

Abstract

Access this article

Similar content being viewed by others

Designing a Multi-modal Association Graph for Music Objects

Graph-Based Multimodal Music Mood Classification in Discriminative Latent Space

Music auto-tagging based on the unified latent semantic modeling

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A cross-modal method of labeling music tags

Abstract

Access this article

Similar content being viewed by others

Designing a Multi-modal Association Graph for Music Objects

Graph-Based Multimodal Music Mood Classification in Discriminative Latent Space

Music auto-tagging based on the unified latent semantic modeling

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation