Abstract
This chapter describes two different approaches using the variogram in the context of Mel Frequency Cepstral Coefficients (MFCCs) and the evaluation of music similarity. The first approach is referred to as the full variogram approach; in this case, all the lags of the variogram of the second coefficient of the MFCC are employed. The second choice is referred to as the reduced variogram approach; in this case, a subset of the lags of the variogram of the MFCC matrix is considered. Thus, the usage of the variogram is proposed as a tool to synthesize the timbre information contained in the MFCCs.
Also, four different weighting functions are tested for the calculation of the distance measure between songs. The performance of the methods proposed is evaluated by applying the pseudo-objective evaluation scheme of the MIREX AMS task. The results are compared against the scores obtained by other methods submitted to the MIREX AMS 2011.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Audio music similarity and retrieval result (2011), http://www.music-ir.org/mirex/wiki/2011:Audio_Music_Similarity_and_Retrieval_Results (last viewed February 2013)
American Standard Association: Acoustical Terminology. American National Standards Institute (1960)
Aucouturier, J.J., Pachet, F.: Improving timbre similarity: How high is the sky? Journal of Negative Results in Speech and Audio Sciences 1(1) (2004)
Aucouturier, J.J., Pachet, F., Sandler, M.: “The way it sounds”: timbre models for analysis and retrieval of music signals. IEEE Transactions on Multimedia 7(6), 1028–1035 (2005)
Cano, P., Gómez, E., en Gouyon, F., Herrera, P., Koppenberger, M., Ong, B., Serra, X., Streich, S., Wack, N.: Ismir 2004 audio description contest. Tech. rep., Music Technology Group, UPF (2006)
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, series B 39(1), 1–38 (1977)
Downie, S.J.: The music information retrieval evaluation exchange (mirex). D-Lib Magazine 12(12) (2006)
Gómez, E.: Melodic Description of Audio Signals for Music Content Processing. Master’s thesis, Doctoral Pre-Thesis Work. UPF (2002), www.files/publications/Phd-2002-Emilia-Gomez.pdf
Gouyon, F.: A Computational Approach to Rhythm Description — Audio Features for the Computation of Rhythm Periodicity Functions and their use in Tempo Induction and Music Content Processing. Ph.D. thesis, University Pompeu Fabra, Barcelona, Spain (November 2005), http://www.iua.upf.es/~fgouyon/thesis/
Guaus, E.: Audio content processing for automatic music genre classification: descriptors, databases, and classifiers. Ph.D. thesis, Universitat Pompeu Fabra (2009), http://www.dtic.upf.edu/~eguaus/phd/eguaus_phd_2009_genre_classification_A4.pdf
Haslett, J.: On the sample variogram and the sample autocovariance for non-stationary time series. The Statistician 46(4), 475–485 (1997)
Isaaks, E.H., Srivastava, M.R.: An Introduction to Applied Geostatistics. Oxford University Press, USA (January 1990)
Izmirli, O.: Using Spectral Flatness Based Feature for Audio Segmentation and Retrieval. Tech. rep., Department of Mathematics and Computer Science, Connectucut College (1999)
Kacha, A., Grenez, F., Schoentgen, J., Benmahammed, K.: Dysphonic speech analysis using generalized variogram. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2005), vol. 1, pp. 917–920 (2005)
Khachatryan, D., Bisgaard, S.: Some results on the variogram in time series analysis. Quality and Reliability Engineering International (March 2009)
Krige, D.G.: A statistical approach to some basic mine valuation problems on the witwatersrand. Journal of the Chemical, Metallurgical and Mining Society of South Africa 52(6), 119–139 (1951)
Li, S., Lu, W.: Automatic fit of the variogram. In: Third International Conference on Information and Computing (ICIC), vol. 4, pp. 129–132 (June 2010)
Logan, B., Salomon, A.: A music similarity function based on signal analysis. In: IEEE International Conference on Multimedia and Expo., ICME 2001, pp. 745–748 (2001)
Mandel, M.I., Ellis, D.P.W.: Song-Level Features and Support Vector Machines for Music Classification. In: Reiss, J.D., Wiggins, G.A. (eds.) Proceedings of the 6th International Conference on Music Information Retrieval (ISMIR), pp. 594–599 (September 2005)
Manufacturers, A.M.: The Complete MIDI 1.0 Detailed Specification. MIDI Manufacturers Association Incorporated (1996)
Pampalk, E.: Computational Models of Music Similarity and their Application to Music Information Retrieval. Ph.D. thesis, Vienna University of Technology, Vienna (March 2006)
Rabiner, L., Juang, B.H.: Fundamentals of speech recognition. Prentice-Hall, Inc., Upper Saddle River (1993)
Sammartino, S., Tardón, L.J., de la Bandera, C., Barbancho, I., Barbancho, A.M.: The standardized variogram as a novel tool for music similarity evaluation. In: Proc. of Int. Symposium on Music Information Retrieval (ISMIR 2010), pp. 559–564 (2010)
Scheirer, E.D.: Tempo and beat analysis of acoustic musical signals. Journal of the Acoustical Society of America 103(1), 588–601 (1998), http://www.ncbi.nlm.nih.gov/pubmed/9440344
Tardón, L.J., Sammartino, S., Barbancho, I.: Design of an efficient music-speech discriminator. Journal of the Acoustical Society of America 127(1), 271–279 (2010)
Tzanetakis, G., Cook, P.: Musical genre classification of audio signals. IEEE Transactions on Speech and Audio Processing 10(5), 293–302 (2002)
Wackernagel, H.: Multivariate Geostatistics: An Introduction With Applications. Springer-Verlag Telos (January 1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Tardón, L.J., Barbancho, I. (2013). Music Similarity Evaluation Using the Variogram for MFCC Modelling. In: Aramaki, M., Barthet, M., Kronland-Martinet, R., Ystad, S. (eds) From Sounds to Music and Emotions. CMMR 2012. Lecture Notes in Computer Science, vol 7900. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41248-6_17
Download citation
DOI: https://doi.org/10.1007/978-3-642-41248-6_17
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-41247-9
Online ISBN: 978-3-642-41248-6
eBook Packages: Computer ScienceComputer Science (R0)