Abstract
We present a filter-and-refine method to speed up nearest neighbor searches with the Kullback–Leibler divergence for multivariate Gaussians. This combination of features and similarity estimation is of special interest in the field of automatic music recommendation as it is widely used to compute music similarity. However, the non-vectorial features and a non-metric divergence make using it with large corpora difficult, as standard indexing algorithms can not be used. This paper proposes a method for fast nearest neighbor retrieval in large databases which relies on the above approach. In its core the method rescales the divergence and uses a modified FastMap implementation to speed up nearest-neighbor queries. Overall the method accelerates the search for similar music pieces by a factor of 10–30 and yields high recall values of 95–99% compared to a standard linear search.





Similar content being viewed by others
References
Andoni A, Indyk P, MIT C (2006) Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. In: 47th annual IEEE symposium on foundations of computer science, 2006. FOCS’06, pp 459–468
Athitsos V, Alon J, Sclaroff S, Kollios G (2004) BoostMap: a method for efficient approximate similarity rankings. In: Proceedings of the 2004 IEEE computer society conference on computer vision and pattern recognition, vol 2
Athitsos V, Potamias M, Papapetrou P, Kollios G (2008) Nearest neighbor retrieval using distance-based hashing. In: IEEE 24th international conference on data engineering, ICDE 2008, pp 327–336
Bentley J (1975) Multidimensional binary search trees used for associative searching. ACM, New York, NY, USA
Beyer K, Goldstein J, Ramakrishnan R, Shaft U (1999) When is ‘nearest neighbor’ meaningful? In: Proceedings of the 7th international conference on database theory. Springer, London, UK, pp 217–235
Burges C, Platt J, Jana S (2003) Distortion discriminant analysis for audio fingerprinting. IEEE Trans Speech Audio Process 11(3):165–174
Cai R, Zhang C, Zhang L, Ma W (2007) Scalable music recommendation by search. In: Proceedings of the 15th international conference on multimedia. ACM, New York, NY, USA, pp 1065–1074
Cano P, Kaltenbrunner M, Gouyon F, Batlle E (2002) On the use of FastMap for audio retrieval and browsing. In: Proc int conf music information retrieval (ISMIR), pp 275–276
Cano P, Koppenberger M, Wack N (2005) An industrial-strength content-based music recommendation system. In: Proceedings of the 28th annual international ACM SIGIR conference on research and development in information retrieval. ACM, New York, NY, USA, pp 673–673
Casey M, Slaney M (2006) Song intersection by approximate nearest neighbor search. In: Proc ISMIR, pp 144–149
Cox T, Cox M (2001) Multidimensional scaling. CRC Press
Downie JS (2008) The music information retrieval evaluation exchange (2005–2007): a window into music information retrieval research. Acoust Sci Technol 29(4):247–255
Endres D, Schindelin J (2003) A new metric for probability distributions. IEEE Trans Inf Theory 49(7):1858–1860
Faloutsos C, Lin K (1995) FastMap: a fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets. In: Proceedings of the 1995 ACM SIGMOD international conference on management of data. ACM, New York, NY, USA, pp 163–174
Fastl H, Zwicker E (2007) Psychoacoustics: facts and models. Springer, New York
Flexer A (2007) A closer look on artist filters for musical genre classification. In: Proceedings of the international symposium on music information retrieval, Vienna, Austria
Flexer A, Schnitzer D (2010) Effects of album and artist filters in audio similarity computed for very large music databases. Comput Music J 34(3):20–28
Garcia V, Debreuve E, Barlaud M (2008) Fast k nearest neighbor search using GPU. In: IEEE computer society conference on computer vision and pattern recognition workshops, 2008. CVPR Workshops 2008, pp 1–6
Homburg H, Mierswa I, Möller B, Morik K, Wurst M (2005) A benchmark dataset for audio classification and clustering. In: Proceedings of the international conference on music information retrieval, pp 528–31
Jensen J, Christensen M, Ellis D, Jensen S (2009) Quantitative analysis of a common audio similarity measure. IEEE Trans Audio Speech Lang Process 17(4):693–703
Levy M, Sandler M (2006) Lightweight measures for timbral similarity of musical audio. In: Proceedings of the 1st ACM workshop on audio and music computing multimedia. ACM, New York, NY, USA, pp 27–36
Mandel M, Ellis D (2005) Song-level features and support vector machines for music classification. In: Proceedings of the 6th international conference on music information retrieval (ISMIR 2005), London, UK
Mandel M, Ellis DP (2007) Labrosa’s audio music similarity and classification submissions. In: Proceedings of the international symposium on music information retrieval, Vienna, Austria—Mirex 2007
Pampalk E (2006) Computational models of music similarity and their application in music information retrieval. Doctoral dissertation, Vienna University of Technology, Austria
Pampalk E, Rauber A, Merkl D (2002) Content-based organization and visualization of music archives. In: Proceedings of the tenth ACM international conference on multimedia. ACM, New York, NY, USA, pp 570–579
Penny W (2001) KL-divergences of Normal, Gamma, Dirichlet and Wishart densities. Wellcome Department of Cognitive Neurology, University College London
Pohle T, Schnitzer D (2007) Striving for an improved audio similarity measure. In: 4th annual music information retrieval evaluation exchange
Rafailidis D, Nanopoulos A, Manolopoulos Y (2009) Nonlinear dimensionality reduction for efficient and effective audio similarity searching. Multimedia Tools and Applications, pp 1–15
Roy P, Aucouturier J, Pachet F, Beurive A (2005) Exploiting the tradeoff between precision and cpu-time to speed up nearest neighbor search. In: Proceedings of the 6th international conference on music information retrieval (ISMIR 2005), London, UK
Schnitzer D (2007) Mirage—high-performance music similarity computation and automatic playlist generation. Master’s thesis, Vienna University of Technology
Wang J, Wang X, Shasha D, Zhang K (2005) Metricmap: an embedding technique for processing distance-based queries in metric spaces. IEEE Trans Syst Man Cybern Part B Cybern 35(5):973–987
Yianilos P (1993) Data structures and algorithms for nearest neighbor search in general metric spaces. In: Proceedings of the fourth annual ACM-SIAM symposium on discrete algorithms. Society for Industrial and Applied Mathematics Philadelphia, PA, USA, pp 311–321
Acknowledgements
This research is supported by the Austrian Research Fund (FWF) under grant L511-N15, and by the Austrian Research Promotion Agency (FFG) under project number 815474-BRIDGE.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Schnitzer, D., Flexer, A. & Widmer, G. A fast audio similarity retrieval method for millions of music tracks. Multimed Tools Appl 58, 23–40 (2012). https://doi.org/10.1007/s11042-010-0679-8
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-010-0679-8