Abstract
Cover Song Identification (CSI) technique, refers to the process of identifying an alternative version, performance, rendition, or recording of a previously recorded musical composition by measuring and modeling the musical similarity between them quantitatively and objectively. However, it is not possible to describe the similarity between tracks comprehensively and reliably with only one similarity function. In this paper, the Similarity Network Fusion (SNF) technique, which was originally proposed for combining different kernels for predicting drug-target interactions, is adopted to fuse different similarities based on the same descriptor and different similarity functions. First, the Harmonic Pitch Class Profile (HPCP) is extracted from each track. Next, the similarities, in terms of Qmax and Dmax measures, between the HPCP descriptors of any two tracks are calculated, respectively. Then, the track-by-track similarity networks based on Qmax and on Dmax similarity are constructed separately and then fused into one network by SNF. Finally, the fused similarities obtained from the fused similarity network are adopted to train a classifier, which can then be used to identify whether the input two tracks belong to reference/cover or reference/non-cover pair. Experimental results on Covers80 (http://labrosa.ee.columbia.edu/projects/coversongs/covers80/), subset of SecondHandSongs (SHS) (http://labrosa.ee.columbia.edu/millionsong/secondhand), and the Mixed Collection and Mazurka Cover Collection provided by MIREX (http://www.music-ir.org/mirex/wiki/2016:Audio_Cover_Song_Identification) demonstrate that the proposed scheme performs comparably with or even better than state-of-the-art CSI schemes.
Similar content being viewed by others
Notes
References
Bello JP (2007) Audio-based cover song retrieval using approximate chord sequences: Testing shifts, gaps, swaps and beats. In: Proceedings of 6th International Conference on Music Information Retrieval (ISMIR 2007), pp 239–244
Casey MA, Veltkamp R, Goto M, Leman M, Rhodes C, Slaney M (2008) Content-based music information retrieval: Current directions and future challenges. Proc IEEE 96(4):668–696
Chang TM, Chen ET, Hsieh CB, Chang PC (2013) Cover song identification with direct chroma feature extraction from aac files. In: Proceedings of 2013 IEEE 2nd Global Conference on Consumer Electronics (GCCE), pp 55–56. IEEE
Chen N, Downie JS, Xiao Hd, Zhu Y (2015) Cochlear pitch class profile for cover song identification. Appl Acoust 99:92–96
Chen N, Xiao Hd (2016) Similarity fusion scheme for cover song identification. Electron Lett 52(13):1173–1175
Chuan X (2012) Cover song identification using an enhanced chroma over a binary classifier based similarity measurement framework. In: Proceedings of 2012 International Conference on Systems and Informatics (ICSAI), pp 2170–2176. IEEE
Degani A, Dalai M, Leonardi R, Migliorati P (2013) A heuristic for distance fusion in cover song identification. In: Proceedings of 14th International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS 2013), pp 1–4. IEEE
Downie JS (2008) The music information retrieval evaluation exchange (2005-2007): A window into music information retrieval research. Acoust Sci Technol 29(4):247–255
Egorov A, Linetsky G (2008) Cover song identification with if-f0 pitch class profiles. MIREX extended abstract
Ellis DP (2006) Identifying ’cover songs’ with beat-synchronous chroma features. MIREX 2006:1–4
Ellis DP, Poliner GE (2007) Identifying ’cover songs’ with chroma features and dynamic programming beat tracking. In: Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2007), vol 4, pp IV–1429–IV–1432. IEEE
Foucard R, Durrieu JL, Lagrange M, Richard G (2010) Multimodal similarity between musical streams for cover version detection. In: Proceedings of 2010 IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP 2010), pp 5514–5517. IEEE
Fujishima T (1999) Realtime chord recognition of musical sound: A system using common lisp music. In: Proceedings of International Computer Music Association, pp 464–467
Gómez E (2006) Tonal description of music audio signals. Ph.D. thesis, Universitat Pompeu Fabra
Gómez E (2006) Tonal description of polyphonic audio for music content processing. INFORMS J Comput 18(3):294–304
Gómez E, Herrera P (2006) The song remains the same: identifying versions of the same piece using tonal descriptors. In: Proceedings of 6th International Conference on Music Information Retrieval (ISMIR 2006), pp 180–185
Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The weka data mining software: an update. ACM SIGKDD Explor Newslett 11(1):10–18
Khadkevich M, Omologo M (2013) Large-scale cover song identification using chord profiles. In: Proceedings of 14th International Society for Music Information Retrieval Conference (ISMIR), pp 233–238
Marolt M (2006) A mid-level melody-based representation for calculating audio similarity. In: Proceedings of 7th International Conference on Music Information Retrieval (ISMIR 2006), pp 280–285
Marolt M (2008) A mid-level representation for melody-based retrieval in audio collections. IEEE Trans Multimed 10(8):1617–1625
Muller M, Ewert S (2010) Towards timbre-invariant audio features for harmony-based music. IEEE/ACM Trans Audio Speech, Lang Process 18(3):649–662
Ravuri S, Ellis DP (2010) Cover song detection: from high scores to general classification. In: Proceedings of 2010 IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP 2010), pp 65–68. IEEE
Ravuri S et al (2009) Automatic cover song detection: Moving from high scores to general classification. MIREX extended abstract
Sailer C, Dressler K (2006) Finding cover songs by melodic similarity. MIREX extended abstract
Salamon J (2013) Melody extraction from polyphonic music signals. Ph.D. thesis, Universitat Pompeu Fabra
Salamon J, Serrà J, Gómez E (2012) Melody, bass line, and harmony representations for music version identification. In: Proceedings of the 21st International Conference Companion on World Wide Web, pp 887–894. ACM
Salamon J, Serra J, Gómez E (2013) Tonal representations for music retrieval: from version identification to query-by-humming. Int J Multimed Inf Retr 2(1):45–58
Serrà J, Gómez E, Herrera P (2010) Audio cover song identification and similarity: background, approaches, evaluation, and beyond. In: Advances in Music Information Retrieval, pp 307–332. Springer
Serra J, Gómez E, Herrera P, Serra X (2008) Chroma binary similarity and local alignment applied to cover song identification. IEEE/ACM Trans Audio Speech, Lang Process 16(6):1138–1151
Serra J, Serra X, Andrzejak RG (2009) Cross recurrence quantification for cover song identification. J Phys 11(9):093–017
Serrà J, Zanin M, Andrzejak RG (2009) Cover song retrieval by cross recurrence quantification and unsupervised set detection. In: Proceedings of 2009 International Society for Music Information Retrieval, pp 1–3
Serrà Julià J (2011) Identification of versions of the same musical composition by processing audio descriptions. Ph.D. thesis, Universitat Pompeu Fabra
Tsai WH, Yu HM, Wang HM (2005) Query-by-example technique for retrieving cover versions of popular songs with similar melodies. In: Proceedings of 6th International Conference on Music Information Retrieval (ISMIR 2005), pp 183–190
Tsai WH, Yu HM, Wang HM (2008) Using the similarity of main melodies to identify cover versions of popular songs for music document retrieval. J Inf Sci Eng 24(6):1669–1687
Walters TC, Ross DA, Lyon RF (2013) The intervalgram: An audio feature for large-scale cover-song recognition. In: From Sounds to Music and Emotions, pp 197–213. Springer
Wang B, Mezlini AM, Demir F, Fiume M, Tu Z, Brudno M, Haibe-Kains B, Goldenberg A (2014) Similarity network fusion for aggregating data types on a genomic scale. Nat Methods 11(3):333–337
Yang F, Chen N (2016) Cover song identification based on cross recurrence plot and local alignment. J East China Univ Sci Technol 42(2):247–253
Author information
Authors and Affiliations
Corresponding author
Additional information
This work was supported by the National Natural Science Foundation of China (61271349)
Rights and permissions
About this article
Cite this article
Chen, N., Li, W. & Xiao, H. Fusing similarity functions for cover song identification. Multimed Tools Appl 77, 2629–2652 (2018). https://doi.org/10.1007/s11042-017-4456-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-017-4456-9