skip to main content
10.1145/3460231.3478847acmconferencesArticle/Chapter ViewAbstractPublication PagesrecsysConference Proceedingsconference-collections
extended-abstract

Siamese Neural Networks for Content-based Cold-Start Music Recommendation.

Published: 13 September 2021 Publication History

Abstract

Music recommendation systems typically use collaborative filtering to determine which songs to recommend to their users. This mechanism matches a user with listeners that have similar tastes, and uses their listening history to find songs that the user will probably like. The fundamental issue with this approach is that artists already need to have a significant user following to get a fair chance of being recommended. This is known as the music cold-start problem. In this work, we investigate the possibility of making music recommendations based on audio content so that new artists still get a good chance of being recommended, even if they do not have a sufficient number of listeners yet.
We propose the use of Siamese Neural Networks (SNNs) to determine the similarity between two audio clips. Each clip is first pre-processed into a Mel-Spectrogram, which is then used as input to an SNN consisting of two identical Convolutional Neural Networks (CNNs). The output of each CNN is then compared together to determine whether two songs are similar or not. These were trained using audio from the Free Music Archive, with the genre used as a heuristic to determine the similarity between song pairs.
A query-by-multiple-example (QBME) music recommendation system was developed that makes use of the proposed content-based similarity metric to find songs that match the user’s tastes. This was packaged inside an online blind-test survey, which first prompts participants to select a set of preferred songs, and then recommends a number of songs which the subject is expected to listen to and rate on a Likert scale. The recommendations from the proposed algorithm were stochastically interleaved with songs selected randomly from the preferred genres of the user, as a baseline for comparison. The participants were not aware that the recommendations came from two different algorithms.
Our findings show that 60.7% of the 150 participants gave higher ratings to the recommendations made by the proposed SNN-based algorithm. Findings also show that 55% of the recommended songs had less than 1,500 listens, demonstrating that the proposed content-based approach can provide a fairer exposure to all artists based on their music, independent of their fame and popularity.

Supplementary Material

MP4 File (Presentation Video.mp4)
Presentation video for "Siamese Neural Networks for Content-based Cold-Start Music Recommendation"

References

[1]
Sodiq Adebiyi. 2020. An Emotion Based Music Recommender System Using Deep Learning. Ph.D. Dissertation. Dublin, National College of Ireland.
[2]
Dmitry Bogdanov, Martín Haro Berois, Ferdinand Fuhrmann, Emilia Gómez Gutiérrez, Herrera Boyer, 2010. Content-based music recommendation based on user preference examples. In Proceedings of the Workshop on Music Recommendation and Discovery 2010 (WOMRAD 2010). CEUR Workshop Proceedings, Barcelona, Spain, 33–8.
[3]
Jane Bromley, James W Bentz, Léon Bottou, Isabelle Guyon, Yann LeCun, Cliff Moore, Eduard Säckinger, and Roopak Shah. 1993. Signature verification using a “siamese” time delay neural network. International Journal of Pattern Recognition and Artificial Intelligence 7, 04(1993), 669–688.
[4]
Jamie Burns and Terence L van Zyl. 2020. Automated Music Recommendations Using Similarity Learning. In First Southern African Conference for Artificial Intelligence Research. SOCAIR 2020, South Africa, 288–303.
[5]
Ke Yin Cao, Yu Liu, and Hua Xin Zhang. 2020. Improving the Cold Start Problem in Music Recommender Systems. Journal of Physics: Conference Series 1651 (nov 2020), 012067. https://doi.org/10.1088/1742-6596/1651/1/012067
[6]
Felipe Dunsch, David K Evans, Mario Macis, and Qiao Wang. 2018. Bias in patient satisfaction surveys: a threat to measuring healthcare quality. BMJ global health 3, 2 (2018), e000694.
[7]
Chenjiao Feng, Jiye Liang, Peng Song, and Zhiqiang Wang. 2020. A fusion collaborative filtering method for sparse data in recommender systems. Information Sciences 521(2020), 365–379. https://doi.org/10.1016/j.ins.2020.02.052
[8]
Hadi Harb and Liming Chen. 2003. A query by example music retrieval algorithm. In Digital Media Processing For Multimedia Interactive Services: Proceedings of the 4th European Workshop on Image Analysis for Multimedia Interactive Services. World Scientific Publishing, United Kingdom, 122–128.
[9]
Muhammad Huzaifah. 2017. Comparison of Time-Frequency Representations for Environmental Sound Classification using Convolutional Neural Networks. arxiv:1706.07156 [cs.CV]
[10]
IFPI. 2019. IFPI releases Music Listening 2019 - IFPI. Technical Report. IFPI. https://www.ifpi.org/ifpi-releases-music-listening-2019/
[11]
Mahmut Kaya and Hasan Şakir Bilge. 2019. Deep metric learning: A survey., 1066 pages. https://doi.org/10.3390/sym11091066
[12]
Yann LeCun, Yoshua Bengio, 1995. Convolutional networks for images, speech, and time series. The handbook of brain theory and neural networks 3361, 10(1995), 1995.
[13]
Paul Magron and Cédric Févotte. 2021. Leveraging the Structure of Musical Preference in Content-Aware Music Recommendation. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2021). IEEE, Toronto, Canada, 581–585. https://doi.org/10.1109/ICASSP39728.2021.9414194
[14]
Pranay Manocha, Rohan Badlani, Anurag Kumar, Ankit Shah, Benjamin Elizalde, and Bhiksha Raj. 2017. Content-based Representations of audio using Siamese neural networks. ICASSP, IEEE Int. Conf. Acoust. Speech Signal Process. - Proc. 2018-April (oct 2017), 3136–3140. arxiv:1710.10974
[15]
Brian McFee, Luke Barrington, and Gert Lanckriet. 2012. Learning content similarity for music recommendation. IEEE Trans. Audio, Speech Lang. Process. 20, 8 (2012), 2207–2218. https://doi.org/10.1109/TASL.2012.2199109 arxiv:1105.2344
[16]
Masahiro Morii, Takayuki Sakagami, Shigetaka Shinya Masuda, Okubo, and Yuki Tamari. 2017. How does response bias emerge in lengthy sequential preference judgments?Behaviormetrika 44(2017), 575–591. https://doi.org/10.1007/s41237-017-0036-6
[17]
Y. V. Srinivasa Murthy and Shashidhar G. Koolagudi. 2018. Content-Based Music Information Retrieval (CB-MIR) and Its Applications toward the Music Industry: A Review. ACM Comput. Surv. 51, 3, Article 45 (June 2018), 46 pages. https://doi.org/10.1145/3177849
[18]
Hiroyuki Nodera, Yusuke Osaki, Hiroki Yamazaki, Atsuko Mori, Yuishin Izumi, and Ryuji Kaji. 2019. Deep learning for waveform identification of resting needle electromyography signals. Clin. Neurophysiol. 130, 5 (may 2019), 617–623. https://doi.org/10.1016/j.clinph.2019.01.024
[19]
Sergio Oramas, Oriol Nieto, Mohamed Sordo, and Xavier Serra. 2017. A Deep Multimodal Approach for Cold-Start Music Recommendation. In Proceedings of the 2nd Workshop on Deep Learning for Recommender Systems (Como, Italy) (DLRS 2017). Association for Computing Machinery, New York, NY, USA, 32–37. https://doi.org/10.1145/3125486.3125492
[20]
Gian Luca Romani, Samuel J. Williamson, and Lloyd Kaufman. 1982. Tonotopic organization of the human auditory cortex. Science (80-.). 216, 4552 (jun 1982), 1339–1340. https://doi.org/10.1126/science.7079770
[21]
Martin Saveski and Amin Mantrach. 2014. Item Cold-Start Recommendations: Learning Local Collective Embeddings. In Proceedings of the 8th ACM Conference on Recommender Systems (Foster City, Silicon Valley, California, USA) (RecSys ’14). Association for Computing Machinery, New York, NY, USA, 89–96. https://doi.org/10.1145/2645710.2645751
[22]
Mohammad Soleymani, Anna Aljanaki, Frans Wiering, and Remco C. Veltkamp. 2015. Content-based music recommendation using underlying music preference structure. In 2015 IEEE International Conference on Multimedia and Expo (ICME). IEEE, Turin, Italy, 1–6. https://doi.org/10.1109/ICME.2015.7177504
[23]
Spotify. 2020. Spotify 2020 Shareholder Report. https://s22.q4cdn.com/540910603/files/doc_financials/2020/q2/Shareholder-Letter-Q2-2020_FINAL.pdf
[24]
Marko Stamenovic. 2020. Towards Cover Song Detection with Siamese Convolutional Neural Networks. arxiv:2005.10294 [eess.AS]
[25]
Stanley S Stevens and John Volkmann. 1940. The relation of pitch to frequency: A revised scale. The American Journal of Psychology 53, 3 (1940), 329–353.
[26]
Stanley Smith Stevens, John Volkmann, and Edwin Broomell Newman. 1937. A scale for the measurement of the psychological magnitude pitch. The journal of the acoustical society of america 8, 3 (1937), 185–190.
[27]
Tiago Fernandes Tavares and Jônatas Manzolli. 2014. Query-by-Multiple-Examples: Content-Based Search in Computer-Assisted Sound-Based Musical Composition. In Music Technology meets Philosophy - From Digital Echos to Virtual Ethos: Joint Proceedings of the 40th International Computer Music Conference, ICMC 2014, and the 11th Sound and Music Computing Conference, SMC 2014. Michigan Publishing, Athens, Greece, 397–401.
[28]
Wei-Ho Tsai, Hung-Ming Yu, and Hsin-Min Wang. 2005. Query-By-Example Technique for Retrieving Cover Versions of Popular Songs with Similar Melodies. In Proceedings of the 6th International Conference on Music Information Retrieval (ISMIR 2005), Vol. 5. ISMIR 2005, London, United Kingdom, 183–190.
[29]
Aaron van den Oord, Sander Dieleman, and Benjamin Schrauwen. 2013. Deep content-based music recommendation. In Advances in Neural Information Processing Systems (NIPS 2013), C. J. C. Burges, L. Bottou, M. Welling, Z. Ghahramani, and K. Q. Weinberger(Eds.), Vol. 26. Curran Associates, Inc., Lake Tahoe, USA.
[30]
Chunxia Zhang, Ming Yang, Jing Lv, and Wanqi Yang. 2018. An improved hybrid collaborative filtering algorithm based on tags and time factor. Big Data Mining and Analytics 1, 2 (2018), 128–136.

Cited By

View all
  • (2025)MSR: A Personalized Movie Recommendation Model Based on Gate Mechanism and Attention NetworkInternational Journal of Computational Intelligence and Applications10.1142/S1469026824420045Online publication date: 19-Feb-2025
  • (2024)A Multimodal Single-Branch Embedding Network for Recommendation in Cold-Start and Missing Modality ScenariosProceedings of the 18th ACM Conference on Recommender Systems10.1145/3640457.3688138(380-390)Online publication date: 8-Oct-2024
  • (2024)Multimodal Representation Learning for High-Quality Recommendations in Cold-Start and Beyond-AccuracyProceedings of the 18th ACM Conference on Recommender Systems10.1145/3640457.3688009(1290-1295)Online publication date: 8-Oct-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
RecSys '21: Proceedings of the 15th ACM Conference on Recommender Systems
September 2021
883 pages
ISBN:9781450384582
DOI:10.1145/3460231
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 September 2021

Check for updates

Author Tags

  1. Deep Metric Learning
  2. Music Recommendation
  3. Siamese Networks

Qualifiers

  • Extended-abstract
  • Research
  • Refereed limited

Conference

RecSys '21: Fifteenth ACM Conference on Recommender Systems
September 27 - October 1, 2021
Amsterdam, Netherlands

Acceptance Rates

Overall Acceptance Rate 254 of 1,295 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)111
  • Downloads (Last 6 weeks)7
Reflects downloads up to 01 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2025)MSR: A Personalized Movie Recommendation Model Based on Gate Mechanism and Attention NetworkInternational Journal of Computational Intelligence and Applications10.1142/S1469026824420045Online publication date: 19-Feb-2025
  • (2024)A Multimodal Single-Branch Embedding Network for Recommendation in Cold-Start and Missing Modality ScenariosProceedings of the 18th ACM Conference on Recommender Systems10.1145/3640457.3688138(380-390)Online publication date: 8-Oct-2024
  • (2024)Multimodal Representation Learning for High-Quality Recommendations in Cold-Start and Beyond-AccuracyProceedings of the 18th ACM Conference on Recommender Systems10.1145/3640457.3688009(1290-1295)Online publication date: 8-Oct-2024
  • (2024)ADSNet: Cross-Domain LTV Prediction with an Adaptive Siamese Network in AdvertisingProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671612(5872-5881)Online publication date: 25-Aug-2024
  • (2024)Hybrid Recommendation Systems using Adaptive Clustering to address Cold start problems2024 International Conference on Electrical, Computer and Energy Technologies (ICECET10.1109/ICECET61485.2024.10698666(1-6)Online publication date: 25-Jul-2024
  • (2024)Clustering-Based Frequent Pattern Mining Framework for Solving Cold-Start Problem in Recommender SystemsIEEE Access10.1109/ACCESS.2024.335505712(13678-13698)Online publication date: 2024
  • (2024)Content-driven music recommendationComputer Science Review10.1016/j.cosrev.2024.10061851:COnline publication date: 25-Jun-2024
  • (2024)Enhancing user experience: a content-based recommendation approach for addressing cold start in music recommendationJournal of Intelligent Information Systems10.1007/s10844-024-00872-xOnline publication date: 13-Sep-2024
  • (2024)Addressing the Cold-Start Problem in Content-Based Music Recommendation Systems Through the Implementation of a Weather-Based Music Recommendation SystemInnovative Technologies in Intelligent Systems and Industrial Applications10.1007/978-3-031-71773-4_38(619-633)Online publication date: 30-Dec-2024
  • (2023)Towards addressing item cold-start problem in collaborative filtering by embedding agglomerative clustering and FP-growth into the recommendation systemComputer Science and Information Systems10.2298/CSIS221116052K20:4(1343-1366)Online publication date: 2023
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media