Skip to main content

Towards Deep Learning Strategies for Transcribing Electroacoustic Music

  • Conference paper
  • First Online:
Perception, Representations, Image, Sound, Music (CMMR 2019)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12631))

Included in the following conference series:

  • 1019 Accesses

Abstract

Electroacoustic music is experienced primarily through auditory perception, as it is not usually based on a prescriptive score. For the analysis of such pieces, transcriptions are sometimes created to illustrate events and processes graphically in a readily comprehensible way. These are usually based on the spectrogram of the recording. Although the manual generation of transcriptions is often time-consuming, they provide a useful starting point for any person who has interest in a work. Deep-learning algorithms that learn to recognize characteristic spectral patterns using supervised learning represent a promising technology to automatize this task. This paper investigates and explores the labeling of sound objects in electroacoustic music recordings. We test several neural-network architectures that enable classification of sound objects using musicological and signal-processing methods. We also show future perspectives how our results can be improved and applied to a new gradient-based visualization approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 139.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://www.ubu.com/sound/electronic.html.

  2. 2.

    https://keras.io/.

  3. 3.

    https://librosa.github.io/librosa/.

  4. 4.

    We have requested this dataset, but unfortunately it was no longer provided by the creators.

References

  1. Adavanne, S., Virtanen, T.: A report on sound event detection with different binaural features. In: DCASE 2017 Challenge (2017)

    Google Scholar 

  2. Alber, M., et al.: iNNvestigate neural networks! CoRR (2018)

    Google Scholar 

  3. Beiche, M.: Musique concrète. Handbuch der musikalischen Terminologie 4, Steiner-Verlag Stuttgart (1994)

    Google Scholar 

  4. Chung, J., Gülçehre, C., Cho, K., Bengio, Y.: Empirical evaluation of gated recurrent neural networks on sequence modeling. In: NIPS 2014 Deep Learning and Representation Learning Workshop (2014)

    Google Scholar 

  5. Choi, K., Fazekas, G., Sandler, M.B., Cho, K.: Transfer learning for music classification and regression tasks. In: Proceedings of the 18th ISMIR Conference, Suzhou, pp. 141–149 (2017)

    Google Scholar 

  6. Collins, N.: The UbuWeb Electronic Music Corpus: An MIR investigation of a historical database. Organised Sound 20(1), 122–134 (2015)

    Article  Google Scholar 

  7. Collins, N., Manning, P., Tarsitani, S.: A new curated corpus of historical electronic music: collation, data and research findings. Trans. Int. Soc. Music Inf. Retr. 1(1), 34–55 (2018)

    Google Scholar 

  8. Couprie, P.: Methods and tools for transcribing electroacoustic music. In: International Conference on Technologies for Music Notation and Representation - TENOR 2018, Montréal, pp. 7–16 (2018)

    Google Scholar 

  9. Drieger, J., Müller M., Disch S.: Extending harmonic-percussive separation of audio signals. In: Retrieval Conference (ISMIR 2014), Taipei, pp. 611–616 (2014)

    Google Scholar 

  10. Erbe, M.: Klänge schreiben: die Transkriptionsproblematik elektroakustischer Musik. Apfel, Vienna (2009)

    Google Scholar 

  11. Essid, S., Richard, G., David, B.: Musical instrument recognition by pairwise classification strategies. IEEE Trans. Audio Speech Lang. Process. 14(4), 1401–1412 (2006)

    Article  Google Scholar 

  12. Grzywczak, D., Gwardys, G.: Deep image features in music information retrieval. Int. J. Electron. Telecommun. 60(4), 321–326 (2014)

    Article  Google Scholar 

  13. Gulluni, S., Essid, S., Buisson, O., Richard, G.: An interactive system for electro-acoustic music analysis. In: 12th International Society for Music Information Retrieval Conference (ISMIR 2011), Miami, pp. 145–150 (2011)

    Google Scholar 

  14. Gulluni, S., Essid, S., Buisson, O., Richard, G.: Interactive classification of sound objects for polyphonic electro-acoustic music annotation. In: AES 42nd International Conference, Ilmenau (2011)

    Google Scholar 

  15. Klien, V., Grill, T., Flexer, A.: On automated annotation of acousmatic music. J. New Music Res. 41(2), 153–173 (2012)

    Article  Google Scholar 

  16. López-Serrano, P., Dittmar, C., Müller M.: Mid-level audio features based on cascaded harmonic-residual-percussive separation. In: Proceedings of the Audio Engineering Society AES Conference on Semantic Audio, Erlangen (2017)

    Google Scholar 

  17. Mesaros, A., et al.: DCASE 2017 challenge setup: tasks, datasets and baseline system. In: DCASE 2017 - Workshop on Detection and Classification of Acoustic Scenes and Events (2017)

    Google Scholar 

  18. Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22(10), 1345–1359 (2010)

    Article  Google Scholar 

  19. Park, T.H., Li, Z., Wu, W.: Easy does it: the electro-acoustic music analysis toolbox. In: Proceedings of the 10th International Society for Music Information Retrieval Conference (ISMIR 2009), Kobe, pp. 693–698 (2009)

    Google Scholar 

  20. Peeters, G.: A large set of audio features for sound description (similarity and classification) in the CUIDADO project (2004). http://recherche.ircam.fr/anasyn/peeters/ARTICLES/Peeters_2003_cuidadoaudiofeatures.pdf

  21. Pons, J.: Neural networks for music: a journey through its history (2018). https://towardsdatascience.com/neural-networks-for-music-a-journey-through-its-history-91f93c3459fb (2018)

  22. Purwins, H., Li, B., Virtanen, T., Schlüter, J., Chang, S.-Y., Sainath, T.: Deep Learning for Audio Signal Processing. IEEE J. Sel. Top. Signal Process. 14(8), 1–14 (2019)

    Google Scholar 

  23. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: ILCR (2015)

    Google Scholar 

  24. Smalley, D.: Spectromorphology: Explaining Sound-shapes. Organised Sound 2/2, Cambridge, pp. 107–126 (1997)

    Google Scholar 

  25. Stroh, W.M.: Elektronische Musik. Handbuch der musikalischen Terminologie 2, Steiner-Verlag, Stuttgart (1972)

    Google Scholar 

  26. Thoresen, L., Hedman, A.: Spectromorphological Analysis of Sound Objects: An Adaptation of Pierre Schaeffer’s Typomorphology. Organised Sound 12/2, Cambridge, pp. 129–141 (2007)

    Google Scholar 

  27. Torrey, L., Shavlik, J.: Transfer learning. In: Handbook of Research on Machine Learning, Algorithms, Methods, and Techniques, pp. 242–264. IGI-Global (2009)

    Google Scholar 

  28. van der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008)

    MATH  Google Scholar 

  29. Virtanen, T., Plumbley, M.D., Ellis, D.P.W.: Computational Analysis of Sound Scenes and Events. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-63450-0

    Book  Google Scholar 

  30. Weiß, C., Müller M.: Quantifying and visualizing tonal complexity. In: Proceedings of the 9th Conference on Interdisciplinary Musicology (CIM), Berlin, pp. 184–187 (2014)

    Google Scholar 

Download references

Acknowledgements

This work has been supported by the German Research Foundation (AB 675/2-1, MU 2686/11-1). The International Audio Laboratories Erlangen are a joint institution of the Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU) and Fraunhofer Institut für Integrierte Schaltungen IIS.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Matthias Nowakowski .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Nowakowski, M., Weiß, C., Abeßer, J. (2021). Towards Deep Learning Strategies for Transcribing Electroacoustic Music. In: Kronland-Martinet, R., Ystad, S., Aramaki, M. (eds) Perception, Representations, Image, Sound, Music. CMMR 2019. Lecture Notes in Computer Science(), vol 12631. Springer, Cham. https://doi.org/10.1007/978-3-030-70210-6_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-70210-6_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-70209-0

  • Online ISBN: 978-3-030-70210-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics