Towards Deep Learning Strategies for Transcribing Electroacoustic Music

Nowakowski, Matthias; Weiß, Christof; Abeßer, Jakob

doi:10.1007/978-3-030-70210-6_3

Matthias Nowakowski¹¹,
Christof Weiß¹² &
Jakob Abeßer¹³

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12631))

Included in the following conference series:

International Symposium on Computer Music Multidisciplinary Research

1019 Accesses

Abstract

Electroacoustic music is experienced primarily through auditory perception, as it is not usually based on a prescriptive score. For the analysis of such pieces, transcriptions are sometimes created to illustrate events and processes graphically in a readily comprehensible way. These are usually based on the spectrogram of the recording. Although the manual generation of transcriptions is often time-consuming, they provide a useful starting point for any person who has interest in a work. Deep-learning algorithms that learn to recognize characteristic spectral patterns using supervised learning represent a promising technology to automatize this task. This paper investigates and explores the labeling of sound objects in electroacoustic music recordings. We test several neural-network architectures that enable classification of sound objects using musicological and signal-processing methods. We also show future perspectives how our results can be improved and applied to a new gradient-based visualization approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 109.00; Price excludes VAT (USA)

Softcover Book: USD 139.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
http://www.ubu.com/sound/electronic.html.
2.
https://keras.io/.
3.
https://librosa.github.io/librosa/.
4.
We have requested this dataset, but unfortunately it was no longer provided by the creators.

References

Adavanne, S., Virtanen, T.: A report on sound event detection with different binaural features. In: DCASE 2017 Challenge (2017)
Google Scholar
Alber, M., et al.: iNNvestigate neural networks! CoRR (2018)
Google Scholar
Beiche, M.: Musique concrète. Handbuch der musikalischen Terminologie 4, Steiner-Verlag Stuttgart (1994)
Google Scholar
Chung, J., Gülçehre, C., Cho, K., Bengio, Y.: Empirical evaluation of gated recurrent neural networks on sequence modeling. In: NIPS 2014 Deep Learning and Representation Learning Workshop (2014)
Google Scholar
Choi, K., Fazekas, G., Sandler, M.B., Cho, K.: Transfer learning for music classification and regression tasks. In: Proceedings of the 18th ISMIR Conference, Suzhou, pp. 141–149 (2017)
Google Scholar
Collins, N.: The UbuWeb Electronic Music Corpus: An MIR investigation of a historical database. Organised Sound 20(1), 122–134 (2015)
Article Google Scholar
Collins, N., Manning, P., Tarsitani, S.: A new curated corpus of historical electronic music: collation, data and research findings. Trans. Int. Soc. Music Inf. Retr. 1(1), 34–55 (2018)
Google Scholar
Couprie, P.: Methods and tools for transcribing electroacoustic music. In: International Conference on Technologies for Music Notation and Representation - TENOR 2018, Montréal, pp. 7–16 (2018)
Google Scholar
Drieger, J., Müller M., Disch S.: Extending harmonic-percussive separation of audio signals. In: Retrieval Conference (ISMIR 2014), Taipei, pp. 611–616 (2014)
Google Scholar
Erbe, M.: Klänge schreiben: die Transkriptionsproblematik elektroakustischer Musik. Apfel, Vienna (2009)
Google Scholar
Essid, S., Richard, G., David, B.: Musical instrument recognition by pairwise classification strategies. IEEE Trans. Audio Speech Lang. Process. 14(4), 1401–1412 (2006)
Article Google Scholar
Grzywczak, D., Gwardys, G.: Deep image features in music information retrieval. Int. J. Electron. Telecommun. 60(4), 321–326 (2014)
Article Google Scholar
Gulluni, S., Essid, S., Buisson, O., Richard, G.: An interactive system for electro-acoustic music analysis. In: 12th International Society for Music Information Retrieval Conference (ISMIR 2011), Miami, pp. 145–150 (2011)
Google Scholar
Gulluni, S., Essid, S., Buisson, O., Richard, G.: Interactive classification of sound objects for polyphonic electro-acoustic music annotation. In: AES 42nd International Conference, Ilmenau (2011)
Google Scholar
Klien, V., Grill, T., Flexer, A.: On automated annotation of acousmatic music. J. New Music Res. 41(2), 153–173 (2012)
Article Google Scholar
López-Serrano, P., Dittmar, C., Müller M.: Mid-level audio features based on cascaded harmonic-residual-percussive separation. In: Proceedings of the Audio Engineering Society AES Conference on Semantic Audio, Erlangen (2017)
Google Scholar
Mesaros, A., et al.: DCASE 2017 challenge setup: tasks, datasets and baseline system. In: DCASE 2017 - Workshop on Detection and Classification of Acoustic Scenes and Events (2017)
Google Scholar
Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22(10), 1345–1359 (2010)
Article Google Scholar
Park, T.H., Li, Z., Wu, W.: Easy does it: the electro-acoustic music analysis toolbox. In: Proceedings of the 10th International Society for Music Information Retrieval Conference (ISMIR 2009), Kobe, pp. 693–698 (2009)
Google Scholar
Peeters, G.: A large set of audio features for sound description (similarity and classification) in the CUIDADO project (2004). http://recherche.ircam.fr/anasyn/peeters/ARTICLES/Peeters_2003_cuidadoaudiofeatures.pdf
Pons, J.: Neural networks for music: a journey through its history (2018). https://towardsdatascience.com/neural-networks-for-music-a-journey-through-its-history-91f93c3459fb (2018)
Purwins, H., Li, B., Virtanen, T., Schlüter, J., Chang, S.-Y., Sainath, T.: Deep Learning for Audio Signal Processing. IEEE J. Sel. Top. Signal Process. 14(8), 1–14 (2019)
Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: ILCR (2015)
Google Scholar
Smalley, D.: Spectromorphology: Explaining Sound-shapes. Organised Sound 2/2, Cambridge, pp. 107–126 (1997)
Google Scholar
Stroh, W.M.: Elektronische Musik. Handbuch der musikalischen Terminologie 2, Steiner-Verlag, Stuttgart (1972)
Google Scholar
Thoresen, L., Hedman, A.: Spectromorphological Analysis of Sound Objects: An Adaptation of Pierre Schaeffer’s Typomorphology. Organised Sound 12/2, Cambridge, pp. 129–141 (2007)
Google Scholar
Torrey, L., Shavlik, J.: Transfer learning. In: Handbook of Research on Machine Learning, Algorithms, Methods, and Techniques, pp. 242–264. IGI-Global (2009)
Google Scholar
van der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008)
MATH Google Scholar
Virtanen, T., Plumbley, M.D., Ellis, D.P.W.: Computational Analysis of Sound Scenes and Events. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-63450-0
Book Google Scholar
Weiß, C., Müller M.: Quantifying and visualizing tonal complexity. In: Proceedings of the 9th Conference on Interdisciplinary Musicology (CIM), Berlin, pp. 184–187 (2014)
Google Scholar

Download references

Acknowledgements

This work has been supported by the German Research Foundation (AB 675/2-1, MU 2686/11-1). The International Audio Laboratories Erlangen are a joint institution of the Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU) and Fraunhofer Institut für Integrierte Schaltungen IIS.

Author information

Authors and Affiliations

Media Informatics, University of Applied Sciences, Düsseldorf, Germany
Matthias Nowakowski
International Audio Laboratories Erlangen, Erlangen, Germany
Christof Weiß
Semantic Music Technologies Group, Fraunhofer IDMT, Ilmenau, Germany
Jakob Abeßer

Authors

Matthias Nowakowski
View author publications
You can also search for this author in PubMed Google Scholar
Christof Weiß
View author publications
You can also search for this author in PubMed Google Scholar
Jakob Abeßer
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Matthias Nowakowski .

Editor information

Editors and Affiliations

Laboratoire PRISM, CNRS-AMU, Marseille, France
Richard Kronland-Martinet
Laboratoire PRISM, CNRS-AMU, Marseille, France
Sølvi Ystad
Laboratoire PRISM, CNRS-AMU, Marseille, France
Mitsuko Aramaki

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Nowakowski, M., Weiß, C., Abeßer, J. (2021). Towards Deep Learning Strategies for Transcribing Electroacoustic Music. In: Kronland-Martinet, R., Ystad, S., Aramaki, M. (eds) Perception, Representations, Image, Sound, Music. CMMR 2019. Lecture Notes in Computer Science(), vol 12631. Springer, Cham. https://doi.org/10.1007/978-3-030-70210-6_3

Download citation

DOI: https://doi.org/10.1007/978-3-030-70210-6_3
Published: 10 March 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-70209-0
Online ISBN: 978-3-030-70210-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics