Abstract
Timbral autoencoders, a class of generative model that learn the timbre distribution of audio data, are a current research focus in music technology; however, despite recent improvements, they have rarely been used in music composition or musical systems due to issues of static musical output, general lack of real-time synthesis and the unwieldiness of synthesis parameters. This project proposes a solution to these issues by combining timbral autoencoder models with a classic computer music synthesis technique in wavetable synthesis. A proof-of-concept implementation in Python, with controllers in Max and SuperCollider, demonstrates the timbral autoencoder’s capability as a wavetable generator. This concept is generally architecture agnostic, showing that most existing timbral autoencoders could be adapted for use in real-time music creation today, regardless of their capabilities for real-time synthesis and time-varying timbre.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bristow-Johnson, R.: Wavetable synthesis 101, a fundamental perspective. In: Proceedings 101st Convention of the Audio Engineering Society (1996)
Colonel, J., Curro, C., Keene, S.: Autoencoding neural networks as musical audio synthesizers. In: Proceedings of the 21st International Conference on Digital Audio Effects (2018)
Engel, J., et al.: Neural audio synthesis of musical notes with WaveNet autoencoders. In: Proceedings of the 34th International Conference on Machine Learning, vol. 70 (2017)
Esling, P., Chemla-Romeu-Santos, A., Bitton, A.: Generative timbre spaces: regularizing variational auto-encoders with perceptual metrics. In: Proceedings of the 21st International Conference on Digital Audio Effects (2018)
Griffin, D., Lim, J.: Signal estimation from modified short-time Fourier transform. IEEE Trans. Acoust. Speech Signal Process. 32(2), 236–243 (1984)
Hantrakul, L., Yang, L.C.: Neural wavetable: a playable wavetable synthesizer using neural networks. In: Workshop on Machine Learning for Creativity and Design (2018)
Huzaifah, M., Wyse, L.: Deep generative models for musical audio synthesis. In: Miranda, E.R. (ed.) Handbook of Artificial Intelligence for Music, pp. 639–678. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-72116-9_22
Kingma, D.P., Welling, M.: Auto-encoding variational bayes. arXiv preprint, arXiv:1312.6114 (2013)
Luo, Y.J., Cheuk, K.W., Nakano, T., Goto, M., Herremans, D.: Unsupervised disentanglement of pitch and timbre for isolated musical instrument sounds. In: Proceedings of the 2020 International Society of Music Information Retrieval Conference (2020)
Mauch, M., Dixon, S.: pYIN: a fundamental frequency estimator using probabilistic threshold distributions. In: Proceedings of the 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (2014)
Paszke, A., et al.: Automatic differentiation in PyTorch. In: Proceedings of the 31st Conference on Neural Information Processing Systems (2017)
Acknowledgments
Special thanks to Karl Yerkes (MAT, University of California Santa Barbara) for his great help with SuperCollider and OSC implementations.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 Springer Nature Switzerland AG
About this paper
Cite this paper
Hyrkas, J. (2023). WaVAEtable Synthesis. In: Aramaki, M., Hirata, K., Kitahara, T., Kronland-Martinet, R., Ystad, S. (eds) Music in the AI Era. CMMR 2021. Lecture Notes in Computer Science, vol 13770 . Springer, Cham. https://doi.org/10.1007/978-3-031-35382-6_3
Download citation
DOI: https://doi.org/10.1007/978-3-031-35382-6_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-35381-9
Online ISBN: 978-3-031-35382-6
eBook Packages: Computer ScienceComputer Science (R0)