Skip to main content
Log in

VOICE2TUBA: transforming singing voice into a musical instrument

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

In this paper, a scheme to synthesize and convert singing voice into tuba sound is presented. First, our method estimates the fundamental frequency (F 0) and the aperiodicity of a monophonic audio signal, in order to obtain the pitch and volume variations of human voice. Then, the parameters extracted are used to generate a musical excerpt emulating a certain musical instrument (tuba) in such a way that the melody resembles the original sung song. To this end, two different generation approaches are devised. One of them is based on additive signal synthesis from harmonic amplitudes. The other one converts the F 0 curve into a MIDI stream, in order to allow the play back with a virtual tuba.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Notes

  1. http://www.atic.uma.es/voice2tuba/Voice2Tuba_survey2

References

  1. Bonada J, Serra X, Amatriain X, Loscos A (2011) Spectral processing. DAFX: digital audio effects, 2nd edn, pp 393–445

  2. de Cheveigné A, Kawahara H (2002) YIN, a fundamental frequency estimator for speech and music. J Acoust Soc Am 111(4):1917–1930

    Article  Google Scholar 

  3. Dittmar C, Großmann H, Cano E, Grollmisch S, Lukashevich HM, Abeßer J (2010) Songs2see and globalmusic2one: two applied research projects in music information retrieval at fraunhofer idmt. In: Ystad S, Aramaki M, Kronland-Martinet R, Jensen K (eds) CMMR, Lecture notes in computer science, vol 6684. Springer, Berlin, pp 259–272

    Google Scholar 

  4. Downie JS (2013) Mirex contest website. http://www.music-ir.org/mirex/

  5. Haus G, Pollastri E (2011) An audio front end for query-by-humming systems. In: 2nd annual International Society for Music Information Retrieval conference (ISMIR2001), pp 65–72

  6. Horner A (2002) Cooking with Csound. Part 1, Woodwind and brass recipes. A-R Editions, Middleton

    Google Scholar 

  7. Horner A, Ayers L (1998) Audio in the new millennium. J Audio Eng Soc 46 (10):868–879

    Google Scholar 

  8. Howard DM, Welch G, Brereton J, Himonides E, Decosta M, Williams J, Howard A (2004) WinSingad: a real-time display for the singing studio. Logoped Phoniatr Vocol 29(3):135–144. doi:10.1080/14015430410000728

    Article  Google Scholar 

  9. Krige W, Herbst T, Niesler T (2008) Explicit transition modelling for automatic singing transcription. J New Music Res 37(4):311–324

    Article  Google Scholar 

  10. Lesaffre M, Leman M, De Baets B, Martens J (2004) Methodological considerations concerning manual annotation of musical audio in function of algorithm development. In: Proceedings of the International Society for Music Information Retrieval conference (ISMIR04), pp 64–71

  11. Mayor O, Bonada J, Janer J (2009) Kaleivoicecope: voice transformation from interactive installations to video games. In: Proceedings of 35st AES conference: audio for games, pp 1–8

  12. Mayor O, Bonada J, Janer J (2011) Audio transformation technologies applied to video games. In: Proceedings of 41st AES conference: audio for games, pp 1–8

  13. Molina E, Barbancho I, Barbancho AM, Tardón LJ (2014) Evaluation framework for automatic singing transcription. In: 15th International Society for Music Information Retrieval conference (ISMIR14), pp 567–572

  14. Molina E, Tardón LJ, Barbancho I, Barbancho AM (2014) The importance of f0 tracking in query-by-singing-humming. In: 15th International Society for Music Information Retrieval conference (ISMIR14), pp 277–282

  15. Molina E, Tardón L, Barbancho A, Barbancho I (2015) Sipth: singing transcription based on hysteresis defined on the pitch-time curve. IEEE/ACM Trans Audio Speech Lang Process 23(2):252–263

    Article  Google Scholar 

  16. Moorer J (1977) Signal processing aspects of computer music: a survey. Proc IEEE 65(8):1108–1137

    Article  Google Scholar 

  17. Poliner GE, Ellis D, Ehmann A, Gomez E, Streich S, Beesuan O (2007) Melody transcription from music audio: approaches and evaluation. IEEE Trans Audio Speech Lang Process 15(4):1247–1256

    Article  Google Scholar 

  18. Risset JC, Wessel D (1999) Exploration of timbre by analysis and synthesis. In: Deutsch D (ed) The psychology of music. Academic, New York, pp 113–169

    Chapter  Google Scholar 

  19. Ryynänen M (2006) Singing transcription. In: Klapuri A, Davy M (eds) Signal processing methods for music transcription. Springer Science + Business Media LLC, Berlin, pp 361–390

    Chapter  Google Scholar 

  20. Salamon J, Serrà J, Gómez E (2013) Tonal representations for music retrieval: from version identification to query-by-humming. Int J Multimed Inf Retr, special issue on Hybrid Music Information Retrieval 2:45–58

    Article  Google Scholar 

  21. Schafer RW, Rabiner LRs (1990) Digital representations of speech signals. Kaufmann, San Mateo, pp 49–64

    Google Scholar 

  22. Schutte K (2012) Midi toolkit for matlab. http://www.kenschutte.com/midi/

  23. Serra X (1997) Musical sound modeling with sinusoids plus noise. Musical signal processing, pp 1–25

  24. Stylianou Y (2009) Voice transformation: a survey. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP2009), pp 3585–3588

  25. University of Iowa: Musical Instrument Samples (MIS): Sonatina Symphonic Orchestra (2014). http://sso.mattiaswestlund.net/

  26. Viitaniemi T, Klapuri A, Eronen A (2003) A probabilistic model for the transcription of single-voice melodies. In: Proceedings of Finnish Signal Processing Symposium, pp 5963–5957

Download references

Acknowledgments

This work has been funded by the Ministerio de Economía y Competitividad of the Spanish Government under Project No. TIN2013-47276-C6-2-R. This work has been done at Universidad de Málaga, Campus de Excelencia Internacional (CEI) Andalucía TECH.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ana M. Barbancho.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Santacruz, J.L., Tardón, L.J., Barbancho, I. et al. VOICE2TUBA: transforming singing voice into a musical instrument. Multimed Tools Appl 76, 9855–9875 (2017). https://doi.org/10.1007/s11042-016-3582-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-016-3582-0

Keywords

Navigation