RAMCESS 2.X framework—expressive voice analysis for realtime and accurate synthesis of singing

d‘Alessandro, Nicolas; Babacan, Onur; Bozkurt, Baris; Dubuisson, Thomas; Holzapfel, Andre; Kessous, Loic; Moinet, Alexis; Vlieghe, Maxime

doi:10.1007/s12193-008-0010-4

RAMCESS 2.X framework—expressive voice analysis for realtime and accurate synthesis of singing

Original Paper
Published: 05 June 2008

Volume 2, pages 133–144, (2008)
Cite this article

Journal on Multimodal User Interfaces Aims and scope Submit manuscript

Nicolas d‘Alessandro¹,
Onur Babacan²,
Baris Bozkurt²,
Thomas Dubuisson¹,
Andre Holzapfel³,
Loic Kessous⁴,
Alexis Moinet¹ &
…
Maxime Vlieghe¹

90 Accesses
Explore all metrics

Abstract

In this paper we present the work that has been achieved in the context of the second version of the Ramcess singing synthesis framework. The main improvement of this study is the integration of new algorithms for expressive voice analysis, especially the separation of the glottal source and the vocal tract. Realtime synthesis modules have also been refined. These elements have been integrated in an existing digital instrument: the HandSketch 1.x, a bi-manual controller. Moreover this digital instrument is compared to existing systems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Cantor Digitalis: chironomic parametric synthesis of singing

Article Open access 23 January 2017

VOICE2TUBA: transforming singing voice into a musical instrument

Article 17 May 2016

Singing Voice Database

References

Bonada J, Serra X (2007) Synthesis of the singing voice by performance sampling and spectral models. IEEE Signal Process 24(2):67–79
Article Google Scholar
Kawahara H (1999) Restructuring speech representations using a pitch-adaptative time-frequency smoothing and an instantaneous-frequency-based f0 extraction: possible role of a repetitive structure in sounds. Speech Commun 27:187–207
Article Google Scholar
http://www.enterface.net
Makhoul J (1975) Linear prediction: a tutorial review. Proc IEEE 63:561–580
Article Google Scholar
Bozkurt B (2005) New spectral methods for the analysis of source/filter characteristics of speech signals. PhD thesis, Faculté Polytechnique de Mons
Henrich N (2001) Etude de la source glottique en voix parlée et chantée: modélisation et estimation, mesures acoustiques et electroglottographiques, perception. PhD thesis, Université de Paris VI
Doval B, d’Alessandro C, Henrich N (2006) The spectrum of glottal flow models. Acta Acustica 92:1026–1046
Google Scholar
Doval B, d’Alessandro C (2003) The voice source as a causal/anticausal linear filter. In: Proceedings of Voqual’03, voice quality: functions, analysis and synthesis, ISCA workshop
Sundberg J (1974) Articulatory interpretation of the singing formant. J Acoust Soc Am 55:838–844
Article Google Scholar
Boite R, Bourlard H, Dutoit T, Hancq J, Leich H (2000) Traitement de la parole
http://www.phon.ucl.ac.uk/home/sampa/
Bozkurt B, Couvreur L, Dutoit T (2007) Chirp group delay analysis of speech signals. Speech Commun 49(3):159–176
Article Google Scholar
Dubuisson T, Dutoit T (2007) Improvement of source-tract decomposition of speech using analogy with LF model for glottal source and tube model for vocal tract. In: Proceedings of models and analysis of vocal emissions for biomedical application workshop, pp 119–122
Edelman A, Murakami H (1995) Polynomial roots from companion matrix eigenvalues. Math Comput 64(210):763–776
Article MATH MathSciNet Google Scholar
Bozkurt B, Doval B, d’Alessandro C, Dutoit T (2005) Zeros of the Z-transform representation with application to source-filter separation in speech. IEEE Signal Process Lett 12(4):344–347
Article Google Scholar
Fant G, Liljencrants J, Lin Q (1985) A four-parameter model of glottal flow. STL-QPSR 4:1–13
Google Scholar
Fant G (1960) Acoustic theory of speech production. Mouton and Co, Netherlands
Google Scholar
Vincent D, Rosec O, Chonavel T (2005) Estimation of LF glottal source parameters based on ARX model. In: Proceedings of Interspeech, Lisbonne, pp 333–336
Vincent D, Rosec O, Chonavel T (2007) A new method for speech synthesis and transformation based on an ARX-LF source-filter decomposition and HNM modeling. In: Proceedings of ICASSP, Honolulu, pp 525–528
http://www.cycling74.com
http://www.puredata.org
d’Alessandro N, Dutoit T (2007) HandSketch bi-manual controller. In: Proceedings of NIME, pp 78–81
Schwarz D, Wright M (2000) Extensions and applications of the SDIF sound description interchange format. In: International computer music conference
d’Alessandro N, Doval B, Beux SL, Woodruff P, Fabre Y, d’Alessandro C, Dutoit T (2007) Realtime and accurate musical control of expression in singing synthesis. J Multimodal User Interfaces 1(1):31–39
Article Google Scholar
d’Alessandro N, Dutoit T (2007) RAMCESS/HandSketch: a multi-representation framework for realtime and expressive singing synthesis. In: Proceedings of Interspeech’07, pp TuC. SS–5
Birkholz P, Steiner I, Breuer S (2007) Control concepts for articulatory speech synthesis. In: Proceedings of the 6th ISCA workshop on speech synthesis
Berndtsson G, Sundberg J (1993) The MUSSE DIG singing synthesis. In: Proceedings of the Stockholm music acoustics conference, pp 279–281
d’Alessandro N, Dubuisson T, Moinet A, Dutoit T (2007) Causal/anticausal decomposition for mixed-phase description of brass and bowed string sounds. In: Proceedings of international computer music conference, pp 465–468

Download references

Author information

Authors and Affiliations

Circuit Theory & Signal Processing Laboratory, Faculté Polytechnique, Mons, Belgium
Nicolas d‘Alessandro, Thomas Dubuisson, Alexis Moinet & Maxime Vlieghe
Electrical and Electronics Engineering Dpt, Izmir Institute of Technology, Izmir, Turkey
Onur Babacan & Baris Bozkurt
Computer Science Dpt, University of Crete, Heraklion, Greece
Andre Holzapfel
LIMSI-CNRS, Université Paris XI, Paris, France
Loic Kessous

Authors

Nicolas d‘Alessandro
View author publications
You can also search for this author inPubMed Google Scholar
Onur Babacan
View author publications
You can also search for this author inPubMed Google Scholar
Baris Bozkurt
View author publications
You can also search for this author inPubMed Google Scholar
Thomas Dubuisson
View author publications
You can also search for this author inPubMed Google Scholar
Andre Holzapfel
View author publications
You can also search for this author inPubMed Google Scholar
Loic Kessous
View author publications
You can also search for this author inPubMed Google Scholar
Alexis Moinet
View author publications
You can also search for this author inPubMed Google Scholar
Maxime Vlieghe
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Nicolas d‘Alessandro.

Rights and permissions

Reprints and permissions

About this article

Cite this article

d‘Alessandro, N., Babacan, O., Bozkurt, B. et al. RAMCESS 2.X framework—expressive voice analysis for realtime and accurate synthesis of singing. J Multimodal User Interfaces 2, 133–144 (2008). https://doi.org/10.1007/s12193-008-0010-4

Download citation

Received: 03 January 2008
Accepted: 28 April 2008
Published: 05 June 2008
Issue Date: September 2008
DOI: https://doi.org/10.1007/s12193-008-0010-4

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

RAMCESS 2.X framework—expressive voice analysis for realtime and accurate synthesis of singing

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Cantor Digitalis: chironomic parametric synthesis of singing

VOICE2TUBA: transforming singing voice into a musical instrument

Singing Voice Database

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now