Abstract
This paper describes a recently initiated effort for collection and transcription of read as well as spontaneous speech data in four Indian languages. The completed preparatory work include the design of phonetically rich sentences, data acquisition setup for recording speech data over telephone channel, a Wizard of Oz setup for acquiring speech data of a spoken dialogue of a caller with the machine in the context of a remote information retrieval task. An account of care taken to collect speech data that is as close to real world as possible is given. The current status of the programme and the set of actions planned to achieve the goal is given.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Agrawal, S., Samudravijaya, K., Arora, K.: Recent Advances of Speech Databases development activity for Indian Languages. In: Proc. of ISCSLP 2006, Companion. COLIPS, Singapore (2006)
Samudravijaya, K., Rao, P.V.S., Agrawal, S.S.: Hindi Speech Database. In: Proc. Int. Conf. on Spoken Language processing(ICSLP 2000) Beijing China, CDROM paper: 00192.pdf (2000)
http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC93S1
Chourasia, V., Samudravijaya, K., Chandwani, M.: Phonetically Rich Hindi Sentence Corpus for Creation of Speech Database. In: Proc. O-COCOSDA 2005, Indonesia, pp. 132–137 (2005)
http://gps.tsc.upc.es/veu/personal/sesma/sesma/CorpusCrt/php3
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Samudravijaya, K. (2006). Development of Multi-lingual Spoken Corpora of Indian Languages. In: Huo, Q., Ma, B., Chng, ES., Li, H. (eds) Chinese Spoken Language Processing. ISCSLP 2006. Lecture Notes in Computer Science(), vol 4274. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11939993_79
Download citation
DOI: https://doi.org/10.1007/11939993_79
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-49665-6
Online ISBN: 978-3-540-49666-3
eBook Packages: Computer ScienceComputer Science (R0)