Abstract
In this paper we describe the collection and organization of the speaker recognition database in Indian scenario named as IITG Multivariability Speaker Recognition Database. The database contains speech from 451 speakers speaking English and other Indian languages both in conversational and read speech styles recorded using various sensors in parallel under different environmental conditions. The database is organized into four phases on the basis of different conditions employed for the recording. The results of the initial studies conducted on a speaker verification system exploring the impact of mismatch in training and test conditions using the collected data are also included. A copy of this database can be obtained from the authors by contacting them.
Similar content being viewed by others
References
Campbell, J. P., & Reynolds, D. A. (1999). Corpora for the evaluation of speaker recognition systems. In Proceedings of international conference on acoustics, speech and signal processing 1999 (ICASSP ’99).
Dehak, N., Kenny, P., Dehak, R., Dumouchel, P., & Ouellet, P. (2011). Front-end factor analysis for speaker verification. IEEE Transactions on Audio, Speech, and Language Processing, 19(4), 788–798.
Doddington, G. (1985). Speaker recognition-identifying people by their voices. Proceedings of the IEEE, 73(11), 1651–1664.
Ganchev, T., Fakotakis, N., & Kokkinakis, G. (2005). Comparative evaluation of various mfcc implementations on the speaker verification task. In Proc. SPECOM (pp. 191–194).
Haris B C, Pradhan, G., Misra, A., Shukla, S., Sinha, R., & Prasanna, S. R. M. (2011). Multi-variability speech database for robust speaker recognition. In Proceedings of national conference on communications (pp. 1–5).
KTH Royal Institute of Technology. (2005). wavesurfer. http://www.speech.kth.se/wavesurfer/index2.html.
Martin, A. (2003). NIST 2003 speaker recognition evaluation plan, http://www.itl.nist.gov/iad/mig/tests/sre/2003/2003-spkrec-evalplan-v2.2.pdf.
Martin, A., Doddington, G., Kamm, T., Ordowski, M., & Przybocki, M. (1997). The DET curve in assessment of detection task performance. In Proceedings of Eurospeech ’97, Rhodes, Greece (pp. 1895–1898).
Patil, H., & Basu, T. (2008). Development of speech corpora for speaker recognition research and evaluation in Indian languages. International Journal of Speech Technology, 11, 17–32.
Patil, H., Prakash, D., Kar, B., Bhatta, B., & Basu, T. (2006). Corpora for speaker recognition research and evaluation in oriya. In Proceedings of IEEE international conference on industrial technology (pp. 2217–2222).
Reynolds, D. (1996). The effects of handset variability on speaker recognition performance: experiments on the switchboard corpus. In Proceedings of IEEE international conference on acoustics, speech, and signal processing 1996 (ICASSP ’96) (Vol. 1, pp. 113–116).
Reynolds, D., Zissman, M., Quatieri, T., O’Leary, G., & Carlson, B. (1995). The effects of telephone transmission degradations on speaker recognition performance. In Proceedings of IEEE international conference on acoustics, speech, and signal processing 1995 (ICASSP ’95) (Vol. 1, pp. 329–332).
Reynolds, D. A. (2002). An overview of automatic speaker recognition technology. In Proceedings of IEEE international conference on acoustics, speech, and signal processing 2002 (ICASSP ’02) (Vol. 4, pp. IV–4072–IV–4075)
Reynolds, D. A., Quatieri, T. F., & Dunn, R. B. (2000). Speaker verification using adapted Gaussian mixture models. Digital Signal Processing, 10(1–3), 19–41.
Yin, S.-C., Rose, R., & Kenny, P. (2007). A joint factor analysis approach to progressive model adaptation in text-independent speaker verification. IEEE Transactions on Audio, Speech, and Language Processing, 15(7), 1999–2010.
Young, S., Evermann, G., Gales, M., Kershaw, D., Moore, G., Odell, J., Ollason, D., Povey, D., Valtchev, V., & Woodland, P. (2006). The HTK book version 3.4. Cambridge: Cambridge University Engineering Department.
Acknowledgement
This work has been supported by the project grant No. 12(4)/2009-ESD sponsored by the Department of Information Technology, Government of India. The authors sincerely thank the efforts of Mr. Akhilesh Shukla and Mr. Sumit Shukla for their effort towards the collection and processing of database.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Haris B C, Pradhan, G., Misra, A. et al. Multivariability speaker recognition database in Indian scenario. Int J Speech Technol 15, 441–453 (2012). https://doi.org/10.1007/s10772-012-9140-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10772-012-9140-x