Skip to main content
Log in

Multivariability speaker recognition database in Indian scenario

  • Published:
International Journal of Speech Technology Aims and scope Submit manuscript

Abstract

In this paper we describe the collection and organization of the speaker recognition database in Indian scenario named as IITG Multivariability Speaker Recognition Database. The database contains speech from 451 speakers speaking English and other Indian languages both in conversational and read speech styles recorded using various sensors in parallel under different environmental conditions. The database is organized into four phases on the basis of different conditions employed for the recording. The results of the initial studies conducted on a speaker verification system exploring the impact of mismatch in training and test conditions using the collected data are also included. A copy of this database can be obtained from the authors by contacting them.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Campbell, J. P., & Reynolds, D. A. (1999). Corpora for the evaluation of speaker recognition systems. In Proceedings of international conference on acoustics, speech and signal processing 1999 (ICASSP ’99).

    Google Scholar 

  • Dehak, N., Kenny, P., Dehak, R., Dumouchel, P., & Ouellet, P. (2011). Front-end factor analysis for speaker verification. IEEE Transactions on Audio, Speech, and Language Processing, 19(4), 788–798.

    Article  Google Scholar 

  • Doddington, G. (1985). Speaker recognition-identifying people by their voices. Proceedings of the IEEE, 73(11), 1651–1664.

    Article  Google Scholar 

  • Ganchev, T., Fakotakis, N., & Kokkinakis, G. (2005). Comparative evaluation of various mfcc implementations on the speaker verification task. In Proc. SPECOM (pp. 191–194).

    Google Scholar 

  • Haris B C, Pradhan, G., Misra, A., Shukla, S., Sinha, R., & Prasanna, S. R. M. (2011). Multi-variability speech database for robust speaker recognition. In Proceedings of national conference on communications (pp. 1–5).

    Google Scholar 

  • KTH Royal Institute of Technology. (2005). wavesurfer. http://www.speech.kth.se/wavesurfer/index2.html.

  • Martin, A. (2003). NIST 2003 speaker recognition evaluation plan, http://www.itl.nist.gov/iad/mig/tests/sre/2003/2003-spkrec-evalplan-v2.2.pdf.

  • Martin, A., Doddington, G., Kamm, T., Ordowski, M., & Przybocki, M. (1997). The DET curve in assessment of detection task performance. In Proceedings of Eurospeech ’97, Rhodes, Greece (pp. 1895–1898).

    Google Scholar 

  • Patil, H., & Basu, T. (2008). Development of speech corpora for speaker recognition research and evaluation in Indian languages. International Journal of Speech Technology, 11, 17–32.

    Article  Google Scholar 

  • Patil, H., Prakash, D., Kar, B., Bhatta, B., & Basu, T. (2006). Corpora for speaker recognition research and evaluation in oriya. In Proceedings of IEEE international conference on industrial technology (pp. 2217–2222).

    Chapter  Google Scholar 

  • Reynolds, D. (1996). The effects of handset variability on speaker recognition performance: experiments on the switchboard corpus. In Proceedings of IEEE international conference on acoustics, speech, and signal processing 1996 (ICASSP ’96) (Vol. 1, pp. 113–116).

    Chapter  Google Scholar 

  • Reynolds, D., Zissman, M., Quatieri, T., O’Leary, G., & Carlson, B. (1995). The effects of telephone transmission degradations on speaker recognition performance. In Proceedings of IEEE international conference on acoustics, speech, and signal processing 1995 (ICASSP ’95) (Vol. 1, pp. 329–332).

    Google Scholar 

  • Reynolds, D. A. (2002). An overview of automatic speaker recognition technology. In Proceedings of IEEE international conference on acoustics, speech, and signal processing 2002 (ICASSP ’02) (Vol. 4, pp. IV–4072–IV–4075)

    Google Scholar 

  • Reynolds, D. A., Quatieri, T. F., & Dunn, R. B. (2000). Speaker verification using adapted Gaussian mixture models. Digital Signal Processing, 10(1–3), 19–41.

    Article  Google Scholar 

  • Yin, S.-C., Rose, R., & Kenny, P. (2007). A joint factor analysis approach to progressive model adaptation in text-independent speaker verification. IEEE Transactions on Audio, Speech, and Language Processing, 15(7), 1999–2010.

    Article  Google Scholar 

  • Young, S., Evermann, G., Gales, M., Kershaw, D., Moore, G., Odell, J., Ollason, D., Povey, D., Valtchev, V., & Woodland, P. (2006). The HTK book version 3.4. Cambridge: Cambridge University Engineering Department.

    Google Scholar 

Download references

Acknowledgement

This work has been supported by the project grant No. 12(4)/2009-ESD sponsored by the Department of Information Technology, Government of India. The authors sincerely thank the efforts of Mr. Akhilesh Shukla and Mr. Sumit Shukla for their effort towards the collection and processing of database.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to S. R. M. Prasanna.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Haris B C, Pradhan, G., Misra, A. et al. Multivariability speaker recognition database in Indian scenario. Int J Speech Technol 15, 441–453 (2012). https://doi.org/10.1007/s10772-012-9140-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10772-012-9140-x

Keywords

Navigation