VoCMex: a voice corpus in Mexican Spanish for research in speaker recognition

Olguín-Espinoza, José-Martín; Mayorga-Ortiz, Pedro; Hidalgo-Silva, Hugo; Vizcarra-Corral, Luis; Mendiola-Cárdenas, Mónica-Livier

doi:10.1007/s10772-012-9183-z

VoCMex: a voice corpus in Mexican Spanish for research in speaker recognition

Published: 24 November 2012

Volume 16, pages 295–302, (2013)
Cite this article

International Journal of Speech Technology Aims and scope Submit manuscript

José-Martín Olguín-Espinoza¹,
Pedro Mayorga-Ortiz²,
Hugo Hidalgo-Silva³,
Luis Vizcarra-Corral⁴ &
…
Mónica-Livier Mendiola-Cárdenas¹

286 Accesses
Explore all metrics

Abstract

Voice corpus is an essential element for automatic speaker recognition systems. In order for a corpus to be useful in recognition tasks, it must contain recordings from several speakers pronouncing phonetically balanced utterances; recorded through several sessions using different recording media. This work shows the methodology, development and evaluation of a Mexican Spanish Corpus referred as to VoCMex, which is aimed to support research on speaker recognition. It contains telephone and microphone recordings of 20 male and 13 female speakers, obtained through three sessions. In order to validate the usefulness of the corpus, a speaker identification system was developed and the recognition results were similar compared against those obtained using a known voice corpus.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Challenging Voice Dataset for Robotic Applications in Noisy Environments

Identifying Lithuanian Native Speakers Using Voice Recognition

E2PCast: an English to Persian voice casting dataset

Article 17 January 2025

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Auckenthaler, R., Parris, E. S., & Carey, M. J. (1999). Improving a GMM speaker verification system by phonetic weighting. In IEEE international conference on acoustics, speech, and signal processing, ICASSP-1999, Phoenix, AZ (Vol. 1, pp. 313–316).
Google Scholar
Campbell, J. P. (1995). Testing with the YOHO CD-ROM voice verification corpus. In IEEE international conference on acoustics, speech and signal processing, ICASSP-1995, Detroit, MI (Vol. 1, pp. 341–344).
Chapter Google Scholar
Campbell, J. P., & Reynolds, D. A. (1999). Corpora for the evaluation of speaker recognition systems. In IEEE international conference on acoustics, speech, and signal processing, ICASSP-1999, Phoenix, AZ (Vol. 2, pp. 829–832).
Google Scholar
Casacuberta, F., García, R., Llisterri, J., Nadeu, C., Pardo, J. M., & Rubio, A. (1992). Desarrollo de corpus para investigación en tecnologías del habla (Albayzín). Procesamiento del Lenguaje Natural, 12, 35–42.
Google Scholar
Dempster, A., Laird, N., & Rubin, D. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society. Series B. Methodological, 39(1), 1–38.
MathSciNet MATH Google Scholar
Faltlhauser, R., & Ruske, G. (2001). Improving speaker recognition performance using phonetically structured Gaussian mixture models. In EUROSPEECH-2001, Aalborg, Denmark (pp. 751–754).
Google Scholar
Fauve, B., Matrouf, D., Scheffer, N., Bonastre, J., & Mason, J. (2007). State-of-the-art performance in text-independent speaker verification through open-source software. IEEE Transactions on Audio, Speech, and Language Processing, 15(7), 1960–1968.
Article Google Scholar
Fredouille, C., Mariéthoz, J., Jaboulet, C., Hennebert, J., Mokbel, C., & Bimbot, F. (2000). Behavior of a Bayesian adaptation method for incremental enrollment in speaker verification. In IEEE int. conf. on acoustics, speech and signal processing (ICASSP2000), Turkey, Istambul (pp. 1197–1200).
Google Scholar
Hennebert, J., Melin, H., Petrovska, D., & Genoud, D. (2000). Polycost: a telephone speech database for speaker recognition. Speech Communication, 31(2–3), 265–270.
Article Google Scholar
Juang, B. H., & Tsuhan, C. (1998). The past, present, and future of speech processing. IEEE Signal Processing Magazine, 15(3), 24–48.
Article Google Scholar
Keshet, J., & Bengio, S. (2009). Automatic speech and speaker recognition: large margin and kernel methods. New York: Wiley.
Book Google Scholar
Kirschning, I. (2001). Research and development of speech technology & applications for Mexican Spanish at the Tlatoa group. In CHI’01 extended abstracts on human factors in computing systems (CHI EA’01) (pp. 49–50). New York: ACM.
Chapter Google Scholar
Martinez, W. L., & Martinez, A. R. (2008). Computational statistics handbook with MatLab (2nd ed.). London: Chapman&Hall/CRC. ISBN 1-58488-566-1.
MATH Google Scholar
Messer, K., Matas, J., Kittler, J., Luettin, J., & Maitre, G. (1999). XM2VTSDB: the extended M2VTS database. In Second international conference on audio and video based biometric person authentication, AVBPA-1999, Washington, DC (pp. 166–171).
Google Scholar
Ortega-García, J., González-Rodríguez, J., Marrero, V., Díaz-Gómez, J., García-Jiménez, R., Lucena-Molina, J., & Sánchez-Molero, J. (2000). AHUMADA: a large speech corpus in Spanish for speaker identification and verification. Speech Communication, 31(2–3), 255–264.
Article Google Scholar
Patil, H., & Basu, T. (2009). Development of speech corpora for speaker recognition research and evaluation in Indian languages. International Journal of Speech Technology, 11(1), 17–32.
Article Google Scholar
Pérez, H. E. (2003). Frecuencia de fonemas. Revista Electrónica de la Red Temática en Tecnologías del Habla, 1. http://gth-www.die.upm.es/numeros/N1/N1_A4.pdf.
Pineda, L. A., Castellanos, H., Cuétara, J., Galescu, L., Juárez, J., Llisterri, J., Pérez, P., & Villaseñor, L. (2010). The corpus DIMEx100: transcription and evaluation. Language Resources and Evaluation, 44(4), 347–370.
Article Google Scholar
Przybocki, M., & Martin, A. F. (2004). NIST speaker recognition evaluation chronicles. In ODYS-2004, Toledo, Spain (pp. 15–22).
Google Scholar
Reynolds, D. (2002). An overview of automatic speaker recognition technology. In IEEE international conference on acoustics, speech, and signal processing, ICASSP-2002, Orlando, FL (Vol. 4, pp. 4072–4075).
Google Scholar
Reynolds, D. A., & Rose, R. (1995). Robust text-independent speaker identification using Gaussian mixture speaker model. IEEE Transactions on Speech and Audio Processing, 3(1), 72–83.
Article Google Scholar
Villaseñor-Pineda, L., Montes-y-Gómez, M., Vaufreydaz, D., & Serignat, J. (2003). Elaboración de un Corpus Balanceado para el Cálculo de Modelos Acústicos usando la Web. In XII congreso internacional de computación, CIC-2003, Mexico City, Mexico (pp. 198–200).
Google Scholar
Zamalloa, M., Bordel, G., Rodríguez, L. J., Peñagarikano, M., & Uribe, J. P. (2006). Selección y pesado de parámetros acústicos mediante algoritmos genéticos para el reconocimiento del locutor. In IV jornadas en tecnologías del habla, 4JTH06, Zaragoza, Spain (pp. 349–354).
Google Scholar

Download references

Acknowledgements

The authors would like to thank the Universidad Autónoma de Baja California (Autonomous University of Baja California), who financed the development of this work through the program 1899 of the 11th Internal announcement for research funding.

Author information

Authors and Affiliations

Facultad de Ingeniería Mexicali, Universidad Autónoma de Baja California (UABC), Mexicali, BC, México
José-Martín Olguín-Espinoza & Mónica-Livier Mendiola-Cárdenas
Departamento de Posgrado, Instituto Tecnológico de Mexicali, Mexicali, BC, México
Pedro Mayorga-Ortiz
Departamento de Ciencias de la Computación, CICESE, Ensenada, BC, México
Hugo Hidalgo-Silva
Facultad de Ciencias, UABC, Ensenada, BC, México
Luis Vizcarra-Corral

Authors

José-Martín Olguín-Espinoza
View author publications
You can also search for this author in PubMed Google Scholar
Pedro Mayorga-Ortiz
View author publications
You can also search for this author in PubMed Google Scholar
Hugo Hidalgo-Silva
View author publications
You can also search for this author in PubMed Google Scholar
Luis Vizcarra-Corral
View author publications
You can also search for this author in PubMed Google Scholar
Mónica-Livier Mendiola-Cárdenas
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to José-Martín Olguín-Espinoza.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Olguín-Espinoza, JM., Mayorga-Ortiz, P., Hidalgo-Silva, H. et al. VoCMex: a voice corpus in Mexican Spanish for research in speaker recognition. Int J Speech Technol 16, 295–302 (2013). https://doi.org/10.1007/s10772-012-9183-z

Download citation

Received: 17 July 2012
Accepted: 14 November 2012
Published: 24 November 2012
Issue Date: September 2013
DOI: https://doi.org/10.1007/s10772-012-9183-z

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

VoCMex: a voice corpus in Mexican Spanish for research in speaker recognition

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A Challenging Voice Dataset for Robotic Applications in Noisy Environments

Identifying Lithuanian Native Speakers Using Voice Recognition

E2PCast: an English to Persian voice casting dataset

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

VoCMex: a voice corpus in Mexican Spanish for research in speaker recognition

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A Challenging Voice Dataset for Robotic Applications in Noisy Environments

Identifying Lithuanian Native Speakers Using Voice Recognition

E2PCast: an English to Persian voice casting dataset

Explore related subjects

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation