Abstract
In this paper we evaluate sorted Gaussian Mixture Model (GMM) system performance for Text Independent Speaker Verification under the feature domain normalization conditions. Sorted GMM is a speed-up algorithm proposed for GMM based systems. Cepstral Mean Subtraction (CMS) and Dynamic Range Normalization (DRN) are the normalization schemes studied for sorted GMM system purposes. Effectiveness of these normalizations has been proved in speaker recognition systems while their effectiveness on the speed-up of GMM based speaker verification is showed in this study. The baseline system is a universal background model–Gaussian mixture model (UBM-GMM) system and evaluations were performed on the NIST 2002 speaker recognition evaluation database with NIST SRE rules. It is shown that CMS and DRN normalizations enhance both the baseline system and sorted GMM system performances. In other words, the performance loss due to reducing the computational load is mitigated by applying CMS and DRN.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Reynolds, D.A., Rose, R.C.: Robust text-independent speaker identification using Gaussian mixture speaker models. IEEE Trans. on Speech Audio Processing 3(1), 72–83 (1995)
Reynolds, D.A., Quatieri, T.F., Dunn, R.B.: Speaker verification using adapted Gaussian mixture models. Digital Signal Processing 10(1-3), 19–41 (2000)
Chan, A., Mosur, R., Rudnicky, A., Sherwani, J.: Four-layer categorization scheme of fast GMM computation techniques in large vocabulary continuous speech recognition systems. In: Proc. INTERSPEECH-2004, pp. 689–692 (2004)
Sadegh Mohammadi, H.R., Saeidi, R.: Efficient implementation of GMM based speaker verification using sorted Gaussian mixture model. In: Proc. EUSIPCO 2006, Florence, Italy, September 4-8 (2006)
Saeidi, R., Sadegh Mohammadi, H.R., Rodman, R.D., Kinnunen, T.: A new segmentation algorithm combined with transient frames power for text independent speaker verification. In: Proc. ICASSP 2007, vol. 1, pp. 305–308 (April 2007)
Sadegh Mohammadi, H.R., Saeidi, R., Rohani, M.R., Rodman, R.D.: Combined inter-frame and intra-frame fast scoring methods for Efficient implementation of GMM-based speaker verification systems. In: Proc. ICASSP 2007, pp. 309–312 (April 2007)
Auckenthaler, R., Mason, J.: Gaussian selection applied to text-independent speaker verification. In: Proc. A Speaker Odyssey, Speaker Recognition Workshop (2001)
Roch, M.: Gaussian-selection-based non-optimal search for speaker identification. Speech Communication 48, 85–95 (2006)
Xiang, B., Berger, T.: Efficient text-independent speaker verification with structural Gaussian mixture models and neural networks. IEEE Trans. Speech Audio Processing 11(5), 447–456 (2003)
Xiong, Z., Zheng, T.F., Song, Z., Soong, F., Wu, W.: A tree-based kernel selection approach to efficient Gaussian mixture model-universal background model based speaker identification. Speech Communication 48, 1273–1282 (2006)
Pellom, B.L., Hansen, J.H.L.: An efficient scoring algorithm for Gaussian mixture model based speaker identification. IEEE Signal Processing Lett. 5(11), 281–284 (1998)
Kinnunen, T., Karpov, E., Fränti, P.: Real-time speaker identification and verification. IEEE Trans. on Audio, Speech and Language Processing 14(1), 277–288 (2006)
Hermansky, H., Morgan, N.: RASTA processing of speech. IEEE Trans. on Speech and Audio Processing 2, 578–589 (1994)
Pelecanos, J., Sridharan, S.: Feature warping for robust speaker verification. In: Proc. ISCA Workshop on Speaker Recognition - 2001: A Speaker Odyssey (June 2001)
Xiang, B., Chaudhari, U., Navrátil, J., Ramaswamy, G., Gopinath, R.: Short-time Gaussianization for robust speaker verification. In: Proc. ICASSP, vol. 1, pp. 681–684 (2002)
Reynolds, D.A.: Channel robust speaker verification via feature mapping. In: Proc. ICASSP, vol. II, pp. 53–56 (April 2003)
Barras, C., Gauvain, J.L.: Feature and score normalization for speaker verification of cellular data. In: Proc. ICASSP, vol. 2, pp. 49–52 (2003)
Burget, L., et al.: Analysis of feature extraction and channel compensation in GMM speaker recognition system. IEEE Transactions on Audio, Speech, and Language Processing 15(7), 1979–1986 (2007)
The NIST year 2002 speaker recognition evaluation, http://www.nist.gov/speech/tests/
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Saeidi, R., Sadegh Mohammadi, H.R., Ganchev, T., Rodman, R.D. (2008). Effects of Feature Domain Normalizations on Text Independent Speaker Verification Using Sorted Adapted Gaussian Mixture Models. In: Sarbazi-Azad, H., Parhami, B., Miremadi, SG., Hessabi, S. (eds) Advances in Computer Science and Engineering. CSICC 2008. Communications in Computer and Information Science, vol 6. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-89985-3_61
Download citation
DOI: https://doi.org/10.1007/978-3-540-89985-3_61
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-89984-6
Online ISBN: 978-3-540-89985-3
eBook Packages: Computer ScienceComputer Science (R0)