Maximum Likelihood and Maximum a Posteriori Adaptation for Distributed Speaker Recognition Systems

Sit, Chin-Hung; Mak, Man-Wai; Kung, Sun-Yuan

doi:10.1007/978-3-540-25948-0_87

Chin-Hung Sit¹⁷,
Man-Wai Mak¹⁷ &
Sun-Yuan Kung¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3072))

Included in the following conference series:

International Conference on Biometric Authentication

1294 Accesses
2 Citations

Abstract

We apply the ETSI’s DSR standard to speaker verification over telephone networks and investigate the effect of extracting spectral features from different stages of the ETSI’s front-end on speaker verification performance. We also evaluate two approaches to creating speaker models, namely maximum likelihood (ML) and maximum a posteriori (MAP), in the context of distributed speaker verification. In the former, random vectors with variances depending on the distance between unquantized training vectors and their closest code vector are added to the vector-quantized feature vectors extracted from client speech. The resulting vectors are then used for creating speaker-dependent GMMs based on ML techniques. For the latter, vector quantized vectors extracted from client speech are used for adapting a universal background model to speaker-dependent GMMs. Experimental results based on 145 speakers from the SPIDRE corpus show that quantized feature vectors extracted from the server side can be directly used for MAP adaptation. Results also show that the best performing system is based on the ML approach. However, the ML approach is sensitive to the number of input dimensions of the training data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Euler, S., Zinke, J.: The influence of speech coding algorithms on automatic speech recognition. In: Proc. ICASSP, pp. 621–624 (1994)
Google Scholar
Lilly, B.T., Paliwal, K.K.: Effect of speech coders on speech recognition performance. In: Proc. ICSLP, October 1996, vol. 4, pp. 2344–2347 (1996)
Google Scholar
Pearce, D.: Enabling new speech driven services for mobile devices: An overview of the ETSI standards activities for distributed speech recognition front-ends. In: AVIOS 2000: The Speech Application Conference (2000)
Google Scholar
ETSI ES 202 050 V1.1.1 (2002-10), Speech Processing, Transmission and Quality Aspects (STQ); Distributed Speech Recognition; Advanced Front-end Feature Extraction Algorithm; Compression Algorithms (October 2002)
Google Scholar
Kelleher, H., Pearce, D., Ealey, D., Mauuary, L.: Speech recognition performance comparison between DSR and AMR transcoded speech. In: Proc. ICSLP 2002, pp. 1873–1876 (2002)
Google Scholar
Dempster, P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. of Royal Statistical Soc., Ser. B. 39(1), 1–38 (1977)
MATH MathSciNet Google Scholar
Reynolds, D.A., Quatieri, T.F., Dunn, R.B.: Speaker verification using adapted gaussian mixture models. Digital Signal Processing 10, 19–41 (2000)
Article Google Scholar
Martin, A., Doddington, G., Kamm, T., Ordowski, M., Przybocki, M.: The DET curve in assessment of detection task performance. In: Proc. Eurospeech 1997, Rhodes, Greece, pp. 1895–1898 (1997)
Google Scholar

Download references

Author information

Authors and Affiliations

Center for Multimedia Signal Processing, Dept. of Electronic and Information Engineering, The Hong Kong Polytechnic University, China
Chin-Hung Sit & Man-Wai Mak
Dept. of Electrical Engineering, Princeton University, USA
Sun-Yuan Kung

Authors

Chin-Hung Sit
View author publications
You can also search for this author in PubMed Google Scholar
Man-Wai Mak
View author publications
You can also search for this author in PubMed Google Scholar
Sun-Yuan Kung
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Biometrics Research Centre, Department of Computing, The Hong Kong Polytechnic University, Kowloon, Hong Kong
David Zhang
Department of Computer Science and Engineering, Michigan State University,
Anil K. Jain

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sit, CH., Mak, MW., Kung, SY. (2004). Maximum Likelihood and Maximum a Posteriori Adaptation for Distributed Speaker Recognition Systems. In: Zhang, D., Jain, A.K. (eds) Biometric Authentication. ICBA 2004. Lecture Notes in Computer Science, vol 3072. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-25948-0_87

Download citation

DOI: https://doi.org/10.1007/978-3-540-25948-0_87
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22146-3
Online ISBN: 978-3-540-25948-0
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Maximum Likelihood and Maximum a Posteriori Adaptation for Distributed Speaker Recognition Systems