A GMM-Based Handset Selector for Channel Mismatch Compensation with Aplications to Speaker Identification

Yiu, K. K.; Mak, M. W.; Kung, S. Y.

doi:10.1007/3-540-45453-5_156

K. K. Yiu⁷,
M. W. Mak⁷ &
S. Y. Kung⁷

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2195))

Included in the following conference series:

Pacific-Rim Conference on Multimedia

697 Accesses
1 Citations

Abstract

In telephone-based speaker identification, variation in handset characteristics can introduce severe speech variabilityeven for speech uttered by the same speaker. This paper proposes a method to compensate the variation in handset characteristics. In the method, a number of Gaussian mixture models are independently trained to identify the most likely handset given a test utterance. The identified handset is used to select a compensation vector from a set of pre-computed vectors, where the pre-computed vectors are the average frame-by-frame differences between the clean and distorted utterances. The clean features are then recovered by subtracting the selected compensation vector from the distirted vectors. Experimental results based on 138 speakers of the YOHO and telephone YOHO corppora show that the proposed approach is computationally efficient and is able to increase the accuracy from 17% (without compensation) to 85% (with compensation).

S. Y. Kung is on sabbatical from the Princeton, University, Usa. He is currently a distinguished chair professor of the Department of Electronic and Information Engineering, The HOng Kong Polytechnic University. This project was supported by the Hong Kong Polytechnic University Grant No. 1.42.37.A410.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

M. W. Mak and S. Y. Kung. Estimation of elliptical basis function parameters by the EM algorithms with application to speaker verification. In IEEE Trans. on Neural Networks, volume 11, pages 961–969, 2000.
Article Google Scholar
S. Furui. Cepstral analysis technique for automatic speaker verification. IEEE Trans. on Acoustics, Speech and Signal Processing, ASSP-29(2):254–272, April 1981.
Article Google Scholar
M. G. Rahim and B. H. Juang. Signal bias removal by maximum likelihood estimation for robust telephone speech recognition. IEEE Transactions on Speech and Audio Processing, 4(1):19–30, Jan 1996.
Article Google Scholar
T. F. Lo, K. K. Yiu, and M. W. Mak. A new cepstrum-based channel compensation method for speaker verification. In Proc. Eurospeech’99, volume 2, pages 775–778, Sept. 1999.
Google Scholar
K. K. Yiu, M. W. Mak, and S. Y. Kung. Channel distortion compensation based on the measurement of handset’s frequency responses. In International Symposium on Intelligent Multimedia, Video and Speech Processing, 2001.
Google Scholar
J. P. Campbell. Testing with YOHO CD-ROm voice verification corpus. In ICASSP’95, volume 1, pages 341–344, 1995.
Google Scholar
L. P. Heck and M. Weintraub. Handset dependent background models for robust text-independent speaker recognition. In ICASSP97, volume 2, pages 1071–1074, 1997.
Google Scholar
C. Mokbel, D. Jouvet, and J. Monné. Deconvolution of telephone line effects for speech recognition. Speech Communication, 19:185–196, 1996.
Article Google Scholar
A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the EM algorithm. J. of Royal Statistical Soc., Ser. B., 39(1):1–38, 1977.
MATH MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Center for Multimedia Signal Processing, Dept. of ELectronic and Information Engineering, The Hong Kong Polytechnic University, Hong Kong
K. K. Yiu, M. W. Mak & S. Y. Kung

Authors

K. K. Yiu
View author publications
You can also search for this author in PubMed Google Scholar
M. W. Mak
View author publications
You can also search for this author in PubMed Google Scholar
S. Y. Kung
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Microsoft Research China, 5/F Beijing Sigma Center 49 Zhichung Road, Haidian District, Beijing, 100080, China
Heung-Yeung Shum
Institute of Information Science, Academia Sinica, Taiwan
Mark Liao
Department of Electrical Engineering, Columbia University, New York, NY, 10027, USA
Shih-Fu Chang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yiu, K.K., Mak, M.W., Kung, S.Y. (2001). A GMM-Based Handset Selector for Channel Mismatch Compensation with Aplications to Speaker Identification. In: Shum, HY., Liao, M., Chang, SF. (eds) Advances in Multimedia Information Processing — PCM 2001. PCM 2001. Lecture Notes in Computer Science, vol 2195. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45453-5_156

Download citation

DOI: https://doi.org/10.1007/3-540-45453-5_156
Published: 20 November 2001
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42680-6
Online ISBN: 978-3-540-45453-3
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics