Integrating Complementary Features with a Confidence Measure for Speaker Identification

Zheng, Nengheng; Ching, P. C.; Wang, Ning; Lee, Tan

doi:10.1007/11939993_57

Nengheng Zheng²²,
P. C. Ching²²,
Ning Wang²² &
…
Tan Lee²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4274))

Included in the following conference series:

International Symposium on Chinese Spoken Language Processing

1569 Accesses

Abstract

This paper investigates the effectiveness of integrating complementary acoustic features for improved speaker identification performance. The complementary contributions of two acoustic features, i.e. the conventional vocal tract related features MFCC and the recently proposed vocal source related features WOCOR, for speaker identification are studied. An integrating system, which performs a score level fusion of MFCC and WOCOR with a confidence measure as the weighting parameter, is proposed to take full advantage of the complementarity between the two features. The confidence measure is derived based on the speaker discrimination powers of MFCC and WOCOR in each individual identification trial so as to give more weight to the one with higher confidence in speaker discrimination. Experiments show that information fusion with such a confidence measure based varying weight outperforms that with a pre-trained fixed weight in speaker identification.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Campbell, J.P.: Speaker recognition: a tutorial. Proc. IEEE 85(9), 1437–1462 (1997)
Article Google Scholar
Reynolds, D.A., Quatieri, T.F., Dunn, R.B.: Speaker verification using adapted gaussian mixturemodels. Digital Signal Processing 10(1-3), 19–41 (2000)
Article Google Scholar
Sonmez, M.K., Heck, L., Weintraub, M., Shriberg, E.: A lognormal tied mixture model of pitch for prosody based speaker recognition. In: Proc. Eurospeech, pp. 1391–1394 (1997)
Google Scholar
Imperl, B., Kacic, Z., Horvat, B.: A study of harmonic features for speaker recognition. Speech Communication 22(4), 385–402 (1997)
Article Google Scholar
Plumpe, M.D., Quatieri, T.F., Reynolds, D.A.: Modeling of the glottal flow derivative waveform with application to speaker identification. IEEE Trans. Speech Audio Processing 7(5), 569–585 (1999)
Article Google Scholar
Reynolds, D., Andrews, W., Campbell, J., Navratil, J., Peskin, B., Adami, A., Jin, Q., Klusacek, D., Abramson, J., Mihaescu, R., Godfrey, J., Jones1, D., Xiang, B.: The SuperSID project: Exploiting highlevel information for high-accuracy speaker recognition. In: Proc. IEEE Int. Conf. on Acoustics, Speech, Signal Processing, pp. 784–787 (2003)
Google Scholar
Zheng, N.H., Ching, P.C., Lee, T.: Time frequency analysis of vocal source signal for speaker recognition. In: Proc. Int. Conf. on Spoken Language Processing, pp. 2333–2336 (2004)
Google Scholar
Rabiner, L.R., Schafer, R.W.: Digital Processing of Speech Signals. Prentice-Hall, Englewood Cliffs (1978)
Google Scholar
Davis, S.B., Mermelstein, P.: Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans. Acoust., Speech, Signal Processing 28(4), 357–366 (1980)
Article Google Scholar
Ross, A., Jain, A., Qian, J.-Z.: Information fusion in biometrics. In: Bigun, J., Smeraldi, F. (eds.) AVBPA 2001. LNCS, vol. 2091, pp. 354–359. Springer, Heidelberg (2001)
Chapter Google Scholar
Garcia-Romero, D., Fierrez-Aguilar, J., Gonzalez-Rodriguez, J., Garcia, J.O.: On the use of quality measures for text-independent speaker recognition. In: ESCA Workshop on Speaker and Language Recognition, Odyssey, pp. 105–110 (2004)
Google Scholar
Toh, K.-A., Yau, W.-Y.: Fingerprint and speaker verification decisions fusion using a functional link network. IEEE Trans. System, Man and Cybernetics B 35(3), 357–370 (2005)
Article Google Scholar
Talkin, D.: A robust algorithm for pitch tracking (RAPT). In: Kleijn, W.B., Paliwal, K.K. (eds.) Speech Coding and Synthesis. Elsevier, Amsterdam (1995)
Google Scholar
Daubechies, I.: Ten Lectures on Wavelets. Society for Industrial and Applied Mathematics, Philadelphia (1992)
Google Scholar
Zheng, N.H., Qin, C., Lee, T., Ching, P.C.: CU2C: A dual-condition Cantonese speech database for speaker recognition applications. In: Proc. Oriental- COCOSDA, pp. 67–72 (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electronic Engineering, The Chinese University of Hong Kong, Hong Kong
Nengheng Zheng, P. C. Ching, Ning Wang & Tan Lee

Authors

Nengheng Zheng
View author publications
You can also search for this author in PubMed Google Scholar
P. C. Ching
View author publications
You can also search for this author in PubMed Google Scholar
Ning Wang
View author publications
You can also search for this author in PubMed Google Scholar
Tan Lee
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, The University of Hong Kong, Hong Kong
Qiang Huo
Human Language Technology Department, Institute for Infocomm Research (I2R), 119613, Singapore
Bin Ma
School of Computer Engineering, Nanyang Technological University (NTU), 639798, Singapore
Eng-Siong Chng
Institute for Infocomm Research, 21 Heng Mui Keng Terrace, 119613, Singapore
Haizhou Li

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zheng, N., Ching, P.C., Wang, N., Lee, T. (2006). Integrating Complementary Features with a Confidence Measure for Speaker Identification. In: Huo, Q., Ma, B., Chng, ES., Li, H. (eds) Chinese Spoken Language Processing. ISCSLP 2006. Lecture Notes in Computer Science(), vol 4274. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11939993_57

Download citation

DOI: https://doi.org/10.1007/11939993_57
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-49665-6
Online ISBN: 978-3-540-49666-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics