Language-Dependent Contribution Measuring and Weighting for Combining Likelihood Scores in Language Identification Systems

Yin, Bo; Ambikairajah, Eliathamby; Chen, Fang

doi:10.1007/s11265-008-0291-6

Language-Dependent Contribution Measuring and Weighting for Combining Likelihood Scores in Language Identification Systems

Published: 10 December 2008

Volume 59, pages 201–210, (2010)
Cite this article

Journal of Signal Processing Systems Aims and scope Submit manuscript

Bo Yin^1,2,
Eliathamby Ambikairajah^1,2 &
Fang Chen²

136 Accesses
Explore all metrics

Abstract

Developing a fusion-based system is one of the key research issues in modern Language Identification (LID) systems. In this paper we investigate existing fusion techniques for LID systems and propose an alternative solution. By directly utilizing language-dependent contribution information, a novel Language-Dependent Weighting approach is introduced and implemented. We investigate various contribution measures, including LID performances, likelihood ratios, and Kullback–Leibler divergence. These measures are conducted from either development datasets or class models. The advantage of using language-dependent weighting over language-independent weighting is illustrated using a Language-Dependent Contribution Map. Both the OGI and CallFriend databases show a very similar contribution pattern which is related to language characteristics. Experiments on the NIST LRE 2003 task and OGI database demonstrate that the proposed fusion technique outperforms other recent fusion techniques when the amount of available development data is limited. In particular, the system based on Kullback-Leibler divergence achieved the best performance while eliminating the need for development data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

LIMA: A Spoken Language Identification Framework

An artificial neural network approach for the language learning model

Article Open access 20 December 2023

Automatic Language Identification for Celtic Texts

References

Greenberg, S., & Arai, T. (2004). What are the essential cues for understanding spoken languages. IEICE Transaction on Information & System, E87-D, 1059.
Google Scholar
Yin, B., Ambikairajah, E., & Chen, F. (2006). Combining prosodic and cepstral features in language identification. IEEE international conference on pattern recognition, Hong Kong, China.
Singer, E., Torres-Carrasquillo, P. A., Gleason, T. P., Campbell, W. M., D. A. Reynolds (2003). Acoustic, Phonetic, and Discriminative approaches to automatic language identification. EuroSpeech, Geneva, Switzerland.
Wong, E., & Sridharan, S. (2001). Fusion of output scores on language identification system. Workshop on Multilingual Speech and Language Processing, Aalborg Denmark.
Rong, T., Bin, M., Donglai, Z., Haizhou, L., & Eng Siong, C. (2006). Integrating acoustic, prosodic and phonotactic features for spoken language identification. IEEE International Conference on Acoustics, Speech, and Signal Processing, Toulouse, France.
Gutierrez, J., Rouas, J. L., & Andre-Obrecht, R. (2004). Fusing language identification systems using performance confidence indexes. IEEE International Conference on Acoustics, Speech, and Signal Processing, Montreal Canada.
Shafran, I. (2007). Multi-stream fusion for speaker classification. In Speaker Classification I, pp. 298–312.
Snelick, R., Uludag, U., Mink, A., Indovina, M., & Jain, A. (2005). Large scale evaluation of multimodal biometric authentication using state-of-the-art systems. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27, 450–455.
Article Google Scholar
Schölkopf, B., Burges, C. J. C., & Smola, A. J. (1999). Advances in kernel methods: Support vector learning. Cambridge: MIT Press.
Google Scholar
Campbell, W., Gleason, T., Navratil, J.,Reynolds, D., Shen, W., Singer, E., & Torres-Carrasquillo, P. (2006). Advanced language recognition using cepstra and phonotactics: MITLL system performance on the NIST 2005 language recognition evaluation. IEEE Odyssey—The Speaker and Language Recognition Workshop.
Gauvain, J. L., Messaoudi, A., & Schwenk, H. (2004). Language recognition using phone lattices. ICSLP, Jeju island.
Milner, B. (2002). A comparison of front-end configurations for robust speech recognition. Acoustics, Speech, and Signal Processing. IEEE International Conference on (ICASSP).
Openshaw, J. P., Sun, Z. P., & Mason, J. S. (1993). A comparison of composite features under degraded speech in speaker recognition, Acoustics, Speech, and Signal Processing. IEEE International Conference on (ICASSP).
Chi-Yueh Lin, H.-C. W. (2005). Language identification using pitch contour information, ICASSP.
Yasunari Obuchi, N. S. (2005). Language identification using phonetic and prosodic HMMs with feature normalization. ICASSP.
Liu, L., He, J., & Palm, G. (1997). Effects of phase on the perception of intervocalic stop consonants. Speech Communication, 22, 403–417.
Article Google Scholar
Hegde, R. M., Murthy, H. A., & Rao, G. V. R. (2004). Application of the modified group delay function to speaker identification and discrimination, Acoustics, Speech, and Signal Processing. 2004. Proceedings. (ICASSP ‘04). IEEE International Conference on.
Alsteris, L. D., & Paliwal, K. K. (2005). Evaluation of the modified group delay feature for isolated word recognition. ISSAP.
Thiruvaran, T., Ambikairajah, E., & Epps, J. (2008). Extraction of FM components from speech signals using all-pole model. Electronics Letters, 44, 449–450.
Article Google Scholar
Yin, B., Ambikairajah, E., & Chen, F. (2007). Hierarchical language identification based on automatic language clustering. InterSpeech–EuroSpeech, Antwerp, Belgium.
Stadelmann, T., & Freisleben, B. (2006). Fast and robust speaker clustering using the Earth Mover’s distance and mixmax models, acoustics, speech and signal processing, 2006. ICASSP 2006, Proceedings. 2006 IEEE International Conference on.
Beigi, H. S. M., Maes, S. H., & Sorensen, J. S. (1998). A distance measure between collections of distributions and its application to speaker recognition, acoustics, speech and signal processing, 1998. Proceedings of the 1998 IEEE International Conference on.
Allen, F., Ambikairajah, E., & Epps, J. (2005). Language identification using warping and the shifted delta cepstrum. IEEE International Workshop on Multimedia Signal Processing, Shanghai, China.
NIST Language Recognition Evaluation. (2003). http://www.itl.nist.gov/iad/894.01/tests/lang/2003/index.htm

Download references

Author information

Authors and Affiliations

School of Electrical Engineering and Telecommunications, The University of New South Wales, Sydney, NSW, 2052, Australia
Bo Yin & Eliathamby Ambikairajah
National ICT Australia (NICTA), Australian Technology Park, Eveleigh, Sydney, 1430, Australia
Bo Yin, Eliathamby Ambikairajah & Fang Chen

Authors

Bo Yin
View author publications
You can also search for this author in PubMed Google Scholar
Eliathamby Ambikairajah
View author publications
You can also search for this author in PubMed Google Scholar
Fang Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bo Yin.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yin, B., Ambikairajah, E. & Chen, F. Language-Dependent Contribution Measuring and Weighting for Combining Likelihood Scores in Language Identification Systems. J Sign Process Syst Sign Image Video Technol 59, 201–210 (2010). https://doi.org/10.1007/s11265-008-0291-6

Download citation

Received: 21 May 2008
Revised: 15 August 2008
Accepted: 23 September 2008
Published: 10 December 2008
Issue Date: May 2010
DOI: https://doi.org/10.1007/s11265-008-0291-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Language-Dependent Contribution Measuring and Weighting for Combining Likelihood Scores in Language Identification Systems

Abstract

Access this article

Similar content being viewed by others

LIMA: A Spoken Language Identification Framework

An artificial neural network approach for the language learning model

Automatic Language Identification for Celtic Texts

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Language-Dependent Contribution Measuring and Weighting for Combining Likelihood Scores in Language Identification Systems

Abstract

Access this article

Similar content being viewed by others

LIMA: A Spoken Language Identification Framework

An artificial neural network approach for the language learning model

Automatic Language Identification for Celtic Texts

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation