Language Identification Based on the Variations in Intonation Using Multi-classifier Systems

Ghosh, Shinjini

doi:10.1007/978-3-319-71928-3_8

Shinjini Ghosh¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10682))

Included in the following conference series:

International Conference on Mining Intelligence and Knowledge Exploration

1275 Accesses

Abstract

In this article we make use of the characteristics of tonal languages and machine learning methodologies to understand the patterns in them. Instead of analyzing the absolute pitch or frequency, we analyze how one tone transitions to another in speech. Features (namely, zero crossing count, short time energy, minimum formant frequency, maximum formant frequency) are extracted using the tonal transitions over segments of audio signals. We have developed a multi-classifier system using four classifiers, namely maximum likelihood estimate (MLE), minimum distance classifier (MDC), k-nearest neighbor (kNN) classifier and fuzzy k-NN classifier to automatically identify tonal languages from audio signals. Initially, each individual classifier is trained with existing known data represented by the extracted features. The trained classifier is then used for language identification. Results obtained from these classifiers are combined to generate the final output. Experiments are conducted using three different tonal languages, namely, Chinese, Thai and Vietnamese. The output reveals that the developed multi-classifier model is able to produce promising results. The extracted features produced better results in comparison to usually used frequency value (as a feature). Ensemble of classifiers is a better tool than using individual classifiers.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

A Pre-classification-Based Language Identification for Northeast Indian Languages Using Prosody and Spectral Features

Article 12 October 2018

A Novel Approach for Spoken Language Identification and Performance Comparison Using Machine Learning-Based Classifiers and Neural Network

New Method for Automatic Recognition of Mexican Indigenous Languages: Comparative Performance of Classifiers

Article 28 August 2023

References

Jurafsky, D., Martin, J.H.: Speech and Language Processing, 2nd edn. Pearson, New Delhi (2014)
Google Scholar
Muthusamy, Y.K., Barnard, E., Cole, R.A.: Reviewing automatic language identification. IEEE Sign. Process. Mag. 11, 33–41 (1994)
Article Google Scholar
Zissman, M.A.: Automatic language identification of telephone speech. Lincoln Laboratory Manual, MIT, USA, vol. 8, no. 2, pp. 115–144 (1995)
Google Scholar
Ambikairajah, E., Li, H., Wang, L., Yin, B., Sethu, V.: Language identification: a tutorial. IEEE Circ. Syst. Mag. 11(2), 82–108 (2011)
Article Google Scholar
Ng, R.W.M., Lee, T., Leung, C., Ma, B., Li, H.: Spoken language recognition with prosodic features. IEEE Trans. Audio Speech Lang. Process. 21(9), 1841–1852 (2013)
Article Google Scholar
Itahashi, S., Zhou, J.X., Tanaka, K.: Spoken language discrimination using speech fundamental frequency. In: Proceedings of Third International Conference on Spoken Language Processing, Japan, vol. 4, pp. 1899–1902 (1994)
Google Scholar
Tong, R., Ma, B., Zhu, D., Li, H., Chng, E.S.: Integrating acoustic, prosodic and phonotactic features for spoken language identification. In: Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. I 205–I 208 (2006)
Google Scholar
Rao, K.S., Yegnanarayana, B.: Intonation modeling for Indian languages. J. Comput. Speech Lang. 23, 240–256 (2009)
Article Google Scholar
Newman, J.L., Cox, S.J.: Language identification using visual features. IEEE Trans. Audio Speech Lang. Process. 20(7), 1936–1947 (2012)
Article Google Scholar
Segbroeck, M., Travadi, R., Narayanan, S.S.: Rapid language identification. IEEE Trans. Audio Speech Lang. Process. 23(7), 1118–1129 (2015)
Article Google Scholar
Yencken, L.: The great language game (2013). www.greatlanguagegame.com
Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd edn. Wiley, New York (2001)
MATH Google Scholar
Theodoridis, S., Koutroumbas, K.: Pattern Recognition, 4th edn. Elsevier, New York (2008)
MATH Google Scholar
Gonzalez, R.C., Woods, R.E.: Digital Image Processing, 3rd edn. Pearson Education, New Delhi (2009)
Google Scholar
Cannam, C., Landone, C., Sandler, M.: Sonic visualiser: an open source application for viewing, analysing, and annotating music audio files. In: Proceedings of the 18th ACM International Conference on Multimedia, pp. 1467–1468 (2010)
Google Scholar

Download references

Acknowledgment

An earlier version of this work has been presented at the Intel International Science and Engineering Fair (Intel ISEF), held at Los Angeles, USA in May 2017 and won a Grand Award. The author would like to acknowledge her School teacher, Dr. Partha Pratim Roy, for advising her throughout the course of this work. Thanks are due to the Intel Initiative for Research and Innovation in Science (IRIS) Scientific Review Committee and her mentors, for their valuable comments. The author also acknowledges Rahul Roy and Ajoy Mondal, her parents’ students, for helping her in conducting the experiments.

Author information

Authors and Affiliations

South Point High School, 82/7A Ballygunge Place, Kolkata, 700019, India
Shinjini Ghosh

Authors

Shinjini Ghosh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shinjini Ghosh .

Editor information

Editors and Affiliations

Indian Statistical Institute, Kolkata, India
Ashish Ghosh
Institute for Development and Research in Banking Technology, Hyderabad, India
Rajarshi Pal
Indian Institute of Information Technology, Sri City, India
Rajendra Prasath

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ghosh, S. (2017). Language Identification Based on the Variations in Intonation Using Multi-classifier Systems. In: Ghosh, A., Pal, R., Prasath, R. (eds) Mining Intelligence and Knowledge Exploration. MIKE 2017. Lecture Notes in Computer Science(), vol 10682. Springer, Cham. https://doi.org/10.1007/978-3-319-71928-3_8

Download citation

DOI: https://doi.org/10.1007/978-3-319-71928-3_8
Published: 28 November 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-71927-6
Online ISBN: 978-3-319-71928-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics