Abstract
The paper studies the problem of language identification for audio files. For solving the problem, we use methods of digital signal processing only (without analysis of phonemes distinctive for language). A special attention is drawn to the form of signal in an area close to the position of a stop consonant. The evaluation is performed on a set of two languages; this includes speech records taken from TV programs. It is provided that solely one of the two languages (either Tatar or Russian) is used in each of files. Experimental evidence demonstrates the feasibility of the proposed techniques.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Li, H., Ma, B., Lee, K.A.: Spoken language recognition: from fundamentals to practice. Proc. IEEE 101, 1136–1159 (2013)
Leonard, R., Doddington, G.: Automatic language identification. Technical report RADC-TR-74-200, Air Force Rome Air Development Center (1974)
Schuller, B., Batliner, A.: Computational Paralinguistics: Emotion, Affect and Personality in Speech and Language Processing. Wiley, Hoboken (2013)
Diehl, R.L., Lotto, A.J., Holt, L.L.: Speech perception. Annu. Rev. Psychol. 55, 149–179 (2004)
Karafiat, M., Burget, L., Matejka, P., Glembek, O., Cernocky, J.: iVector-based discriminative adaptation for automatic speech recognition. In: Proceedings of the IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), pp. 152–157 (2011)
Zissman, M.A.: Language identification using phoneme recognition and phonotactic language modeling. In: Proceedings of the 1995 International Conference on Acoustics, Speech, and Signal Processing, vol. 5, pp. 3503–3506 (1995)
Mathew, N.V., Bai, V.R.: Analyzing the effectiveness of N-gram technique based Feature Set in a Naive Bayesian Spam Filter. In: Proceedings of the 2016 International Conference on Emerging Technological Trends (ICETT), pp. 1–5 (2016)
Lopez-Moreno, I., Gonzalez-Dominguez, J., Plchot, O., Martinez, D., Gonzalez-Rodriguez, J., Moreno, P.: Automatic language identification using deep neural networks. In: Proceedings of the 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5337–5341 (2014)
Nandi, D., Pati, D., Rao, K.S.: Language identification using Hilbert envelope and phase information of linear prediction residual. In: Proceedings of the 2013 International Conference Oriental COCOSDA Held Jointly with 2013 Conference on Asian Spoken Language Research and Evaluation (O-COCOSDA/CASLRE), pp. 1–6 (2013)
Mallat, S.: A Wavelet Tour of Signal Processing. Elsevier, Amsterdam (2009)
Latypov, R.K., Nigmatullin, R.R., Stolov, E.L.: Classification of speech files by waveforms. Lobachevskii J. Math. 36, 488–494 (2015)
Jain, A.K.: Fundamentals of Digital Image Processing. Prentice-Hall, Upper Saddle River (1989)
Cramer, H.: Mathematical Methods of Statistics. Princeton University Press, Princeton (1999)
scikit-learn. http://scikit-learn.org
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Latypov, R., Nigmatullin, R., Stolov, E. (2017). Automatic Spoken Language Identification by Digital Signal Processing Methods. Tatar and Russian Languages. In: Damaševičius, R., Mikašytė, V. (eds) Information and Software Technologies. ICIST 2017. Communications in Computer and Information Science, vol 756. Springer, Cham. https://doi.org/10.1007/978-3-319-67642-5_45
Download citation
DOI: https://doi.org/10.1007/978-3-319-67642-5_45
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-67641-8
Online ISBN: 978-3-319-67642-5
eBook Packages: Computer ScienceComputer Science (R0)