Dialect Identification in Ao Using Modulation-Based Representation

Tzudir, Moakala; Sadashiv T.N., Rishith; Agarwal, Ayush; Prasanna, S. R. Mahadeva

doi:10.1007/978-3-031-48312-7_43

Moakala Tzudir¹³,
Rishith Sadashiv T.N.¹³,
Ayush Agarwal¹⁴ &
…
S. R. Mahadeva Prasanna¹³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14339))

Included in the following conference series:

International Conference on Speech and Computer

286 Accesses

Abstract

This paper presents an automatic dialect identification in Ao using modulation-based approach. Ao is a low-resource, Tibeto-Burman tonal language spoken in Nagaland, a North-East state of India. This work aims to investigate dialect-specific characteristics to build a more robust DID system for classifying the three Ao dialects. In this direction, modulation-based representation is explored. Considering Ao is a tone language, the experiments were evaluated for 3 sec segment duration in order to capture the temporal information of the modulation spectrogram. In addition, the log Mel spectrogram is used as the feature for the baseline DID system. The proposed modulation spectrogram shows a significant performance of \(\approx 8\%\) improvement in accuracy over the baseline Ao DID system. Hence, the result indicates the effectiveness of modulation-based representation in automatically identifying the three dialects of Ao.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Agrawal, S.S., Jain, A., Sinha, S.: Analysis and modeling of acoustic information for automatic dialect classification. Int. J. Speech Technol. 19(3), 593–609 (2016). https://doi.org/10.1007/s10772-016-9351-7
Article Google Scholar
Biadsy, F., Hirschberg, J., Habash, N.: Spoken Arabic dialect identification using phonotactic modeling. In: Proceedings of the EACL Workshop on Computational Approaches to Semitic Languages, pp. 53–61. Stroudsburg, PA, USA (2009)
Google Scholar
Cassani, R., Albuquerque, I., Monteiro, J., Falk, T.H.: AMA: an open-source amplitude modulation analysis toolkit for signal processing applications. In: 2019 IEEE Global Conference on Signal and Information Processing (GlobalSIP), pp. 1–4. IEEE (2019)
Google Scholar
Chambers, J.K., Trudgill, P.: Dialectology, vol. \(2^{nd}\) edition. Cambridge University Press (1998)
Google Scholar
Chittaragi, N.B., Koolagudi, S.G.: Dialect identification using chroma-spectral shape features with ensemble technique. Comput. Speech Lang. 70, 101230 (2021)
Article Google Scholar
Coupe, A.R.: The acoustic and perceptual features of tone in the Tibeto-Burman language Ao naga. In: Proceedings of the \(5^{th}\) International Conference on Spoken Language Processing (1998)
Google Scholar
G.191 ITU-T, R.: ITU-T software tool library. International Telecommunication Union, Geneva, Switzerland (2009)
Google Scholar
G.191 ITU-T, R.: Software tools for speech and audio coding standardization. International Telecommunication Union, Geneva, Switzerland (2005). https://www.itu.int/rec/T-REC-G.191/en
Grierson, G.A.: Linguistic Survey of India, vol. 4. Office of the superintendent of government printing, India (1906)
Google Scholar
Hung, P.N., Ha, N.T., Van Loan, T., Thang, V.X., Chien, N.D.: Vietnamese dialect identification on embedded system. UTEHY J. Sci. Technol. 24, 82–87 (2019)
Google Scholar
Kakouros, S., Hiovain, K., Vainio, M., Šimko, J.: Dialect identification of spoken north Sámi language varieties using prosodic features. arXiv preprint arXiv:2003.10183 (2020)
Kakouros, S., Hiovain-Asikainen, K.: North Sámi dialect identification with self-supervised speech models. arXiv preprint arXiv:2305.11864 (2023)
Kethireddy, R., Kadiri, S.R., Alku, P., Gangashetty, S.V.: Mel-weighted single frequency filtering spectrogram for dialect identification. IEEE Access 8, 174871–174879 (2020)
Article Google Scholar
Lei, Y., Hansen, J.H.L.: Dialect classification via text-independent training and testing for Arabic, Spanish, and Chinese. IEEE Trans. Audio Speech Lang. Process. 19, 85–96 (2011)
Article Google Scholar
Lin, W., Madhavi, M., Das, R.K., Li, H.: Transformer-based Arabic dialect identification. In: International Conference on Asian Language Processing (IALP), pp. 192–196. IEEE (2020)
Google Scholar
Ma, B., Zhu, D., Tong, R.: Chinese dialect identification using tone features based on pitch flux. In: Proceedings of the IEEE International Conference on Acoustics Speech and Signal Processing, vol. 1 (2006)
Google Scholar
Magazine, R., Agarwal, A., Hedge, A., Prasanna, S.M.: Fake speech detection using modulation spectrogram. In: International Conference on Speech and Computer, pp. 451–463. Springer (2022). https://doi.org/10.1007/978-3-031-20980-2_39
Mingliang, G., Yuguo, X., Yiming, Y.: Semi-supervised learning based Chinese dialect identification. In: Proceedings of the \(9^{th}\) International Conference on Signal Processing, pp. 1608–1611. IEEE (2008)
Google Scholar
Rao, K.S., Koolagudi, S.G.: Identification of Hindi dialects and emotions using spectral and prosodic features of speech. IJSCI: Int. J. Syst. Cybern. Inf. 9(4), 24–33 (2011)
Google Scholar
Shon, S., Ali, A., Samih, Y., Mubarak, H., Glass, J.: ADI17: a fine-grained Arabic dialect identification dataset. In: Proceedings of the International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 8244–8248. IEEE (2020)
Google Scholar
Tzudir, M., Baghel, S., Sarmah, P., Prasanna, S.: Under-resourced dialect identification in Ao using source information. J. Acoust. Soc. Am. 152(3), 1755–1766 (2022)
Article Google Scholar
Tzudir, M., Baghel, S., Sarmah, P., Prasanna, S.M.: Excitation source feature based dialect identification in Ao-a low resource language. In: Proceedings of the INTERSPEECH, pp. 1524–1528 (2021)
Google Scholar
Tzudir, M., Baghel, S., Sarmah, P., Prasanna, S.: Analyzing RMFCC feature for dialect identification in Ao, an under-resourced language. In: Proceedings of the National Conference on Communications (NCC), pp. 308–313. IEEE (2022)
Google Scholar
Tzudir, M., Bhattacharjee, M., Sarmah, P., Prasanna, S.: Low-resource dialect identification in Ao using noise robust mean Hilbert envelope coefficients. In: Proc. of the National Conference on Communications (NCC), pp. 256–261. IEEE (2022)
Google Scholar
Tzudir, M., Sarmah, P., Prasanna, S.R.M.: Dialect identification using tonal and spectral features in two dialects of Ao. In: Proceedings of the SLTU (2018)
Google Scholar
Tzudir, M., Sarmah, P., Prasanna, S.M.: Tonal feature based dialect discrimination in two dialects in Ao. In: Proceedings of the Region 10 Conference, TENCON, pp. 1795–1799. IEEE (2017)
Google Scholar
Tzudir, M., Sarmah, P., Prasanna, S.M.: Analysis and modeling of dialect information in Ao, a low resource language. J. Acoust. Soc. Am. 149(5), 2976–2987 (2021)
Article Google Scholar
Tzudir, M., Sarmah, P., Prasanna, S.M.: Prosodic information in dialect identification of a tonal language: the case of Ao. Proc. Interspeech 2022, 2238–2242 (2022)
Article Google Scholar
Vincent, E., Campbell, D.: Roomsimove. https://irisa.fr/metiss/members/evincent/Roomsimove.zip

Download references

Author information

Authors and Affiliations

Department of Electrical Engineering, Indian Institute of Technology Dharwad, Dharwad, 580011, India
Moakala Tzudir, Rishith Sadashiv T.N. & S. R. Mahadeva Prasanna
McAfee, Bengaluru, India
Ayush Agarwal

Authors

Moakala Tzudir
View author publications
You can also search for this author in PubMed Google Scholar
Rishith Sadashiv T.N.
View author publications
You can also search for this author in PubMed Google Scholar
Ayush Agarwal
View author publications
You can also search for this author in PubMed Google Scholar
S. R. Mahadeva Prasanna
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rishith Sadashiv T.N. .

Editor information

Editors and Affiliations

St. Petersburg Federal Research Center of the Russian Academy of Sciences, St. Petersburg, Russia
Alexey Karpov
Koneru Lakshmaiah Education Foundation, Vaddeswaram, India
K. Samudravijaya
Indian Institute of Information Technology Dharwad, Dharwad, India
K. T. Deepak
Indian Institute of Technology Dharwad, Dharwad, India
Rajesh M. Hegde
KIIT Group of Colleges, Gurugram, India
Shyam S. Agrawal
Indian Institute of Technology Dharwad, Dharwad, India
S. R. Mahadeva Prasanna

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tzudir, M., Sadashiv T.N., R., Agarwal, A., Prasanna, S.R.M. (2023). Dialect Identification in Ao Using Modulation-Based Representation. In: Karpov, A., Samudravijaya, K., Deepak, K.T., Hegde, R.M., Agrawal, S.S., Prasanna, S.R.M. (eds) Speech and Computer. SPECOM 2023. Lecture Notes in Computer Science(), vol 14339. Springer, Cham. https://doi.org/10.1007/978-3-031-48312-7_43

Download citation

DOI: https://doi.org/10.1007/978-3-031-48312-7_43
Published: 22 November 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-48311-0
Online ISBN: 978-3-031-48312-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Dialect Identification in Ao Using Modulation-Based Representation