Abstract
This paper presents an automatic dialect identification in Ao using modulation-based approach. Ao is a low-resource, Tibeto-Burman tonal language spoken in Nagaland, a North-East state of India. This work aims to investigate dialect-specific characteristics to build a more robust DID system for classifying the three Ao dialects. In this direction, modulation-based representation is explored. Considering Ao is a tone language, the experiments were evaluated for 3 sec segment duration in order to capture the temporal information of the modulation spectrogram. In addition, the log Mel spectrogram is used as the feature for the baseline DID system. The proposed modulation spectrogram shows a significant performance of \(\approx 8\%\) improvement in accuracy over the baseline Ao DID system. Hence, the result indicates the effectiveness of modulation-based representation in automatically identifying the three dialects of Ao.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Agrawal, S.S., Jain, A., Sinha, S.: Analysis and modeling of acoustic information for automatic dialect classification. Int. J. Speech Technol. 19(3), 593–609 (2016). https://doi.org/10.1007/s10772-016-9351-7
Biadsy, F., Hirschberg, J., Habash, N.: Spoken Arabic dialect identification using phonotactic modeling. In: Proceedings of the EACL Workshop on Computational Approaches to Semitic Languages, pp. 53–61. Stroudsburg, PA, USA (2009)
Cassani, R., Albuquerque, I., Monteiro, J., Falk, T.H.: AMA: an open-source amplitude modulation analysis toolkit for signal processing applications. In: 2019 IEEE Global Conference on Signal and Information Processing (GlobalSIP), pp. 1–4. IEEE (2019)
Chambers, J.K., Trudgill, P.: Dialectology, vol. \(2^{nd}\) edition. Cambridge University Press (1998)
Chittaragi, N.B., Koolagudi, S.G.: Dialect identification using chroma-spectral shape features with ensemble technique. Comput. Speech Lang. 70, 101230 (2021)
Coupe, A.R.: The acoustic and perceptual features of tone in the Tibeto-Burman language Ao naga. In: Proceedings of the \(5^{th}\) International Conference on Spoken Language Processing (1998)
G.191 ITU-T, R.: ITU-T software tool library. International Telecommunication Union, Geneva, Switzerland (2009)
G.191 ITU-T, R.: Software tools for speech and audio coding standardization. International Telecommunication Union, Geneva, Switzerland (2005). https://www.itu.int/rec/T-REC-G.191/en
Grierson, G.A.: Linguistic Survey of India, vol. 4. Office of the superintendent of government printing, India (1906)
Hung, P.N., Ha, N.T., Van Loan, T., Thang, V.X., Chien, N.D.: Vietnamese dialect identification on embedded system. UTEHY J. Sci. Technol. 24, 82–87 (2019)
Kakouros, S., Hiovain, K., Vainio, M., Šimko, J.: Dialect identification of spoken north Sámi language varieties using prosodic features. arXiv preprint arXiv:2003.10183 (2020)
Kakouros, S., Hiovain-Asikainen, K.: North Sámi dialect identification with self-supervised speech models. arXiv preprint arXiv:2305.11864 (2023)
Kethireddy, R., Kadiri, S.R., Alku, P., Gangashetty, S.V.: Mel-weighted single frequency filtering spectrogram for dialect identification. IEEE Access 8, 174871–174879 (2020)
Lei, Y., Hansen, J.H.L.: Dialect classification via text-independent training and testing for Arabic, Spanish, and Chinese. IEEE Trans. Audio Speech Lang. Process. 19, 85–96 (2011)
Lin, W., Madhavi, M., Das, R.K., Li, H.: Transformer-based Arabic dialect identification. In: International Conference on Asian Language Processing (IALP), pp. 192–196. IEEE (2020)
Ma, B., Zhu, D., Tong, R.: Chinese dialect identification using tone features based on pitch flux. In: Proceedings of the IEEE International Conference on Acoustics Speech and Signal Processing, vol. 1 (2006)
Magazine, R., Agarwal, A., Hedge, A., Prasanna, S.M.: Fake speech detection using modulation spectrogram. In: International Conference on Speech and Computer, pp. 451–463. Springer (2022). https://doi.org/10.1007/978-3-031-20980-2_39
Mingliang, G., Yuguo, X., Yiming, Y.: Semi-supervised learning based Chinese dialect identification. In: Proceedings of the \(9^{th}\) International Conference on Signal Processing, pp. 1608–1611. IEEE (2008)
Rao, K.S., Koolagudi, S.G.: Identification of Hindi dialects and emotions using spectral and prosodic features of speech. IJSCI: Int. J. Syst. Cybern. Inf. 9(4), 24–33 (2011)
Shon, S., Ali, A., Samih, Y., Mubarak, H., Glass, J.: ADI17: a fine-grained Arabic dialect identification dataset. In: Proceedings of the International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 8244–8248. IEEE (2020)
Tzudir, M., Baghel, S., Sarmah, P., Prasanna, S.: Under-resourced dialect identification in Ao using source information. J. Acoust. Soc. Am. 152(3), 1755–1766 (2022)
Tzudir, M., Baghel, S., Sarmah, P., Prasanna, S.M.: Excitation source feature based dialect identification in Ao-a low resource language. In: Proceedings of the INTERSPEECH, pp. 1524–1528 (2021)
Tzudir, M., Baghel, S., Sarmah, P., Prasanna, S.: Analyzing RMFCC feature for dialect identification in Ao, an under-resourced language. In: Proceedings of the National Conference on Communications (NCC), pp. 308–313. IEEE (2022)
Tzudir, M., Bhattacharjee, M., Sarmah, P., Prasanna, S.: Low-resource dialect identification in Ao using noise robust mean Hilbert envelope coefficients. In: Proc. of the National Conference on Communications (NCC), pp. 256–261. IEEE (2022)
Tzudir, M., Sarmah, P., Prasanna, S.R.M.: Dialect identification using tonal and spectral features in two dialects of Ao. In: Proceedings of the SLTU (2018)
Tzudir, M., Sarmah, P., Prasanna, S.M.: Tonal feature based dialect discrimination in two dialects in Ao. In: Proceedings of the Region 10 Conference, TENCON, pp. 1795–1799. IEEE (2017)
Tzudir, M., Sarmah, P., Prasanna, S.M.: Analysis and modeling of dialect information in Ao, a low resource language. J. Acoust. Soc. Am. 149(5), 2976–2987 (2021)
Tzudir, M., Sarmah, P., Prasanna, S.M.: Prosodic information in dialect identification of a tonal language: the case of Ao. Proc. Interspeech 2022, 2238–2242 (2022)
Vincent, E., Campbell, D.: Roomsimove. https://irisa.fr/metiss/members/evincent/Roomsimove.zip
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Tzudir, M., Sadashiv T.N., R., Agarwal, A., Prasanna, S.R.M. (2023). Dialect Identification in Ao Using Modulation-Based Representation. In: Karpov, A., Samudravijaya, K., Deepak, K.T., Hegde, R.M., Agrawal, S.S., Prasanna, S.R.M. (eds) Speech and Computer. SPECOM 2023. Lecture Notes in Computer Science(), vol 14339. Springer, Cham. https://doi.org/10.1007/978-3-031-48312-7_43
Download citation
DOI: https://doi.org/10.1007/978-3-031-48312-7_43
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-48311-0
Online ISBN: 978-3-031-48312-7
eBook Packages: Computer ScienceComputer Science (R0)