Skip to main content

Dialect Identification in Ao Using Modulation-Based Representation

  • Conference paper
  • First Online:
Speech and Computer (SPECOM 2023)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14339))

Included in the following conference series:

  • 286 Accesses

Abstract

This paper presents an automatic dialect identification in Ao using modulation-based approach. Ao is a low-resource, Tibeto-Burman tonal language spoken in Nagaland, a North-East state of India. This work aims to investigate dialect-specific characteristics to build a more robust DID system for classifying the three Ao dialects. In this direction, modulation-based representation is explored. Considering Ao is a tone language, the experiments were evaluated for 3 sec segment duration in order to capture the temporal information of the modulation spectrogram. In addition, the log Mel spectrogram is used as the feature for the baseline DID system. The proposed modulation spectrogram shows a significant performance of \(\approx 8\%\) improvement in accuracy over the baseline Ao DID system. Hence, the result indicates the effectiveness of modulation-based representation in automatically identifying the three dialects of Ao.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Agrawal, S.S., Jain, A., Sinha, S.: Analysis and modeling of acoustic information for automatic dialect classification. Int. J. Speech Technol. 19(3), 593–609 (2016). https://doi.org/10.1007/s10772-016-9351-7

    Article  Google Scholar 

  2. Biadsy, F., Hirschberg, J., Habash, N.: Spoken Arabic dialect identification using phonotactic modeling. In: Proceedings of the EACL Workshop on Computational Approaches to Semitic Languages, pp. 53–61. Stroudsburg, PA, USA (2009)

    Google Scholar 

  3. Cassani, R., Albuquerque, I., Monteiro, J., Falk, T.H.: AMA: an open-source amplitude modulation analysis toolkit for signal processing applications. In: 2019 IEEE Global Conference on Signal and Information Processing (GlobalSIP), pp. 1–4. IEEE (2019)

    Google Scholar 

  4. Chambers, J.K., Trudgill, P.: Dialectology, vol. \(2^{nd}\) edition. Cambridge University Press (1998)

    Google Scholar 

  5. Chittaragi, N.B., Koolagudi, S.G.: Dialect identification using chroma-spectral shape features with ensemble technique. Comput. Speech Lang. 70, 101230 (2021)

    Article  Google Scholar 

  6. Coupe, A.R.: The acoustic and perceptual features of tone in the Tibeto-Burman language Ao naga. In: Proceedings of the \(5^{th}\) International Conference on Spoken Language Processing (1998)

    Google Scholar 

  7. G.191 ITU-T, R.: ITU-T software tool library. International Telecommunication Union, Geneva, Switzerland (2009)

    Google Scholar 

  8. G.191 ITU-T, R.: Software tools for speech and audio coding standardization. International Telecommunication Union, Geneva, Switzerland (2005). https://www.itu.int/rec/T-REC-G.191/en

  9. Grierson, G.A.: Linguistic Survey of India, vol. 4. Office of the superintendent of government printing, India (1906)

    Google Scholar 

  10. Hung, P.N., Ha, N.T., Van Loan, T., Thang, V.X., Chien, N.D.: Vietnamese dialect identification on embedded system. UTEHY J. Sci. Technol. 24, 82–87 (2019)

    Google Scholar 

  11. Kakouros, S., Hiovain, K., Vainio, M., Šimko, J.: Dialect identification of spoken north Sámi language varieties using prosodic features. arXiv preprint arXiv:2003.10183 (2020)

  12. Kakouros, S., Hiovain-Asikainen, K.: North Sámi dialect identification with self-supervised speech models. arXiv preprint arXiv:2305.11864 (2023)

  13. Kethireddy, R., Kadiri, S.R., Alku, P., Gangashetty, S.V.: Mel-weighted single frequency filtering spectrogram for dialect identification. IEEE Access 8, 174871–174879 (2020)

    Article  Google Scholar 

  14. Lei, Y., Hansen, J.H.L.: Dialect classification via text-independent training and testing for Arabic, Spanish, and Chinese. IEEE Trans. Audio Speech Lang. Process. 19, 85–96 (2011)

    Article  Google Scholar 

  15. Lin, W., Madhavi, M., Das, R.K., Li, H.: Transformer-based Arabic dialect identification. In: International Conference on Asian Language Processing (IALP), pp. 192–196. IEEE (2020)

    Google Scholar 

  16. Ma, B., Zhu, D., Tong, R.: Chinese dialect identification using tone features based on pitch flux. In: Proceedings of the IEEE International Conference on Acoustics Speech and Signal Processing, vol. 1 (2006)

    Google Scholar 

  17. Magazine, R., Agarwal, A., Hedge, A., Prasanna, S.M.: Fake speech detection using modulation spectrogram. In: International Conference on Speech and Computer, pp. 451–463. Springer (2022). https://doi.org/10.1007/978-3-031-20980-2_39

  18. Mingliang, G., Yuguo, X., Yiming, Y.: Semi-supervised learning based Chinese dialect identification. In: Proceedings of the \(9^{th}\) International Conference on Signal Processing, pp. 1608–1611. IEEE (2008)

    Google Scholar 

  19. Rao, K.S., Koolagudi, S.G.: Identification of Hindi dialects and emotions using spectral and prosodic features of speech. IJSCI: Int. J. Syst. Cybern. Inf. 9(4), 24–33 (2011)

    Google Scholar 

  20. Shon, S., Ali, A., Samih, Y., Mubarak, H., Glass, J.: ADI17: a fine-grained Arabic dialect identification dataset. In: Proceedings of the International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 8244–8248. IEEE (2020)

    Google Scholar 

  21. Tzudir, M., Baghel, S., Sarmah, P., Prasanna, S.: Under-resourced dialect identification in Ao using source information. J. Acoust. Soc. Am. 152(3), 1755–1766 (2022)

    Article  Google Scholar 

  22. Tzudir, M., Baghel, S., Sarmah, P., Prasanna, S.M.: Excitation source feature based dialect identification in Ao-a low resource language. In: Proceedings of the INTERSPEECH, pp. 1524–1528 (2021)

    Google Scholar 

  23. Tzudir, M., Baghel, S., Sarmah, P., Prasanna, S.: Analyzing RMFCC feature for dialect identification in Ao, an under-resourced language. In: Proceedings of the National Conference on Communications (NCC), pp. 308–313. IEEE (2022)

    Google Scholar 

  24. Tzudir, M., Bhattacharjee, M., Sarmah, P., Prasanna, S.: Low-resource dialect identification in Ao using noise robust mean Hilbert envelope coefficients. In: Proc. of the National Conference on Communications (NCC), pp. 256–261. IEEE (2022)

    Google Scholar 

  25. Tzudir, M., Sarmah, P., Prasanna, S.R.M.: Dialect identification using tonal and spectral features in two dialects of Ao. In: Proceedings of the SLTU (2018)

    Google Scholar 

  26. Tzudir, M., Sarmah, P., Prasanna, S.M.: Tonal feature based dialect discrimination in two dialects in Ao. In: Proceedings of the Region 10 Conference, TENCON, pp. 1795–1799. IEEE (2017)

    Google Scholar 

  27. Tzudir, M., Sarmah, P., Prasanna, S.M.: Analysis and modeling of dialect information in Ao, a low resource language. J. Acoust. Soc. Am. 149(5), 2976–2987 (2021)

    Article  Google Scholar 

  28. Tzudir, M., Sarmah, P., Prasanna, S.M.: Prosodic information in dialect identification of a tonal language: the case of Ao. Proc. Interspeech 2022, 2238–2242 (2022)

    Article  Google Scholar 

  29. Vincent, E., Campbell, D.: Roomsimove. https://irisa.fr/metiss/members/evincent/Roomsimove.zip

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rishith Sadashiv T.N. .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Tzudir, M., Sadashiv T.N., R., Agarwal, A., Prasanna, S.R.M. (2023). Dialect Identification in Ao Using Modulation-Based Representation. In: Karpov, A., Samudravijaya, K., Deepak, K.T., Hegde, R.M., Agrawal, S.S., Prasanna, S.R.M. (eds) Speech and Computer. SPECOM 2023. Lecture Notes in Computer Science(), vol 14339. Springer, Cham. https://doi.org/10.1007/978-3-031-48312-7_43

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-48312-7_43

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-48311-0

  • Online ISBN: 978-3-031-48312-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics