Skip to main content

Disfluent-to-Fluent Tunisian Dialect Speech Translation with Fine-Tuning Pre-trained Language Models

  • Conference paper
  • First Online:
Advancements in Machine Learning and Natural Language Processing: Innovations and Applications (LPKM 2024)

Abstract

In contrast to written texts and prepared speeches, conversational/spontaneous speech has a very high degree of freedom and includes a huge number of disfluencies. Detecting disfluencies using transformer-based models has advanced state-of-the-art performance. In this work, we aim to process disfluencies in the spontaneous tunisian dialect speech by generating fluent utterances from disfluent transcripts. We propose a transformer-based model by fine-tuning the pre-trained T5 language model. Using this model, we achieved an F-Measure score of 74,71% based on the evaluation data set part of DisCoTAT.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+
from $39.99 /Month
  • Starting from 10 chapters or articles per month
  • Access and download chapters and articles from more than 300k books and 2,500 journals
  • Cancel anytime
View plans

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://huggingface.co/. Hugging Face is an NLP-focused library with a large open-source community, around the Transformers library.

References

  1. Abdallah, N.B., Kchaou, S., Bougares, F.: Text and speech-based Tunisian Arabic sub-dialects identification. In: Proceedings of the 12th Language Resources and Evaluation Conference, pp. 6405–6411 (2020)

    Google Scholar 

  2. Alharbi, S., Hasan, M., Simons, A.J., Brumfitt, S., Green, P.: Sequence labeling to detect stuttering events in read speech. Comput. Speech Lang. 62, 101052 (2020)

    Article  Google Scholar 

  3. Boughariou, E., Bahou, Y., Hadrich Belguith, L.: Linguistic resources construction: towards disfluency processing in spontaneous Tunisian dialect speech. In: International Conference on Text, Speech, and Dialogue, pp. 316–328. Springer, Cham (2019)

    Google Scholar 

  4. Boughariou, E., Bahou, Y., Hadrich Belguith, L.: Classification based method for disfluencies detection in spontaneous spoken Tunisian dialect. In: Proceedings of SAI Intelligent Systems Conference, pp. 182–195. Springer, Cham (2020)

    Google Scholar 

  5. Boughariou, E., Bahou, Y., Hadrich Belguith, L.: Detecting speech disorders using a machine-learning guided method in spontaneous Tunisian dialect speech. SN Comput. Sci. 5(5), 440 (2024)

    Article  Google Scholar 

  6. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)

  7. Dong, Q., Wang, F., Yang, Z., Chen, W., Xu, S., Xu, B.: Adapting translation models for transcript disfluency detection. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 6351–6358 (2019)

    Google Scholar 

  8. Georgila, K.: Using integer linear programming for detecting speech disfluencies. In: Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers, pp. 109–112. Association for Computational Linguistics (2009)

    Google Scholar 

  9. Johnson, M., Charniak, E.: A TAG-based noisy-channel model of speech repairs. In: Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL 2004), Barcelona, Spain, pp. 33–39 (2004). https://www.aclweb.org/anthology/P04-1005

  10. Liu, Y., Shriberg, E., Stolcke, A., Hillard, D., Ostendorf, M., Harper, M.: Enriching speech recognition with automatic detection of sentence boundaries and disfluencies. IEEE Trans. Audio Speech Lang. Process. 14(5), 1526–1540 (2006)

    Article  Google Scholar 

  11. Lou, P.J., Anderson, P., Johnson, M.: Disfluency detection using auto-correlational neural networks. arXiv preprint arXiv:1808.09092 (2018)

  12. Lou, P.J., Johnson, M.: Disfluency detection using a noisy channel model and a deep neural language model. arXiv preprint arXiv:1808.09091 (2018)

  13. Masmoudi, A., Bougares, F., Khmekhem, M.E., Estève, Y., Hadrich Belguith, L.: Automatic speech recognition system for Tunisian dialect. Lang. Resour. Eval. 52(1), 249–267 (2018)

    Article  Google Scholar 

  14. Mathur, A., Foody, G.: Multiclass and binary SVM classification: implications for training and classification users. IEEE Geosci. Remote Sens. Lett. 5, 241–245 (2008). https://doi.org/10.1109/LGRS.2008.915597

  15. Neifar, W., Bahou, Y., Graja, M., Jaoua, M.: Implementation of a symbolic method for the Tunisian dialect understanding. In: Proceedings of 5th International Conference on Arabic Language Processing, Oujda, Maroc (2014)

    Google Scholar 

  16. Raffel, C., et al.: Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res. 21(140), 1–67 (2020). http://jmlr.org/papers/v21/20-074.html

  17. Saini, N., Khatri, J., Jyothi, P., Bhattacharyya, P.: Generating fluent translations from disfluent text without access to fluent references: IIT Bombay@ IWSLT2020. In: Proceedings of the 17th International Conference on Spoken Language Translation, pp. 178–186 (2020)

    Google Scholar 

  18. Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)

    Google Scholar 

  19. Wang, S., Che, W., Zhang, Y., Zhang, M., Liu, T.: Transition-based disfluency detection using LSTMs. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 2785–2794 (2017)

    Google Scholar 

  20. Yoshikawa, M., Shindo, H., Matsumoto, Y.: Joint transition-based dependency parsing and disfluency detection for automatic speech recognition texts. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 1036–1041 (2016)

    Google Scholar 

  21. Zayats, V., Ostendorf, M., Hajishirzi, H.: Disfluency detection using a bidirectional LSTM. arXiv preprint arXiv:1604.03209 (2016)

  22. Zribi, I., Ellouze, M., Hadrich Belguith, L., Blache, P.: Spoken Tunisian Arabic corpus “STAC’’: transcription and annotation. Res. Comput. Sci. 90, 123–135 (2015)

    Article  Google Scholar 

  23. Zribi, I., Graja, M., Khmekhem, M.E., Jaoua, M., Hadrich Belguith, L.: Orthographic transcription for spoken Tunisian Arabic. In: International Conference on Intelligent Text Processing and Computational Linguistics, pp. 153–163. Springer, Cham (2013)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Emna Boughariou .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Boughariou, E., Bahou, Y., Belguith, L.H. (2025). Disfluent-to-Fluent Tunisian Dialect Speech Translation with Fine-Tuning Pre-trained Language Models. In: Belguith, L.H., Shaalan, K. (eds) Advancements in Machine Learning and Natural Language Processing: Innovations and Applications. LPKM 2024. Lecture Notes in Networks and Systems, vol 1303. Springer, Cham. https://doi.org/10.1007/978-3-031-85067-7_8

Download citation

Publish with us

Policies and ethics