Disfluent-to-Fluent Tunisian Dialect Speech Translation with Fine-Tuning Pre-trained Language Models

Boughariou, Emna; Bahou, Younés; Belguith, Lamia Hadrich

doi:10.1007/978-3-031-85067-7_8

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 1303))

Included in the following conference series:

International Conference on Language Processing and Knowledge Management

132 Accesses

Abstract

In contrast to written texts and prepared speeches, conversational/spontaneous speech has a very high degree of freedom and includes a huge number of disfluencies. Detecting disfluencies using transformer-based models has advanced state-of-the-art performance. In this work, we aim to process disfluencies in the spontaneous tunisian dialect speech by generating fluent utterances from disfluent transcripts. We propose a transformer-based model by fine-tuning the pre-trained T5 language model. Using this model, we achieved an F-Measure score of 74,71% based on the evaluation data set part of DisCoTAT.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+

from $39.99 /Month

Starting from 10 chapters or articles per month
Access and download chapters and articles from more than 300k books and 2,500 journals
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Detecting Speech Disorders Using A Machine-Learning Guided Method in Spontaneous Tunisian Dialect Speech

Article 17 April 2024

Classification Based Method for Disfluencies Detection in Spontaneous Spoken Tunisian Dialect

Linguistic Resources Construction: Towards Disfluency Processing in Spontaneous Tunisian Dialect Speech

Notes

1.
https://huggingface.co/. Hugging Face is an NLP-focused library with a large open-source community, around the Transformers library.

References

Abdallah, N.B., Kchaou, S., Bougares, F.: Text and speech-based Tunisian Arabic sub-dialects identification. In: Proceedings of the 12th Language Resources and Evaluation Conference, pp. 6405–6411 (2020)
Google Scholar
Alharbi, S., Hasan, M., Simons, A.J., Brumfitt, S., Green, P.: Sequence labeling to detect stuttering events in read speech. Comput. Speech Lang. 62, 101052 (2020)
Article Google Scholar
Boughariou, E., Bahou, Y., Hadrich Belguith, L.: Linguistic resources construction: towards disfluency processing in spontaneous Tunisian dialect speech. In: International Conference on Text, Speech, and Dialogue, pp. 316–328. Springer, Cham (2019)
Google Scholar
Boughariou, E., Bahou, Y., Hadrich Belguith, L.: Classification based method for disfluencies detection in spontaneous spoken Tunisian dialect. In: Proceedings of SAI Intelligent Systems Conference, pp. 182–195. Springer, Cham (2020)
Google Scholar
Boughariou, E., Bahou, Y., Hadrich Belguith, L.: Detecting speech disorders using a machine-learning guided method in spontaneous Tunisian dialect speech. SN Comput. Sci. 5(5), 440 (2024)
Article Google Scholar
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Dong, Q., Wang, F., Yang, Z., Chen, W., Xu, S., Xu, B.: Adapting translation models for transcript disfluency detection. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 6351–6358 (2019)
Google Scholar
Georgila, K.: Using integer linear programming for detecting speech disfluencies. In: Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers, pp. 109–112. Association for Computational Linguistics (2009)
Google Scholar
Johnson, M., Charniak, E.: A TAG-based noisy-channel model of speech repairs. In: Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL 2004), Barcelona, Spain, pp. 33–39 (2004). https://www.aclweb.org/anthology/P04-1005
Liu, Y., Shriberg, E., Stolcke, A., Hillard, D., Ostendorf, M., Harper, M.: Enriching speech recognition with automatic detection of sentence boundaries and disfluencies. IEEE Trans. Audio Speech Lang. Process. 14(5), 1526–1540 (2006)
Article Google Scholar
Lou, P.J., Anderson, P., Johnson, M.: Disfluency detection using auto-correlational neural networks. arXiv preprint arXiv:1808.09092 (2018)
Lou, P.J., Johnson, M.: Disfluency detection using a noisy channel model and a deep neural language model. arXiv preprint arXiv:1808.09091 (2018)
Masmoudi, A., Bougares, F., Khmekhem, M.E., Estève, Y., Hadrich Belguith, L.: Automatic speech recognition system for Tunisian dialect. Lang. Resour. Eval. 52(1), 249–267 (2018)
Article Google Scholar
Mathur, A., Foody, G.: Multiclass and binary SVM classification: implications for training and classification users. IEEE Geosci. Remote Sens. Lett. 5, 241–245 (2008). https://doi.org/10.1109/LGRS.2008.915597
Neifar, W., Bahou, Y., Graja, M., Jaoua, M.: Implementation of a symbolic method for the Tunisian dialect understanding. In: Proceedings of 5th International Conference on Arabic Language Processing, Oujda, Maroc (2014)
Google Scholar
Raffel, C., et al.: Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res. 21(140), 1–67 (2020). http://jmlr.org/papers/v21/20-074.html
Saini, N., Khatri, J., Jyothi, P., Bhattacharyya, P.: Generating fluent translations from disfluent text without access to fluent references: IIT Bombay@ IWSLT2020. In: Proceedings of the 17th International Conference on Spoken Language Translation, pp. 178–186 (2020)
Google Scholar
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Google Scholar
Wang, S., Che, W., Zhang, Y., Zhang, M., Liu, T.: Transition-based disfluency detection using LSTMs. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 2785–2794 (2017)
Google Scholar
Yoshikawa, M., Shindo, H., Matsumoto, Y.: Joint transition-based dependency parsing and disfluency detection for automatic speech recognition texts. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 1036–1041 (2016)
Google Scholar
Zayats, V., Ostendorf, M., Hajishirzi, H.: Disfluency detection using a bidirectional LSTM. arXiv preprint arXiv:1604.03209 (2016)
Zribi, I., Ellouze, M., Hadrich Belguith, L., Blache, P.: Spoken Tunisian Arabic corpus “STAC’’: transcription and annotation. Res. Comput. Sci. 90, 123–135 (2015)
Article Google Scholar
Zribi, I., Graja, M., Khmekhem, M.E., Jaoua, M., Hadrich Belguith, L.: Orthographic transcription for spoken Tunisian Arabic. In: International Conference on Intelligent Text Processing and Computational Linguistics, pp. 153–163. Springer, Cham (2013)
Google Scholar

Download references

Author information

Authors and Affiliations

Sfax University, Sfax, Tunisia
Emna Boughariou & Lamia Hadrich Belguith
HA’IL University, Hail, Kingdom of Saudi Arabia
Younés Bahou

Authors

Emna Boughariou
View author publications
Search author on:PubMed Google Scholar
Younés Bahou
View author publications
Search author on:PubMed Google Scholar
Lamia Hadrich Belguith
View author publications
Search author on:PubMed Google Scholar

Corresponding author

Correspondence to Emna Boughariou .

Editor information

Editors and Affiliations

FSEGS, University of Sfax, Sfax, Tunisia
Lamia Hadrich Belguith
The British University In Dubai, Dubai, Dubai, United Arab Emirates
Khaled Shaalan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Boughariou, E., Bahou, Y., Belguith, L.H. (2025). Disfluent-to-Fluent Tunisian Dialect Speech Translation with Fine-Tuning Pre-trained Language Models. In: Belguith, L.H., Shaalan, K. (eds) Advancements in Machine Learning and Natural Language Processing: Innovations and Applications. LPKM 2024. Lecture Notes in Networks and Systems, vol 1303. Springer, Cham. https://doi.org/10.1007/978-3-031-85067-7_8

Download citation

DOI: https://doi.org/10.1007/978-3-031-85067-7_8
Published: 28 March 2025
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-85066-0
Online ISBN: 978-3-031-85067-7
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

Disfluent-to-Fluent Tunisian Dialect Speech Translation with Fine-Tuning Pre-trained Language Models