Abstract
In contrast to written texts and prepared speeches, conversational/spontaneous speech has a very high degree of freedom and includes a huge number of disfluencies. Detecting disfluencies using transformer-based models has advanced state-of-the-art performance. In this work, we aim to process disfluencies in the spontaneous tunisian dialect speech by generating fluent utterances from disfluent transcripts. We propose a transformer-based model by fine-tuning the pre-trained T5 language model. Using this model, we achieved an F-Measure score of 74,71% based on the evaluation data set part of DisCoTAT.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
https://huggingface.co/. Hugging Face is an NLP-focused library with a large open-source community, around the Transformers library.
References
Abdallah, N.B., Kchaou, S., Bougares, F.: Text and speech-based Tunisian Arabic sub-dialects identification. In: Proceedings of the 12th Language Resources and Evaluation Conference, pp. 6405–6411 (2020)
Alharbi, S., Hasan, M., Simons, A.J., Brumfitt, S., Green, P.: Sequence labeling to detect stuttering events in read speech. Comput. Speech Lang. 62, 101052 (2020)
Boughariou, E., Bahou, Y., Hadrich Belguith, L.: Linguistic resources construction: towards disfluency processing in spontaneous Tunisian dialect speech. In: International Conference on Text, Speech, and Dialogue, pp. 316–328. Springer, Cham (2019)
Boughariou, E., Bahou, Y., Hadrich Belguith, L.: Classification based method for disfluencies detection in spontaneous spoken Tunisian dialect. In: Proceedings of SAI Intelligent Systems Conference, pp. 182–195. Springer, Cham (2020)
Boughariou, E., Bahou, Y., Hadrich Belguith, L.: Detecting speech disorders using a machine-learning guided method in spontaneous Tunisian dialect speech. SN Comput. Sci. 5(5), 440 (2024)
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Dong, Q., Wang, F., Yang, Z., Chen, W., Xu, S., Xu, B.: Adapting translation models for transcript disfluency detection. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 6351–6358 (2019)
Georgila, K.: Using integer linear programming for detecting speech disfluencies. In: Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers, pp. 109–112. Association for Computational Linguistics (2009)
Johnson, M., Charniak, E.: A TAG-based noisy-channel model of speech repairs. In: Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL 2004), Barcelona, Spain, pp. 33–39 (2004). https://www.aclweb.org/anthology/P04-1005
Liu, Y., Shriberg, E., Stolcke, A., Hillard, D., Ostendorf, M., Harper, M.: Enriching speech recognition with automatic detection of sentence boundaries and disfluencies. IEEE Trans. Audio Speech Lang. Process. 14(5), 1526–1540 (2006)
Lou, P.J., Anderson, P., Johnson, M.: Disfluency detection using auto-correlational neural networks. arXiv preprint arXiv:1808.09092 (2018)
Lou, P.J., Johnson, M.: Disfluency detection using a noisy channel model and a deep neural language model. arXiv preprint arXiv:1808.09091 (2018)
Masmoudi, A., Bougares, F., Khmekhem, M.E., Estève, Y., Hadrich Belguith, L.: Automatic speech recognition system for Tunisian dialect. Lang. Resour. Eval. 52(1), 249–267 (2018)
Mathur, A., Foody, G.: Multiclass and binary SVM classification: implications for training and classification users. IEEE Geosci. Remote Sens. Lett. 5, 241–245 (2008). https://doi.org/10.1109/LGRS.2008.915597
Neifar, W., Bahou, Y., Graja, M., Jaoua, M.: Implementation of a symbolic method for the Tunisian dialect understanding. In: Proceedings of 5th International Conference on Arabic Language Processing, Oujda, Maroc (2014)
Raffel, C., et al.: Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res. 21(140), 1–67 (2020). http://jmlr.org/papers/v21/20-074.html
Saini, N., Khatri, J., Jyothi, P., Bhattacharyya, P.: Generating fluent translations from disfluent text without access to fluent references: IIT Bombay@ IWSLT2020. In: Proceedings of the 17th International Conference on Spoken Language Translation, pp. 178–186 (2020)
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Wang, S., Che, W., Zhang, Y., Zhang, M., Liu, T.: Transition-based disfluency detection using LSTMs. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 2785–2794 (2017)
Yoshikawa, M., Shindo, H., Matsumoto, Y.: Joint transition-based dependency parsing and disfluency detection for automatic speech recognition texts. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 1036–1041 (2016)
Zayats, V., Ostendorf, M., Hajishirzi, H.: Disfluency detection using a bidirectional LSTM. arXiv preprint arXiv:1604.03209 (2016)
Zribi, I., Ellouze, M., Hadrich Belguith, L., Blache, P.: Spoken Tunisian Arabic corpus “STAC’’: transcription and annotation. Res. Comput. Sci. 90, 123–135 (2015)
Zribi, I., Graja, M., Khmekhem, M.E., Jaoua, M., Hadrich Belguith, L.: Orthographic transcription for spoken Tunisian Arabic. In: International Conference on Intelligent Text Processing and Computational Linguistics, pp. 153–163. Springer, Cham (2013)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Boughariou, E., Bahou, Y., Belguith, L.H. (2025). Disfluent-to-Fluent Tunisian Dialect Speech Translation with Fine-Tuning Pre-trained Language Models. In: Belguith, L.H., Shaalan, K. (eds) Advancements in Machine Learning and Natural Language Processing: Innovations and Applications. LPKM 2024. Lecture Notes in Networks and Systems, vol 1303. Springer, Cham. https://doi.org/10.1007/978-3-031-85067-7_8
Download citation
DOI: https://doi.org/10.1007/978-3-031-85067-7_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-85066-0
Online ISBN: 978-3-031-85067-7
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)