Abstract
This work investigates the disfluencies processing task within the natural spoken language comprehension field. We present a transcription-based method with purely linguistic features for detecting disfluencies in spoken Tunisian dialect transcriptions. Disfluencies processing is the task of detecting spontaneous disorders in spoken language transcripts, distinguishing between fluent and disfluent words. The originality of this method is that several disfluency types are processed automatically and for wide domains in the spontaneous spoken Tunisian dialect. Likewise, it incorporates various linguistic features such as morpho-syntactic labels and word synonyms. Syllabic elongations, speech words, word-fragments, and simple repetitions are carried out according to the rule-based approach, while complex repetitions, insertions, substitutions, and deletions are detected using a transition-based model through the machine learning approach. We compare the transition-based model to the sequence-tagging-based model presented in the previous work. Experiments show that both models are relevant to the disfluencies detection task in the spoken Tunisian dialect, the F-Measure rates are respectively 79.81% and 78.97%.


Similar content being viewed by others
Data Availability Statement
Data available on request from the authors.
Notes
The translation from TD to English is right-to-left and word-for-word.
References
Alharbi S, Hasan M, Simons AJ, Brumfitt S, Green P. Sequence labeling to detect stuttering events in read speech. Comput Speech Lang. 2020;62: 101052.
Bach N, Huang F. Noisy bilstm-based models for disfluency detection. In: Proc Interspeech. 2019;2019:4230–4.
Bahou Y, Maaloul M, Boughariou E. Towards the supervised machine learning and the conceptual segmentation technique in the spontaneous Arabic speech understanding. In: Procedia Computer Science, ACLING2017. UAE: Dubai; 2017. p. 225–32.
Bouamor H, Hassan S, Habash N. The madar shared task on Arabic fine-grained dialect identification. In: Proceedings of the Fourth Arabic Natural Language Processing Workshop, 2019; p. 199–207.
Boughariou E, Bahou Y, Hadrich Belguith L. Linguistic resources construction: Towards disfluency processing in spontaneous Tunisian dialect speech. In: International Conference on text, speech, and dialogue, 2019; pages 316–328. Springer.
Boughariou E, Bahou Y, Hadrich Belguith L. Classification based method for disfluencies detection in spontaneous spoken Tunisian dialect. In: Proceedings of SAI Intelligent Systems Conference, 2020; p. 182–195. Springer.
Bouraoui J-L, Vigouroux N. Traitement automatique de disfluences dans un corpus linguistiquement contraint. In: Actes de TALN, 2009; p. 117.
Chang C-C, Lin C-J. Libsvm: a library for support vector machines. ACM Trans Intell Syst Technol (TIST). 2011;2(3):1–27.
Constant M, Tellier I. Evaluating the impact of external lexical resources into a crf-based multiword segmenter and part-of-speech tagger. In: 8th International Conference on Language Resources and Evaluation (LREC’12), 2012; p. 646–650.
Dong Q, Wang F, Yang Z, Chen W, Xu S, Xu B. Adapting translation models for transcript disfluency detection. In: Proceedings of the AAAI Conference on Artificial Intelligence. 2019;33:6351–8.
Germesin S, Becker T, Poller P. Hybrid multi-step disfluency detection. In: International Workshop on Machine Learning for Multimodal Interaction, 2008; p. 185–195. Springer.
Graja M, Jaoua M, Hadrich Belguith L. Statistical framework with knowledge base integration for robust speech understanding of the Tunisian dialect. IEEE/ACM Trans Audio Speech Lang Process (TASLP). 2015;23(12):2311–21.
Ismail SB, Boukédi S, Haddar K. Hpsg grammar supporting Arabic preference nouns and its tdl specification. In: International Conference on Arabic language processing, 2019; p. 221–234. Springer.
Johnson M, Charniak E. A TAG-based noisy-channel model of speech repairs. In: Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-04), 2004; p. 33–39, Barcelona, Spain.
Lin H-T, Lin C-J. A study on sigmoid kernels for svm and the training of non-psd kernels by smo-type methods. Neural Comput. 2003;3(1–32):16.
Lou PJ, Anderson, P, Johnson M. Disfluency detection using auto-correlational neural networks. 2018. arXiv preprint arXiv:1808.09092.
Lou PJ, Johnson M. Disfluency detection using a noisy channel model and a deep neural language model. 2018. arXiv preprint arXiv:1808.09091.
Lu Y, Gales M, Knill K, Manakul P, Wang Y. Disfluency detection for spoken learner English. In Proc. SLaTE 2019: 8th ISCA Workshop on Speech and Language Technology in Education, 2019; p. 74–78.
Masmoudi A, Bougares F, Khmekhem ME, Estève Y, Hadrich Belguith L. Automatic speech recognition system for Tunisian dialect. Lang Resour Eval. 2018;52(1):249–67.
Masmoudi A, Khmekhem ME, Esteve Y, Hadrich Belguith L, Habash N. A corpus and phonetic dictionary for Tunisian Arabic speech recognition. In LREC, 2014; p. 306–310.
Masmoudi A, Khmekhem ME, Khrouf M, Hadrich Belguith L. Transliteration of Arabizi into Arabic script for Tunisian dialect. ACM Trans Asian Low-Resour Lang Inf Process. 2019;19(2):1–21.
Masmoudi A, Laatar R, Ellouze M, Hadrich Belguith L. Semantic language model for Tunisian dialect. In: Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019), 2019; p. 720–729, Varna, Bulgaria. INCOMA Ltd.
Mathur A, Foody G. Multiclass and binary svm classification: implications for training and classification users. Geosci Remote Sens Lett IEEE. 2008;5:241–5.
Messaoudi A, Haddad H, HajHmida MB, Fourati C, Hamida AB. Learning word representations for Tunisian sentiment analysis. 2020. arXiv preprint arXiv:2010.06857.
Neifar W, Bahou Y, Graja M, Jaoua M. Implementation of a symbolic method for the tunisian dialect understanding. In: Proceedings of 5th International Conference on Arabic Language Processing, Oujda, Maroc; 2014.
Ramshaw LA, Marcus MP. Text chunking using transformation-based learning. In: Armstrong S, Church K, Isabelle P, Manzi S, Tzoukermann E, Yarowsky D, editors. Natural language processing using very large corpora. Springer; 1999. p. 157–76.
Rocholl JC, Zayats V, Walker DD, Murad NB, Schneider A, Liebling DJ. Disfluency detection with unlabeled data and small bert models. 2011. arXiv preprint arXiv:2104.10769.
Rohanian M, Hough J. Best of both worlds: Making high accuracy non-incremental transformer-based disfluency detection incremental. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), 2021; p. 3693–3703.
Shriberg EE. Preliminaries to a theory of speech disfluencies. Thèse de doctorat: University of California, Berkeley; 1994.
Wang S, Che W, Zhang Y, Zhang M, Liu T. Transition-based disfluency detection using lstms. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2017; p. 2785–2794.
Wu S, Zhang D, Zhou M, Zhao T. Efficient disfluency detection with transition-based parsing. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on natural language processing (Volume 1: Long Papers), 2015; p. 495–503.
Zayats V, Ostendorf M, Hajishirzi H. Disfluency detection using a bidirectional lstm. 2016. arXiv preprint arXiv:1604.03209.
Zribi I, Ellouze M, Hadrich Belguith L, Blache P. Spoken Tunisian Arabic corpus “stac’’: transcription and annotation. Res Comput Sci. 2015;90:123–35.
Zribi I, Graja M, Khmekhem ME, Jaoua M, Hadrich Belguith L. Orthographic transcription for spoken Tunisian Arabic. In: International Conference on intelligent text processing and computational linguistics, 2013; p. 153–163. Springer.
Zwarts S, Johnson M. The impact of language models and loss functions on repair disfluency detection. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, 2011; p. 703–711, Portland, Oregon, USA. Association for Computational Linguistics.
Funding
Not applicable.
Author information
Authors and Affiliations
Contributions
All authors contributed to the study. The first draft of the manuscript was written by Emna Boughariou and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Conflict of Interest
The authors declare that they have no conflict of interest.
Research Involving Human and /or Animals:
Not applicable.
Informed Consent
Not applicable.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Boughariou, E., Bahou, Y. & Belguith, L.H. Detecting Speech Disorders Using A Machine-Learning Guided Method in Spontaneous Tunisian Dialect Speech. SN COMPUT. SCI. 5, 440 (2024). https://doi.org/10.1007/s42979-024-02775-8
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s42979-024-02775-8