Arabic Readability Assessment for Foreign Language Learners

Nassiri, Naoual; Lakhouaja, Abdelhak; Cavalli-Sforza, Violetta

doi:10.1007/978-3-319-91947-8_49

Naoual Nassiri¹⁸,
Abdelhak Lakhouaja¹⁸ &
Violetta Cavalli-Sforza¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10859))

Included in the following conference series:

International Conference on Applications of Natural Language to Information Systems

2461 Accesses
5 Citations

Abstract

Reading in a foreign language is a difficult task, especially if the texts presented to readers are chosen without taking into account the reader’s skill level. Foreign language learners need to be presented with reading material suitable to their reading capacities. A basic tool for determining if a text is appropriate to a reader’s level is the assessment of its readability, a measure that aims to represent the human capacities required to comprehend a given text. Readability prediction for a text is an important aspect in the process of teaching and learning, for reading in a foreign language as well as in one’s native language, and continues to be a central area of research and practice. In this paper, we present our approach to readability assessment for Modern Standard Arabic (MSA) as a foreign language. Readability prediction is carried out using the Global Language Online Support System (GLOSS) corpus, which was developed for independent learners to improve their foreign language skills and was annotated with the Interagency Language Roundtable (ILR) scale. In this study, we introduce a frequency dictionary, which was developed to calculate frequency-based features. The approach gives results that surpass the state-of the-art results for Arabic.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://gloss.dliflc.edu/. The MSA corpus has undergone some variation in contents over time.
2.
The ILR scale (https://www.languagetesting.com/ilr-scale), developed by the U.S. Federal Government, rates language ability uses values 0 to 5, where: Level 0 (no proficiency); Level 1 (elementary proficiency); Level 2 (limited working proficiency); Level 3 (general occupational proficiency); Level 4 (advanced professional proficiency) and Level 5 (functionally native proficiency). Levels 0+, 1+, 2+, 3+, or 4+ are used when the person’s skills significantly exceed those of a given level, but are insufficient to reach the next level.
3.
The Waikato Environment for Knowledge Analysis (WEKA) is an open source machine learning software resource that contains implementations of various algorithms.

References

Dale, E., Chall, J.S.: A formula for predicting readability: instructions. Educ. Res. Bull. 27(2), 37–54 (1948)
Google Scholar
Flesch, R.: A new readability yardstick. J. Appl. Psychol. 32(3), 221 (1948)
Article Google Scholar
Gunning, R.: The fog index after twenty years. J. Bus. Commun. 6(2), 3–13 (1969)
Article Google Scholar
Ghani, K.A., Noh, A.S., Yusoff, N.M.: Linguistic features for development of Arabic text readability formula in Malaysia: a preliminary study. Middle-East J. Sci. Res. 19(3), 319–331 (2014)
Google Scholar
Al Tamimi, A.K., Jaradat, M., Al-Jarrah, N., Ghanem, S.: AARI: automatic arabic readability index. Int. Arab. J. Inf. Technol. 11(4), 370–378 (2014)
Google Scholar
Al-Khalifa, H.S., Al-Ajlan, A.: Automatic readability measurements of the Arabic text: an exploratory study. Arab. J. Sci. Eng. 35, 103–124 (2010)
Google Scholar
Forsyth, J.: Automatic readability detection for modern standard Arabic. Thesis Diss., Brigh. Young Univ. – Provo (2014)
Google Scholar
Pasha, A., et al.: MADAMIRA: a fast, comprehensive tool for morphological analysis and disambiguation of Arabic. In: LREC, vol. 14, pp. 1094–1101 (2014)
Google Scholar
Saddiki, H., Bouzoubaa, K., Cavalli-Sforza, V.: Text readability for Arabic as a foreign language. In: 2015 IEEE/ACS 12th International Conference of Computer Systems and Applications (AICCSA), pp. 1–8 (2015)
Google Scholar
Nassiri, N., Lakhouaja, A., Cavalli-Sforza, V.: Modern Standard Arabic readability prediction. In: Lachkar, A., Bouzoubaa, K., Mazroui, A., Hamdani, A., Lekhouaja, A. (eds.) ICALP 2017. CCIS, vol. 782, pp. 120–133. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-73500-9_9
Chapter Google Scholar
Boudchiche, M., Mazroui, A.: Approche hybride pour le développement d’un lemmatiseur pour la langue arabe. In: Presented at the 13th African Conference on Research in Computer Science and Applied Mathematics, Hammamet, Tunisia, p. 147 (2016)
Google Scholar
Boudchiche, M., Mazroui, A., Ould Abdallahi Ould Bebah, M., Lakhouaja, A., Boudlal, A.: AlKhalil Morpho Sys 2: a robust Arabic morpho-syntactic analyzer. J. King Saud Univ. – Comput. Inf. Sci. 29(2), 141–146 (2017)
Google Scholar
Ababou, N., Mazroui, A.: A hybrid Arabic POS tagging for simple and compound morphosyntactic tags. Int. J. Speech Technol. 19(2), 289–302 (2016)
Article Google Scholar
Zerrouki, T., Balla, A.: Tashkeela: novel corpus of Arabic vocalized texts, data for auto-diacritization systems. Data Brief 11, 147–151 (2017)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Faculty of Sciences, University Mohamed First, Oujda, Morocco
Naoual Nassiri & Abdelhak Lakhouaja
School of Science and Engineering, AI Akhawayn University, Ifrane, Morocco
Violetta Cavalli-Sforza

Authors

Naoual Nassiri
View author publications
You can also search for this author in PubMed Google Scholar
Abdelhak Lakhouaja
View author publications
You can also search for this author in PubMed Google Scholar
Violetta Cavalli-Sforza
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Naoual Nassiri .

Editor information

Editors and Affiliations

Université de Franche-Comté, Besançon, France
Max Silberztein
Conservatoire National des Arts et Métiers, Paris, France
Faten Atigui
Conservatoire National des Arts et Métiers, Paris, France
Elena Kornyshova
Conservatoire National des Arts et Métiers, Paris, France
Elisabeth Métais
University of Salford, Manchester, United Kingdom
Farid Meziane

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Nassiri, N., Lakhouaja, A., Cavalli-Sforza, V. (2018). Arabic Readability Assessment for Foreign Language Learners. In: Silberztein, M., Atigui, F., Kornyshova, E., Métais, E., Meziane, F. (eds) Natural Language Processing and Information Systems. NLDB 2018. Lecture Notes in Computer Science(), vol 10859. Springer, Cham. https://doi.org/10.1007/978-3-319-91947-8_49

Download citation

DOI: https://doi.org/10.1007/978-3-319-91947-8_49
Published: 22 May 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-91946-1
Online ISBN: 978-3-319-91947-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics