Skip to main content

Arabic Readability Assessment for Foreign Language Learners

  • Conference paper
  • First Online:
Natural Language Processing and Information Systems (NLDB 2018)

Abstract

Reading in a foreign language is a difficult task, especially if the texts presented to readers are chosen without taking into account the reader’s skill level. Foreign language learners need to be presented with reading material suitable to their reading capacities. A basic tool for determining if a text is appropriate to a reader’s level is the assessment of its readability, a measure that aims to represent the human capacities required to comprehend a given text. Readability prediction for a text is an important aspect in the process of teaching and learning, for reading in a foreign language as well as in one’s native language, and continues to be a central area of research and practice. In this paper, we present our approach to readability assessment for Modern Standard Arabic (MSA) as a foreign language. Readability prediction is carried out using the Global Language Online Support System (GLOSS) corpus, which was developed for independent learners to improve their foreign language skills and was annotated with the Interagency Language Roundtable (ILR) scale. In this study, we introduce a frequency dictionary, which was developed to calculate frequency-based features. The approach gives results that surpass the state-of the-art results for Arabic.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://gloss.dliflc.edu/. The MSA corpus has undergone some variation in contents over time.

  2. 2.

    The ILR scale (https://www.languagetesting.com/ilr-scale), developed by the U.S. Federal Government, rates language ability uses values 0 to 5, where: Level 0 (no proficiency); Level 1 (elementary proficiency); Level 2 (limited working proficiency); Level 3 (general occupational proficiency); Level 4 (advanced professional proficiency) and Level 5 (functionally native proficiency). Levels 0+, 1+, 2+, 3+, or 4+ are used when the person’s skills significantly exceed those of a given level, but are insufficient to reach the next level.

  3. 3.

    The Waikato Environment for Knowledge Analysis (WEKA) is an open source machine learning software resource that contains implementations of various algorithms.

References

  1. Dale, E., Chall, J.S.: A formula for predicting readability: instructions. Educ. Res. Bull. 27(2), 37–54 (1948)

    Google Scholar 

  2. Flesch, R.: A new readability yardstick. J. Appl. Psychol. 32(3), 221 (1948)

    Article  Google Scholar 

  3. Gunning, R.: The fog index after twenty years. J. Bus. Commun. 6(2), 3–13 (1969)

    Article  Google Scholar 

  4. Ghani, K.A., Noh, A.S., Yusoff, N.M.: Linguistic features for development of Arabic text readability formula in Malaysia: a preliminary study. Middle-East J. Sci. Res. 19(3), 319–331 (2014)

    Google Scholar 

  5. Al Tamimi, A.K., Jaradat, M., Al-Jarrah, N., Ghanem, S.: AARI: automatic arabic readability index. Int. Arab. J. Inf. Technol. 11(4), 370–378 (2014)

    Google Scholar 

  6. Al-Khalifa, H.S., Al-Ajlan, A.: Automatic readability measurements of the Arabic text: an exploratory study. Arab. J. Sci. Eng. 35, 103–124 (2010)

    Google Scholar 

  7. Forsyth, J.: Automatic readability detection for modern standard Arabic. Thesis Diss., Brigh. Young Univ. – Provo (2014)

    Google Scholar 

  8. Pasha, A., et al.: MADAMIRA: a fast, comprehensive tool for morphological analysis and disambiguation of Arabic. In: LREC, vol. 14, pp. 1094–1101 (2014)

    Google Scholar 

  9. Saddiki, H., Bouzoubaa, K., Cavalli-Sforza, V.: Text readability for Arabic as a foreign language. In: 2015 IEEE/ACS 12th International Conference of Computer Systems and Applications (AICCSA), pp. 1–8 (2015)

    Google Scholar 

  10. Nassiri, N., Lakhouaja, A., Cavalli-Sforza, V.: Modern Standard Arabic readability prediction. In: Lachkar, A., Bouzoubaa, K., Mazroui, A., Hamdani, A., Lekhouaja, A. (eds.) ICALP 2017. CCIS, vol. 782, pp. 120–133. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-73500-9_9

    Chapter  Google Scholar 

  11. Boudchiche, M., Mazroui, A.: Approche hybride pour le développement d’un lemmatiseur pour la langue arabe. In: Presented at the 13th African Conference on Research in Computer Science and Applied Mathematics, Hammamet, Tunisia, p. 147 (2016)

    Google Scholar 

  12. Boudchiche, M., Mazroui, A., Ould Abdallahi Ould Bebah, M., Lakhouaja, A., Boudlal, A.: AlKhalil Morpho Sys 2: a robust Arabic morpho-syntactic analyzer. J. King Saud Univ. – Comput. Inf. Sci. 29(2), 141–146 (2017)

    Google Scholar 

  13. Ababou, N., Mazroui, A.: A hybrid Arabic POS tagging for simple and compound morphosyntactic tags. Int. J. Speech Technol. 19(2), 289–302 (2016)

    Article  Google Scholar 

  14. Zerrouki, T., Balla, A.: Tashkeela: novel corpus of Arabic vocalized texts, data for auto-diacritization systems. Data Brief 11, 147–151 (2017)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Naoual Nassiri .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Nassiri, N., Lakhouaja, A., Cavalli-Sforza, V. (2018). Arabic Readability Assessment for Foreign Language Learners. In: Silberztein, M., Atigui, F., Kornyshova, E., Métais, E., Meziane, F. (eds) Natural Language Processing and Information Systems. NLDB 2018. Lecture Notes in Computer Science(), vol 10859. Springer, Cham. https://doi.org/10.1007/978-3-319-91947-8_49

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-91947-8_49

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-91946-1

  • Online ISBN: 978-3-319-91947-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics