Skip to main content

Advertisement

Log in

A resource of errors written in Spanish by people with dyslexia and its linguistic, phonetic and visual analysis

  • Original Paper
  • Published:
Language Resources and Evaluation Aims and scope Submit manuscript

Abstract

In this work we introduce the analysis of DysList, a language resource for Spanish composed of a list of unique spelling errors extracted from a collection of texts written by people with dyslexia. Each of the errors was annotated with a set of characteristics as well as with visual and phonetic features. To the best of our knowledge, this is the largest resource of this kind in Spanish. We also analyzed all the features of Spanish errors and our main finding is that dyslexic errors are phonetically and visually motivated.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. In some literature, dyslexia is referred to as a specific reading disability only (Vellutino et al. 2004) and dysgraphia as its written manifestation (Romani et al. 1999).

  2. This estimation was carried out taking into account primary schools in the region of Murcia, Spain.

  3. The preliminary resource was described in the Proceedings of the LREC 2102 Workshop on Natural Language Processing for Improving Textual Accessibility (NLP4ITA), 27 May, Istanbul, Turkey, pp. 22–26 (Rello et al. 2012). This work presents an enlarged collection of texts with new annotations. The preliminary analysis appeared in the Proceedings of the 9th International Conference on Language Resources and Evaluation (LREC 2014), 26–31 May, Reykjavik, Iceland, pp. 1289–1296 (Rello et al. 2014). In comparison with the LREC paper, which is focused on the language resource only, this paper extends previous work by presenting the analyses of the errors using linguistic, phonetic and visual features.

  4. Available at: grupoweb.upf.es/WRG/resources/DysWebxia/DysList_resource.csv.gz.

  5. Penfriend XL (http://www.penfriend.biz/).

  6. http://www.dcs.bbk.ac.uk/~jenny/resources.html.

  7. The examples report the correct word—the first word between parentheses—the related English translation—the second word in quotes. The erroneous word are those preceded by an asterisk ‘*’. We use the standard linguistic conventions: ‘\(\langle\) \(\rangle\)’ for graphemes, ‘/ /’ for phonemes and ‘[ ]’ for phones.

  8. The edit or Levenshtein distance (Levenshtein 1965) is the minimum number of substitutions, insertions and deletions to transform one string into another. The Damerau version (Damerau 1964) counts a transposition as a single error instead of two errors. Notice that there might be more than one solution for the transformation associated to the edit distance.

  9. Notice that a deletion in the target word is an insertion in the misspelled word and vice versa.

  10. Here we refer to all web pages written in Spanish, not the web pages from Spain. For determining whether a web page was written in Spanish, we used Google Advanced Search settings http://www.google.com/advanced_search.

  11. In Catalan, the sound [i] is always represented by the letter \(\langle\)i\(\rangle\), while in Spanish it might be also represented by \(\langle\)y\(\rangle\); moreover, \(\langle\)y\(\rangle\) in Catalan is only present in the digraph \(\langle\)ny\(\rangle\) used to represent the nasal palatal consonant [ɲ]. Thus, transfer from Catalan might explain the errors.

  12. http://grupoweb.upf.es/WRG/resources/DysWebxia/DysList_resource.csv.gz.

  13. Similarly to the corpus created by Jara for adults in Costa Rica (Jara Murillo 2013), we will create a list of errors from a control group comparable in age and Spanish variant.

References

  • American Psychiatric Association. (2000). Diagnostic and statistical manual of mental disorders: DSM-IV-TR. Arlington, VA: American Psychiatric Publishing.

    Google Scholar 

  • Aragón, L. E., & Silva, A. (2000). Análisis cualitativo de un instrumento para detectar errores de tipo disléxico (IDETID-LEA) (Qualitative analysis of an instrument to detect dyslexic errors, IDETID-LEA). Psicothema, 12(Supl. 2), 35–38.

    Google Scholar 

  • Baeza-Yates, R. & Rello, L. (2011). Estimating dyslexia in the Web. In Proceedings of the 8th international cross-disciplinary conference on web accessibility (W4A 2011). ACM Press, Hyderabad, India.

  • Brunswick, N. (2010). Unimpaired reading development and dyslexia across different languages. In S. McDougall & P. del Mornay Davies (Eds.), Reading and dyslexia in different orthographies (pp. 131–154). Hove: Psychology Press.

    Google Scholar 

  • Carrillo, M. S., Alegría, J., Miranda, P., & Sánchez, N. (2011). Evaluación de la dislexia en la escuela primaria: Prevalencia en español (Evaluation of dyslexia in primary school: The prevalence in Spanish). Escritos de Psicología, 4(2), 35–44.

    Article  Google Scholar 

  • Col\(\cdot \)legi de Logopedes de Catalunya (2011). PRODISCAT: Protocol de detecció i actuació en la dislèxia. Àmbit educatiu (Protocol for detection and management of dyslexia. Educational scope). Departament d’Educació, Generalitat de Catalunya (Department of Education, Catalan Government), Barcelona.

  • Coleman, C., Gregg, N., McLain, L., & Bellair, L. W. (2009). A comparison of spelling performance across young adults with and without dyslexia. Assessment for Effective Intervention, 34(2), 94–105.

    Article  Google Scholar 

  • Connelly, V., Campbell, S., MacLean, M., & Barnes, J. (2006). Contribution of lower order skills to the written composition of college students with and without dyslexia. Developmental Neuropsychology, 29(1), 175–196.

    Article  Google Scholar 

  • Damerau, F. J. (1964). A technique for computer detection and correction of spelling errors. Communications of the ACM, 7, 171–176.

    Article  Google Scholar 

  • Fernández, M. (2011). Lingüística de corpus y adquisición de la lengua (Corpus linguistics and language acquisition). Madrid: Arco/Libros.

    Google Scholar 

  • Franceschini, S., Gori, S., Ruffino, M., Pedrolli, K., & Facoetti, A. (2012). A causal link between visual spatial attention and reading acquisition. Current Biology, 22(9), 814–819.

    Article  Google Scholar 

  • Garrote, M. (2010). Los corpus de habla infantil. Metodología y análisis (Children’s spoken language corpora. Methodology and analysis). Madrid: Ediciones de la Universidad Autónoma de Madrid.

    Book  Google Scholar 

  • Gelman, I. A., & Barletta, A. L. (2008). A “quick and dirty” website data quality indicator. In The 2nd ACM Workshop on Information Credibility on the Web (WICOW ’08) (pp. 43–46). Napa Valley, USA.

  • Gil, J. (2007). Fonética para profesores de español: de la teoría a la práctica (Phonetics for teachers of Spanish: from theory to practice). Madrid: Arco/Libros.

    Google Scholar 

  • Goulandris, N. (Ed.). (2003). Dyslexia in different languages: Cross-linguistic comparisons. London: Whurr Publishers.

    Google Scholar 

  • Gregor, P., Dickinson, A., Macaffer, A., & Andreasen, P. (2003). Seeword: A personal word processing environment for dyslexic computer users. British Journal of Educational Technology, 34(3), 341–355.

    Article  Google Scholar 

  • Hernández García, C. (1998). Una propuesta de clasificación de la interferencia lingüística a partir de dos lenguas en contacto: el catalán y el español (A proposal for the classification of linguistic transfer based on two languages in contact: Catalan and Spanish). Hesperia. Anuario de filología hispánica, 1, 61–80.

    Google Scholar 

  • Holbrook, D. (1964). English for the rejected: Training literacy in the lower streams of the secondary school. Cambridge: Cambridge University Press.

    Google Scholar 

  • Interagency Commission on Learning Disabilities. (1987). Learning disabilities: A report to the U.S. Congress. Government Printing Office, Washington, DC.

  • International Phonetic Association. (1999). Handbook of the International Phonetic Association: A guide to the use of the International Phonetic Alphabet. Cambridge: Cambridge University Press.

  • Jara Murillo, C. V. (2013). COCAE: corpus cacográfico adulto del español de Costa Rica (COCAE: a cacographical corpus of adults in Costa Rica Spanish). Universidad de Costa Rica, San José de Costa Rica. http://hdl.handle.net/10669/8928

  • Korhonen, T. (2008). Adaptive spell checker for dyslexic writers. In Proceedings of the 11th international conference on Computers Helping People with Special Needs (ICCHP ’08) (pp. 733–741). Berlin, Heidelberg: Springer.

  • Levenshtein, V. (1965). Binary codes capable of correcting spurious insertions and deletions of ones. Problems of Information Transmission, 1, 8–17.

    Google Scholar 

  • Li, A. Q., Sbattella, L., & Tedesco, R. (2013). Polispell: An adaptive spellchecker and predictor for people with dyslexia. In S. Carberry, S. Weibelzahl, A. Micarelli, & G. Semeraro (Eds.), User Modeling, Adaptation, and Personalization (pp. 302–309). Berlin, Heidelberg: Springer.

    Google Scholar 

  • Lindgrén, S. A., & Laine, M. (2011). Multilingual dyslexia in university students: Reading and writing patterns in three languages. Clinical Linguistics & Phonetics, 25(9), 753–766.

    Article  Google Scholar 

  • Llisterri, J. & Mariño, J. B. (1993). Spanish adaptation of SAMPA and automatic phonetic transcription. Tech. rep., ESPRIT project 6819 SAM-A Speech Technology Assessment in Multilingual Applications. http://liceu.uab.cat/ joaquim/publicacions/SAMPA_Spanish_93.pdf.

  • Lyon, G. R., Shaywitz, S. E., & Shaywitz, B. A. (2003). A definition of dyslexia. Annals of Dyslexia, 53(1), 1–14.

    Article  Google Scholar 

  • Machuca, M. J. (2000). Problemas de pronunciación (Pronunciation problems). In S. Alcoba (Ed.), La expresión oral (Oral expression) (pp. 71–88). Barcelona: Ariel.

    Google Scholar 

  • Meng, H., Smith, S., Hager, K., Held, M., Liu, J., Olson, R., et al. (2005). DCDC2 is associated with reading disability and modulates neuronal development in the brain. Proceedings of the National Academy of Sciences, 102, 17053–17058.

    Article  Google Scholar 

  • Mitton, R. (1996). English spelling and the computer. Harlow: Longman.

    Google Scholar 

  • Moats, L. C. (1996). Phonological spelling errors in the writing of dyslexic adolescents. Reading and Writing, 8(1), 105–119.

    Article  Google Scholar 

  • Navarro Tomás, T. (1980). Manual de pronunciación española (Manual of Spanish pronunciation) (20th ed.). Madrid: Consejo Superior de Investigaciones Científicas. (Original work published in 1918).

    Google Scholar 

  • Orton Dyslexia Society Research Committee. (1994). Operational definition of dyslexia. In C. Scruggs (Ed.), Perspectives, 20(5), 4.

  • Pedler, J. (2007). Computer correction of real-word spelling errors in dyslexic text. Ph.D. thesis, Birkbeck College, London University.

  • Piskorski, J., Sydow, M., & Weiss, D. (2008). Exploring linguistic features for web spam detection: a preliminary study. In Proceedings of the 4th International Workshop on Adversarial Information Retrieval on the Web (AIRWeb ’08) (pp. 25–28). New York, NY: ACM Press.

  • Pujol, M. (2004) Análisis de errores grafemáticos en textos libres de estudiantes de enseñanzas medias (Analysis of graphematic errors in free texts by secondary school students). Ph.D. thesis, Departament de Didàctica de la Llengua i la Literatura, Universitat de Barcelona, Barcelona. http://www.tdx.cat/handle/10803/1283.

  • Ramírez, F., & López, E. (2006). Spelling error patterns in Spanish for word processing applications. In Proceedings of the 5th international conference on language resources and evaluation (LREC 2006) (pp. 93–98). Genoa, Italy.

  • Ramus, F. (2003). Developmental dyslexia: Specific phonological deficit or general sensorimotor dysfunction? Current Opinion in Neurobiology, 13(2), 212–218.

    Article  Google Scholar 

  • Real Academia Española. (2001). Diccionario de la lengua española (Dictionary of the Spanish Language) (22nd ed.). Madrid: Espasa-Calpe.

  • Real Academia Española. (2005). Diccionario panhispánico de dudas (Pan-Hispanic Dictionary of Doubts). Madrid: Santillana.

    Google Scholar 

  • Rello, L., & Baeza-Yates, R. (2012). The presence of English and Spanish dyslexia in the Web. New Review of Hypermedia and Multimedia, 8, 131–158.

    Article  Google Scholar 

  • Rello, L., Baeza-Yates, R., & Llisterri, J. (2014). DysList: An annotated resource of dyslexic errors. In Proceedings of the 9th international conference on language resources and evaluation (LREC 2014) (pp. 1289–1296). Reykjavik, Iceland.

  • Rello, L., Baeza-Yates, R., Saggion, H., & Pedler, J. (2012). A first approach to the creation of a Spanish corpus of dyslexic texts. LREC Workshop Natural Language Processing for Improving Textual Accessibility (NLP4ITA) (pp. 22–27). Istanbul, Turkey.

  • Rello, L., Bayarri, C., & Gorriz, A. (2012). What is wrong with this word? Dyseggxia: A game for children with dyslexia (demo). In Proceedings of the 14th international ACM SIGACCESS conference on computers and accessibility (ASSETS ’12) (pp. 219–220). Boulder, USA: ACM Press.

  • Rello, L., Bayarri, C., Otal, Y., Pielot, M. (2014). A computer-based method to improve the spelling of children with dyslexia. In Proceedings of the 16th international ACM SIGACCESS conference on computers & accessibility, ASSETS ’14 (pp. 153–160). Rochester, NY, USA, October 20–22, 2014.

  • Romani, C., Ward, J., & Olson, A. (1999). Developmental surface dysgraphia: What is the underlying cognitive impairment? The Quarterly Journal of Experimental Psychology, 52(1), 97–128.

    Article  Google Scholar 

  • Schulte-Körne, G., Deimel, W., Müller, K., Gutenbrunner, C., & Remschmidt, H. (1996). Familial aggregation of spelling disability. Journal of Child Psychology and Psychiatry, 37(7), 817–822.

    Article  Google Scholar 

  • Seymour, P. H. K., Aro, M., & Erskine, J. M. (2003). Foundation literacy acquisition in European orthographies. British Journal of Psychology, 94(2), 143–174.

    Article  Google Scholar 

  • Snowling, M. (1998). Dyslexia as a phonological deficit: Evidence and implications. Child and Adolescent Mental Health, 3(1), 4–11.

    Article  Google Scholar 

  • Spooner, R. (1998). A spelling aid for dyslexic writers. Ph.D. thesis, University of York

  • Sterling, C., Farmer, M., Riddick, B., Morgan, S., & Matthews, C. (1998). Adult dyslexic writing. Dyslexia, 4(1), 1–15.

    Article  Google Scholar 

  • Toro, J., & Cervera, M. (1984). TALE: Test de Análisis de Lectoescritura (TALE: Literacy Analysis Test). Madrid: Visor.

    Google Scholar 

  • Treiman, R. (1997). Spelling in normal children and dyslexics. In B. A. Blachman (Ed.), Foundations of reading acquisition and dyslexia: Implications for early intervention (pp. 191–218). Mahwah, NJ: Lawrence Erlbaum.

    Google Scholar 

  • Vellutino, F. R., Fletcher, J. M., Snowling, M. J., & Scanlon, D. M. (2004). Specific reading disability (dyslexia): What have we learned in the past four decades? Journal of Child Psychology and Psychiatry, 45(1), 2–40.

    Article  Google Scholar 

  • Vidyasagar, T. R., & Pammer, K. (2010). Dyslexia: A deficit in visuo-spatial attention, not in phonological processing. Trends in Cognitive Sciences, 14(2), 57–63.

    Article  Google Scholar 

  • Wells, J. C. (2000). Computer-coding the IPA: A proposed extension of SAMPA. Division of Psychology and Language Sciences, University College London, London. http://www.phon.ucl.ac.uk/home/sampa/x-sampa.htm.

  • Wells, J. C. (2005). SAMPA—Computer readable phonetic alphabet. Division of Psychology and Language Sciences, University College London, London. http://www.phon.ucl.ac.uk/home/sampa/index.html.

  • World Health Organization. (1993). International statistical classification of diseases, injuries and causes of death (ICD-10) (10th ed.). Geneva: World Health Organization.

    Google Scholar 

  • Yannakoudakis, E. J., & Fawthrop, D. (1983). The rules of spelling errors. Information Processing & Management, 19(2), 87–99.

    Article  Google Scholar 

Download references

Acknowledgments

We thank Martí Mayo for computing some of the features and the first author thanks the partial funding of a doctoral fellowship (FI-DGR) of the Generalitat de Calaluyna (Government of Catalonia). We thank Yolanda Otal de la Torre, teacher and professional of CREIX—Centro de Desarrollo Infantil Barcelona for helping us to collect texts written by people with dyslexia. We also thank the anonymous reviewers for their comments.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Luz Rello.

Appendix: Comparing English and Spanish errors

Appendix: Comparing English and Spanish errors

We took a subgroup of texts from our corpus composed of 1075 word and performed a comparison of the error distribution with a similar one in English.

English and Spanish languages are archetypes of deep and shallow orthographies, respectively. Along an orthographic transparency scale for European languages, English appears as the language with the deepest orthography and Spanish as the second most shallow one after Finnish (Seymour et al. 2003).

In Tables 14 and 15 we compare the data of the English corpus described in Pedler (2007) with our Spanish texts. We compute the error ratio as the fraction of errors over the correctly spelt words we observe. As expected, Spanish dyslexics make less spelling errors (15 %) than English dyslexics (20 %), due to the different orthographies of their languages. However, the percentage of distinct errors is almost the same.

Table 14 Error ratio and percentage of total errors (with repetitions) and distinct errors in English and Spanish texts written by people with dyslexia

Table 15 presents the distribution of the different types of dyslexic errors for both languages. To determine if an error was a real world error we checked its existence in the Diccionario de la lengua española (Dictionary of the Spanish Language) (Real Academia Española 2001), the standard normative dictionary for Spanish.

Table 15 Distribution of errors in English and Spanish corpora

As expected, there is a greater percentage of multi-errors in a language with deep orthography—English—than in Spanish, e.g. *qría (creía, ‘thought’). However, first letter errors are almost two times more frequent in Spanish, e.g. *tula (ruta, ‘way’). This may look surprising according to Yannakoudakis and Fawthrop (1983), whose findings report that the first letter of a misspelling is correct in the majority of cases, but in Spanish the letter \(\langle\)h\(\rangle\) at the beginning of a word is not pronounced and this generates many more errors (4.6 %) in that position (see Table 5).

The rest of the dyslexic error types are similar in both languages. There are slightly more real-word errors in Spanish, *dijo (digo, ‘said’) or *llegada (llegaba, ‘arrived’). Simple errors are the most frequent ones in both languages. However, each error type has a different frequency. A detailed analysis of the different kind of dyslexic errors and their occurrence in the Web is given in Rello and Baeza-Yates (2012).

Even if both corpora are composed of text written by children with dyslexia in English and Spanish this comparison is not definitive because the two corpora are not fully comparable. For instance, text types and text size, among other characteristics, were not controlled. However, this comparison is still useful for cross-linguistic studies (Brunswick 2010; Goulandris 2003; Seymour et al. 2003), as a preliminary approach for a qualitative cross-linguistic comparison of dyslexic errors written in both languages, as English and Spanish present similar distributions frequencies; and, as expected, differences are due to the different orthographies of the two languages.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Rello, L., Baeza-Yates, R. & Llisterri, J. A resource of errors written in Spanish by people with dyslexia and its linguistic, phonetic and visual analysis. Lang Resources & Evaluation 51, 379–408 (2017). https://doi.org/10.1007/s10579-015-9329-0

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10579-015-9329-0

Keywords

Navigation