Language technologies applied to document simplification for helping autistic people
Introduction
Autism Spectrum Disorder (ASD) is a condition that impairs the proper development of cognitive functions, social skills, and communicative abilities (Mesibov, Adams, & Klinger, 1997). It was shown in various studies (Nation, Clarke, Wright, & Williams, 2006) that individuals affected by ASD have deficits in reading comprehension.
Other psychological studies have showed that ASD children’s lexical and syntactic knowledge remains delayed with respect to the other cognitive functions (Lord & Paul, 1997) and that pictures can improve the reading comprehension skills of children with a wide developmental disabilities spectrum (Fossett & Mirenda, 2006).
It is already proven by the significant amount of dedicated computer applications that information technology can enhance the communicative abilities of the people with ASD (Mintz, 2013, Ploog et al., 2013). Computer applications can also help people with ASD in other facet of their lives. Chu, Liao, Chen, Lin, and Chen (2011) expose a framework to support problem-oriented e-learning, in plain English, a methodology to develop e-learning platforms focus on adaptive case-based learning, which is highly important to cover the specific learning needs of each person with ASD. Moreover, technology can assist people with ASD in other of their common symptoms: repetitive and stereotyped behaviors. In Coronato, Pietro, and Paragliola (2014) is described a method and infrastructure for the detection of the stereotyped motion disorders of people with ASD. The main goal of the system is the study of the stereotyped movements with the aim of reducing them.
This paper presents several components of a new software tool, called Open Book, developed to assist ASD carers to transform written documents into a format that is easier to read and understand using HLT.
The Open Book tool is being developed under the on-going European Project FIRST (Flexible Interactive Reading Support Tool). The project involves nine institutions from four countries (Spain, Bulgaria, UK and Belgium) and includes experts in language technologies, software development and autism. Currently, the Open Book tool processes documents written in English, Spanish and Bulgarian but can be extended to other languages provided that the required resources are available. To accommodate the variability of ASD users the tool functionality can be customized to fit their needs.
Based on literature research and on a series of studies performed in the United Kingdom, Spain and Bulgaria with a variety of autistic patients ranging from children to adults, a series of obstacles in reading comprehensions have been identified. From a linguistic point of view, they can be classified in lexical obstacles (difficulty in processing relative clauses, for example) and semantic obstacles (difficulty in understanding rare or specialized terms or in comprehension of idioms, for example). The tool applies a series of automatic transformations to user documents to identify and remove the reading obstacles to comprehension. These transformations include: the replacement of long and complex sentences with several simpler ones, identification of difficult terms and image retrieval for difficult terms, generation of concise summaries and automatic generation of tables of contents to help the navigation through the text.
In this paper, we focus on three specific challenging issues related to text simplification using Natural Language Processing (NLP). The first one is the retrieval of images associated with difficult or complex concepts detected in a document. Secondly, a module of idiom detection is presented. This feature is an open research issue in NLP and, to the best of our knowledge, there are no existing studies related to integrating idiom detection into a simplification system. Finally, we study two different approaches to text summarization based on Topic Model and the PageRank algorithm. The final evaluation demonstrates the usefulness of our tool and points out the advantages obtained by integrating the different modules. Of course we have also found some weaknesses in our systems that should be improved, most of them related to the imprecision of the integrated modules (images are in some situations inappropriate, the idiom detection is in some cases inaccurate, etc.). Thus, more research must be carried out in order to overcome the different issues. For example, improvement of the disambiguation module could imply greater accuracy in other related components such as the image retrieval or idiom detection modules.
Before going further we make some clarifications about the terminology. As we said before, a multidisciplinary team of experts is developing the Open Book tool. Throughout this paper we will refer to the clinicians and the specialists in psychology that contributed to the tool as the psychologists. Moreover, the partners from academy and industry that programmed the tool will be referred as the technicians. The people with ASD that benefit from using the tool will be denoted whenever possible as the users.
The rest of the paper is organized as follows: the next section includes a review of the state of the art for NLP and Automatic Text Simplification. In addition, we introduce some important software tools designed to help people with ASD and we show how Open Book is different. Next, we indicate how the obstacles in user comprehension have been identified. Then the general Open Book Architecture is presented and the different components of the system are shown. Finally an evaluation of the accuracy is presented. We end the paper with the conclusions.
Section snippets
Related work
In this section, we first present some interesting works related to NLP and text simplification (TS). On the other side, we will comment the main current software tools available to help people with ASD.
Obstacles in reading comprehension
To identify the obstacles in text comprehension we rely on four lines of evidence. The first line of evidence comes from the studies performed with autistic population and reported in the literature. For example, Frith and Snowling (1983) show that ASD children can understand the meaning of single words but they have difficulty using semantic context to disambiguate the ambiguous words. O’Connor and Klein (2004) point out that replacing difficult pronouns with their referents improves the
Open Book Architecture
The components listed above are either syntactical (the first two) or semantic (the last four). Before presenting the semantic components we make some brief considerations about the architecture of Open Book.
The Open Book System is a web application that has a three-tier architecture (Fig. 1). The first layer is composed by Natural Language Processing Components implemented as SOAP web services. These web services add information in a GATE9 document (Cunningham et al., 2011
Evaluation
In this section we evaluate the performance of the Image Retrieval System and Topic Model detection. The performance of the summarizer system has been thoroughly evaluated elsewhere (Erkan & Radev, 2004). The idiom detection system has a very high precision (close to 100%) because the idioms in our list are frozen phrases, multiple meanings being unlikely.
Conclusions
In this paper we have presented the tool Open Book that applies HLT to identifying reading comprehension obstacles in text documents in three languages: English, Spanish and Bulgarian. We have implemented several NLP features in the whole architecture of OB oriented to simplifying texts for ASD sufferers. Although the tool integrates several modules into a unique system, our main contribution presented in this paper focuses on the implementation of three specific challenging tasks in NLP.
Acknowledgments
This work has been partially supported by a Grant from the Fondo Europeo de Desarrollo Regional (FEDER), ATTOS project (TIN2012-38536-C03-0) from the Spanish Government. The project AORESCU (P11-TIC-7684 MO) from the regional government of Junta de Andalucía and the project CEATIC-2013-001 from the University of Jaén partially supports this manuscript. The work in this paper is partially funded by the European Commission under the Seventh (FP7-2007-2013) Framework Program for Research and
References (43)
- et al.
Learning case adaptation for problem-oriented e-learning on mathematics teaching for students with mild disabilities
Expert Systems with Applications
(2011) - et al.
A situation-aware system for the detection of motion disorders of patients with autism spectrum disorders
Expert Systems with Applications
(2014) - et al.
Sight word reading in children with developmental disabilities: A comparison of paired associate and picture-to-text matching instruction
Research in Developmental Disabilities
(2006) Additional key factors mediating the use of a mobile technology tool designed to develop social and life skills in children with autism spectrum disorders: Evaluation of the 2nd {HANDS} prototype
Computers & Education
(2013)Theory of mind and autism: A review
International Review of Mental Retardation
(2001)- et al.
Latent Dirichlet allocation
Journal of Machine Learning Research
(2003) - Bouayad-Agha, N., Casamayor, G., Ferraro, G., & Wanner, L. (2009). Simplification of patent claim sentences for their...
- Canning, Y. (2002). Syntactic simplification of text (Ph.D. thesis). United Kingdom: University of...
- et al.
Motivations and methods for text simplification
- Cunningham, H., Maynard, D., Bontcheva, K., Tablan, V., Aswani, N., Roberts, I., et al. (2011). Text processing with...