Language technologies applied to document simplification for helping autistic people

https://doi.org/10.1016/j.eswa.2015.02.044Get rights and content

Highlights

  • A reading comprehension tool for people with ASD is exposed.

  • The tool assist people with ASD and careers to simplify texts.

  • A key element of the tool is the concepts substitution with images.

  • Figurative language is also simplified for people with ASD.

  • Summary generation and topic detection are other functions of the tool.

Abstract

People affected by Autism Spectrum Disorders (ASD) have impairments in social interaction because they lack an adequate theory of mind. A significant percentile has inadequate reading comprehension skills. We present a multilingual tool called Open Book (OB) that applies Human Language Technologies (HLT) in order to identify reading comprehension obstacles in text documents and propose more simple alternatives with the aim of assisting the reading comprehension of users. OB involves several text transformations at lexical, syntactic and semantic level. In this paper we focus on three challenging components of the OB tool: the image retrieval component, the idiom detection component and the summarization module. There are very few studies that involve simplification by showing images associated to difficult concepts. In addition, the treatment of figurative language such as idioms or metaphors is one of the most challenging areas in Natural Language Processing (NLP). Finally, although text summarization is a more widely studied field in NLP, its application to text simplification remains as an open research issue. Thus, we focus on the integration of these three modules in our OB tool. We present the motivation for building these components and we describe how they are integrated in the whole system. Moreover, the usability and the usefulness of OB have been evaluated and analysed showing that the tool helps to produce texts easier to understand for autistic people.

Introduction

Autism Spectrum Disorder (ASD) is a condition that impairs the proper development of cognitive functions, social skills, and communicative abilities (Mesibov, Adams, & Klinger, 1997). It was shown in various studies (Nation, Clarke, Wright, & Williams, 2006) that individuals affected by ASD have deficits in reading comprehension.

Other psychological studies have showed that ASD children’s lexical and syntactic knowledge remains delayed with respect to the other cognitive functions (Lord & Paul, 1997) and that pictures can improve the reading comprehension skills of children with a wide developmental disabilities spectrum (Fossett & Mirenda, 2006).

It is already proven by the significant amount of dedicated computer applications that information technology can enhance the communicative abilities of the people with ASD (Mintz, 2013, Ploog et al., 2013). Computer applications can also help people with ASD in other facet of their lives. Chu, Liao, Chen, Lin, and Chen (2011) expose a framework to support problem-oriented e-learning, in plain English, a methodology to develop e-learning platforms focus on adaptive case-based learning, which is highly important to cover the specific learning needs of each person with ASD. Moreover, technology can assist people with ASD in other of their common symptoms: repetitive and stereotyped behaviors. In Coronato, Pietro, and Paragliola (2014) is described a method and infrastructure for the detection of the stereotyped motion disorders of people with ASD. The main goal of the system is the study of the stereotyped movements with the aim of reducing them.

This paper presents several components of a new software tool, called Open Book, developed to assist ASD carers to transform written documents into a format that is easier to read and understand using HLT.

The Open Book tool is being developed under the on-going European Project FIRST (Flexible Interactive Reading Support Tool). The project involves nine institutions from four countries (Spain, Bulgaria, UK and Belgium) and includes experts in language technologies, software development and autism. Currently, the Open Book tool processes documents written in English, Spanish and Bulgarian but can be extended to other languages provided that the required resources are available. To accommodate the variability of ASD users the tool functionality can be customized to fit their needs.

Based on literature research and on a series of studies performed in the United Kingdom, Spain and Bulgaria with a variety of autistic patients ranging from children to adults, a series of obstacles in reading comprehensions have been identified. From a linguistic point of view, they can be classified in lexical obstacles (difficulty in processing relative clauses, for example) and semantic obstacles (difficulty in understanding rare or specialized terms or in comprehension of idioms, for example). The tool applies a series of automatic transformations to user documents to identify and remove the reading obstacles to comprehension. These transformations include: the replacement of long and complex sentences with several simpler ones, identification of difficult terms and image retrieval for difficult terms, generation of concise summaries and automatic generation of tables of contents to help the navigation through the text.

In this paper, we focus on three specific challenging issues related to text simplification using Natural Language Processing (NLP). The first one is the retrieval of images associated with difficult or complex concepts detected in a document. Secondly, a module of idiom detection is presented. This feature is an open research issue in NLP and, to the best of our knowledge, there are no existing studies related to integrating idiom detection into a simplification system. Finally, we study two different approaches to text summarization based on Topic Model and the PageRank algorithm. The final evaluation demonstrates the usefulness of our tool and points out the advantages obtained by integrating the different modules. Of course we have also found some weaknesses in our systems that should be improved, most of them related to the imprecision of the integrated modules (images are in some situations inappropriate, the idiom detection is in some cases inaccurate, etc.). Thus, more research must be carried out in order to overcome the different issues. For example, improvement of the disambiguation module could imply greater accuracy in other related components such as the image retrieval or idiom detection modules.

Before going further we make some clarifications about the terminology. As we said before, a multidisciplinary team of experts is developing the Open Book tool. Throughout this paper we will refer to the clinicians and the specialists in psychology that contributed to the tool as the psychologists. Moreover, the partners from academy and industry that programmed the tool will be referred as the technicians. The people with ASD that benefit from using the tool will be denoted whenever possible as the users.

The rest of the paper is organized as follows: the next section includes a review of the state of the art for NLP and Automatic Text Simplification. In addition, we introduce some important software tools designed to help people with ASD and we show how Open Book is different. Next, we indicate how the obstacles in user comprehension have been identified. Then the general Open Book Architecture is presented and the different components of the system are shown. Finally an evaluation of the accuracy is presented. We end the paper with the conclusions.

Section snippets

Related work

In this section, we first present some interesting works related to NLP and text simplification (TS). On the other side, we will comment the main current software tools available to help people with ASD.

Obstacles in reading comprehension

To identify the obstacles in text comprehension we rely on four lines of evidence. The first line of evidence comes from the studies performed with autistic population and reported in the literature. For example, Frith and Snowling (1983) show that ASD children can understand the meaning of single words but they have difficulty using semantic context to disambiguate the ambiguous words. O’Connor and Klein (2004) point out that replacing difficult pronouns with their referents improves the

Open Book Architecture

The components listed above are either syntactical (the first two) or semantic (the last four). Before presenting the semantic components we make some brief considerations about the architecture of Open Book.

The Open Book System is a web application that has a three-tier architecture (Fig. 1). The first layer is composed by Natural Language Processing Components implemented as SOAP web services. These web services add information in a GATE9 document (Cunningham et al., 2011

Evaluation

In this section we evaluate the performance of the Image Retrieval System and Topic Model detection. The performance of the summarizer system has been thoroughly evaluated elsewhere (Erkan & Radev, 2004). The idiom detection system has a very high precision (close to 100%) because the idioms in our list are frozen phrases, multiple meanings being unlikely.

Conclusions

In this paper we have presented the tool Open Book that applies HLT to identifying reading comprehension obstacles in text documents in three languages: English, Spanish and Bulgarian. We have implemented several NLP features in the whole architecture of OB oriented to simplifying texts for ASD sufferers. Although the tool integrates several modules into a unique system, our main contribution presented in this paper focuses on the implementation of three specific challenging tasks in NLP.

Acknowledgments

This work has been partially supported by a Grant from the Fondo Europeo de Desarrollo Regional (FEDER), ATTOS project (TIN2012-38536-C03-0) from the Spanish Government. The project AORESCU (P11-TIC-7684 MO) from the regional government of Junta de Andalucía and the project CEATIC-2013-001 from the University of Jaén partially supports this manuscript. The work in this paper is partially funded by the European Commission under the Seventh (FP7-2007-2013) Framework Program for Research and

References (43)

  • J. Deng et al.

    What does classifying more than 10,000 image categories tell us?

  • J. Deng et al.

    ImageNet: A large-scale hierarchical image database

  • S. Devlin et al.

    The use of a psycholinguistic database in the simplification of text for aphasic readers

    Linguistic Databases

    (1998)
  • S. Devlin et al.

    Helping aphasic people process online information

  • H.P. Edmundson

    New methods in automatic extracting

    Journal of ACM

    (1969)
  • G. Erkan et al.

    LexRank: Graph-based lexical centrality as salience in text summarization

    Journal of Artificial Intelligence Research

    (2004)
  • Feng, L. (March 2008). Text simplification: A survey, Tech. rep., The City University of New...
  • U. Frith et al.

    Reading for meaning and reading for sound in autistic and dyslexic children

    British Journal of Developmental Psychology

    (1983)
  • T. Grandin

    Thinking in pictures: And other reports from my life with autism

    (1996)
  • N. Kaji et al.

    Verb paraphrase based on case frame alignment

  • Kana, R. K., Keller, T. A., Cherkassky, V. L., Minshew, N. J., & Just, M. A. (2006). Sentence comprehension in autism:...
  • Cited by (0)

    View full text