Design of a Novel Web Utility that Provides Multi-lingual Word Definitions for Child E-Book Applications

Adiani, Deeksha; Lewis, Daniel; Serao, Vanessa; Barrett, Kevin; Bennett, Amelia; Hambly, Derick; Shenoda, Martina; West, Samuel; Coulter, Garrett; Shagal, Sultan; Biala, Toheeb; Sarkar, Medha; Wade, Joshua; Sarkar, Nilanjan

doi:10.1007/978-3-030-21817-1_1

Deeksha Adiani¹⁶,
Daniel Lewis¹⁶,
Vanessa Serao¹⁶,
Kevin Barrett¹⁶,
Amelia Bennett¹⁶,
Derick Hambly¹⁶,
Martina Shenoda¹⁶,
Samuel West¹⁶,
Garrett Coulter¹⁶,
Sultan Shagal¹⁶,
Toheeb Biala¹⁶,
Medha Sarkar¹⁶,
Joshua Wade¹⁷ &
…
Nilanjan Sarkar¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11591))

Included in the following conference series:

International Conference on Human-Computer Interaction

1401 Accesses

Abstract

The use of mobile computing devices to gain access to the Internet and to interact with a range of applications has become ubiquitous, impacting on education, entertainment, healthcare, and many other domains. Engaging applications such as e-books used by children and their parents or educators have also become increasingly common, especially in the context of childhood education. An e-book that presents the reader with challenging words has the potential to improve and expand vocabulary. However, the seamless combination of methods for definition-retrieval, word sense disambiguation, and multi-lingual support are not currently available in a simple tool for the specific application of children’s e-book reading. In this work, we present WordWeaver, an open-source tool for use in child reading scenarios where context-sensitive, multi-lingual definitions of unfamiliar words are determined and provided to the user via a simple web API. Our proof-of-concept design includes support for English, Spanish, and French language definitions. Preliminary results support the feasibility of WordWeaver including excellent levels of usability based on the System Usability Scale (i.e., mean SUS = 87.17). Future work will include extending support to include additional languages, definition selection tailored to individual reading skill level, and the ability to address more complicated cases of word and part-of-speech disambiguation.

You have full access to this open access chapter, Download conference paper PDF

Towards an Automatic Easy-to-Read Adaptation of Morphological Features in Spanish Texts

"Easy-to-Read on the Web": State of the Art and Needed Research

Design and Evaluation of a Mobile Application for Interactive Reading

Keywords

1 Introduction

For children, an essential component of literacy development is vocabulary acquisition. Evidence shows that the clarification of word meanings during reading training produces greater gains in vocabulary than training that does not provide this information [1, 2]. E-book use by children and their parents and/or educators has become increasingly common [3, 4] and may offer a unique opportunity to present readers with challenging words that have the capacity to improve and expand vocabulary. A tool that provides context-sensitive definitions for unfamiliar words would therefore be a valuable component of an e-book application. Furthermore, support for a wide variety of languages would facilitate broader adoption and impact.

In this work, we present WordWeaver, an open-source tool for use in child reading scenarios where context-sensitive, multi-lingual definitions of unfamiliar words are provided via a simple web protocol [5]. Our proof-of-concept design currently includes support for English, Spanish, and French language definitions, but support for other languages, including Arabic, is part of ongoing work. The proposed multi-lingual dictionary utility is a web-based resource for an e-book application that provides definitions for words according to both language and context. Unlike existing single-language dictionary APIs, WordWeaver returns definitions tailored to the individual’s preferred language, resulting in a more individualized e-book reading experience. While resources like Webster’s Dictionary API are robust, they do not offer a one-stop solution for word definition-retrieval, language translation, word sense disambiguation, and other natural language processing functions. By unifying such functionality in a single application, we believe that WordWeaver can serve to fill this gap and offer a useful means by which to perform such functions, and, ultimately, to facilitate learning.

2 System Design

Given a specified word and surrounding text from which contextual information can be inferred, WordWeaver, the novel utility, provides context-sensitive definitions to users in a variety of languages. For proof-of-concept demonstration, the frontend consisted of a simple application supporting word selection via double-click or -tap and was implemented using the Unity game engine (www.unity3d.com). Client queries (i.e., requests for definition) are sent to a server application via HTTP POST requests consisting of JavaScript Object Notation (JSON)-serialized objects, and the server responds with a structured JSON object containing query results. The backend consists of a Python application and utilizes a range of powerful resources including Natural Language Toolkit (NLTK, version 3.3) [6], spaCy version 2.0 [7], Glosbe API [8], and Pywsd [9] for context interpretation and translation functionalities.

2.1 Front End

The user interface was created using built-in User Interface (UI) tools and data structures in the Unity game engine (version 2018.2) as well as TextMesh Pro [10], a text-manipulation package supported by the game engine. All frontend scripting was implemented in C♯. The proof-of-concept tool is composed of a simple story and supports features such as text-selection, text-highlighting, requests for definitions, and definition display (Fig. 1). While reading a story in the e-book, the reader may encounter an unfamiliar word. The text-highlighting feature implemented using TextMesh Pro allows the user to select the unfamiliar word by double-clicking on it—or tapping in the case of mobile device use—thus initiating a new definition request. The selected word, the context in which the word is found, and the active story language are transmitted to the server via Transmission Control Protocol (TCP) as a JSON-serialized object, such as that shown in the following example:

The server, after processing this information, returns the most appropriate definition of the word, which can finally be presented to the user by the client in the manner appropriate to the particular client application.

2.2 Backend

The backend of the tool is written entirely in Python (version 3.6) and consists of a simple Flask server [11]. A word sense disambiguation module returns a context-appropriate definition based on information contained in each JSON object. The JSON object includes a language tag, a word to be defined, and a context in which the word appears. The word is defined with respect to the context. In WordWeaver, this is accomplished by performing word sense disambiguation based on the Lesk algorithm [12]. This is accomplished through utilization of information from Wordnet [13] which is a large lexical database of English words. Before this step occurs, WordWeaver first checks whether the word is a stop word (high-frequency words like the, to and also; [14]). Because our domain is specific to children’s literature, it is likely that stop words will be used in the most common sense, and are, moreover, largely ignored by Wordnet (i.e., they are not defined). We simply return the most common definition found in a separate dictionary if a request is for a stop word.

Currently, if a request is not in English, then a translation to English must be performed before the disambiguation step can be executed. This translation is performed through a call to the Glosbe API [8]. Next, word sense disambiguation happens in two main steps. First, two versions of the lesk algorithm from the Pywsd module are used [9]; Pywsd shares an implementation and author with the built in Wordnet version of the Lesk algorithms. A simple Lesk (shown in Fig. 2) and a cosine Lesk are performed with the given context (i.e., the surrounding text), the word, and the part of speech. The results from these two algorithms are compared by checking which definition has the highest percentage of overlap with the context. Overlap is defined as the set of words in common with the context. After this step is done, some checks are made to ensure that the result is reasonable before returning result to the client. For instance, given the domain of children’s literature, we can disregard rare and archaic uses of a word as uses of these words imply a higher reading level. A simple check is made to ensure that the definition returned is in the top four most common definitions for the given word. If an error occurs at any point during this process, an error message is returned and displayed to the user. WordWeaver’s step-by-step process is detailed in Appendix A.

3 Results and Discussion

The preliminary usability of WordWeaver was gauged using the System Usability Scale (SUS), which is a widely used measure of the perceived ease of use of digital systems [15]. N = 15 volunteers in an undergraduate program in the lead author’s university interacted with the frontend application. Subjects were asked to progress through an example story and to select words of their choosing for definition by the novel tool. The mean cumulative SUS reported by participants was 87.17 (SD = 13.46), which is interpreted as “excellent” usability based on benchmarks reported in the literature [16]. Detailed participant responses on the SUS are given in Table 1. In addition, Fig. 3 shows the results of WordWeaver for an example input word (“quick”) and context (“the quick brown fox jumped over the lazy dog”) in English, Spanish, and French. Cumulatively, our preliminary results indicate that WordWeaver is capable of reliably reporting user-requested definitions and that client-server interactions perform as expected.

Table 1. Interpretation of SUS cumulative scores.

Full size table

4 Conclusion

There are several opportunities for future development and improvement of WordWeaver. First, the use of the Glosbe API to translate non-English requests into English has some key limitations. The most important is that many words simply do not translate directly into another single word, and machine translation is often incorrect—the reader has likely experienced this when interacting with commercial tools such as Siri, Amazon Alexa, or Google Assistant. The versions of the Lesk algorithm we used rely on Wordnet, and not every language’s version of Wordnet currently provides this functionality. In the future, we hope to rectify this and fully realize the design equally across languages rather than relying on an English translation, or perhaps by using other methods such as cross referencing among the translated languages to disambiguate word meaning. In the future, we hope to additionally provide definitions at the optimal reading level of the child, but because the Lesk algorithm requires a definition and example sentences in which the word is used, a simple children’s dictionary alone may not provide the algorithm with enough information to permit such fine-tuning at this time. Also, the availability of children’s dictionaries and example sentences is quite limited. Another area of future work involves distinguishing proper nouns such as character names from their literal meanings. For instance, if a character is named “Mrs. Red” and the user taps on the word “Red”, then a definition should not necessarily be returned by WordWeaver. One way to address cases such as these may be to have the client application maintain a list of keywords that are unique to a particular story. Then, when the user attempts to request a definition for an identified keyword from WordWeaver, the client-side application could preempt the request and returns the appropriate definition locally. Such changes as those described are in fact part of ongoing development with WordWeaver.

References

Elley, W.B.: Vocabulary acquisition from listening to stories. Read. Res. Q. 24, 174–187 (1989). https://doi.org/10.2307/747863
Article Google Scholar
Biemiller, A., Boote, C.: An effective method for building meaning vocabulary in primary grades. J. Educ. Psychol. 98(1), 44 (2006). https://doi.org/10.1037/0022-0663.98.1.44
Article Google Scholar
Yuill, N., Martin, A.F.: Curling up with a good e-book: mother-child shared story reading on screen or paper affects embodied interaction and warmth. Front. Psychol. 7, 1951 (2016). https://doi.org/10.3389/fpsyg.2016.01951
Article Google Scholar
Baron, N.S.: Words Onscreen: The Fate of Reading in a Digital World. Oxford University Press, Oxford (2015)
Google Scholar
WordWeaver (2018). https://github.com/danielyoureelewis/dictionary_group. Accessed 30 Jan 2019
Bird, S., Loper, E.: NLTK: the natural language toolkit. In: Proceedings of the ACL 2004 on Interactive Poster and Demonstration Sessions, p. 31 (2004). https://doi.org/10.3115/1219044.1219075
spaCy. https://spacy.io/usage/v2. Accessed 21 Jan 2019
Glosbe - the multilingual online dictionary. https://glosbe.com/. Accessed 30 Jan 2019
Tan, L.: Pywsd: python implementations of word sense disambiguation (WSD) technologies (2014)
Google Scholar
TextMesh Pro 1.2.2 (2018). https://assetstore.unity.com/packages/essentials/beta-projects/textmesh-pro-84126. Accessed 30 Jan 2019
Ronacher, A.: Welcome—flask (a python microframework) (2010). https://flask.pocoo.org/. Accessed 30 Jan 2019
Banerjee, S., Pedersen, T.: An adapted lesk algorithm for word sense disambiguation using wordnet. In: Gelbukh, A. (ed.) CICLing 2002. LNCS, vol. 2276, pp. 136–145. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-45715-1_11
Chapter Google Scholar
Miller, G.: WordNet: An Electronic Lexical Database. MIT Press, Cambridge (1998). https://doi.org/10.1002/9781405198431.wbeal1285
Book MATH Google Scholar
Bird, S., Klein, E., Loper, E.: Natural language processing with python (2014). https://www.nltk.org/book/ch02.html#stopwords_index_term. Accessed 30 Jan 2019
Brooke, J.: SUS-a quick and dirty usability scale. Usability Eval. Ind. 189(194), 4–7 (1996). https://doi.org/10.1.1.232.5526
Bangor, A., Kortum, P.T., Miller, J.T.: An empirical evaluation of the system usability scale. Int. J. Hum.-Comput. Interact. 24(6), 574–594 (2008). https://doi.org/10.1080/10447310802205776
Article Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science, Middle Tennessee State University, Murfreesboro, TN, 37132, USA
Deeksha Adiani, Daniel Lewis, Vanessa Serao, Kevin Barrett, Amelia Bennett, Derick Hambly, Martina Shenoda, Samuel West, Garrett Coulter, Sultan Shagal, Toheeb Biala & Medha Sarkar
Adaptive Technology Consulting, LLC, Murfreesboro, TN, 37127, USA
Joshua Wade & Nilanjan Sarkar

Authors

Deeksha Adiani
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Lewis
View author publications
You can also search for this author in PubMed Google Scholar
Vanessa Serao
View author publications
You can also search for this author in PubMed Google Scholar
Kevin Barrett
View author publications
You can also search for this author in PubMed Google Scholar
Amelia Bennett
View author publications
You can also search for this author in PubMed Google Scholar
Derick Hambly
View author publications
You can also search for this author in PubMed Google Scholar
Martina Shenoda
View author publications
You can also search for this author in PubMed Google Scholar
Samuel West
View author publications
You can also search for this author in PubMed Google Scholar
Garrett Coulter
View author publications
You can also search for this author in PubMed Google Scholar
Sultan Shagal
View author publications
You can also search for this author in PubMed Google Scholar
Toheeb Biala
View author publications
You can also search for this author in PubMed Google Scholar
Medha Sarkar
View author publications
You can also search for this author in PubMed Google Scholar
Joshua Wade
View author publications
You can also search for this author in PubMed Google Scholar
Nilanjan Sarkar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Joshua Wade .

Editor information

Editors and Affiliations

Cyprus University of Technology, Limassol, Cyprus
Panayiotis Zaphiris
Cyprus University of Technology, Limassol, Cyprus
Andri Ioannou

Appendix

1.1 Appendix A. Detailed Procedures for Returning a Definition in WordWeaver

See Fig. 4.

1.2 Appendix B. Instructions for Setting up and Utilizing the Open-Source Project WordWeaver

In order to communicate with WordWeaver’s server-side application, clients must submit queries via HTTP POST requests. A complete example is given in the code listing below (C♯ console application), which utilizes the Json.NET implementation of JSON (i.e., Newtonsoft.Json.dll) (Fig. 4).

WordWeaver was designed to be easily deployed as a lightweight server application. WordWeaver can be setup on either a dedicated hosting service or locally using a service such as ngrok (www.ngrok.com). In order to launch the server, users must execute the file Server.py (see code listing below; source code available at [5]):

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Adiani, D. et al. (2019). Design of a Novel Web Utility that Provides Multi-lingual Word Definitions for Child E-Book Applications. In: Zaphiris, P., Ioannou, A. (eds) Learning and Collaboration Technologies. Ubiquitous and Virtual Environments for Learning and Collaboration. HCII 2019. Lecture Notes in Computer Science(), vol 11591. Springer, Cham. https://doi.org/10.1007/978-3-030-21817-1_1

Download citation

DOI: https://doi.org/10.1007/978-3-030-21817-1_1
Published: 15 June 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-21816-4
Online ISBN: 978-3-030-21817-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics