Abstract
Over 70 million people worldwide face communication difficulties, with many using augmentative and alternative communication (AAC) technology. While AAC systems help improve interaction, the communication rate gap between individuals with and without speaking difficulties remains significant, and this has led to a low sustained use of AAC systems. The study reported here combines human computer interaction (HCI) and language modelling techniques to improve the ease of use, user satisfaction, and communication rates of AAC technology in open-domain interactions. A text input interface utilising word prediction based on BERT and RoBERTa language models has been investigated with a view to improving communication rates. Three interface layouts were implemented, and it was found that a radial configuration was the most efficient. RoBERTa models fine-tuned on conversational AAC corpora led to the highest communication rates of 25.75 words per minute (WPM), with alphabetical ordering preferred over probabilistic ordering. It was also found that training on conversational corpora such as TV and Reddit outperformed training based on generic corpora such as COCA or Wikipedia. Hence, it is concluded that the limited availability of large-scale conversational AAC corpora represent a key challenge for improving communication rates and robust AAC systems.
Index Terms: Text Input Prediction, Language Modelling, Augmentative and Alternative Communication (AAC), Speech Synthesis
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
https://therapy-box.co.uk/predictable - Available on June 2023.
References
Benyon, D., Murray, D.: Applying user modeling to human-computer interaction design. Artif. Intell. Rev. 7(3–4), 199–225 (1993)
Biswas, P., Robinson, P.: Automatic evaluation of assistive interfaces. In: Proceedings of the 13th International Conference on Intelligent User Interfaces, pp. 247–256 (2008)
Cai, S., et al.: Speakfaster observer: long-term instrumentation of eye-gaze typing for measuring AAC communication. In: Extended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems, pp. 1–8 (2023)
Calhoun, S., et al.: The NXT-format switchboard corpus: a rich resource for investigating the syntax, semantics, pragmatics and prosody of dialogue. Lang. Resour. Eval. 44(4), 387–419 (2010)
Card, S.K., Moran, T.P., Newell, A.: The keystroke-level model for user performance time with interactive systems. Commun. ACM 23(7), 396–410 (1980)
Chafe, W., Tannen, D.: The relation between written and spoken language. Annu. Rev. Anthropol. 16(1), 383–407 (1987)
Copestake, A.: Augmented and alternative NLP techniques for augmentative and alternative communication (AAC). In: Natural Language Processing for Communication Aids (1997)
Curtis, H., Neate, T., Vazquez Gonzalez, C.: State of the art in AAC: a systematic review and taxonomy. In: Proceedings of the 24th International ACM SIGACCESS Conference on Computers and Accessibility, pp. 1–22 (2022)
Davies, M.: The corpus of contemporary American English as the first reliable monitor corpus of English. Lit. Linguist. Comput. 25(4), 447–464 (2010)
Davies, M.: The TV and Movies corpora: design, construction, and use. Int. J. Corpus Linguist. 26(1), 10–37 (2021)
Denoyer, L., Gallinari, P.: The Wikipedia XML corpus. In: Fuhr, N., Lalmas, M., Trotman, A. (eds.) INEX 2006. LNCS, vol. 4518, pp. 12–19. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-73888-6_2
Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of naacL-HLT, vol. 1, p. 2 (2019)
Dhakal, V., Feit, A.M., Kristensson, P.O., Oulasvirta, A.: Observations on typing from 136 million keystrokes. In: Conference on Human Factors in Computing Systems - Proceedings, vol. 2018-April. Association for Computing Machinery (2018)
Dupré, D., Karjalainen, A.: Employment of disabled people in Europe in 2002. Stat. Focus 3–26 (2003)
Fitts, P.M.: The information capacity of the human motor system in controlling the amplitude of movement. J. Exp. Psychol. 47(6), 381–391 (1954)
Garay-Vitoria, N., Abascal, J.: Text prediction systems: a survey. Univ. Access Inf. Soc. 4(3), 188–203 (2006)
Goetze, S., Moritz, N., Appell, J.E., Meis, M., Bartsch, C., Bitzer, J.: Acoustic user interfaces for ambient assisted living technologies. Inform. Health Soc. Care SI Ageing Technol. 35(4), 161–179 (2010)
Guiard, Y.: Asymmetric division of labor in human skilled bimanual action: the kinematic chain as a model. J. Mot. Behav. 19(4), 486–517 (1987)
Guiard, Y., Olafsdottir, H.B., Perrault, S.T.: Fitt’s law as an explicit time/error trade-off. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 1619–1628 (2011)
Hameister, I., Nickels, L.: The cat in the tree-using picture descriptions to inform our understanding of conceptualisation in aphasia. Lang. Cogn. Neurosci. 33(10), 1296–1314 (2018)
Hick, W.E.: On the rate of gain of information. Q. J. Exp. Psychol. 4(1), 11–26 (2008)
Higginbotham, D.J., Shane, H., Russell, S., Caves, K.: Access to AAC: present, past, and future. Augment. Altern. Commun. 23(3), 243–257 (2007)
John, B.E., Kieras, D.E.: The GOMS family of user interface analysis techniques: comparison and contrast. ACM Trans. Comput.-Hum. Interact. (CHI) 3(4), 320–351 (1996)
Krause, J., Taliaferro, A.: Supporting students with autism spectrum disorders in physical education: there’s an app for that. Palaestra 29(2), 45 (2015)
Kristensson, P.: Discrete and continuous shape writing for text entry and control. Ph.D. thesis, Linköping University (2007)
Kristensson, P.O., et al.: Grand challenges in text entry. In: CHI 2013 Extended Abstracts on Human Factors in Computing Systems, pp. 3315–3318. Association for Computing Machinery, ACM (2013)
Kristensson, P.O., Lilley, J., Black, R., Waller, A.: A design engineering approach for quantitatively exploring context-aware sentence retrieval for nonspeaking individuals with motor disabilities. In: Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, vol. 20, pp. 1–11. Association for Computing Machinery (ACM) (2020)
Kristensson, P.O., Müllners, T.: Design and analysis of intelligent text entry systems with function structure models and envelope analysis. In: Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems, pp. 1–12 (2021)
Kristensson, P.O., Vertanen, K.: The inviscid text entry rate and its application as a grand goal for mobile text entry. In: Proceedings of the 16th International Conference on Human-Computer Interaction with Mobile Devices & Services, pp. 335–338 (2014)
Kurosu, M.: Human-Computer Interaction: Human-Centred Design Approaches, Methods, Tools and Environments: 15th International Conference, HCI International 2013, Las Vegas, NV, USA, July 21–26, 2013, Proceedings, Part I, vol. 8004. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-39232-0
Ma, K., Jurczyk, T., Choi, J.D.: Challenging reading comprehension on daily conversation: passage completion on multiparty dialog. In: NAACL HLT 2018–2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Conference 1, 2039–2048 (2018)
MacKenzie, I.S.: Fitts’ law as a research and design tool in human-computer interaction. Hum.-Comput. Interact. 7(1), 91–139 (1992)
Mackenzie, I.S., Zhang, S.X., Soukoreff, R.W.: Text entry using soft keyboards. Behav. Inf. Technol. 18(4), 235–244 (1999)
Medvedev, A.N., Lambiotte, R., Delvenne, J.-C.: The anatomy of reddit: an overview of academic research. In: Ghanbarnejad, F., Saha Roy, R., Karimi, F., Delvenne, J.-C., Mitra, B. (eds.) DOOCN 2017. SPC, pp. 183–204. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-14683-2_9
Moore, R.K.: Modeling data entry rates for ASR and alternative input methods. In: Interspeech (2004)
Morris, M.A., Meier, S.K., Griffin, J.M., Branda, M.E., Phelan, S.M.: Prevalence and etiologies of adult communication disabilities in the united states: results from the 2012 national health interview survey. Disabil. Health J. 9(1), 140–144 (2016)
Nicholas, L.E., Brookshire, R.H.: A system for quantifying the informativeness and efficiency of the connected speech of adults with aphasia. J. Speech Lang. Hear. Res. 36(2), 338–350 (1993)
Ola Kristensson, P., Müllners, T.: Design and analysis of intelligent text entry systems with function structure models and envelope analysis. Analysis 12 (2021)
Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, pp. 311–318 (2002)
Pritom, A.I., Mahmud, H., Ahmed, S., Hasan, M.K., Khan, M.M.: TYPEHEX keyboard: a virtual keyboard for faster typing in smartphone. In: 2015 18th International Conference on Computer and Information Technology (ICCIT), pp. 522–526. IEEE (2015)
Rackensperger, T., Krezman, C., Mcnaughton, D., Williams, M.B., D’silva, K.: “When I first got it, I wanted to throw it off a cliff”: the challenges and benefits of learning AAC technologies as described by adults who use AAC. Augment. Altern. Commun. 21(3), 165–186 (2005)
Rennies, J., Goetze, S., Appell, J.E.: Personalized acoustic interfaces for human-computer interaction. In: Ziefle, M., C.Röcker (eds.) Human-Centered Design of E-Health Technologies: Concepts, Methods and Applications, chap. 8, pp. 180–207. IGI Global (2011)
Rogers, A., Kovaleva, O., Rumshisky, A.: A primer in BERTology: what we know about how BERT works. Trans. Assoc. Comput. Linguist. 8, 842–866 (2021)
Schepis, M.M., Reid, D.H., Behrman, M.M.: Acquisition and functional use of voice output communication by persons with profound multiple disabilities. Behav. Modif. 20(4), 451–468 (1996)
Seow, S.C.: Information theoretic models of HCI: a comparison of the Hick-Hyman law and Fitt’s law. Hum.-Comput. Interact. 20(3), 315–352 (2005)
Shane, H.C., Blackstone, S., Vanderheiden, G., Williams, M., DeRuyter, F.: Using AAC technology to access the world. Assist. Technol. 24(1), 3–13 (2012)
Sharma, D., Cse, M.: Stemming algorithms: a comparative study and their analysis. Int. J. Appl. Inf. Syst. 4(3), 7–12 (2012)
Shire, S.Y., Jones, N.: Communication partners supporting children with complex communication needs who use AAC: a systematic review. Commun. Disord. Q. 37(1), 3–15 (2015)
Todman, J., Alm, N., Higginbotham, J., File, P.: Whole utterance approaches in AAC. Augment. Altern. Commun. 24(3), 235–254 (2008)
Vertanen, K., Kristensson, P.O.: The imagination of crowds: conversational AAC language modeling using crowdsourcing and large data sources. In: Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, pp. 700–711 (2011)
Wang, A., Singh, A., Michael, J., Hill, F., Levy, O., Bowman, S.: GLUE: a multi-task benchmark and analysis platform for natural language understanding. In: Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP, pp. 353–355. Association for Computational Linguistics, Brussels (2018)
Ward, D.J., Blackwell, A.F., MacKay, D.J.C.: Dasher-a data entry interface using continuous gestures and language models. In: Proceedings of the 13th Annual ACM Symposium on User Interface Software and Technology, pp. 129–137 (2000)
Wobbrock, J.O., Cutrell, E., Harada, S., MacKenzie, I.S.: An error model for pointing based on Fitts’ law. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 1613–1622 (2008)
Zhai, S., Hunter, M., Smith, B.A.: Performance optimization of virtual keyboards. Hum.-Comput. Interact. 17(2–3), 229–269 (2002)
Zhuang, L., Wayne, L., Ya, S., Jun, Z.: A robustly optimized BERT pre-training approach with post-training. In: Proceedings of the 20th Chinese National Conference on Computational Linguistics, pp. 1218–1227. Chinese Information Processing Society of China (2021)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Yusufali, H., Goetze, S., Moore, R.K. (2023). Bridging the Communication Rate Gap: Enhancing Text Input for Augmentative and Alternative Communication (AAC). In: Gao, Q., Zhou, J., Duffy, V.G., Antona, M., Stephanidis, C. (eds) HCI International 2023 – Late Breaking Papers. HCII 2023. Lecture Notes in Computer Science, vol 14055. Springer, Cham. https://doi.org/10.1007/978-3-031-48041-6_29
Download citation
DOI: https://doi.org/10.1007/978-3-031-48041-6_29
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-48040-9
Online ISBN: 978-3-031-48041-6
eBook Packages: Computer ScienceComputer Science (R0)