Abstract
Desktop interaction solutions are often inappropriate for mobile devices due to small screen size and portability needs. Speech recognition can improve interactions by providing a relatively hands-free solution that can be used in various situations. While mobile systems are designed to be transportable, few have examined the effects of motion on mobile interactions. This paper investigates the effect of motion on automatic speech recognition (ASR) input for mobile devices. Speech recognition error rates (RER) have been examined with subjects walking or seated, while performing text input tasks and the effect of ASR enrollment conditions on RER. The obtained results suggest changes in user training of ASR systems for mobile and seated usage.







Similar content being viewed by others
Abbreviations
- ASR:
-
Automatic speech recognition
- RER:
-
Recognition error rate
- PDA:
-
Personal digital assistant
- SIID:
-
Situationally induced impairments and disabilities
- NASA TLX:
-
NASA task load index
- ISRC:
-
Interactive systems research center
- MME:
-
Multi metaphor environments
- NSF:
-
National science foundation
References
Akoumianakis D, Stephanidis C (2003) Multiple metaphor environments: designing for diversity. Ergonomics 46(1–3):88–113
Baber C, Noyes J (1996) Automatic speech recognition in adverse environments. Human factors 38(1):142 – 155
Barnard L, Yi JS, Jacko JA, Sears A An empirical comparison of use-in-motion evaluation scenarios for mobile computing devices. Int J Human-Comput Stud (in press)
Barnard L, Yi JS, Jacko JA, Sears A (204) The effects of context on human performance in mobile computing. Personal and Ubiquitous Computing, July 2004
Bradford JH (1995) The human factors of speech-based interfaces. SIGCHI Bull 27(2):61–67
Brewster S, Lumsden J, Bell M, Hall M, Tasker S (2003) Multi-modal ‘eyes-free’ interaction techniques for wearable devices. Lett CHI 5(1): 473–480
Brodie J, Perry M (2001) Designing for mobility, collaboration and information use by blue-collar workers. SIGGROUP Bull 22(3):22–27
Chandrasekhar A (2003) Respiratory rate and pattern of breathing: to evaluate one of the vital signs. Retrieved October 14, 2003, from Loyola University Medical Education Network Web Site: http://www.meddean.luc.edu/lumen/meded/medicine/pulmonar/pd/step73a.htm (n.d.)
Cohen PR, Oviatt SL (1993) The role of voice in human-machine communication. In: Roe DB, Wilpon J (eds) Human-computer interaction by voice. National Academy of Sciences Press, Washington, pp 1–36
Dahlbom B, Ljungberg F (1998) Mobile informatics. Scand J Inform Syst 10(1–2):227–234
Doust JH, Patrick JM (1981) The limitation of exercise ventilation during speech. Respir Physiol 46:137–147
Emery VK, Moloney KP, Jacko JA, Sainfort F (2004) Assessing workload in the context of human-computer interactions: Is the NASA-TLX a suitable measurement tool? (200401). Laboratory for Human-Computer Interaction and Health Care Informatics, Georgia Institute of Technology, Atlanta
Entwistle MS (2003) The performance of automated speech recognition systems under adverse conditions of human exertion. Int J Hum Comput Int 16(2):127–140
Feng J, Sears A (2003) Using confidence scores to improve hands-free speech-based navigation. In: Stephanidis C, Jacko J (eds) Human-computer interaction: theory and practice, Vol 2. Lawrence. Erlbaum Associates, Mahwah, pp 641–645
Feng J, Sears A, Karat C-M (2004) A longitudinal evaluation of hands-free speech-based navigation during dictation (Information Systems Department Technical Report). UMBC, Information Systems Department ISRC, Baltimore
Fiscus JG, Fisher WM, Martin AF, Przybocki MA, Pallett DS (2000) NIST evaluation of conversational speech recognition over the telephone: English and Mandarin performance results. Retrieved February 28, 2004, from http://www.nist.gov/speech/publications/tw00/pdf/cts10.pdf
Hagen A, Connors DA, Pellom BL (2003) The analysis and design of architecture systems for speech recognition on modern handheld-computing devices. In: Proceedings of the international symposium on systems synthesis. ACM Press, New York, pp 65–70
Hart SG, Staveland LE (1988) Development of NASA-TLX (Task Load Index): Results of empirical and theoretical research. In: Hancock PA, Meshkati N (Eds) Human mental workload. Elsevier Science Publishers B.V., Amsterdam, pp. 139–183
19. Holzman TG (2001) Speech-audio interface for medical information management in field environments. Int J Speech Technol 4:209–226
Huerta JM (2000) Speech recognition in mobile environments. Ph.D. Dissertation, Carnegie Mellon University, Pittsburgh
Iacucci G, Kuutti K, Ranta M (2000) On the move with a magic thing: Role playing in concept design of mobile services and devices. In: Proceedings of the conference on designing interactive systems: Processes, practices, methods, and techniques. ACM Press, New York, pp 193–202
Johnson P (1998) Usability and mobility; interactions on the move. Retrieved August 20, 03, from Department of Computer Science Web Site: http://www.dcs.gla.ac.uk/∼johnson/papers/mobile/HCIMD1.html
Juul-Kristensen B, Laursen B, Pilegaard M, Jensen BR (2004) Physical workload during use of speech recognition and traditional computer input devices. Ergonomics 47(2):119–133
Karat C-M, Halverson C, Karat J, Horn D (1999) Patterns of entry and correction in large vocabulary continuous speech recognition systems. In: Proceedings of CHI ’99, ACM Press, New York , pp 568–575
Lin M, Price K, Goldman R, Sears A, Jacko J (2005) Tapping on the Move - Fitts’ Law under mobile conditions. In: Proceedings of IRMA 2005 (in press)
Lu Y-C, Xiao Y, Sears A, Jacko J (2003) An observational and interview study on personal digital assistant (PDA) uses by clinicians in different contexts. In: Harris D, Duffy V, Smith M, Stephanidis C (eds) Human-centred computing: cognitive, social and ergonomic aspects. Lawrence Erlbaum Associates, Mahwah, pp 93–97
McCormick J (2003) Speech Recognition. Govern Comput News 22(22):24–28
Meckel Y, Rotstein A, Inbar O (2002) The effects of speech production on physiologic responses during submaximal exercise. Med Sci Sports Exerc 34(8):1337–1343
NASA Ames Research Center (1987) NASA Human Performance Research Group Task Load Index (NASA-TLX) instruction manual [Brochure]. Moffett Field, CA
Noyes JM, Frankish CR (1994) Errors and error correction in automatic speech recognition systems. Ergonomics 37:1943–1957
Pascoe J, Ryan N, Morse D (2000) Using while moving: HCI issues in fieldwork environments. ACM Trans Comput Hum Interact 7(3):417–437
Paterno F (2003) Understanding interaction with mobile devices. Interact Comput 15:473–478
Perry M, O’Hara K, Sellen A, Brown B, Harper R (2001) Dealing with mobility: Understanding access anytime, anywhere. ACM Trans Comput Hum Interact 8(4):323–347
Price KJ, Sears A (2002) Speech-based data entry for handheld devices: Speed of entry and error correction techniques (Information Systems Department Technical Report). UMBC, Baltimore, pp 1–8
Price KJ, (2003) Sears A Speech-based text entry for mobile devices. In: Stephanidis C, Jacko J (eds) Human-computer interaction: theory and practice (Part II). Lawrence Erlbaum Associates, Mahwah, pp 766–770
Price KJ, Lin M, Feng J, Goldman R, Sears A, Jacko J (2004) Data entry on the move: An examination of nomadic speech-based text entry. Lect Notes Comput Sci (LNCS) 3196:460–471
Rollins AM (1985) Speech recognition and manner of speaking in noise and in quiet. In: CHI ’85 Proceedings. ACM Press, New York, pp 197–199
Satyanarayanan M (1996) Fundamental challenges in mobile computing. In: Proceedings of the fifteenth annual ACM symposium on principles of distributed computing. ACM Press, New York, pp 1–7
Sawhney N, Schmandt C (2000) Nomadic radio: speech and audio interaction for contextual messaging in nomadic environments. ACM Trans Comput Hum Interact 7(3):353–383
Schumacher EH, Seymour TL, Glass JM, Fencsik DE, Lauber EJ, Kieras DE et al (2001) Virtually perfect time sharing in dual-task performance: uncorking the central cognitive bottleneck. Psychol Sci 12(2):101–108
Sears A, Feng J, Oseitutu K, Karat C-M (2003) Hands-free, speech-based navigation during dictation: Difficulties, consequences, and solutions. Hum Comput Interact 18:229–257
Sears A, Jacko JA, Chu J, Moro F (2001) The role of visual search in the design of effective soft keyboards. Behav Inform Technol 20(3):159–166
Sears A, Karat C-M, Oseitutu K, Karimullah A, Feng J (2001) Productivity, satisfaction, and interaction strategies of individuals with spinal cord injuries and traditional users interacting with speech recognition software. Univer Access Inform Soc 1:4–15
Sears A, Lin M, Jacko J, Xiao Y (2003) When computers fade ... pervasive computing and situationally-induced impairments and disabilities. In: Proceedings of HCII 2003, pp 1298–1302
Sears A, Young M (2003) Physical disabilities and computing technologies: an analysis of impairments. In: Jacko J, Sears A (eds) The human-computer interaction handbook. Lawrence Erlbaum and Associates, Mahwah, pp 482–503
Shneiderman B (2000) The limits of speech recognition. Commun ACM 43(9):63–65
Suhm B, Myers B, Waibel A (2001) Multimodal error correction for speech user interfaces. ACM Trans Comput Hum Interact 8(1):60–98
Ward K, Novick DG (2003) Accessibility: hands-free documentation. In: Proceedings of the 21st annual international conference on documentation. ACM Press, New York, pp 147–154
Wright P, Bartram C, Rogers N, Emslie H, Evans J, Wilson B, Belt S (2000) Text entry on handheld computers by older users. Ergonomics 43(6):702–716
Xie B, Salvendy G (2000) Prediction of mental workload in single and multiple tasks environments. Int J Cogn Ergon 4(3):213–242
Acknowledgements
This material is based upon work supported by the National Science Foundation (NSF) under Grant Nos. IIS-0121570 and IIS-0328391. Any opinions, findings and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the NSF. Numerous colleagues at the ISRC were instrumental in the completion of this research, including Liwei Dai who performed analysis and coding of speech data. We would also like to thank the anonymous reviewers for their thoughtful feedback, which led to several improvements in this paper.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Price, K.J., Lin, M., Feng, J. et al. Motion does matter: an examination of speech-based text entry on the move. Univ Access Inf Soc 4, 246–257 (2006). https://doi.org/10.1007/s10209-005-0006-8
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10209-005-0006-8