Motion does matter: an examination of speech-based text entry on the move

Price, Kathleen J.; Lin, Min; Feng, Jinjuan; Goldman, Rich; Sears, Andrew; Jacko, Julie A.

doi:10.1007/s10209-005-0006-8

Motion does matter: an examination of speech-based text entry on the move

LONG PAPER
Published: 08 December 2005

Volume 4, pages 246–257, (2006)
Cite this article

Universal Access in the Information Society Aims and scope Submit manuscript

Kathleen J. Price¹,
Min Lin¹,
Jinjuan Feng¹,
Rich Goldman¹,
Andrew Sears¹ &
…
Julie A. Jacko²

293 Accesses
12 Citations
Explore all metrics

Abstract

Desktop interaction solutions are often inappropriate for mobile devices due to small screen size and portability needs. Speech recognition can improve interactions by providing a relatively hands-free solution that can be used in various situations. While mobile systems are designed to be transportable, few have examined the effects of motion on mobile interactions. This paper investigates the effect of motion on automatic speech recognition (ASR) input for mobile devices. Speech recognition error rates (RER) have been examined with subjects walking or seated, while performing text input tasks and the effect of ASR enrollment conditions on RER. The obtained results suggest changes in user training of ASR systems for mobile and seated usage.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Tecla Sound: Combining Single Switch and Speech Access

Development and Evaluation of a Voice Command App for Smartphone Interaction Using the Speech-to-Text API

Speech Recognition Challenges in the Car Navigation Industry

Abbreviations

ASR:: Automatic speech recognition
RER:: Recognition error rate
PDA:: Personal digital assistant
SIID:: Situationally induced impairments and disabilities
NASA TLX:: NASA task load index
ISRC:: Interactive systems research center
MME:: Multi metaphor environments
NSF:: National science foundation

References

Akoumianakis D, Stephanidis C (2003) Multiple metaphor environments: designing for diversity. Ergonomics 46(1–3):88–113
Google Scholar
Baber C, Noyes J (1996) Automatic speech recognition in adverse environments. Human factors 38(1):142 – 155
Article Google Scholar
Barnard L, Yi JS, Jacko JA, Sears A An empirical comparison of use-in-motion evaluation scenarios for mobile computing devices. Int J Human-Comput Stud (in press)
Barnard L, Yi JS, Jacko JA, Sears A (204) The effects of context on human performance in mobile computing. Personal and Ubiquitous Computing, July 2004
Bradford JH (1995) The human factors of speech-based interfaces. SIGCHI Bull 27(2):61–67
Article MathSciNet Google Scholar
Brewster S, Lumsden J, Bell M, Hall M, Tasker S (2003) Multi-modal ‘eyes-free’ interaction techniques for wearable devices. Lett CHI 5(1): 473–480
Google Scholar
Brodie J, Perry M (2001) Designing for mobility, collaboration and information use by blue-collar workers. SIGGROUP Bull 22(3):22–27
Google Scholar
Chandrasekhar A (2003) Respiratory rate and pattern of breathing: to evaluate one of the vital signs. Retrieved October 14, 2003, from Loyola University Medical Education Network Web Site: http://www.meddean.luc.edu/lumen/meded/medicine/pulmonar/pd/step73a.htm (n.d.)
Cohen PR, Oviatt SL (1993) The role of voice in human-machine communication. In: Roe DB, Wilpon J (eds) Human-computer interaction by voice. National Academy of Sciences Press, Washington, pp 1–36
Google Scholar
Dahlbom B, Ljungberg F (1998) Mobile informatics. Scand J Inform Syst 10(1–2):227–234
Google Scholar
Doust JH, Patrick JM (1981) The limitation of exercise ventilation during speech. Respir Physiol 46:137–147
Article Google Scholar
Emery VK, Moloney KP, Jacko JA, Sainfort F (2004) Assessing workload in the context of human-computer interactions: Is the NASA-TLX a suitable measurement tool? (200401). Laboratory for Human-Computer Interaction and Health Care Informatics, Georgia Institute of Technology, Atlanta
Entwistle MS (2003) The performance of automated speech recognition systems under adverse conditions of human exertion. Int J Hum Comput Int 16(2):127–140
Article Google Scholar
Feng J, Sears A (2003) Using confidence scores to improve hands-free speech-based navigation. In: Stephanidis C, Jacko J (eds) Human-computer interaction: theory and practice, Vol 2. Lawrence. Erlbaum Associates, Mahwah, pp 641–645
Feng J, Sears A, Karat C-M (2004) A longitudinal evaluation of hands-free speech-based navigation during dictation (Information Systems Department Technical Report). UMBC, Information Systems Department ISRC, Baltimore
Fiscus JG, Fisher WM, Martin AF, Przybocki MA, Pallett DS (2000) NIST evaluation of conversational speech recognition over the telephone: English and Mandarin performance results. Retrieved February 28, 2004, from http://www.nist.gov/speech/publications/tw00/pdf/cts10.pdf
Hagen A, Connors DA, Pellom BL (2003) The analysis and design of architecture systems for speech recognition on modern handheld-computing devices. In: Proceedings of the international symposium on systems synthesis. ACM Press, New York, pp 65–70
Hart SG, Staveland LE (1988) Development of NASA-TLX (Task Load Index): Results of empirical and theoretical research. In: Hancock PA, Meshkati N (Eds) Human mental workload. Elsevier Science Publishers B.V., Amsterdam, pp. 139–183
Chapter Google Scholar
19. Holzman TG (2001) Speech-audio interface for medical information management in field environments. Int J Speech Technol 4:209–226
Article MATH Google Scholar
Huerta JM (2000) Speech recognition in mobile environments. Ph.D. Dissertation, Carnegie Mellon University, Pittsburgh
Iacucci G, Kuutti K, Ranta M (2000) On the move with a magic thing: Role playing in concept design of mobile services and devices. In: Proceedings of the conference on designing interactive systems: Processes, practices, methods, and techniques. ACM Press, New York, pp 193–202
Johnson P (1998) Usability and mobility; interactions on the move. Retrieved August 20, 03, from Department of Computer Science Web Site: http://www.dcs.gla.ac.uk/∼johnson/papers/mobile/HCIMD1.html
Juul-Kristensen B, Laursen B, Pilegaard M, Jensen BR (2004) Physical workload during use of speech recognition and traditional computer input devices. Ergonomics 47(2):119–133
Article Google Scholar
Karat C-M, Halverson C, Karat J, Horn D (1999) Patterns of entry and correction in large vocabulary continuous speech recognition systems. In: Proceedings of CHI ’99, ACM Press, New York , pp 568–575
Lin M, Price K, Goldman R, Sears A, Jacko J (2005) Tapping on the Move - Fitts’ Law under mobile conditions. In: Proceedings of IRMA 2005 (in press)
Lu Y-C, Xiao Y, Sears A, Jacko J (2003) An observational and interview study on personal digital assistant (PDA) uses by clinicians in different contexts. In: Harris D, Duffy V, Smith M, Stephanidis C (eds) Human-centred computing: cognitive, social and ergonomic aspects. Lawrence Erlbaum Associates, Mahwah, pp 93–97
Google Scholar
McCormick J (2003) Speech Recognition. Govern Comput News 22(22):24–28
Google Scholar
Meckel Y, Rotstein A, Inbar O (2002) The effects of speech production on physiologic responses during submaximal exercise. Med Sci Sports Exerc 34(8):1337–1343
Article Google Scholar
NASA Ames Research Center (1987) NASA Human Performance Research Group Task Load Index (NASA-TLX) instruction manual [Brochure]. Moffett Field, CA
Noyes JM, Frankish CR (1994) Errors and error correction in automatic speech recognition systems. Ergonomics 37:1943–1957
Article Google Scholar
Pascoe J, Ryan N, Morse D (2000) Using while moving: HCI issues in fieldwork environments. ACM Trans Comput Hum Interact 7(3):417–437
Article Google Scholar
Paterno F (2003) Understanding interaction with mobile devices. Interact Comput 15:473–478
Article Google Scholar
Perry M, O’Hara K, Sellen A, Brown B, Harper R (2001) Dealing with mobility: Understanding access anytime, anywhere. ACM Trans Comput Hum Interact 8(4):323–347
Article Google Scholar
Price KJ, Sears A (2002) Speech-based data entry for handheld devices: Speed of entry and error correction techniques (Information Systems Department Technical Report). UMBC, Baltimore, pp 1–8
Google Scholar
Price KJ, (2003) Sears A Speech-based text entry for mobile devices. In: Stephanidis C, Jacko J (eds) Human-computer interaction: theory and practice (Part II). Lawrence Erlbaum Associates, Mahwah, pp 766–770
Google Scholar
Price KJ, Lin M, Feng J, Goldman R, Sears A, Jacko J (2004) Data entry on the move: An examination of nomadic speech-based text entry. Lect Notes Comput Sci (LNCS) 3196:460–471
Google Scholar
Rollins AM (1985) Speech recognition and manner of speaking in noise and in quiet. In: CHI ’85 Proceedings. ACM Press, New York, pp 197–199
Satyanarayanan M (1996) Fundamental challenges in mobile computing. In: Proceedings of the fifteenth annual ACM symposium on principles of distributed computing. ACM Press, New York, pp 1–7
Sawhney N, Schmandt C (2000) Nomadic radio: speech and audio interaction for contextual messaging in nomadic environments. ACM Trans Comput Hum Interact 7(3):353–383
Article Google Scholar
Schumacher EH, Seymour TL, Glass JM, Fencsik DE, Lauber EJ, Kieras DE et al (2001) Virtually perfect time sharing in dual-task performance: uncorking the central cognitive bottleneck. Psychol Sci 12(2):101–108
Article Google Scholar
Sears A, Feng J, Oseitutu K, Karat C-M (2003) Hands-free, speech-based navigation during dictation: Difficulties, consequences, and solutions. Hum Comput Interact 18:229–257
Article Google Scholar
Sears A, Jacko JA, Chu J, Moro F (2001) The role of visual search in the design of effective soft keyboards. Behav Inform Technol 20(3):159–166
Article Google Scholar
Sears A, Karat C-M, Oseitutu K, Karimullah A, Feng J (2001) Productivity, satisfaction, and interaction strategies of individuals with spinal cord injuries and traditional users interacting with speech recognition software. Univer Access Inform Soc 1:4–15
Google Scholar
Sears A, Lin M, Jacko J, Xiao Y (2003) When computers fade ... pervasive computing and situationally-induced impairments and disabilities. In: Proceedings of HCII 2003, pp 1298–1302
Sears A, Young M (2003) Physical disabilities and computing technologies: an analysis of impairments. In: Jacko J, Sears A (eds) The human-computer interaction handbook. Lawrence Erlbaum and Associates, Mahwah, pp 482–503
Google Scholar
Shneiderman B (2000) The limits of speech recognition. Commun ACM 43(9):63–65
Article Google Scholar
Suhm B, Myers B, Waibel A (2001) Multimodal error correction for speech user interfaces. ACM Trans Comput Hum Interact 8(1):60–98
Article Google Scholar
Ward K, Novick DG (2003) Accessibility: hands-free documentation. In: Proceedings of the 21st annual international conference on documentation. ACM Press, New York, pp 147–154
Wright P, Bartram C, Rogers N, Emslie H, Evans J, Wilson B, Belt S (2000) Text entry on handheld computers by older users. Ergonomics 43(6):702–716
Article Google Scholar
Xie B, Salvendy G (2000) Prediction of mental workload in single and multiple tasks environments. Int J Cogn Ergon 4(3):213–242
Article Google Scholar

Download references

Acknowledgements

This material is based upon work supported by the National Science Foundation (NSF) under Grant Nos. IIS-0121570 and IIS-0328391. Any opinions, findings and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the NSF. Numerous colleagues at the ISRC were instrumental in the completion of this research, including Liwei Dai who performed analysis and coding of speech data. We would also like to thank the anonymous reviewers for their thoughtful feedback, which led to several improvements in this paper.

Author information

Authors and Affiliations

UMBC, Information Systems Department, Interactive Systems Research Center, 1000 Hilltop Circle, Baltimore, MD, 21250, USA
Kathleen J. Price, Min Lin, Jinjuan Feng, Rich Goldman & Andrew Sears
Georgia Institute of Technology, School of Industrial& Systems Engineering, 765 Ferst Drive, Atlanta, GA, 30332-0205, USA
Julie A. Jacko

Authors

Kathleen J. Price
View author publications
You can also search for this author inPubMed Google Scholar
Min Lin
View author publications
You can also search for this author inPubMed Google Scholar
Jinjuan Feng
View author publications
You can also search for this author inPubMed Google Scholar
Rich Goldman
View author publications
You can also search for this author inPubMed Google Scholar
Andrew Sears
View author publications
You can also search for this author inPubMed Google Scholar
Julie A. Jacko
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Kathleen J. Price.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Price, K.J., Lin, M., Feng, J. et al. Motion does matter: an examination of speech-based text entry on the move. Univ Access Inf Soc 4, 246–257 (2006). https://doi.org/10.1007/s10209-005-0006-8

Download citation

Published: 08 December 2005
Issue Date: March 2006
DOI: https://doi.org/10.1007/s10209-005-0006-8

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Motion does matter: an examination of speech-based text entry on the move

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Tecla Sound: Combining Single Switch and Speech Access

Development and Evaluation of a Voice Command App for Smartphone Interaction Using the Speech-to-Text API

Speech Recognition Challenges in the Car Navigation Industry

Abbreviations

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now