ABSTRACT
In this paper, we explore the concept of dual-purpose speech: speech that is socially appropriate in the context of a human-to-human conversation which also provides meaningful input to a computer. We motivate the use of dual-purpose speech and explore issues of privacy and technological challenges related to mobile speech recognition. We present three applications that utilize dual-purpose speech to assist a user in conversational tasks: the Calendar Navigator Agent, DialogTabs, and Speech Courier. The Calendar Navigator Agent navigates a user's calendar based on socially appropriate speech used while scheduling appointments. DialogTabs allows a user to postpone cognitive processing of conversational material by proving short-term capture of transient information. Finally, Speech Courier allows asynchronous delivery of relevant conversational information to a third party.
- Allied Communications Publication. Communication Instructions Radiotelephone Procedures, September 2001.Google Scholar
- J. L. Austin. How to do Things with Words. Harvard University Press, 1962.Google Scholar
- S. Busemann, T. Declerck, A. K. Diagne, L. Dini, J. Klein, and S. Schmeier. Natural language dialogue service for appointment scheduling agents. Technical Report RR-97-02, Deutsches Forschungszentrum für Künstliche Intelligenz GmbH, 1997.Google ScholarDigital Library
- V. Bush. As we may think. Atlantic Monthly, 76(1):101--108, July 1945.Google Scholar
- T. Choudhury and A. Pentland. Sensing and modeling human networks using the sociometer. In Proceedings of ISWC, October 2003. Google ScholarDigital Library
- P. Cohen and S. Oviatt. The role of voice input for human-machine communication. In Proceedings of the National Academy of Sciences, volume 92, pages 9921--9927, 1995.Google ScholarCross Ref
- C. Danis, L. Comerford, E. Janke, K. Davies, J. De-Vries, and A. Bertrand. Storywriter: A speech oriented editor. In Proceedings of CHI, pages 277--278, New York, April 1994. ACM. Google ScholarDigital Library
- J. DelPapa. Personal communication. Boston Voice Users Group, June 1998.Google Scholar
- J. Gould, J. Conti, and T. Hovanyecz. Composing letters with a simulated listening typewriter. Communications of the ACM, 26(4):295--308, April 1983. Google ScholarDigital Library
- G. R. Hayes, S. N. Patel, K. N. Truong, G. Iachello, J. A. Kientz, R. Farmer, and G. D. Abowd. The personal audio loop: Designing a ubiquitous audio-based memory aid. In Proceedings of Mobile HCI, 2004.Google ScholarCross Ref
- C. T. Hemphill, J. J. Godfrey, and G. R. Doddington. The ATIS spoken language systems pilot corpus. In Proc. of the Speech and Natural Language Workshop, pages 96--101, Hidden Valley, PA, 1990. Google ScholarDigital Library
- E. Horvitz, J. Breese, D. Heckerman, D. Hovel, and K. Rommelse. The Lumiere project: Bayesian user modeling for inferring the goals and needs of software users. In Proceedings of Uncertainty in Artificial Intelligence, 1998. Google ScholarDigital Library
- X. Huang, F. Alleva, H. wuen Hon, M.-Y. H. andKai Fu Lee, and R. Rosenfeld. The Sphinx-II speech recognition system: An overview. Computer, Speech and Language, pages 137--148, 1993.Google Scholar
- F. Kubala, A. Anastasakos, J. Makhoul, L. Nguyen, R. Schwartz, and G. Zavaliagkos. Comparative experiments on large vocabulary speech recognition. In ICASSP, Adelaide, Australia, 1994.Google ScholarCross Ref
- E. Levin, R. Pieraccini, and W. Eckert. A stochastic model of human-machine interaction for learning dialog strategies. Trans. on Speech and Audio Processing, 8(1):11--23, 2000.Google ScholarCross Ref
- S. Oviatt. Ten myths of multimodal interaction. Communications of the ACM, 42(11):74--81, 1999. Google ScholarDigital Library
- R. Panko. Managerial communication patterns. Journal of Organisational Computing, 1992.Google ScholarCross Ref
- B. J. Rhodes. Just-In-Time Information Retrieval.PhD thesis, MIT Media Laboratory, Cambridge, MA, May 2000. Google ScholarDigital Library
- A. Rudnicky. Mode preference in a simple data-retrieval task. In ARPA Human Language Technology Workshop, Princeton, New Jersey, March 1993. Google ScholarDigital Library
- C. Schmandt. Voice Communication with Computers. Van Nostrand Reinhold, New York, 1994. Google ScholarDigital Library
- J. R. Searle. Speech Acts: An Essay in the Philosophy of Language. Cambridge University Press, 1969.Google Scholar
- B. Shneiderman. The limits of speech recognition. Communications of the ACM, 43(9), September 2000. Google ScholarDigital Library
- T. E. Starner, C. M. Snoeck, B. A. Wong, and R. M. McGuire. Use of mobile appointment scheduling devices. In Proceedings of CHI. ACM Press, 2004. Google ScholarDigital Library
- M. Stede, S. Haas, and U. Küssner. Tracking and understanding temporal descriptions in dialogue. Verbmobil-Report 232, Technische Universität Berlin, October 1998.Google Scholar
- L. Stifelman. Augmenting real-world objects. In Proceedings of CHI, New York, 1996. ACM. Google ScholarDigital Library
- L. Stifelman, B. Arons, C. Schmandt, and E. Hulteen. Voicenotes: A speech interface for a hand-held voice notetaker. In Proceedings of CHI, pages 179--186, New York, 1993. ACM. Google ScholarDigital Library
- S. Whittaker, D. Frohlich, and O. Daly-Jones. Informal workplace communication: what is it like and how might we support it? In Proceedings of CHI, pages 131--137. ACM Press, 1994. Google ScholarDigital Library
- S. Whittaker, J. Hirschberg, B. Amento, L. Stark, M. Bacchiani, P. Isenhour, L. Stead, G. Zamchick, and A. Rosenberg. Scanmail: a voicemail interface that makes speech browsable, readable and searchable. In Proceedings of CHI, pages 275--282, New York, 2002. ACM Press. Google ScholarDigital Library
- S. Whittaker, P. Hyland, and M. Wiley. Filochat: Hand-written notes provide access to recorded conversations. In Proceedings of CHI, pages 271--276, New York, 1994. ACM Press. Google ScholarDigital Library
- L. Wilcox, B. Schilit, and N. Sawhney. Dynomite: A dynamically organized ink and audio notebook. In CHI, pages 186--193, New York, 1997. ACM. Google ScholarDigital Library
- N. Yankelovich, G. Levow, and M. Marx. Designing SpeechActs: Issues in speech user interfaces. In Proceedings of CHI, pages 568--572, New York, 1995. ACM. Google ScholarDigital Library
Index Terms
- Augmenting conversations using dual-purpose speech
Recommendations
Providing support for mobile calendaring conversations: a wizard of oz evaluation of dual--purpose speech
MobileHCI '05: Proceedings of the 7th international conference on Human computer interaction with mobile devices & servicesWe present a Wizard of Oz evaluation of dual--purpose speech, a technique designed to provide support during a face--to--face conversation by leveraging a user's conversational speech for input. With a dual--purpose speech interaction, the user's speech ...
Online Speech Detection and Dual-Gender Speech Recognition for Captioning Broadcast News
This paper describes a new method to detect speech segments online with identifying gender attributes for efficient dual gender-dependent speech recognition and broadcast news captioning. The proposed online speech detection performs dual-gender phoneme ...
Comments