Article

Augmenting conversations using dual-purpose speech

Authors:
Kent Lyons

Georgia Institute of Technology, Atlanta, GA

Georgia Institute of Technology, Atlanta, GA
View Profile

,
Christopher Skeels

Georgia Institute of Technology, Atlanta, GA

Georgia Institute of Technology, Atlanta, GA
View Profile

,
Thad Starner

Georgia Institute of Technology, Atlanta, GA

Georgia Institute of Technology, Atlanta, GA
View Profile

,
Cornelis M. Snoeck

Georgia Institute of Technology, Atlanta, GA

Georgia Institute of Technology, Atlanta, GA
View Profile

,
Benjamin A. Wong

Georgia Institute of Technology, Atlanta, GA

Georgia Institute of Technology, Atlanta, GA
View Profile

,
Daniel Ashbrook

Georgia Institute of Technology, Atlanta, GA

Georgia Institute of Technology, Atlanta, GA
View Profile

UIST '04: Proceedings of the 17th annual ACM symposium on User interface software and technologyOctober 2004Pages 237–246https://doi.org/10.1145/1029632.1029674

Published:24 October 2004Publication History

UIST '04: Proceedings of the 17th annual ACM symposium on User interface software and technology

Pages 237–246

ABSTRACT

In this paper, we explore the concept of dual-purpose speech: speech that is socially appropriate in the context of a human-to-human conversation which also provides meaningful input to a computer. We motivate the use of dual-purpose speech and explore issues of privacy and technological challenges related to mobile speech recognition. We present three applications that utilize dual-purpose speech to assist a user in conversational tasks: the Calendar Navigator Agent, DialogTabs, and Speech Courier. The Calendar Navigator Agent navigates a user's calendar based on socially appropriate speech used while scheduling appointments. DialogTabs allows a user to postpone cognitive processing of conversational material by proving short-term capture of transient information. Finally, Speech Courier allows asynchronous delivery of relevant conversational information to a third party.

References

Allied Communications Publication. Communication Instructions Radiotelephone Procedures, September 2001.Google Scholar
J. L. Austin. How to do Things with Words. Harvard University Press, 1962.Google Scholar
S. Busemann, T. Declerck, A. K. Diagne, L. Dini, J. Klein, and S. Schmeier. Natural language dialogue service for appointment scheduling agents. Technical Report RR-97-02, Deutsches Forschungszentrum für Künstliche Intelligenz GmbH, 1997.Google ScholarDigital Library
V. Bush. As we may think. Atlantic Monthly, 76(1):101--108, July 1945.Google Scholar
T. Choudhury and A. Pentland. Sensing and modeling human networks using the sociometer. In Proceedings of ISWC, October 2003. Google ScholarDigital Library
P. Cohen and S. Oviatt. The role of voice input for human-machine communication. In Proceedings of the National Academy of Sciences, volume 92, pages 9921--9927, 1995.Google ScholarCross Ref
C. Danis, L. Comerford, E. Janke, K. Davies, J. De-Vries, and A. Bertrand. Storywriter: A speech oriented editor. In Proceedings of CHI, pages 277--278, New York, April 1994. ACM. Google ScholarDigital Library
J. DelPapa. Personal communication. Boston Voice Users Group, June 1998.Google Scholar
J. Gould, J. Conti, and T. Hovanyecz. Composing letters with a simulated listening typewriter. Communications of the ACM, 26(4):295--308, April 1983. Google ScholarDigital Library
G. R. Hayes, S. N. Patel, K. N. Truong, G. Iachello, J. A. Kientz, R. Farmer, and G. D. Abowd. The personal audio loop: Designing a ubiquitous audio-based memory aid. In Proceedings of Mobile HCI, 2004.Google ScholarCross Ref
C. T. Hemphill, J. J. Godfrey, and G. R. Doddington. The ATIS spoken language systems pilot corpus. In Proc. of the Speech and Natural Language Workshop, pages 96--101, Hidden Valley, PA, 1990. Google ScholarDigital Library
E. Horvitz, J. Breese, D. Heckerman, D. Hovel, and K. Rommelse. The Lumiere project: Bayesian user modeling for inferring the goals and needs of software users. In Proceedings of Uncertainty in Artificial Intelligence, 1998. Google ScholarDigital Library
X. Huang, F. Alleva, H. wuen Hon, M.-Y. H. andKai Fu Lee, and R. Rosenfeld. The Sphinx-II speech recognition system: An overview. Computer, Speech and Language, pages 137--148, 1993.Google Scholar
F. Kubala, A. Anastasakos, J. Makhoul, L. Nguyen, R. Schwartz, and G. Zavaliagkos. Comparative experiments on large vocabulary speech recognition. In ICASSP, Adelaide, Australia, 1994.Google ScholarCross Ref
E. Levin, R. Pieraccini, and W. Eckert. A stochastic model of human-machine interaction for learning dialog strategies. Trans. on Speech and Audio Processing, 8(1):11--23, 2000.Google ScholarCross Ref
S. Oviatt. Ten myths of multimodal interaction. Communications of the ACM, 42(11):74--81, 1999. Google ScholarDigital Library
R. Panko. Managerial communication patterns. Journal of Organisational Computing, 1992.Google ScholarCross Ref
B. J. Rhodes. Just-In-Time Information Retrieval.PhD thesis, MIT Media Laboratory, Cambridge, MA, May 2000. Google ScholarDigital Library
A. Rudnicky. Mode preference in a simple data-retrieval task. In ARPA Human Language Technology Workshop, Princeton, New Jersey, March 1993. Google ScholarDigital Library
C. Schmandt. Voice Communication with Computers. Van Nostrand Reinhold, New York, 1994. Google ScholarDigital Library
J. R. Searle. Speech Acts: An Essay in the Philosophy of Language. Cambridge University Press, 1969.Google Scholar
B. Shneiderman. The limits of speech recognition. Communications of the ACM, 43(9), September 2000. Google ScholarDigital Library
T. E. Starner, C. M. Snoeck, B. A. Wong, and R. M. McGuire. Use of mobile appointment scheduling devices. In Proceedings of CHI. ACM Press, 2004. Google ScholarDigital Library
M. Stede, S. Haas, and U. Küssner. Tracking and understanding temporal descriptions in dialogue. Verbmobil-Report 232, Technische Universität Berlin, October 1998.Google Scholar
L. Stifelman. Augmenting real-world objects. In Proceedings of CHI, New York, 1996. ACM. Google ScholarDigital Library
L. Stifelman, B. Arons, C. Schmandt, and E. Hulteen. Voicenotes: A speech interface for a hand-held voice notetaker. In Proceedings of CHI, pages 179--186, New York, 1993. ACM. Google ScholarDigital Library
S. Whittaker, D. Frohlich, and O. Daly-Jones. Informal workplace communication: what is it like and how might we support it? In Proceedings of CHI, pages 131--137. ACM Press, 1994. Google ScholarDigital Library
S. Whittaker, J. Hirschberg, B. Amento, L. Stark, M. Bacchiani, P. Isenhour, L. Stead, G. Zamchick, and A. Rosenberg. Scanmail: a voicemail interface that makes speech browsable, readable and searchable. In Proceedings of CHI, pages 275--282, New York, 2002. ACM Press. Google ScholarDigital Library
S. Whittaker, P. Hyland, and M. Wiley. Filochat: Hand-written notes provide access to recorded conversations. In Proceedings of CHI, pages 271--276, New York, 1994. ACM Press. Google ScholarDigital Library
L. Wilcox, B. Schilit, and N. Sawhney. Dynomite: A dynamically organized ink and audio notebook. In CHI, pages 186--193, New York, 1997. ACM. Google ScholarDigital Library
N. Yankelovich, G. Levow, and M. Marx. Designing SpeechActs: Issues in speech user interfaces. In Proceedings of CHI, pages 568--572, New York, 1995. ACM. Google ScholarDigital Library

Index Terms

Augmenting conversations using dual-purpose speech
1. Hardware
  1. Communication hardware, interfaces and storage
    1. Sound-based input / output
2. Human-centered computing
  1. Human computer interaction (HCI)
    1. Interaction devices
      1. Sound-based input / output
      2. Touch screens
    2. Interaction paradigms
      1. Natural language interfaces

Recommendations

Providing support for mobile calendaring conversations: a wizard of oz evaluation of dual--purpose speech
MobileHCI '05: Proceedings of the 7th international conference on Human computer interaction with mobile devices & services

We present a Wizard of Oz evaluation of dual--purpose speech, a technique designed to provide support during a face--to--face conversation by leveraging a user's conversational speech for input. With a dual--purpose speech interaction, the user's speech ...
Read More
The dual excitation speech model
Read More
Online Speech Detection and Dual-Gender Speech Recognition for Captioning Broadcast News

This paper describes a new method to detect speech segments online with identifying gender attributes for efficient dual gender-dependent speech recognition and broadcast news captioning. The proposed online speech detection performs dual-gender phoneme ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
UIST '04: Proceedings of the 17th annual ACM symposium on User interface software and technology
October 2004
312 pages
ISBN:1581139578
DOI:10.1145/1029632
General Chair:
Steven K. Feiner
Columbia University
,
Program Chair:
James A. Landay
University of Washington & Intel Research
Copyright © 2004 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 24 October 2004
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
dual-purpose speech
mobile computing
speech user interfaces
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate842of3,967submissions,21%
Upcoming Conference
UIST '24

Sponsor:

sigchi

sigchi

UIST '24: The 37th Annual ACM Symposium on User Interface Software and Technology

October 13 - 16, 2024

Pittsburgh , PA , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 27
  Total Citations
  View Citations
- 840
  Total Downloads
- Downloads (Last 12 months)39
- Downloads (Last 6 weeks)3
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Augmenting conversations using dual-purpose speech

UIST '04: Proceedings of the 17th annual ACM symposium on User interface software and technology

ABSTRACT

References

Cited By

Index Terms

Recommendations

Providing support for mobile calendaring conversations: a wizard of oz evaluation of dual--purpose speech

The dual excitation speech model

Online Speech Detection and Dual-Gender Speech Recognition for Captioning Broadcast News

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media