skip to main content
10.1145/1029632.1029674acmconferencesArticle/Chapter ViewAbstractPublication PagesuistConference Proceedingsconference-collections
Article

Augmenting conversations using dual-purpose speech

Published:24 October 2004Publication History

ABSTRACT

In this paper, we explore the concept of dual-purpose speech: speech that is socially appropriate in the context of a human-to-human conversation which also provides meaningful input to a computer. We motivate the use of dual-purpose speech and explore issues of privacy and technological challenges related to mobile speech recognition. We present three applications that utilize dual-purpose speech to assist a user in conversational tasks: the Calendar Navigator Agent, DialogTabs, and Speech Courier. The Calendar Navigator Agent navigates a user's calendar based on socially appropriate speech used while scheduling appointments. DialogTabs allows a user to postpone cognitive processing of conversational material by proving short-term capture of transient information. Finally, Speech Courier allows asynchronous delivery of relevant conversational information to a third party.

References

  1. Allied Communications Publication. Communication Instructions Radiotelephone Procedures, September 2001.Google ScholarGoogle Scholar
  2. J. L. Austin. How to do Things with Words. Harvard University Press, 1962.Google ScholarGoogle Scholar
  3. S. Busemann, T. Declerck, A. K. Diagne, L. Dini, J. Klein, and S. Schmeier. Natural language dialogue service for appointment scheduling agents. Technical Report RR-97-02, Deutsches Forschungszentrum für Künstliche Intelligenz GmbH, 1997.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. V. Bush. As we may think. Atlantic Monthly, 76(1):101--108, July 1945.Google ScholarGoogle Scholar
  5. T. Choudhury and A. Pentland. Sensing and modeling human networks using the sociometer. In Proceedings of ISWC, October 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. P. Cohen and S. Oviatt. The role of voice input for human-machine communication. In Proceedings of the National Academy of Sciences, volume 92, pages 9921--9927, 1995.Google ScholarGoogle ScholarCross RefCross Ref
  7. C. Danis, L. Comerford, E. Janke, K. Davies, J. De-Vries, and A. Bertrand. Storywriter: A speech oriented editor. In Proceedings of CHI, pages 277--278, New York, April 1994. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. J. DelPapa. Personal communication. Boston Voice Users Group, June 1998.Google ScholarGoogle Scholar
  9. J. Gould, J. Conti, and T. Hovanyecz. Composing letters with a simulated listening typewriter. Communications of the ACM, 26(4):295--308, April 1983. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. G. R. Hayes, S. N. Patel, K. N. Truong, G. Iachello, J. A. Kientz, R. Farmer, and G. D. Abowd. The personal audio loop: Designing a ubiquitous audio-based memory aid. In Proceedings of Mobile HCI, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  11. C. T. Hemphill, J. J. Godfrey, and G. R. Doddington. The ATIS spoken language systems pilot corpus. In Proc. of the Speech and Natural Language Workshop, pages 96--101, Hidden Valley, PA, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. E. Horvitz, J. Breese, D. Heckerman, D. Hovel, and K. Rommelse. The Lumiere project: Bayesian user modeling for inferring the goals and needs of software users. In Proceedings of Uncertainty in Artificial Intelligence, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. X. Huang, F. Alleva, H. wuen Hon, M.-Y. H. andKai Fu Lee, and R. Rosenfeld. The Sphinx-II speech recognition system: An overview. Computer, Speech and Language, pages 137--148, 1993.Google ScholarGoogle Scholar
  14. F. Kubala, A. Anastasakos, J. Makhoul, L. Nguyen, R. Schwartz, and G. Zavaliagkos. Comparative experiments on large vocabulary speech recognition. In ICASSP, Adelaide, Australia, 1994.Google ScholarGoogle ScholarCross RefCross Ref
  15. E. Levin, R. Pieraccini, and W. Eckert. A stochastic model of human-machine interaction for learning dialog strategies. Trans. on Speech and Audio Processing, 8(1):11--23, 2000.Google ScholarGoogle ScholarCross RefCross Ref
  16. S. Oviatt. Ten myths of multimodal interaction. Communications of the ACM, 42(11):74--81, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. R. Panko. Managerial communication patterns. Journal of Organisational Computing, 1992.Google ScholarGoogle ScholarCross RefCross Ref
  18. B. J. Rhodes. Just-In-Time Information Retrieval.PhD thesis, MIT Media Laboratory, Cambridge, MA, May 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. A. Rudnicky. Mode preference in a simple data-retrieval task. In ARPA Human Language Technology Workshop, Princeton, New Jersey, March 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. C. Schmandt. Voice Communication with Computers. Van Nostrand Reinhold, New York, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. J. R. Searle. Speech Acts: An Essay in the Philosophy of Language. Cambridge University Press, 1969.Google ScholarGoogle Scholar
  22. B. Shneiderman. The limits of speech recognition. Communications of the ACM, 43(9), September 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. T. E. Starner, C. M. Snoeck, B. A. Wong, and R. M. McGuire. Use of mobile appointment scheduling devices. In Proceedings of CHI. ACM Press, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. M. Stede, S. Haas, and U. Küssner. Tracking and understanding temporal descriptions in dialogue. Verbmobil-Report 232, Technische Universität Berlin, October 1998.Google ScholarGoogle Scholar
  25. L. Stifelman. Augmenting real-world objects. In Proceedings of CHI, New York, 1996. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. L. Stifelman, B. Arons, C. Schmandt, and E. Hulteen. Voicenotes: A speech interface for a hand-held voice notetaker. In Proceedings of CHI, pages 179--186, New York, 1993. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. S. Whittaker, D. Frohlich, and O. Daly-Jones. Informal workplace communication: what is it like and how might we support it? In Proceedings of CHI, pages 131--137. ACM Press, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. S. Whittaker, J. Hirschberg, B. Amento, L. Stark, M. Bacchiani, P. Isenhour, L. Stead, G. Zamchick, and A. Rosenberg. Scanmail: a voicemail interface that makes speech browsable, readable and searchable. In Proceedings of CHI, pages 275--282, New York, 2002. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. S. Whittaker, P. Hyland, and M. Wiley. Filochat: Hand-written notes provide access to recorded conversations. In Proceedings of CHI, pages 271--276, New York, 1994. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. L. Wilcox, B. Schilit, and N. Sawhney. Dynomite: A dynamically organized ink and audio notebook. In CHI, pages 186--193, New York, 1997. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. N. Yankelovich, G. Levow, and M. Marx. Designing SpeechActs: Issues in speech user interfaces. In Proceedings of CHI, pages 568--572, New York, 1995. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Augmenting conversations using dual-purpose speech

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            UIST '04: Proceedings of the 17th annual ACM symposium on User interface software and technology
            October 2004
            312 pages
            ISBN:1581139578
            DOI:10.1145/1029632

            Copyright © 2004 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 24 October 2004

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • Article

            Acceptance Rates

            Overall Acceptance Rate842of3,967submissions,21%

            Upcoming Conference

            UIST '24

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader