ABSTRACT
Despite decades of research attempting to establish conversational interaction between humans and computers, the capabilities of automated conversational systems are still limited. In this paper, we introduce Chorus, a crowd-powered conversational assistant. When using Chorus, end users converse continuously with what appears to be a single conversational partner. Behind the scenes, Chorus leverages multiple crowd workers to propose and vote on responses. A shared memory space helps the dynamic crowd workforce maintain consistency, and a game-theoretic incentive mechanism helps to balance their efforts between proposing and voting. Studies with 12 end users and 100 crowd workers demonstrate that Chorus can provide accurate, topical responses, answering nearly 93% of user queries appropriately, and staying on-topic in over 95% of responses. We also observed that Chorus has advantages over pairing an end user with a single crowd worker and end users completing their own tasks in terms of speed, quality, and breadth of assistance. Chorus demonstrates a new future in which conversational assistants are made usable in the real world by combining human and machine intelligence, and may enable a useful new way of interacting with the crowds powering other systems.
- Allen, J. F., Schubert, L. K., Ferguson, G., Heeman, P., Hwang, C. H., Kato, T., Light, M., Martin, N. G., Miller, B. W., Poesio, M., and Traum, D. R. Enriching speech recognition with automatic detection of sentence boundaries and disfluencies. In Journal of Experimental and Theoretical AI (JETAI) 7 (1995), 7--48.Google Scholar
- Bozzon, A., Brambilla, M., and Ceri, S Answering search queries with CrowdSearcher. In WWW 2012, 33--42. Google ScholarDigital Library
- Bernstein, M. S., Brandt, J. R., Miller, R. C., and Karger, D. R. Crowds in two seconds: Enabling realtime crowd-powered interfaces. In UIST 2011, 33--42. Google ScholarDigital Library
- Bernstein, M. S., Karger, D. R., Miller, R. C., and Brandt, J. R. Analytic methods for optimizing realtime crowdsourcing. In Collective Intelligence 2012.Google Scholar
- Bernstein, M. S., Little, G., Miller, R. C., Hartmann, B., Ackerman, M. S., Karger, D. R., Crowell, D., and Panovich, K. Soylent: a word processor with a crowd inside. In UIST 2010, 313--322. Google ScholarDigital Library
- Bernstein, M. S., Teevan, J., Dumais, S., Liebling, D., and Horvitz, E. Direct answers for search queries in the long tail. In CHI 2012, 237--246. Google ScholarDigital Library
- Bigham, J. P., Jayant, C., Ji, H., Little, G., Miller, A., Miller, R. C., Miller, R., Tatarowicz, A., White, B., White, S., and Yeh, T. Vizwiz: nearly real-time answers to visual questions. In UIST 2010, 333--342. Google ScholarDigital Library
- Bilton, N. With apple's siri, a romance gone sour. http://bits.blogs.nytimes.com/2012/07/15/with-apple's-siri-a-romance-gone-sour/.Google Scholar
- Bonnington, C. Steve jobs would have 'lost his mind' over Siri, former employee says. www.wired.com/gadgetlab/2012/05/apple-siri-disappointment/.Google Scholar
- Chilton, L. Seaweed: A web application for designing economic games. Master's thesis, MIT, 2009.Google Scholar
- Cooper, S., Khatib, F., Treuille, A., Barbero, J., Lee, J., Beenen, M., Leaver-Fay, A., Baker, D., Popovic, Z., and Players, F. Predicting protein structures with a multiplayer online game. In Nature 466, 7307 (2010), 756--760.Google ScholarCross Ref
- Hirschman, L., Dahl, D. A., McKay, D. P., Norton, L. M., and Linebarger, M. C. Beyond class a: a proposal for automatic evaluation of discourse. In Workshop on Speech and Natural Language, HLT '90, ACL, 1990, 109--113. Google ScholarDigital Library
- Kittur, A., Smus, B., and Kraut, R. Crowdforge: Crowdsourcing complex work. In UIST 2011, 43--52. Google ScholarDigital Library
- Kittur, A., Nickerson, J. V., Bernstein, M., Gerber, E., Shaw, A., Zimmerman, J., Lease, M. and Horton, J. The future of crowd work. In CSCW 2013, 1301--1318. Google ScholarDigital Library
- Kulkarni, A. P., Can, M., and Hartmann, B. Turkomatic: automatic recursive task and workflow design for mechanical turk. In CHI 2011, 2053--2058. Google ScholarDigital Library
- Lasecki, W. S., Miller, C. D., Sadilek, A., AbuMoussa, A., R. Kushalnagar, and Bigham, J. Real-time captioning by groups of non-experts. In UIST 2012, 23--34. Google ScholarDigital Library
- Lasecki, W. S., Murray, K., White, S., Miller, R. C., and Bigham, J. P. Real-time crowd control of existing interfaces. In UIST 2011, 23--32. Google ScholarDigital Library
- Lasecki, W. S., Song, Y. C., Kautz, H., and Bigham, J. P. Real-time crowd labeling for deployable activity recognition. In CSCW 2013, 1203-1212. Google ScholarDigital Library
- Lasecki, W. S., Thiha, P., Zhong, Y., Brady, E. and Bigham, J. P. Answering Visual Questions with Conversational Crowd Assistants. In ASSETS 2013. Google ScholarDigital Library
- Lasecki, W. S., White, S., Murray, K. I., and Bigham, J. P. Crowd memory: Learning in the collective. In Collective Intelligence 2012.Google Scholar
- Levitt, B., and March, J. G. Organizational learning. In Annual Review of Sociology 14 (1988), 319--340.Google ScholarCross Ref
- Little, G., Chilton, L. B., Goldman, M., and Miller, R. C. Turkit: human computation algorithms on Mechanical Turk. In UIST 2010, 57--66. Google ScholarDigital Library
- Sears, A., Lin, M., Jacko, J., Xiao, Y. When computers fade: Pervasive computing and situationally induced impairments and disabilities. In HCI International '03, 1298--1302.Google Scholar
- Sun, Y.-A., Dance, C. R., Roy, S., and Little, G. How to assure the quality of human computation tasks when majority voting fails' In Workshop on Computational Social Science and the Wisdom of Crowds, NIPS 2011.Google Scholar
- von Ahn, L. Human computation. In Ph.D. Thesis (2005). Google ScholarDigital Library
- von Ahn, L., and Dabbish, L. Labeling images with a computer game. In CHI 2004, 319--326. Google ScholarDigital Library
- Walker, M. A., Litman, D. J., Kamm, C. A., and Abella, A. Paradise: a framework for evaluating spoken dialogue agents. In Proceedings of the Association for Computational Linguistics, ACL '98, Association for Computational Linguistics (1997), 271--280. Google ScholarDigital Library
- Webb, N., Benyon, D., Hansen, P., and Mival, O. Evaluating human-machine conversation for appropriateness. In Proceedings of the Conference on Language Resources and Evaluation, N. Calzolari, K. Choukri, B. Maegaard, J. Mariani, J. Odijk, S. Piperidis, M. Rosner, and D. Tapias, Eds., LREC'10, European Language Resources Association (ELRA) (May 2010).Google Scholar
- Zhang, H., Law, E., Miller, R. C., Gajos, K. Z., Parkes, D. C., and Horvitz, E. Human computation tasks with global constraints. In CHI 2012, 217--226. Google ScholarDigital Library
Index Terms
- Chorus: a crowd-powered conversational assistant
Recommendations
How many crowdsourced workers should a requester hire?
Recent years have seen an increased interest in crowdsourcing as a way of obtaining information from a potentially large group of workers at a reduced cost. The crowdsourcing process, as we consider in this paper, is as follows: a requester hires a ...
Crowdsourcing a self-evolving dialog graph
CUI '19: Proceedings of the 1st International Conference on Conversational User InterfacesIn this paper we present a crowdsourcing-based approach for collecting dialog data for a social chat dialog system, which gradually builds a dialog graph from actual user responses and crowd-sourced system answers, conditioned by a given persona and ...
Towards Conversationally Intelligent Dialog Systems
CHI EA '22: Extended Abstracts of the 2022 CHI Conference on Human Factors in Computing SystemsSpoken dialog systems, lacking the means to address the complex phenomena of spontaneous speech and conversational dynamics, force users into a constrained mode of dialog that resembles text-based interaction more closely than spoken conversation. Turn-...
Comments