research-article

Evaluation of an integrated multi-task machine learning system with humans in the loop

Authors:
Aaron Steinfeld

Carnegie Mellon University, Pittsburgh, PA

Carnegie Mellon University, Pittsburgh, PA
View Profile

,
S. Rachael Bennett

Carnegie Mellon University, Pittsburgh, PA

Carnegie Mellon University, Pittsburgh, PA
View Profile

,
Kyle Cunningham

Carnegie Mellon University, Pittsburgh, PA

Carnegie Mellon University, Pittsburgh, PA
View Profile

,
Matt Lahut

Carnegie Mellon University, Pittsburgh, PA

Carnegie Mellon University, Pittsburgh, PA
View Profile

,
Pablo-Alejandro Quinones

Carnegie Mellon University, Pittsburgh, PA

Carnegie Mellon University, Pittsburgh, PA
View Profile

,
Django Wexler

Carnegie Mellon University, Pittsburgh, PA

Carnegie Mellon University, Pittsburgh, PA
View Profile

,
Dan Siewiorek

Carnegie Mellon University, Pittsburgh, PA

Carnegie Mellon University, Pittsburgh, PA
View Profile

,
Jordan Hayes

Bitway, Inc.

Bitway, Inc.
View Profile

,
Paul Cohen

U. of Southern California

U. of Southern California
View Profile

,
Julie Fitzgerald

JSF Consulting

JSF Consulting
View Profile

,
Othar Hansson

Bitway, Inc.

Bitway, Inc.
View Profile

,
Mike Pool

IET

IET
View Profile

,
Mark Drummond

SRI International

SRI International
View Profile

PerMIS '07: Proceedings of the 2007 Workshop on Performance Metrics for Intelligent SystemsAugust 2007Pages 168–174https://doi.org/10.1145/1660877.1660901

Published:28 August 2007Publication History

PerMIS '07: Proceedings of the 2007 Workshop on Performance Metrics for Intelligent Systems

Pages 168–174

ABSTRACT

Performance of a cognitive personal assistant, RADAR, consisting of multiple machine learning components, natural language processing, and optimization was examined with a test explicitly developed to measure the impact of integrated machine learning when used by a human user in a real world setting. Three conditions (conventional tools, Radar without learning, and Radar with learning) were evaluated in a large-scale, between-subjects study. The study revealed that integrated machine learning does produce a positive impact on overall performance. This paper also discusses how specific machine learning components contributed to human-system performance.

References

Steinfeld, A., Bennett, R., Cunningham, K., Lahut, M., Quinones, P.-A., Wexler, D., Siewiorek, D., Cohen, P., Fitzgerald, J., Hansson, O., Hayes, J., Pool, M., and Drummond, M., The RADAR Test Methodology: Evaluating a Multi-Task Machine Learning System with Humans in the Loop. 2006, Carnegie Mellon University, School of Computer Science: Pittsburgh, PA. http://reports-archive.adm.cs.cmu.edu/anon/2006/abs tracts/06-125.htmlGoogle Scholar
Clymer, J. R. Simulation of a vehicle traffic control network using a fuzzy classifier system. In Proc. of the IEEE Simulation Symposium. 2002. Google ScholarDigital Library
Clymer, J. R. and Harrsion, V. Simulation of air traffic control at a VFR airport using OpEMCSS. In Proc. IEEE Digital Avionics Systems Conference. 2002.Google ScholarCross Ref
Zhang, L., Samaras, D., Tomasi, D., Volkow, N., and Goldstein, R. Machine learning for clinical diagnosis from functional magnetic resonance imaging. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2005. Google ScholarDigital Library
Hu, Y., Li, H., Cao, Y., Meyerzon, D., and Zheng, Q. Automatic extraction of titles from general documents using machine learning. In Proc. of ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL). 2005. Google ScholarDigital Library
Allen, J., Chambers, N., Ferguson, G., Galescu, L., Jung, H., Swift, M., and Taysom, W. PLOW: A Collaborative Task Learning Agent. In Proc. Conference on Artificial Intelligence (AAAI). 2007. Vancouver, Canada. Google ScholarDigital Library
Schrag, R., Pool, M., Chaudhri, V., Kahlert, R., Powers, J., Cohen, P., Fitzgerald, J., and Mishra, S. Experimental evaluation of subject matter expert-oriented knowledge base authoring tools. In Proc. NIST Performance Metrics for Intelligent Systems Workshop. 2002. http://www.iet.com/Projects/RKF/PerMIS02.docGoogle Scholar
Shen, J., Li, L., Dietterich, T. G., and Herlocker, J. L. A hybrid learning system for recognizing user tasks from desktop activities and email messages. In Proc. International Conference on Intelligent User Interfaces (IUI). 2006. Google ScholarDigital Library
Yoo, J., Gervasio, M., and Langley, P. An adaptive stock tracker for personalized trading advice. In Proc. International Conference on Intelligent User Interfaces (IUI). 2003. Google ScholarDigital Library
Airspace: Tools for evaluating complex systems, machine language, and complex tasks. http://www.cs.cmu.edu/~airspaceGoogle Scholar
Steinfeld, A., Quinones, P.-A., Zimmerman, J., Bennett, S. R., and Siewiorek, D. Survey measures for evaluation of cognitive assistants. In Proc. NIST Performance Metrics for Intelligent Systems Workshop (PerMIS). 2007. Google ScholarDigital Library

Recommendations

Machine Learning Task as a Diclique Extracting Task
FSKD '09: Proceedings of the 2009 Sixth International Conference on Fuzzy Systems and Knowledge Discovery - Volume 01

As we know there exist several approaches and algorithms for data mining and machine learning task solution, for example, decision tree learning, artificial neural networks, Bayesian learning, instance-based learning, genetic algorithms, etc. They are ...
Read More
Machine learning task as a diclique extracting task
FSKD'09: Proceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 1

As we know there exist several approaches and algorithms for data mining and machine learning task solution, for example, decision tree learning, artificial neural networks, Bayesian learning, instance-based learning, genetic algorithms, etc. They are ...
Read More
Machine Learning: The State of the Art

The two fundamental problems in machine learning (ML) are statistical analysis and algorithm design. The former tells us the principles of the mathematical models that we establish from the observation data. The latter defines the conditions on which ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
PerMIS '07: Proceedings of the 2007 Workshop on Performance Metrics for Intelligent Systems
August 2007
293 pages
ISBN:9781595938541
DOI:10.1145/1660877
General Chair:
Elena Messina
Intelligent Systems Division, NIST
,
Program Chair:
Raj Madhavan
Oak Ridge National Laboratory/NIST
Copyright © 2007 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 28 August 2007
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
evaluation
intelligent systems
machine learning
mixed-initiative assistants
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 10
  Total Citations
  View Citations
- 167
  Total Downloads
- Downloads (Last 12 months)3
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Evaluation of an integrated multi-task machine learning system with humans in the loop

PerMIS '07: Proceedings of the 2007 Workshop on Performance Metrics for Intelligent Systems

ABSTRACT

References

Cited By

Recommendations

Machine Learning Task as a Diclique Extracting Task

Machine learning task as a diclique extracting task

Machine Learning: The State of the Art

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Evaluation of an integrated multi-task machine learning system with humans in the loop

PerMIS '07: Proceedings of the 2007 Workshop on Performance Metrics for Intelligent Systems

ABSTRACT

References

Cited By

Recommendations

Machine Learning Task as a Diclique Extracting Task

Machine learning task as a diclique extracting task

Machine Learning: The State of the Art

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media