research-article

Oopsy-daisy: failure stories in quantitative evaluation studies for visualizations

Authors:
Sung-Hee Kim

University of British Columbia

University of British Columbia
View Profile

,
Ji Soo Yi

Purdue University

Purdue University
View Profile

,
Niklas Elmqvist

University of Maryland

University of Maryland
View Profile

BELIV '14: Proceedings of the Fifth Workshop on Beyond Time and Errors: Novel Evaluation Methods for VisualizationNovember 2014Pages 142–146https://doi.org/10.1145/2669557.2669576

Published:10 November 2014Publication History

BELIV '14: Proceedings of the Fifth Workshop on Beyond Time and Errors: Novel Evaluation Methods for Visualization

Pages 142–146

ABSTRACT

Designing, conducting, and interpreting evaluation studies with human participants is challenging. While researchers in cognitive psychology, social science, and human-computer interaction view competence in evaluation study methodology a key job skill, it is only recently that visualization researchers have begun to feel the need to learn this skill as well. Acquiring such competence is a lengthy and difficult process fraught with much trial and error. Recent work on patterns for visualization evaluation is now providing much-needed best practices for how to evaluate a visualization technique with human participants. However, negative examples of evaluation methods that fail, yield no usable results, or simply do not work are still missing, mainly because of the difficulty and lack of incentive for publishing negative results or failed research. In this paper, we take the position that there are many good ideas with the best intentions for how to evaluate a visualization tool that simply do not work. We call upon the community to help collect these negative examples in order to show the other side of the coin: what not to do when trying to evaluate visualization.

References

J. Boy, R. Rensink, E., E. Bertini, and J.-D. Fekete. A principled way of assessing visualization literacy. IEEE Transactions on Visualization and Computer Graphics, 2014.Google ScholarCross Ref
W. J. Brown, R. C. Malveau, H. W. S. McCormick III, and T. J. Mowbray. AntiPatterns: Refactoring Software, Architectures, and Projects in Crisis. John Wiley, 1998. Google ScholarDigital Library
S. Carpendale. Evaluating information visualizations. In Information Visualization, pages 19--45. Springer, 2008. Google ScholarDigital Library
C. Chen and Y. Yu. Empirical studies of information visualization: a meta-analysis. International Journal of Human-Computer Studies, 53(5):851--866, 2000. Google ScholarDigital Library
J. Cohen. A power primer. Psychological Bulletin, 112(1):155--159, 1992.Google ScholarCross Ref
J. S. Downs, M. B. Holbrook, S. Sheng, and L. F. Cranor. Are your participants gaming the system?: screening mechanical turk workers. In Proceedings of the ACM Conference on Human Factors in Computing Systems, pages 2399--2402, 2010. Google ScholarDigital Library
N. Elmqvist and J. S. Yi. Patterns for visualization evaluation. Information Visualization. to appear. Google ScholarDigital Library
E. Gamma, R. Helm, R. Johnson, and J. Vlissides. Design Patterns: Elements of Reusable Object-oriented Software. Addison Wesley, Boston, MA, 1994. Google ScholarDigital Library
K. Hornbæk. Some whys and hows of experiments in human--computer interaction. Foundations and Trends in Human--Computer Interaction, 5(4):299--373, 2011.Google ScholarDigital Library
P. G. Ipeirotis. Analyzing the Amazon Mechanical Turk marketplace. XRDS: Crossroads, The ACM Magazine for Students, 17(2):16--21, 2010. Google ScholarDigital Library
P. Isenberg, T. Zuk, C. Collins, and M. S. T. Carpendale. Grounded evaluation of information visualizations. In Proceedings of BEyond time and errors: novel evaLuation methods for Information Visualization, 2008. Google ScholarDigital Library
S.-H. Kim, Z. Dong, H. Xian, B. Upatising, and J. S. Yi. Does an eye tracker tell the truth about visualizations?: findings while investigating visualizations for decision making. IEEE Transactions on Visualization and Computer Graphics, 18(12):2421--2430, 2012.Google ScholarDigital Library
S.-H. Kim, H. Yun, and J. S. Yi. How to filter out random clickers in a crowdsourcing-based study? In Proceedings of the Workshop on BEyond Time and Errors: Novel Evaluation Methods for Visualization, number 15, 2012. Google ScholarDigital Library
H. Lam, E. Bertini, P. Isenberg, C. Plaisant, and S. Carpendale. Empirical studies in information visualization: Seven scenarios. IEEE Transactions on Visualization and Computer Graphics, 18(9):1520--1536, 2012. Google ScholarDigital Library
H. Lam and T. Munzner. Increasing the utility of quantitative empirical studies for meta-analysis. In Proceedings of the Workshop on BEyond time and errors: novel evaLuation methods for Information Visualization, 2008. Google ScholarDigital Library
R. Levin and D. D. Redell. An evaluation of the ninth SOSP submissions or: How (and how not) to write a good systems paper. Operating Systems Review, 17(3):35--40, 1983.Google Scholar
T. Munzner. Process and pitfalls in writing information visualization research papers. In A. Kerren, J. T. Stasko, J.-D. Fekete, and C. North, editors, Information Visualization: Human-Centered Issues and Perspectives, number 4950 in Lecture Notes in Computer Science, pages 134--153. Springer, 2008. Google ScholarDigital Library
T. Munzner. A nested process model for visualization design and validation. IEEE Transactions on Visualization and Computer Graphics, 15(6):921--928, 2009. Google ScholarDigital Library
G. Paolacci and J. Chandler. Inside the Turk: Understanding Mechanical Turk as a participant pool. Current Directions in Psychological Science, 23(3):184--188, 2014.Google ScholarCross Ref
C. Plaisant. The challenge of information visualization evaluation. In Proceedings of the Conference on Advanced Visual Interfaces, pages 109--116, 2004. Google ScholarDigital Library
C. Plaisant, B. Lee, C. S. Parr, J.-D. Fekete, and N. Henry. Task taxonomy for graph visualization. In Proceedings of BEyond time and errors: novel evaLuation methods for Information Visualization, pages 82--86, 2006. Google ScholarDigital Library
H. C. Purchase. Experimental human-computer interaction: a practical guide with visual examples. Cambridge University Press, 2012. Google ScholarCross Ref
A. Savikhin, R. Maciejewski, and D. Ebert. Applied visual analytics for economic decision-making. In Proceedings of the IEEE Symposium on Visual Analytics Science and Technology, pages 107--114, 2008.Google ScholarCross Ref
P. Shah and E. G. Freedman. Bar and line graph comprehension: An interaction of top-down and bottom-up processes. Topics in Cognitive Science, 3(3):560--578, 2011.Google ScholarCross Ref
B. Shneiderman and C. Plaisant. Strategies for evaluating information visualization tools: multi-dimensional in-depth long-term case studies. In Proceedings of the Workshop on BEyond time and errors: novel evaLuation methods for Information Visualization, pages 1--7, 2006. Google ScholarDigital Library
W. Willett, J. Heer, and M. Agrawala. Strategies for crowdsourcing social data analysis. In Proceedings of the ACM Conference on Human Factors in Computing Systems, pages 227--236, 2012. Google ScholarDigital Library

Index Terms

Oopsy-daisy: failure stories in quantitative evaluation studies for visualizations

Recommendations

Lessons Learned: Architects Are Facilitators, Too!

This is an interesting collection of lessons learned the hard way, as told by an architect who joined a "team in transition," replacing the original architect. These lessons are almost "anti-patterns," and the author provides thoughtful solutions. What ...
Read More
Hey teachers, find your inner designer: stimulating reflection with a cultural probe approach
FabLearn Europe'18: Proceedings of the Conference on Creativity and Making in Education

This demo proposal builds upon recent research in the Netherlands were we try to gain insights into teachers acting as designers. A few years ago, we started professional learning groups around design and research with teachers in secondary education [8]...
Read More
Using diaries for evaluating interactive products: the relevance of form and context
OZCHI '10: Proceedings of the 22nd Conference of the Computer-Human Interaction Special Interest Group of Australia on Computer-Human Interaction

In this paper we discuss two studies, in which we used incident diaries to evaluate different aspects of a web-based tool and a wearable display. For the web-based tool we used a diary in form of a table distributed in digital form, which resulted in a ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
BELIV '14: Proceedings of the Fifth Workshop on Beyond Time and Errors: Novel Evaluation Methods for Visualization
November 2014
184 pages
ISBN:9781450332095
DOI:10.1145/2669557
Editors:
Heidi Lam
Google Inc.
,
Petra Isenberg
INRIA, France
,
Tobias Isenberg
INRIA, France
,
Michael Sedlmair
University of Vienna, Austria
Copyright © 2014 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 10 November 2014
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
evaluation studies
failures
lessons learned
mistakes
Qualifiers
- research-article
Conference

Acceptance Rates
BELIV '14 Paper Acceptance Rate23of30submissions,77%Overall Acceptance Rate45of64submissions,70%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 2
  Total Citations
  View Citations
- 218
  Total Downloads
- Downloads (Last 12 months)14
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Oopsy-daisy: failure stories in quantitative evaluation studies for visualizations

BELIV '14: Proceedings of the Fifth Workshop on Beyond Time and Errors: Novel Evaluation Methods for Visualization

ABSTRACT

References

Cited By

Index Terms

Recommendations

Lessons Learned: Architects Are Facilitators, Too!

Hey teachers, find your inner designer: stimulating reflection with a cultural probe approach

Using diaries for evaluating interactive products: the relevance of form and context