ABSTRACT
Designing, conducting, and interpreting evaluation studies with human participants is challenging. While researchers in cognitive psychology, social science, and human-computer interaction view competence in evaluation study methodology a key job skill, it is only recently that visualization researchers have begun to feel the need to learn this skill as well. Acquiring such competence is a lengthy and difficult process fraught with much trial and error. Recent work on patterns for visualization evaluation is now providing much-needed best practices for how to evaluate a visualization technique with human participants. However, negative examples of evaluation methods that fail, yield no usable results, or simply do not work are still missing, mainly because of the difficulty and lack of incentive for publishing negative results or failed research. In this paper, we take the position that there are many good ideas with the best intentions for how to evaluate a visualization tool that simply do not work. We call upon the community to help collect these negative examples in order to show the other side of the coin: what not to do when trying to evaluate visualization.
- J. Boy, R. Rensink, E., E. Bertini, and J.-D. Fekete. A principled way of assessing visualization literacy. IEEE Transactions on Visualization and Computer Graphics, 2014.Google ScholarCross Ref
- W. J. Brown, R. C. Malveau, H. W. S. McCormick III, and T. J. Mowbray. AntiPatterns: Refactoring Software, Architectures, and Projects in Crisis. John Wiley, 1998. Google ScholarDigital Library
- S. Carpendale. Evaluating information visualizations. In Information Visualization, pages 19--45. Springer, 2008. Google ScholarDigital Library
- C. Chen and Y. Yu. Empirical studies of information visualization: a meta-analysis. International Journal of Human-Computer Studies, 53(5):851--866, 2000. Google ScholarDigital Library
- J. Cohen. A power primer. Psychological Bulletin, 112(1):155--159, 1992.Google ScholarCross Ref
- J. S. Downs, M. B. Holbrook, S. Sheng, and L. F. Cranor. Are your participants gaming the system?: screening mechanical turk workers. In Proceedings of the ACM Conference on Human Factors in Computing Systems, pages 2399--2402, 2010. Google ScholarDigital Library
- N. Elmqvist and J. S. Yi. Patterns for visualization evaluation. Information Visualization. to appear. Google ScholarDigital Library
- E. Gamma, R. Helm, R. Johnson, and J. Vlissides. Design Patterns: Elements of Reusable Object-oriented Software. Addison Wesley, Boston, MA, 1994. Google ScholarDigital Library
- K. Hornbæk. Some whys and hows of experiments in human--computer interaction. Foundations and Trends in Human--Computer Interaction, 5(4):299--373, 2011.Google ScholarDigital Library
- P. G. Ipeirotis. Analyzing the Amazon Mechanical Turk marketplace. XRDS: Crossroads, The ACM Magazine for Students, 17(2):16--21, 2010. Google ScholarDigital Library
- P. Isenberg, T. Zuk, C. Collins, and M. S. T. Carpendale. Grounded evaluation of information visualizations. In Proceedings of BEyond time and errors: novel evaLuation methods for Information Visualization, 2008. Google ScholarDigital Library
- S.-H. Kim, Z. Dong, H. Xian, B. Upatising, and J. S. Yi. Does an eye tracker tell the truth about visualizations?: findings while investigating visualizations for decision making. IEEE Transactions on Visualization and Computer Graphics, 18(12):2421--2430, 2012.Google ScholarDigital Library
- S.-H. Kim, H. Yun, and J. S. Yi. How to filter out random clickers in a crowdsourcing-based study? In Proceedings of the Workshop on BEyond Time and Errors: Novel Evaluation Methods for Visualization, number 15, 2012. Google ScholarDigital Library
- H. Lam, E. Bertini, P. Isenberg, C. Plaisant, and S. Carpendale. Empirical studies in information visualization: Seven scenarios. IEEE Transactions on Visualization and Computer Graphics, 18(9):1520--1536, 2012. Google ScholarDigital Library
- H. Lam and T. Munzner. Increasing the utility of quantitative empirical studies for meta-analysis. In Proceedings of the Workshop on BEyond time and errors: novel evaLuation methods for Information Visualization, 2008. Google ScholarDigital Library
- R. Levin and D. D. Redell. An evaluation of the ninth SOSP submissions or: How (and how not) to write a good systems paper. Operating Systems Review, 17(3):35--40, 1983.Google Scholar
- T. Munzner. Process and pitfalls in writing information visualization research papers. In A. Kerren, J. T. Stasko, J.-D. Fekete, and C. North, editors, Information Visualization: Human-Centered Issues and Perspectives, number 4950 in Lecture Notes in Computer Science, pages 134--153. Springer, 2008. Google ScholarDigital Library
- T. Munzner. A nested process model for visualization design and validation. IEEE Transactions on Visualization and Computer Graphics, 15(6):921--928, 2009. Google ScholarDigital Library
- G. Paolacci and J. Chandler. Inside the Turk: Understanding Mechanical Turk as a participant pool. Current Directions in Psychological Science, 23(3):184--188, 2014.Google ScholarCross Ref
- C. Plaisant. The challenge of information visualization evaluation. In Proceedings of the Conference on Advanced Visual Interfaces, pages 109--116, 2004. Google ScholarDigital Library
- C. Plaisant, B. Lee, C. S. Parr, J.-D. Fekete, and N. Henry. Task taxonomy for graph visualization. In Proceedings of BEyond time and errors: novel evaLuation methods for Information Visualization, pages 82--86, 2006. Google ScholarDigital Library
- H. C. Purchase. Experimental human-computer interaction: a practical guide with visual examples. Cambridge University Press, 2012. Google ScholarCross Ref
- A. Savikhin, R. Maciejewski, and D. Ebert. Applied visual analytics for economic decision-making. In Proceedings of the IEEE Symposium on Visual Analytics Science and Technology, pages 107--114, 2008.Google ScholarCross Ref
- P. Shah and E. G. Freedman. Bar and line graph comprehension: An interaction of top-down and bottom-up processes. Topics in Cognitive Science, 3(3):560--578, 2011.Google ScholarCross Ref
- B. Shneiderman and C. Plaisant. Strategies for evaluating information visualization tools: multi-dimensional in-depth long-term case studies. In Proceedings of the Workshop on BEyond time and errors: novel evaLuation methods for Information Visualization, pages 1--7, 2006. Google ScholarDigital Library
- W. Willett, J. Heer, and M. Agrawala. Strategies for crowdsourcing social data analysis. In Proceedings of the ACM Conference on Human Factors in Computing Systems, pages 227--236, 2012. Google ScholarDigital Library
Index Terms
- Oopsy-daisy: failure stories in quantitative evaluation studies for visualizations
Recommendations
Lessons Learned: Architects Are Facilitators, Too!
This is an interesting collection of lessons learned the hard way, as told by an architect who joined a "team in transition," replacing the original architect. These lessons are almost "anti-patterns," and the author provides thoughtful solutions. What ...
Hey teachers, find your inner designer: stimulating reflection with a cultural probe approach
FabLearn Europe'18: Proceedings of the Conference on Creativity and Making in EducationThis demo proposal builds upon recent research in the Netherlands were we try to gain insights into teachers acting as designers. A few years ago, we started professional learning groups around design and research with teachers in secondary education [8]...
Using diaries for evaluating interactive products: the relevance of form and context
OZCHI '10: Proceedings of the 22nd Conference of the Computer-Human Interaction Special Interest Group of Australia on Computer-Human InteractionIn this paper we discuss two studies, in which we used incident diaries to evaluate different aspects of a web-based tool and a wearable display. For the web-based tool we used a diary in form of a table distributed in digital form, which resulted in a ...
Comments