skip to main content
10.1145/3313831.3376454acmconferencesArticle/Chapter ViewAbstractPublication PageschiConference Proceedingsconference-collections
research-article
Honorable Mention

How Visualizing Inferential Uncertainty Can Mislead Readers About Treatment Effects in Scientific Results

Published:23 April 2020Publication History

ABSTRACT

When presenting visualizations of experimental results, scientists often choose to display either inferential uncertainty (e.g., uncertainty in the estimate of a population mean) or outcome uncertainty (e.g., variation of outcomes around that mean) about their estimates. How does this choice impact readers' beliefs about the size of treatment effects? We investigate this question in two experiments comparing 95% confidence intervals (means and standard errors) to 95% prediction intervals (means and standard deviations). The first experiment finds that participants are willing to pay more for and overestimate the effect of a treatment when shown confidence intervals relative to prediction intervals. The second experiment evaluates how alternative visualizations compare to standard visualizations for different effect sizes. We find that axis rescaling reduces error, but not as well as prediction intervals or animated hypothetical outcome plots (HOPs), and that depicting inferential uncertainty causes participants to underestimate variability in individual outcomes.

Skip Supplemental Material Section

Supplemental Material

paper327pv.mp4

mp4

1.2 MB

References

  1. Alice R Albrecht and Brian J Scholl. 2010. Perceptually averaging in a continuous visual world: Extracting statistical summary representations over time. Psychological Science 21, 4 (2010), 560--567.Google ScholarGoogle ScholarCross RefCross Ref
  2. American Psychological Association and others. 2001. Publication manual (5th edition). American Psychological Association Washington, DC.Google ScholarGoogle Scholar
  3. Nicholas J Barrowman and Ransom A Myers. 2003. Raindrop plots: a new way to display collections of likelihoods and distributions. The American Statistician 57, 4 (2003), 268--274.Google ScholarGoogle ScholarCross RefCross Ref
  4. Sarah Belia, Fiona Fidler, Jennifer Williams, and Geoff Cumming. 2005. Researchers misunderstand confidence intervals and standard error bars. Psychological methods 10, 4 (2005), 389.Google ScholarGoogle Scholar
  5. Melanie L Bell, Mallorie H Fiero, Haryana M Dhillon, Victoria J Bray, and Janette L Vardy. 2017. Statistical controversies in cancer research: using standardized effect size graphs to enhance interpretability of cancer-related clinical trials with patient-reported outcomes. Annals of Oncology 28, 8 (2017), 1730--1733.Google ScholarGoogle ScholarCross RefCross Ref
  6. Lonni Besançon and Pierre Dragicevic. 2019. The Continued Prevalence of Dichotomous Inferences at CHI. (2019).Google ScholarGoogle Scholar
  7. Katherine S Button, John PA Ioannidis, Claire Mokrysz, Brian A Nosek, Jonathan Flint, Emma SJ Robinson, and Marcus R Munafò. 2013. Power failure: why small sample size undermines the reliability of neuroscience. Nature Reviews Neuroscience 14, 5 (2013), 365.Google ScholarGoogle ScholarCross RefCross Ref
  8. Colin F Camerer, Anna Dreber, Felix Holzmeister, Teck-Hua Ho, Jürgen Huber, Magnus Johannesson, Michael Kirchler, Gideon Nave, Brian A Nosek, Thomas Pfeiffer, and others. 2018. Evaluating the replicability of social science experiments in Nature and Science between 2010 and 2015. Nature Human Behaviour 2, 9 (2018), 637.Google ScholarGoogle ScholarCross RefCross Ref
  9. Beth Chance, Robert del Mas, and Joan Garfield. 2004. Reasoning about sampling distribitions. In The challenge of developing statistical literacy, reasoning and thinking. Springer, 295--323.Google ScholarGoogle Scholar
  10. Open Science Collaboration and others. 2015. Estimating the reproducibility of psychological science. Science 349, 6251 (2015), aac4716.Google ScholarGoogle Scholar
  11. Michael Correll and Michael Gleicher. 2014. Error bars considered harmful: Exploring alternate encodings for mean and error. Visualization and Computer Graphics, IEEE Transactions on 20, 12 (2014), 2142--2151.Google ScholarGoogle Scholar
  12. Geoff Cumming. 2013. Understanding the new statistics: Effect sizes, confidence intervals, and meta-analysis. Routledge.Google ScholarGoogle Scholar
  13. Geoff Cumming, Fiona Fidler, Pav Kalinowski, and Jerry Lai. 2012. The statistical recommendations of the American Psychological Association Publication Manual: Effect sizes, confidence intervals, and meta-analysis. Australian Journal of Psychology 64, 3 (2012), 138--146.Google ScholarGoogle ScholarCross RefCross Ref
  14. Geoff Cumming and Sue Finch. 2005. Inference by eye: confidence intervals and how to read pictures of data. American Psychologist 60, 2 (2005), 170.Google ScholarGoogle ScholarCross RefCross Ref
  15. Peter Cummings. 2011. Arguments for and against standardized mean differences (effect sizes). Archives of pediatrics & adolescent medicine 165, 7 (2011), 592--596.Google ScholarGoogle Scholar
  16. Pierre Dragicevic, Yvonne Jansen, Abhraneel Sarma, Matthew Kay, and Fanny Chevalier. 2019. Increasing the Transparency of Research Papers with Explorable Multiverse Analyses. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. ACM, 65.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Michael Fernandes, Logan Walls, Sean Munson, Jessica Hullman, and Matthew Kay. 2018. Uncertainty Displays Using Quantile Dotplots or CDFs Improve Transit Decision-Making. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. ACM, 144.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Rocio Garcia-Retamero and Edward T Cokely. 2013. Communicating health risks with visual aids. Current Directions in Psychological Science 22, 5 (2013), 392--399.Google ScholarGoogle ScholarCross RefCross Ref
  19. Gerd Gigerenzer. 1994. Why the distinction between single-event probabilities and frequencies is important for psychology (and vice versa). In Subjective probability. Wiley, 129--161.Google ScholarGoogle Scholar
  20. Gerd Gigerenzer and Ulrich Hoffrage. 1995. How to improve Bayesian reasoning without instruction: frequency formats. Psychological review 102, 4 (1995), 684.Google ScholarGoogle Scholar
  21. Daniel G Goldstein, Eric J Johnson, and William F Sharpe. 2008. Choosing outcomes versus choosing products: Consumer-focused retirement investment advice. Journal of Consumer Research 35, 3 (2008), 440--456.Google ScholarGoogle ScholarCross RefCross Ref
  22. Daniel G Goldstein and David Rothschild. 2014. Lay understanding of probability distributions. Judgment & Decision Making 9, 1 (2014).Google ScholarGoogle Scholar
  23. Rink Hoekstra, Richard D Morey, Jeffrey N Rouder, and Eric-Jan Wagenmakers. 2014. Robust misinterpretation of confidence intervals. Psychonomic bulletin & review 21, 5 (2014), 1157--1164.Google ScholarGoogle Scholar
  24. Ulrich Hoffrage and Gerd Gigerenzer. 1998. Using natural frequencies to improve diagnostic inferences. Academic medicine 73, 5 (1998), 538--540.Google ScholarGoogle ScholarCross RefCross Ref
  25. Jessica Hullman, Matthew Kay, Yea-Seul Kim, and Samana Shrestha. 2018. Imagining Replications: Graphical Prediction & Discrete Visualizations Improve Recall & Estimation of Effect Uncertainty. IEEE transactions on visualization and computer graphics 24, 1 (2018), 446--456.Google ScholarGoogle Scholar
  26. Jessica Hullman, Paul Resnick, and Eytan Adar. 2015. Hypothetical outcome plots outperform error bars and violin plots for inferences about reliability of variable ordering. PloS one 10, 11 (2015), e0142444.Google ScholarGoogle ScholarCross RefCross Ref
  27. Harald Ibrekk and M Granger Morgan. 1987. Graphical communication of uncertain quantities to nontechnical people. Risk analysis 7, 4 (1987), 519--529.Google ScholarGoogle Scholar
  28. Christopher H Jackson. 2008. Displaying uncertainty with shading. The American Statistician 62, 4 (2008), 340--347.Google ScholarGoogle ScholarCross RefCross Ref
  29. Alex Kale, Francis Nguyen, Matthew Kay, and Jessica Hullman. 2018. Hypothetical Outcome Plots Help Untrained Observers Judge Trends in Ambiguous Data. IEEE transactions on visualization and computer graphics (2018).Google ScholarGoogle Scholar
  30. Peter Kampstra and others. 2008. Beanplot: A boxplot alternative for visual comparison of distributions. (2008).Google ScholarGoogle Scholar
  31. Matthew Kay, Tara Kola, Jessica R Hullman, and Sean A Munson. 2016. When (ish) is my bus?: User-centered visualizations of uncertainty in everyday, mobile predictive systems. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems. ACM, 5092--5103.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Yea-Seul Kim, Logan Walls, Pete Krafft, and Jessica Hullman. 2019. A Bayesian Cognition Approach to Improve Data Visualization. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. ACM.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Martin Krzywinski and Naomi Altman. 2013. Points of significance: error bars. (2013).Google ScholarGoogle Scholar
  34. George E Newman and Brian J Scholl. 2012. Bar graphs depicting averages are perceptually misinterpreted: The within-the-bar bias. Psychonomic bulletin & review 19, 4 (2012), 601--607.Google ScholarGoogle Scholar
  35. Nathaniel Schenker and Jane F Gentleman. 2001. On judging the significance of differences by examining the overlap between confidence intervals. The American Statistician 55, 3 (2001), 182--186.Google ScholarGoogle ScholarCross RefCross Ref
  36. Michael Schulte-Mecklenbeck, Joseph G Johnson, Ulf Böckenholt, Daniel G Goldstein, J Edward Russo, Nicolette J Sullivan, and Martijn C Willemsen. 2017. Process-tracing methods in decision making: On growing up in the 70s. Current Directions in Psychological Science 26, 5 (2017), 442--450.Google ScholarGoogle ScholarCross RefCross Ref
  37. Transparent Statistics in Human--Computer Interaction Working Group. 2019. Transparent Statistics Guidelines. (Feb 2019). DOI: http://dx.doi.org/10.5281/zenodo.1186169 (Available at https://transparentstats.github.io/guidelines).Google ScholarGoogle ScholarCross RefCross Ref
  38. Leland Wilkinson. 1999. Statistical methods in psychology journals: Guidelines and explanations. American psychologist 54, 8 (1999), 594.Google ScholarGoogle Scholar

Index Terms

  1. How Visualizing Inferential Uncertainty Can Mislead Readers About Treatment Effects in Scientific Results

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        CHI '20: Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems
        April 2020
        10688 pages
        ISBN:9781450367080
        DOI:10.1145/3313831

        Copyright © 2020 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 23 April 2020

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate6,199of26,314submissions,24%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format .

      View HTML Format