Abstract
With the growing popularity of visualizations in various fields, visualization comprehension has gained considerable attention. In this work, we focus on the effect of data size and pattern salience on comprehension of scatterplot, a popular visualization type. We began with a preliminary study in which we interviewed 50 people in terms of comprehension difficulties of 90 different visualizations. The results reveal that data size is one of the top three factors affecting visualization comprehension. Besides, the effect of data size probably depends on the pattern salience within the data. Therefore, we carried out our experiment on the effect of data size and data-related pattern salience on three intermediate-level comprehension tasks, namely finding anomalies, judging correlation, and identifying clusters. The tasks were conducted on the scatterplot due to its familiarity to users and ability to support diverse tasks. Through the experiment, we found a significant interaction effect of data size and pattern salience on the comprehension of the trends in scatterplots. In specific conditions of pattern salience, data size impacts the judgment of anomalies and cluster centers. We discussed the findings in our experiment and further summarized the factors in visualization comprehension.
Graphic abstract
Similar content being viewed by others
References
Alper B, Riche NH, Chevalier F, Boy J, Sezgin M (2017) Visualization literacy at elementary school. In: Proceedings of the CHI conference on human factors in computing systems, pp 5485–5497
Bertin J, Berg WJ (1985) Semiology of graphics: diagrams, networks, maps. Ann Assoc Am Geogr 75(4):605–609
Best LA, Hunter AC, Stewart BM (2006) Perceiving relationships: a physiological examination of the perception of scatterplots. In: Barker-Plummer D, Cox R, Swoboda N (eds) Diagrammatic representation and inference. Diagrams 2006, pp 244–257
Borkin MA, Vo AA, Bylinskii Z, Isola P, Sunkavalli S, Oliva A, Pfister H (2013) What makes a visualization memorable? IEEE Trans Vis Comput Graph 19(12):2306–2315
Börner K, Maltese A, Balliet RN, Heimlich J (2016) Investigating aspects of data visualization literacy using 20 information visualizations and 273 science museum visitors. Inf Vis 15(3):198–213
Börner K, Bueckle A, Ginda M (2019) Data visualization literacy: definitions, conceptual, frameworks, exercises, and assessments. Proc Natl Acad Sci 116(6):1857–1864
Boy J, Rensink RA, Bertini E, Fekete JD (2014) A principled way of assessing visualization literacy. IEEE Trans Vis Comput Graph 20(12):1963–1972
Carpenter PA, Shah P (1998) A model of the perceptual and conceptual processes in graph comprehension. J Exp Psychol Appl 4(2):75–100
Carswell CM (1992) Choosing specifiers: an evaluation of the basic tasks model of graphical perception. Hum Factors 34(5):535–554
Chen R, Shu X, Chen J, Weng D, Tang J, Fu S, Wu Y (2021) Nebula: a coordinating grammar of graphics. IEEE Trans Vis Comput Graph. https://doi.org/10.1109/TVCG.2021.3076222
Cleveland WS, McGill R (1984) Graphical perception: theory, experimentation, and application to the development of graphical methods. J Am Stat Assoc 79(387):531–554
Curcio FR (1987) Comprehension of mathematical relationships expressed in graphs. J Res Math Educ 18(5):382–393
delMas R, Garfield J, Ooms A (2005) Using assessment items to study students’ difficulty reading and interpreting graphical representations of distributions. In: Proceedings of the fourth international research forum on statistical reasoning, thinking, and literacy
Deng Z, Weng D, Liang Y, Bao J, Zheng Y, Schreck T, Xu M, Wu Y (2021) Visual cascade analytics of large-scale spatiotemporal data. IEEE Trans Vis Comput Graph. https://doi.org/10.1109/TVCG.2021.3071387
Embretson SE, Reise SP (2000) Item response theory for psychologists. Lawrence Erlbaum Associates Publishers, Mahwah
Filipov V, Schetinger V, Raminger K, Soursos N, Zapke S, Miksch S (2021) Gone full circle: a radial approach to visualize event-based networks in digital humanities. Vis Inform 5(1):45–60
Freedman EG, Shah P (2002) Toward a model of knowledge-based graph comprehension. In: Hegarty M, Meyer B, Narayanan NH (eds) Diagrammatic representation and inference. Diagrams 2002, pp 18–30
Friendly M, Denis D (2005) The early origins and development of the scatterplot. J Hist Behav Sci 41(2):103–130
Galesic M, Garcia-Retamero R (2011) Graph literacy: a cross-cultural comparison. Med Decis Mak 31(3):444–457
Handzic M, Lam B, Aurum A, Oliver G (2002) A comparative analysis of two knowledge discovery tool: Scatterplot versus barchart. In: Proceedings of international conference on data mining, pp 167–176
Heer J, Bostock M, Ogievetsky V (2010) A tour through the visualization zoo. Commun ACM 53(6):59–67
Hopkins B, Skellam JG (1954) A new method for determining the type of distribution of plant individuals. Ann Bot 18(2):213–227
Huang W, Eades P, Hong SH (2009) Measuring effectiveness of graph visualizations: a cognitive load perspective. Inf Vis 8(3):139–152
Hu K, Gaikwad N, Bakker M, Hulsebos M, Zgraggen E, Hidalgo C, Kraska T, Li G, Satyanarayan A (2019) Çağatay Demiralp: Viznet: towards a large-scale visualization learning and benchmarking repository. In: Proceedings of the conference on human factors in computing systems, pp 1–12
Jin Z, Chen N, Shi Y, Qian W, Xu M, Cao N (2021) TrammelGraph: visual graph abstraction for comparison. J Vis 24(2):365–379
Kim Y, Heer J (2018) Assessing effects of task and data distribution on the effectiveness of visual encodings. Comput Graph Forum 37(3):157–167
Klein G, Moon B, Hoffman RR (2006) Making sense of sensemaking 2: a macrocognitive model. IEEE Intell Syst 21(5):88–92
Klein G, Phillips JK, Rall EL, Peluso DA (2007) A data-frame theory of sensemaking. In: Expertise out of context: proceedings of the sixth international conference on naturalistic decision making, pp 113–155
Kwon BC, Lee B (2016) A comparative evaluation on online learning approaches using parallel coordinate visualization. In: Proceedings of the CHI conference on human factors in computing systems, pp 993–997
Lan J, Wang J, Shu X, Zhou Z, Zhang H, Wu Y (2021) RallyComparator: visual comparison of the multivariate and spatial stroke sequence in a Table Tennis Rally. J Vis (to appear)
Lee S, Kim SH, Hung YH (2016) How do people make sense of unfamiliar visualizations? A grounded model of novice’s information visualization sensemaking. IEEE Trans Vis Comput Graph 22(1):499–508
Lee S, Kim SH, Kwon BC (2017) Vlat: development of a visualization literacy assessment test. IEEE Trans Vis Comput Graph 23(1):551–560
Lee S, Kwon B, Yang J, Lee B, Kim SH (2019) The correlation between users’ cognitive characteristics and visualization literacy. Appl Sci 9(3):488
Li J, Martens JB, van Wijk JJ (2010) Judging correlation from scatterplots and parallel coordinate plots. Inf Vis 9(1):13–30
Li Y, Fujiwara T, Choi YK, Kim KK, Ma KL (2020) A visual analytics system for multi-model comparison on clinical data predictions. Vis Inform 4(2):122–131
Liu FT, Ting KM, hua Zhou Z (2008) Isolation forest. In: Proceedings of IEEE international conference on data mining, pp 413–422
Liu Z, Stasko J (2010) Mental models, visual reasoning and interaction in information visualization: a top-down perspective. IEEE Trans Vis Comput Graph 16(6):999–1008
Ma Y, Tung AK, Wang W, Gao X, Pan Z, Chen W (2020) Scatternet: a deep subjective similarity model for visual analysis of scatterplots. IEEE Trans Vis Comput Graph 26(3):1562–1576
Mei H, Guan H, Xin C, Wen X, Chen W (2020) DataV: data visualization on large high-resolution displays. Vis Inform 4(3):12–23
Nguyen QV, Miller N, Arness D, Huang W, Huang ML, Simoff S (2020) Evaluation on interactive visualization data with scatterplots. Vis Inform 4(4):1–10
Niklas E, Fekete JD (2010) Hierarchical aggregation for information visualization: overview, techniques, and design guidelines. IEEE Trans Vis Comput Graph 16(3):439–454
Pan J, Chen W, Zhao X, Zhou S, Zeng W, Zhu M, Chen J, Fu S, Wu Y (2020) Exemplar-based layout fine-tuning for node-link diagrams. IEEE Trans Vis Comput Graph 27(2):1655–1665
Patterson RE, Blaha LM, Grinstein GG, Liggett KK, Kaveney DE, Sheldon KC, Havig PR, Moore JA (2014) A human cognition framework for information visualization. Comput Graph 42:42–58
Pinker S (1990) A theory of graph comprehension. In: Freedle R (ed) Artificial intelligence and the future of testing. Lawrence Erlbaum Associates Publishers, Mahwah, pp 73–126
Rensink RA, Baldridge G (2010) The perception of correlation in scatterplots. Comput Graph Forum 29(3):1203–1210
Ruchikachorn P, Mueller K (2015) Learning visualizations by analogy: promoting visual literacy through visualization morphing. IEEE Trans Vis Comput Graph 21(9):1028–1044
Ryan G, Mosca A, Chang R, Wu E (2019) At a glance: pixel approximate entropy as a measure of line chart complexity. IEEE Trans Vis Comput Graph 25(1):872–881
Sarikaya A, Gleicher M (2018) Scatterplots: tasks, data, and designs. IEEE Trans Vis Comput Graph 24(1):402–412
Shah P, Freedman EG (2011) Bar and line graph comprehension: an interaction of top-down and bottom-up processes. Top Cognit Sci 3(3):560–578
Shah P, Hoeffner J (2002) Review of graph comprehension research: implications for instruction. Educ Psychol Rev 14(1):47–69
Shi D, Xu X, Sun F, Shi Y, Cao N (2020) Calliope: automatic visual data story generation from a spreadsheet. IEEE Trans Vis Comput Graph 27(2):453–463
Shu X, Wu J, Wu X, Liang H, Cui W, Wu Y, Qu H (2021) Dancingwords: exploring animated word clouds to tell stories. J Vis 24(1):85–100
Simkin D, Hastie R (1987) An information-processing analysis of graph perception. J Am Stat Assoc 82(398):454–465
Spence I (2005) No humble pie: the origins and usage of a statistical chart. J Educ Behav Stat 30(4):353–368
Spence I, Lewandowsky S (1991) Displaying proportions and percentages. Appl Cognit Psychol 5(1):61–77
Tang J, Zhou Y, Tang T, Weng D, Xie B, Yu L, Zhang H, Wu Y (2022) A visualization approach for monitoring order processing in e-commerce warehouse. IEEE Trans Vis Comput Graph
Tatu A, Bak P, Bertini E, Keim D, Schneidewind J (2010) Visual quality metrics and human perception: an initial study on 2d projections of large multidimensional data. In: Proceedings of the international conference on advanced visual interfaces, pp 49–56
Tufte ER (2001) The visual display of quantitative information. Graphics Press, Cheshire
Wainer H (1992) Understanding graphs and tables. Educ Res 21(1):14–23
Wang Y, Wang Z, Zhu L, Zhang J, Fu CW, Cheng Z, Tu C, Chen B (2018) Is there a robust technique for selecting aspect ratios in line charts? IEEE Trans Vis Comput Graph 24(12):3096–3110
Wang J, Zhao K, Deng D, Cao A, Xie X, Zhou Z, Zhang H, Wu Y (2020) Tac-Simur: tactic-based simulative visual analytics of table tennis. IEEE Trans Vis Comput Graph 26(1):407–417
Wang J, Wu J, Cao A, Zhou Z, Zhang H, Wu Y (2021) Tac-Miner: visual tactic mining for multiple table tennis matches. IEEE Trans Vis Comput Graph 27(6):2770–2782
Wang Y, Peng TQ, Lu H, Wang H, Xie X, Qu H, Wu Y (2022) Seek for success: a visualization approach for understanding the dynamics of academic careers. IEEE Trans Vis Comput Graph
Weng D, Zheng C, Deng Z, Ma M, Bao J, Zheng Y, Xu M, Wu Y (2021) Towards better bus networks: a visual analytics approach. IEEE Trans Vis Comput Graph 27(2):817–827
Wilkinson L, Anand A, Grossman R (2005) Graph-theoretic scagnostics. In: Proceedings of IEEE symposium on information visualization, pp 157–164
Wu Y, Weng D, Deng Z, Bao J, Xu M, Wang Z, Zheng Y, Ding Z, Chen W (2020) Towards better detection and analysis of massive spatiotemporal co-occurrence patterns. IEEE Trans Intell Transp Syst 22(6):3387–3402
Wu J, Liu D, Guo Z, Xu Q, Wu Y (2022) TacticFlow: visual analytics of ever-changing tactics in racket sports. IEEE Trans Vis Comput Graph
Xiong C, Ceja CR, Ludwig CJ, Franconeri S (2020) Biased average position estimates in line and bar graphs: underestimation, overestimation, and perceptual pull. IEEE Trans Vis Comput Graph 26(1):301–310
Yang F, Harrison LT, Rensink RA, Franconeri SL, Chang R (2019) Correlation judgment and visualization features: a comparative study. IEEE Trans Vis Comput Graph 25(3):1474–1488
Ye S, Chen Z, Chu X, Wang Y, Fu S, Shen L, Zhou K, Wu Y (2020) Shuttlespace: exploring and analyzing movement trajectory in immersive visualization. IEEE Trans Vis Comput Graph 27(2):860–869
Yoghourdjian V, Archambault D, Diehl S, Dwyer T, Klein K, Purchase HC, Wu HY (2018) Exploring the limits of complexity: a survey of empirical studies on graph visualisation. Vis Inform 2(4):264–282
Yoghourdjian V, Yang Y, Dwyer T, Lawrence L, Wybrow M, Marriott K (2020) Scalability of network visualisation from a cognitive load perspective. IEEE Trans Vis Comput Graph 27(2):1677–1687
Zhao Y, Jiang H, Qin Y, Xie H, Wu Y, Liu S, Zhou Z, Xia J, Zhou F et al (2020) Preserving minority structures in graph sampling. IEEE Trans Vis Comput Graph 27(2):1698–1708
Zhao M, Qu H, Sedlmair M (2019) Neighborhood perception in bar charts. In: Proceedings of the CHI conference on human factors in computing systems, pp 1–12
Zhu H, Zhu M, Feng Y, Cai D, Hu Y, Wu S, Wu X, Chen W (2021) Visualizing large-scale high-dimensional data via hierarchical embedding of KNN graphs. Vis Inform 5:51–59
Acknowledgements
We thank all participants and reviewers for their thoughtful feedback and comments. The work was supported by Zhejiang Provincial Natural Science Foundation (LR18F020001).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Rights and permissions
About this article
Cite this article
Wang, J., Cai, X., Su, J. et al. What makes a scatterplot hard to comprehend: data size and pattern salience matter. J Vis 25, 59–75 (2022). https://doi.org/10.1007/s12650-021-00778-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12650-021-00778-8