Abstract
Data charts are widely used in practices to display insights in complex data. Due to ineffective designs, novice readers may require descriptive content (e.g., chart captions) to understand the implying data stories that may not be accessible in some situations. This problem hinders the usage of data charts for the mass, which has raised deep concerns for visualization researchers. Recently, researchers have proposed deep-learning-based methods to automatically provide text context for data charts. However, these methods ignore the visual links between textual content and visual figures. Moreover, some of them are mainly applied in scalable vector graphics and cannot be easily extended to Internet pictures that are in raster format (e.g., PNG or JPEG). To overcome these limitations, we propose a novel deep-learning-based framework to automatically discover visual insights and generate corresponding text descriptions for chart figures. Specifically, we train a saliency detection model to reveal the salient area that presents the most important data insights and employ an image captioning model to generate the corresponding descriptive text. Meanwhile, we propose a novel method to optimize the saliency map to enable viewers to be aware of visual insights easily. Finally, we develop an interactive system that supports users to upload chart figures and then display chart insights as well as the related descriptions. We evaluate our saliency detection model and image captioning model through quantitative and qualitative experiments and conduct a user study to demonstrate the usage of our system.
Graphical abstract
Similar content being viewed by others
References
Alper B, Riche NH, Chevalier F, Boy J, Sezgin M (2017) Visualization literacy at elementary school. In: Proceedings of the 2017 CHI conference on human factors in computing systems, pp 5485–5497
Al-Zaidy RA, Giles CL (2015) Automatic extraction of data from bar charts. In: Proceedings of the international conference on knowledge capture, pp 1–4
Andrew AM (1979) Another efficient algorithm for convex hulls in two dimensions. Inf Process Lett 9(5):216–219
Ant design (2022). https://ant.design/docs/react/introduce-cn/, retrieved: May 1st, 2022
Balaji A, Ramanathan T, Sonathi V (2018) Chart-text: a fully automated chart image descriptor. arXiv:2006.01791
Banerjee S, Lavie A (2005) Meteor: an automatic metric for MT evaluation with improved correlation with human judgments. In: Proceedings of the ACL workshop on intrinsic and extrinsic evaluation measures for machine translation and/or summarization, pp 65–72
Börner K, Bueckle A, Ginda M (2019) Data visualization literacy: definitions, conceptual frameworks, exercises, and assessments. Proc Natl Acad Sci 116(6):1857–1864
Boy J, Rensink RA, Bertini E, Fekete JD (2014) A principled way of assessing visualization literacy. IEEE Trans Vis Comput Graph 20(12):1963–1972
Bylinskii Z, Alsheikh S, Madan S, Recasens A, Zhong K, Pfister H, Durand F, Oliva A (2017) Understanding infographics through textual and visual tag prediction. arXiv:1709.09215
Bylinskii Z, Judd T, Borji A, Itti L, Durand F, Oliva A, Torralba A (2015) Mit saliency benchmark. http://saliency.mit.edu/
Chen L, Zhao K (2021) An approach for chart description generation in cyber-physical-social system. Symmetry 13(9):1552
Cheng MM, Zhang GX, Mitra NJ, Huang X, Hu SM (2014) Global contrast based salient region detection. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 409–416
Chen C, Zhang R, Koh E, Kim S, Cohen S, Rossi R (2020) Figure captioning with relation maps for reasoning. In: Proceedings of IEEE winter conference on applications of computer vision, pp 1537–1545
Choi J, Jung S, Park DG, Choo J, Elmqvist N (2019) Visualizing for the non-visual: enabling the visually impaired to use visualization. Comput Graph Forum 38(3):249–260
Davila K, Setlur S, Doermann D, Kota BU, Govindaraju V (2021) Chart mining: a survey of methods for automated chart analysis. IEEE Trans Pattern Anal Mach Intell 43(11):3799–3819
Demiralp Ç, Haas PJ, Parthasarathy S, Pedapati T (2017) Foresight: recommending visual insights. arXiv:1707.03877
Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: A large-scale hierarchical image database. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 248–255
Deng Z, Hu X, Zhu L, Xu X, Qin J, Han G, Heng PA (2018) R3net: Recurrent residual refinement network for saliency detection. In: Proceedings of the international joint conference on artificial intelligence, pp 684–690
Deng D, Wu Y, Shu X, Wu J, Fu S, Cui W, Wu Y (2022) VisImages: a fine-grained expert-annotated visualization dataset. arXiv:2007.04584
Ding R, Han S, Xu Y, Zhang H, Zhang D (2019) Quickinsights: Quick and automatic discovery of insights from multi-dimensional data. In: Proceedings of the international conference on management of data, pp 317–332
Edunov S, Ott M, Auli M, Grangier D (2018) Understanding back-translation at scale. arXiv:1808.09381
Flask (2022). https://flask.palletsprojects.com/en/2.1.x/, retrieved: May 1st, 2022
Fu J, Zhu B, Cui W, Ge S, Wang Y, Zhang H, Huang H, Tang Y, Zhang D, Ma X (2021) Chartem: reviving chart images with data embedding. IEEE Trans Vis Comput Graph 27(2):337–346
Haddad RA, Akansu AN (1991) A class of fast gaussian binomial filters for speech and image processing. IEEE Trans Signal Process 39(3):723–727
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Hossain MZ, Sohel F, Shiratuddin MF, Laga H (2019) A comprehensive survey of deep learning for image captioning. ACM Comput Surv 51(6):1–36
Hsu TY, Giles CL, Huang TH (2021) Scicap: generating captions for scientific figures. arXiv:2110.11624
Jung D, Kim W, Song H, Hwang Ji, Lee B, Kim B, Seo J (2017) Chartsense: Interactive data extraction from chart images. In: Proceedings of the CHI conference on human factors in computing systems, pp 6706–6717
Kafle K, Price B, Cohen S, Kanan C (2018) Dvqa: Understanding data visualizations via question answering. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 5648–5656
Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv:1412.6980
Lai C, Lin Z, Jiang R, Han Y, Liu C, Yuan X (2020) Automatic annotation synchronizing with textual description for visualization. In: Proceedings of the CHI conference on human factors in computing systems, pp 1–13
Latif S, Zhou Z, Kim Y, Beck F, Kim NW (2022) Kori: interactive synthesis of text and charts in data documents. IEEE Trans Vis Comput Graph 28(1):184–194
Lei J, Wang B, Fang Y, Lin W, Callet PL, Ling N, Hou C (2016) A universal framework for salient object detection. IEEE Trans Multimedia 18(9):1783–1795
Li D, Mei H, Shen Y, Su S, Zhang W, Wang J, Zu M, Chen W (2018) Echarts: a declarative framework for rapid construction of web-based visualization. Vis Inform 2(2):136–146
Li J, Xu N, Nie W, Zhang S (2021) Image captioning with multi-level similarity-guided semantic matching. Vis Inform 2(4):41–48
Lin CY (2004) Rouge: a package for automatic evaluation of summaries. In: Proceedings of the text summarization branches out, pp 74–81
Liu C, Xie L, Han Y, Wei D, Yuan X (2020) AutoCaption: An approach to generate natural language description from visualization automatically. In: Proceedings of IEEE pacific visualization symposium, pp 191–195
Lu M, Lanir J, Wang C, Yao Y, Zhang W, Deussen O, Huang H (2022) Modeling just noticeable differences in charts. IEEE Trans Vis Comput Graph 28(1):718–726
Luo J, Li Z, Wang J, Lin CY (2021) ChartOCR: Data extraction from charts images via a deep hybrid framework. In: Proceedings of IEEE winter conference on applications of computer vision, pp 1917–1925
Luo R, Price B, Shakhnarovich G, Cohen S (2018) Discriminability objective for training descriptive captions. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 6964–6974
Maltese AV, Harsh JA, Svetina D (2015) Data visualization literacy: investigating data interpretation along the novice–expert continuum. J Coll Sci Teach 45(1):84–90
Milutinovic G, Ahonen-Jonnarth U, Seipel S (2021) Does visual saliency affect decision-making? J Vis 24(6):1267–1285
Mishra P, Kumar S, Chaube MK (2022) Evaginating scientific charts: recovering direct and derived information encodings from chart images. J Vis 25(2):343–359
Munzner T (2014) Visualization analysis and design. CRC Press, Boca Raton
Obeid J, Hoque E (2020) Chart-to-text: Generating natural language descriptions for charts by adapting the transformer model. arXiv:2010.09142
Papineni K, Roukos S, Ward T, Zhu WJ (2002) Bleu: A method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting on association for computational linguistics, pp 311–318
Peng H, Li B, Ling H, Hu W, Xiong W, Maybank SJ (2017) Salient object detection via structured matrix decomposition. IEEE Trans Pattern Anal Mach Intell 39(4):818–832
Poco J, Heer J (2017) Reverse-engineering visualizations: recovering visual encodings from chart images. Comput Graph Forum 36(3):353–363
Points (2021). https://zhidao.baidu.com/question/307544729874806404.html, retrieved: May 1st, 2022
Pytorch (2022). https://pytorch.org/, retrieved: May 1st, 2022
Qian X, Koh E, Du F, Kim S, Chan J, Rossi RA, Malik S, Lee TY (2021) Generating accurate caption units for figure captioning. In: Proceedings of the web conference 2021, pp 2792–2804
Qin X, Zhang Z, Huang C, Gao C, Dehghan M, Jagersand M (2019) Basnet: Boundary-aware salient object detection. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 7479–7489
Ramanishka V, Das A, Zhang J, Saenko K (2017) Top-down visual saliency guided by captions. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 7206–7215
React.js (2022). https://reactjs.org/, retrieved: May 1st, 2022
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556
Sohn C, Choi H, Kim K, Park J, Noh J (2021) Line chart understanding with convolutional neural network. Electronics 10(6):749
Spreafico A, Carenini G (2020) Neural data-driven captioning of time-series line charts. In: Proceedings of the International Conference on Advanced Visual Interfaces, pp 1–5
Srinivasan A, Drucker SM, Endert A, Stasko J (2019) Augmenting visualizations with interactive data facts to facilitate interpretation and communication. IEEE Trans Vis Comput Graph 25(1):672–681
Typescript (2022). https://www.typescriptlang.org/, retrieved: May 1st, 2022
Uddin A, Monira M, Shin W, Chung T, Bae SH, et al. (2020) Saliencymix: a saliency guided data augmentation strategy for better regularization. arXiv:2006.01791
Ullah I, Jian M, Hussain S, Guo J, Yu H, Wang X, Yin Y (2020) A brief survey of visual saliency detection. Multimedia Tools Appl 79(45):34605–34645
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Lu, Polosukhin I (2017) Attention is all you need. In: Proceedings of international conference on neural information processing systems, pp 6000–6010
Vedantam R, Zitnick CL, Parikh D (2015) Cider: consensus-based image description evaluation. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 4566–4575
Wang X, Yin J, Cheng B, Qin J (2021) Colormap optimization with data equality. J Vis 24(1):191–203
Waters (2016). https://baijiahao.baidu.com/s?id=1702686573136173200 &wfr=spider &for=pc, retrieved May 1st, 2022
Xiong C, Setlur V, Bach B, Koh E, Lin K, Franconeri S (2022) Visual arrangements of bar charts influence comparisons in viewer takeaways. IEEE Trans Vis Comput Graph 28(1):955–965
Xu K, Ba JL, Kiros R, Cho K, Courville A, Salakhudinov R, Zemel RS, Bengio Y (2015) Show, attend and tell: neural image caption generation with visual attention. In: Proceedings of international conference on machine learning, pp 2048–2057
Yu L (2010) How poor informationally are the information poor? Evidence from an empirical study of daily and regular information practices of individuals. J Doc 66(6):906–933
Yu L (2011) The divided views of the information and digital divides: a call for integrative theories of information inequality. J Inf Sci 37(6):660–679
Zhang L, Dai J, Lu H, He Y, Wang G (2018) A bi-directional message passing model for salient object detection. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 1741–1750
Zhao T, Wu X (2019) Pyramid feature attention network for saliency detection. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 3085–3094
Zhou F, Zhao Y, Chen W, Tan Y, Zhao Y (2021) Reverse-engineering bar charts using neural networks. J Vis 24(2):419–435
Zhu W, Liang S, Wei Y, Sun J (2014) Saliency optimization from robust background detection. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 2814–2821
Acknowledgments
The work was supported by NSFC (62072400) and the Collaborative Innovation Center of Artificial Intelligence by MOE and Zhejiang Provincial Government (ZJU). This work was also partially funded by the Zhejiang Laboratory (2021KE0AC02).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Supplementary file 1 (mp4 15805 KB)
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zhou, Y., Meng, X., Wu, Y. et al. An intelligent approach to automatically discovering visual insights. J Vis 26, 705–722 (2023). https://doi.org/10.1007/s12650-022-00894-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12650-022-00894-z