An intelligent approach to automatically discovering visual insights

Zhou, Yuhua; Meng, Xiyu; Wu, Yanhong; Tang, Tan; Wang, Yongheng; Wu, Yingcai

doi:10.1007/s12650-022-00894-z

An intelligent approach to automatically discovering visual insights

Regular Paper
Published: 30 October 2022

Volume 26, pages 705–722, (2023)
Cite this article

Journal of Visualization Aims and scope Submit manuscript

Yuhua Zhou¹,
Xiyu Meng¹,
Yanhong Wu¹,
Tan Tang²,
Yongheng Wang³ &
…
Yingcai Wu ORCID: orcid.org/0000-0002-1119-3237¹

510 Accesses
3 Citations
Explore all metrics

Abstract

Data charts are widely used in practices to display insights in complex data. Due to ineffective designs, novice readers may require descriptive content (e.g., chart captions) to understand the implying data stories that may not be accessible in some situations. This problem hinders the usage of data charts for the mass, which has raised deep concerns for visualization researchers. Recently, researchers have proposed deep-learning-based methods to automatically provide text context for data charts. However, these methods ignore the visual links between textual content and visual figures. Moreover, some of them are mainly applied in scalable vector graphics and cannot be easily extended to Internet pictures that are in raster format (e.g., PNG or JPEG). To overcome these limitations, we propose a novel deep-learning-based framework to automatically discover visual insights and generate corresponding text descriptions for chart figures. Specifically, we train a saliency detection model to reveal the salient area that presents the most important data insights and employ an image captioning model to generate the corresponding descriptive text. Meanwhile, we propose a novel method to optimize the saliency map to enable viewers to be aware of visual insights easily. Finally, we develop an interactive system that supports users to upload chart figures and then display chart insights as well as the related descriptions. We evaluate our saliency detection model and image captioning model through quantitative and qualitative experiments and conduct a user study to demonstrate the usage of our system.

Graphical abstract

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

ACCirO: A System for Analyzing and Digitizing Images of Charts with Circular Objects

DensityLayout: Density-Conditioned Layout GAN for Visual-Textual Presentation Designs

SGDraw: Scene Graph Drawing Interface Using Object-Oriented Representation

Notes

References

Alper B, Riche NH, Chevalier F, Boy J, Sezgin M (2017) Visualization literacy at elementary school. In: Proceedings of the 2017 CHI conference on human factors in computing systems, pp 5485–5497
Al-Zaidy RA, Giles CL (2015) Automatic extraction of data from bar charts. In: Proceedings of the international conference on knowledge capture, pp 1–4
Andrew AM (1979) Another efficient algorithm for convex hulls in two dimensions. Inf Process Lett 9(5):216–219
Article MATH Google Scholar
Ant design (2022). https://ant.design/docs/react/introduce-cn/, retrieved: May 1st, 2022
Balaji A, Ramanathan T, Sonathi V (2018) Chart-text: a fully automated chart image descriptor. arXiv:2006.01791
Banerjee S, Lavie A (2005) Meteor: an automatic metric for MT evaluation with improved correlation with human judgments. In: Proceedings of the ACL workshop on intrinsic and extrinsic evaluation measures for machine translation and/or summarization, pp 65–72
Börner K, Bueckle A, Ginda M (2019) Data visualization literacy: definitions, conceptual frameworks, exercises, and assessments. Proc Natl Acad Sci 116(6):1857–1864
Article Google Scholar
Boy J, Rensink RA, Bertini E, Fekete JD (2014) A principled way of assessing visualization literacy. IEEE Trans Vis Comput Graph 20(12):1963–1972
Article Google Scholar
Bylinskii Z, Alsheikh S, Madan S, Recasens A, Zhong K, Pfister H, Durand F, Oliva A (2017) Understanding infographics through textual and visual tag prediction. arXiv:1709.09215
Bylinskii Z, Judd T, Borji A, Itti L, Durand F, Oliva A, Torralba A (2015) Mit saliency benchmark. http://saliency.mit.edu/
Chen L, Zhao K (2021) An approach for chart description generation in cyber-physical-social system. Symmetry 13(9):1552
Article Google Scholar
Cheng MM, Zhang GX, Mitra NJ, Huang X, Hu SM (2014) Global contrast based salient region detection. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 409–416
Chen C, Zhang R, Koh E, Kim S, Cohen S, Rossi R (2020) Figure captioning with relation maps for reasoning. In: Proceedings of IEEE winter conference on applications of computer vision, pp 1537–1545
Choi J, Jung S, Park DG, Choo J, Elmqvist N (2019) Visualizing for the non-visual: enabling the visually impaired to use visualization. Comput Graph Forum 38(3):249–260
Article Google Scholar
Davila K, Setlur S, Doermann D, Kota BU, Govindaraju V (2021) Chart mining: a survey of methods for automated chart analysis. IEEE Trans Pattern Anal Mach Intell 43(11):3799–3819
Article Google Scholar
Demiralp Ç, Haas PJ, Parthasarathy S, Pedapati T (2017) Foresight: recommending visual insights. arXiv:1707.03877
Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: A large-scale hierarchical image database. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 248–255
Deng Z, Hu X, Zhu L, Xu X, Qin J, Han G, Heng PA (2018) R3net: Recurrent residual refinement network for saliency detection. In: Proceedings of the international joint conference on artificial intelligence, pp 684–690
Deng D, Wu Y, Shu X, Wu J, Fu S, Cui W, Wu Y (2022) VisImages: a fine-grained expert-annotated visualization dataset. arXiv:2007.04584
Ding R, Han S, Xu Y, Zhang H, Zhang D (2019) Quickinsights: Quick and automatic discovery of insights from multi-dimensional data. In: Proceedings of the international conference on management of data, pp 317–332
Edunov S, Ott M, Auli M, Grangier D (2018) Understanding back-translation at scale. arXiv:1808.09381
Flask (2022). https://flask.palletsprojects.com/en/2.1.x/, retrieved: May 1st, 2022
Fu J, Zhu B, Cui W, Ge S, Wang Y, Zhang H, Huang H, Tang Y, Zhang D, Ma X (2021) Chartem: reviving chart images with data embedding. IEEE Trans Vis Comput Graph 27(2):337–346
Article Google Scholar
Haddad RA, Akansu AN (1991) A class of fast gaussian binomial filters for speech and image processing. IEEE Trans Signal Process 39(3):723–727
Article Google Scholar
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Hossain MZ, Sohel F, Shiratuddin MF, Laga H (2019) A comprehensive survey of deep learning for image captioning. ACM Comput Surv 51(6):1–36
Article Google Scholar
Hsu TY, Giles CL, Huang TH (2021) Scicap: generating captions for scientific figures. arXiv:2110.11624
Jung D, Kim W, Song H, Hwang Ji, Lee B, Kim B, Seo J (2017) Chartsense: Interactive data extraction from chart images. In: Proceedings of the CHI conference on human factors in computing systems, pp 6706–6717
Kafle K, Price B, Cohen S, Kanan C (2018) Dvqa: Understanding data visualizations via question answering. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 5648–5656
Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv:1412.6980
Lai C, Lin Z, Jiang R, Han Y, Liu C, Yuan X (2020) Automatic annotation synchronizing with textual description for visualization. In: Proceedings of the CHI conference on human factors in computing systems, pp 1–13
Latif S, Zhou Z, Kim Y, Beck F, Kim NW (2022) Kori: interactive synthesis of text and charts in data documents. IEEE Trans Vis Comput Graph 28(1):184–194
Article Google Scholar
Lei J, Wang B, Fang Y, Lin W, Callet PL, Ling N, Hou C (2016) A universal framework for salient object detection. IEEE Trans Multimedia 18(9):1783–1795
Article Google Scholar
Li D, Mei H, Shen Y, Su S, Zhang W, Wang J, Zu M, Chen W (2018) Echarts: a declarative framework for rapid construction of web-based visualization. Vis Inform 2(2):136–146
Article Google Scholar
Li J, Xu N, Nie W, Zhang S (2021) Image captioning with multi-level similarity-guided semantic matching. Vis Inform 2(4):41–48
Article Google Scholar
Lin CY (2004) Rouge: a package for automatic evaluation of summaries. In: Proceedings of the text summarization branches out, pp 74–81
Liu C, Xie L, Han Y, Wei D, Yuan X (2020) AutoCaption: An approach to generate natural language description from visualization automatically. In: Proceedings of IEEE pacific visualization symposium, pp 191–195
Lu M, Lanir J, Wang C, Yao Y, Zhang W, Deussen O, Huang H (2022) Modeling just noticeable differences in charts. IEEE Trans Vis Comput Graph 28(1):718–726
Article Google Scholar
Luo J, Li Z, Wang J, Lin CY (2021) ChartOCR: Data extraction from charts images via a deep hybrid framework. In: Proceedings of IEEE winter conference on applications of computer vision, pp 1917–1925
Luo R, Price B, Shakhnarovich G, Cohen S (2018) Discriminability objective for training descriptive captions. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 6964–6974
Maltese AV, Harsh JA, Svetina D (2015) Data visualization literacy: investigating data interpretation along the novice–expert continuum. J Coll Sci Teach 45(1):84–90
Article Google Scholar
Milutinovic G, Ahonen-Jonnarth U, Seipel S (2021) Does visual saliency affect decision-making? J Vis 24(6):1267–1285
Article Google Scholar
Mishra P, Kumar S, Chaube MK (2022) Evaginating scientific charts: recovering direct and derived information encodings from chart images. J Vis 25(2):343–359
Article Google Scholar
Munzner T (2014) Visualization analysis and design. CRC Press, Boca Raton
Book Google Scholar
Obeid J, Hoque E (2020) Chart-to-text: Generating natural language descriptions for charts by adapting the transformer model. arXiv:2010.09142
Papineni K, Roukos S, Ward T, Zhu WJ (2002) Bleu: A method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting on association for computational linguistics, pp 311–318
Peng H, Li B, Ling H, Hu W, Xiong W, Maybank SJ (2017) Salient object detection via structured matrix decomposition. IEEE Trans Pattern Anal Mach Intell 39(4):818–832
Article Google Scholar
Poco J, Heer J (2017) Reverse-engineering visualizations: recovering visual encodings from chart images. Comput Graph Forum 36(3):353–363
Article Google Scholar
Points (2021). https://zhidao.baidu.com/question/307544729874806404.html, retrieved: May 1st, 2022
Pytorch (2022). https://pytorch.org/, retrieved: May 1st, 2022
Qian X, Koh E, Du F, Kim S, Chan J, Rossi RA, Malik S, Lee TY (2021) Generating accurate caption units for figure captioning. In: Proceedings of the web conference 2021, pp 2792–2804
Qin X, Zhang Z, Huang C, Gao C, Dehghan M, Jagersand M (2019) Basnet: Boundary-aware salient object detection. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 7479–7489
Ramanishka V, Das A, Zhang J, Saenko K (2017) Top-down visual saliency guided by captions. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 7206–7215
React.js (2022). https://reactjs.org/, retrieved: May 1st, 2022
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556
Sohn C, Choi H, Kim K, Park J, Noh J (2021) Line chart understanding with convolutional neural network. Electronics 10(6):749
Article Google Scholar
Spreafico A, Carenini G (2020) Neural data-driven captioning of time-series line charts. In: Proceedings of the International Conference on Advanced Visual Interfaces, pp 1–5
Srinivasan A, Drucker SM, Endert A, Stasko J (2019) Augmenting visualizations with interactive data facts to facilitate interpretation and communication. IEEE Trans Vis Comput Graph 25(1):672–681
Article Google Scholar
Typescript (2022). https://www.typescriptlang.org/, retrieved: May 1st, 2022
Uddin A, Monira M, Shin W, Chung T, Bae SH, et al. (2020) Saliencymix: a saliency guided data augmentation strategy for better regularization. arXiv:2006.01791
Ullah I, Jian M, Hussain S, Guo J, Yu H, Wang X, Yin Y (2020) A brief survey of visual saliency detection. Multimedia Tools Appl 79(45):34605–34645
Article Google Scholar
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Lu, Polosukhin I (2017) Attention is all you need. In: Proceedings of international conference on neural information processing systems, pp 6000–6010
Vedantam R, Zitnick CL, Parikh D (2015) Cider: consensus-based image description evaluation. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 4566–4575
Wang X, Yin J, Cheng B, Qin J (2021) Colormap optimization with data equality. J Vis 24(1):191–203
Article Google Scholar
Waters (2016). https://baijiahao.baidu.com/s?id=1702686573136173200 &wfr=spider &for=pc, retrieved May 1st, 2022
Xiong C, Setlur V, Bach B, Koh E, Lin K, Franconeri S (2022) Visual arrangements of bar charts influence comparisons in viewer takeaways. IEEE Trans Vis Comput Graph 28(1):955–965
Article Google Scholar
Xu K, Ba JL, Kiros R, Cho K, Courville A, Salakhudinov R, Zemel RS, Bengio Y (2015) Show, attend and tell: neural image caption generation with visual attention. In: Proceedings of international conference on machine learning, pp 2048–2057
Yu L (2010) How poor informationally are the information poor? Evidence from an empirical study of daily and regular information practices of individuals. J Doc 66(6):906–933
Article Google Scholar
Yu L (2011) The divided views of the information and digital divides: a call for integrative theories of information inequality. J Inf Sci 37(6):660–679
Article Google Scholar
Zhang L, Dai J, Lu H, He Y, Wang G (2018) A bi-directional message passing model for salient object detection. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 1741–1750
Zhao T, Wu X (2019) Pyramid feature attention network for saliency detection. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 3085–3094
Zhou F, Zhao Y, Chen W, Tan Y, Zhao Y (2021) Reverse-engineering bar charts using neural networks. J Vis 24(2):419–435
Article Google Scholar
Zhu W, Liang S, Wei Y, Sun J (2014) Saliency optimization from robust background detection. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 2814–2821

Download references

Acknowledgments

The work was supported by NSFC (62072400) and the Collaborative Innovation Center of Artificial Intelligence by MOE and Zhejiang Provincial Government (ZJU). This work was also partially funded by the Zhejiang Laboratory (2021KE0AC02).

Author information

Authors and Affiliations

State Key Lab of CAD & CG, Zhejiang University, Hangzhou, China
Yuhua Zhou, Xiyu Meng, Yanhong Wu & Yingcai Wu
Zhejiang Lab, Hangzhou, China
Tan Tang
JD Group, Beijing, China
Yongheng Wang

Authors

Yuhua Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Xiyu Meng
View author publications
You can also search for this author in PubMed Google Scholar
Yanhong Wu
View author publications
You can also search for this author in PubMed Google Scholar
Tan Tang
View author publications
You can also search for this author in PubMed Google Scholar
Yongheng Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yingcai Wu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yingcai Wu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (mp4 15805 KB)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zhou, Y., Meng, X., Wu, Y. et al. An intelligent approach to automatically discovering visual insights. J Vis 26, 705–722 (2023). https://doi.org/10.1007/s12650-022-00894-z

Download citation

Received: 01 September 2022
Accepted: 26 September 2022
Published: 30 October 2022
Issue Date: June 2023
DOI: https://doi.org/10.1007/s12650-022-00894-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions