Skip to main content
Log in

An intelligent approach to automatically discovering visual insights

  • Regular Paper
  • Published:
Journal of Visualization Aims and scope Submit manuscript

Abstract

Data charts are widely used in practices to display insights in complex data. Due to ineffective designs, novice readers may require descriptive content (e.g., chart captions) to understand the implying data stories that may not be accessible in some situations. This problem hinders the usage of data charts for the mass, which has raised deep concerns for visualization researchers. Recently, researchers have proposed deep-learning-based methods to automatically provide text context for data charts. However, these methods ignore the visual links between textual content and visual figures. Moreover, some of them are mainly applied in scalable vector graphics and cannot be easily extended to Internet pictures that are in raster format (e.g., PNG or JPEG). To overcome these limitations, we propose a novel deep-learning-based framework to automatically discover visual insights and generate corresponding text descriptions for chart figures. Specifically, we train a saliency detection model to reveal the salient area that presents the most important data insights and employ an image captioning model to generate the corresponding descriptive text. Meanwhile, we propose a novel method to optimize the saliency map to enable viewers to be aware of visual insights easily. Finally, we develop an interactive system that supports users to upload chart figures and then display chart insights as well as the related descriptions. We evaluate our saliency detection model and image captioning model through quantitative and qualitative experiments and conduct a user study to demonstrate the usage of our system.

Graphical abstract

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. https://github.com/sairajk/PyTorch-Pyramid-Feature-Attention-Network-for-Saliency-Detection.

  2. https://github.com/ruotianluo/ImageCaptioning.pytorch.

References

  • Alper B, Riche NH, Chevalier F, Boy J, Sezgin M (2017) Visualization literacy at elementary school. In: Proceedings of the 2017 CHI conference on human factors in computing systems, pp 5485–5497

  • Al-Zaidy RA, Giles CL (2015) Automatic extraction of data from bar charts. In: Proceedings of the international conference on knowledge capture, pp 1–4

  • Andrew AM (1979) Another efficient algorithm for convex hulls in two dimensions. Inf Process Lett 9(5):216–219

    Article  MATH  Google Scholar 

  • Ant design (2022). https://ant.design/docs/react/introduce-cn/, retrieved: May 1st, 2022

  • Balaji A, Ramanathan T, Sonathi V (2018) Chart-text: a fully automated chart image descriptor. arXiv:2006.01791

  • Banerjee S, Lavie A (2005) Meteor: an automatic metric for MT evaluation with improved correlation with human judgments. In: Proceedings of the ACL workshop on intrinsic and extrinsic evaluation measures for machine translation and/or summarization, pp 65–72

  • Börner K, Bueckle A, Ginda M (2019) Data visualization literacy: definitions, conceptual frameworks, exercises, and assessments. Proc Natl Acad Sci 116(6):1857–1864

    Article  Google Scholar 

  • Boy J, Rensink RA, Bertini E, Fekete JD (2014) A principled way of assessing visualization literacy. IEEE Trans Vis Comput Graph 20(12):1963–1972

    Article  Google Scholar 

  • Bylinskii Z, Alsheikh S, Madan S, Recasens A, Zhong K, Pfister H, Durand F, Oliva A (2017) Understanding infographics through textual and visual tag prediction. arXiv:1709.09215

  • Bylinskii Z, Judd T, Borji A, Itti L, Durand F, Oliva A, Torralba A (2015) Mit saliency benchmark. http://saliency.mit.edu/

  • Chen L, Zhao K (2021) An approach for chart description generation in cyber-physical-social system. Symmetry 13(9):1552

    Article  Google Scholar 

  • Cheng MM, Zhang GX, Mitra NJ, Huang X, Hu SM (2014) Global contrast based salient region detection. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 409–416

  • Chen C, Zhang R, Koh E, Kim S, Cohen S, Rossi R (2020) Figure captioning with relation maps for reasoning. In: Proceedings of IEEE winter conference on applications of computer vision, pp 1537–1545

  • Choi J, Jung S, Park DG, Choo J, Elmqvist N (2019) Visualizing for the non-visual: enabling the visually impaired to use visualization. Comput Graph Forum 38(3):249–260

    Article  Google Scholar 

  • Davila K, Setlur S, Doermann D, Kota BU, Govindaraju V (2021) Chart mining: a survey of methods for automated chart analysis. IEEE Trans Pattern Anal Mach Intell 43(11):3799–3819

    Article  Google Scholar 

  • Demiralp Ç, Haas PJ, Parthasarathy S, Pedapati T (2017) Foresight: recommending visual insights. arXiv:1707.03877

  • Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: A large-scale hierarchical image database. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 248–255

  • Deng Z, Hu X, Zhu L, Xu X, Qin J, Han G, Heng PA (2018) R3net: Recurrent residual refinement network for saliency detection. In: Proceedings of the international joint conference on artificial intelligence, pp 684–690

  • Deng D, Wu Y, Shu X, Wu J, Fu S, Cui W, Wu Y (2022) VisImages: a fine-grained expert-annotated visualization dataset. arXiv:2007.04584

  • Ding R, Han S, Xu Y, Zhang H, Zhang D (2019) Quickinsights: Quick and automatic discovery of insights from multi-dimensional data. In: Proceedings of the international conference on management of data, pp 317–332

  • Edunov S, Ott M, Auli M, Grangier D (2018) Understanding back-translation at scale. arXiv:1808.09381

  • Flask (2022). https://flask.palletsprojects.com/en/2.1.x/, retrieved: May 1st, 2022

  • Fu J, Zhu B, Cui W, Ge S, Wang Y, Zhang H, Huang H, Tang Y, Zhang D, Ma X (2021) Chartem: reviving chart images with data embedding. IEEE Trans Vis Comput Graph 27(2):337–346

    Article  Google Scholar 

  • Haddad RA, Akansu AN (1991) A class of fast gaussian binomial filters for speech and image processing. IEEE Trans Signal Process 39(3):723–727

    Article  Google Scholar 

  • He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  • Hossain MZ, Sohel F, Shiratuddin MF, Laga H (2019) A comprehensive survey of deep learning for image captioning. ACM Comput Surv 51(6):1–36

    Article  Google Scholar 

  • Hsu TY, Giles CL, Huang TH (2021) Scicap: generating captions for scientific figures. arXiv:2110.11624

  • Jung D, Kim W, Song H, Hwang Ji, Lee B, Kim B, Seo J (2017) Chartsense: Interactive data extraction from chart images. In: Proceedings of the CHI conference on human factors in computing systems, pp 6706–6717

  • Kafle K, Price B, Cohen S, Kanan C (2018) Dvqa: Understanding data visualizations via question answering. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 5648–5656

  • Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv:1412.6980

  • Lai C, Lin Z, Jiang R, Han Y, Liu C, Yuan X (2020) Automatic annotation synchronizing with textual description for visualization. In: Proceedings of the CHI conference on human factors in computing systems, pp 1–13

  • Latif S, Zhou Z, Kim Y, Beck F, Kim NW (2022) Kori: interactive synthesis of text and charts in data documents. IEEE Trans Vis Comput Graph 28(1):184–194

    Article  Google Scholar 

  • Lei J, Wang B, Fang Y, Lin W, Callet PL, Ling N, Hou C (2016) A universal framework for salient object detection. IEEE Trans Multimedia 18(9):1783–1795

    Article  Google Scholar 

  • Li D, Mei H, Shen Y, Su S, Zhang W, Wang J, Zu M, Chen W (2018) Echarts: a declarative framework for rapid construction of web-based visualization. Vis Inform 2(2):136–146

    Article  Google Scholar 

  • Li J, Xu N, Nie W, Zhang S (2021) Image captioning with multi-level similarity-guided semantic matching. Vis Inform 2(4):41–48

    Article  Google Scholar 

  • Lin CY (2004) Rouge: a package for automatic evaluation of summaries. In: Proceedings of the text summarization branches out, pp 74–81

  • Liu C, Xie L, Han Y, Wei D, Yuan X (2020) AutoCaption: An approach to generate natural language description from visualization automatically. In: Proceedings of IEEE pacific visualization symposium, pp 191–195

  • Lu M, Lanir J, Wang C, Yao Y, Zhang W, Deussen O, Huang H (2022) Modeling just noticeable differences in charts. IEEE Trans Vis Comput Graph 28(1):718–726

    Article  Google Scholar 

  • Luo J, Li Z, Wang J, Lin CY (2021) ChartOCR: Data extraction from charts images via a deep hybrid framework. In: Proceedings of IEEE winter conference on applications of computer vision, pp 1917–1925

  • Luo R, Price B, Shakhnarovich G, Cohen S (2018) Discriminability objective for training descriptive captions. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 6964–6974

  • Maltese AV, Harsh JA, Svetina D (2015) Data visualization literacy: investigating data interpretation along the novice–expert continuum. J Coll Sci Teach 45(1):84–90

    Article  Google Scholar 

  • Milutinovic G, Ahonen-Jonnarth U, Seipel S (2021) Does visual saliency affect decision-making? J Vis 24(6):1267–1285

    Article  Google Scholar 

  • Mishra P, Kumar S, Chaube MK (2022) Evaginating scientific charts: recovering direct and derived information encodings from chart images. J Vis 25(2):343–359

    Article  Google Scholar 

  • Munzner T (2014) Visualization analysis and design. CRC Press, Boca Raton

    Book  Google Scholar 

  • Obeid J, Hoque E (2020) Chart-to-text: Generating natural language descriptions for charts by adapting the transformer model. arXiv:2010.09142

  • Papineni K, Roukos S, Ward T, Zhu WJ (2002) Bleu: A method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting on association for computational linguistics, pp 311–318

  • Peng H, Li B, Ling H, Hu W, Xiong W, Maybank SJ (2017) Salient object detection via structured matrix decomposition. IEEE Trans Pattern Anal Mach Intell 39(4):818–832

    Article  Google Scholar 

  • Poco J, Heer J (2017) Reverse-engineering visualizations: recovering visual encodings from chart images. Comput Graph Forum 36(3):353–363

    Article  Google Scholar 

  • Points (2021). https://zhidao.baidu.com/question/307544729874806404.html, retrieved: May 1st, 2022

  • Pytorch (2022). https://pytorch.org/, retrieved: May 1st, 2022

  • Qian X, Koh E, Du F, Kim S, Chan J, Rossi RA, Malik S, Lee TY (2021) Generating accurate caption units for figure captioning. In: Proceedings of the web conference 2021, pp 2792–2804

  • Qin X, Zhang Z, Huang C, Gao C, Dehghan M, Jagersand M (2019) Basnet: Boundary-aware salient object detection. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 7479–7489

  • Ramanishka V, Das A, Zhang J, Saenko K (2017) Top-down visual saliency guided by captions. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 7206–7215

  • React.js (2022). https://reactjs.org/, retrieved: May 1st, 2022

  • Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556

  • Sohn C, Choi H, Kim K, Park J, Noh J (2021) Line chart understanding with convolutional neural network. Electronics 10(6):749

    Article  Google Scholar 

  • Spreafico A, Carenini G (2020) Neural data-driven captioning of time-series line charts. In: Proceedings of the International Conference on Advanced Visual Interfaces, pp 1–5

  • Srinivasan A, Drucker SM, Endert A, Stasko J (2019) Augmenting visualizations with interactive data facts to facilitate interpretation and communication. IEEE Trans Vis Comput Graph 25(1):672–681

    Article  Google Scholar 

  • Typescript (2022). https://www.typescriptlang.org/, retrieved: May 1st, 2022

  • Uddin A, Monira M, Shin W, Chung T, Bae SH, et al. (2020) Saliencymix: a saliency guided data augmentation strategy for better regularization. arXiv:2006.01791

  • Ullah I, Jian M, Hussain S, Guo J, Yu H, Wang X, Yin Y (2020) A brief survey of visual saliency detection. Multimedia Tools Appl 79(45):34605–34645

    Article  Google Scholar 

  • Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Lu, Polosukhin I (2017) Attention is all you need. In: Proceedings of international conference on neural information processing systems, pp 6000–6010

  • Vedantam R, Zitnick CL, Parikh D (2015) Cider: consensus-based image description evaluation. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 4566–4575

  • Wang X, Yin J, Cheng B, Qin J (2021) Colormap optimization with data equality. J Vis 24(1):191–203

    Article  Google Scholar 

  • Waters (2016). https://baijiahao.baidu.com/s?id=1702686573136173200 &wfr=spider &for=pc, retrieved May 1st, 2022

  • Xiong C, Setlur V, Bach B, Koh E, Lin K, Franconeri S (2022) Visual arrangements of bar charts influence comparisons in viewer takeaways. IEEE Trans Vis Comput Graph 28(1):955–965

    Article  Google Scholar 

  • Xu K, Ba JL, Kiros R, Cho K, Courville A, Salakhudinov R, Zemel RS, Bengio Y (2015) Show, attend and tell: neural image caption generation with visual attention. In: Proceedings of international conference on machine learning, pp 2048–2057

  • Yu L (2010) How poor informationally are the information poor? Evidence from an empirical study of daily and regular information practices of individuals. J Doc 66(6):906–933

    Article  Google Scholar 

  • Yu L (2011) The divided views of the information and digital divides: a call for integrative theories of information inequality. J Inf Sci 37(6):660–679

    Article  Google Scholar 

  • Zhang L, Dai J, Lu H, He Y, Wang G (2018) A bi-directional message passing model for salient object detection. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 1741–1750

  • Zhao T, Wu X (2019) Pyramid feature attention network for saliency detection. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 3085–3094

  • Zhou F, Zhao Y, Chen W, Tan Y, Zhao Y (2021) Reverse-engineering bar charts using neural networks. J Vis 24(2):419–435

    Article  Google Scholar 

  • Zhu W, Liang S, Wei Y, Sun J (2014) Saliency optimization from robust background detection. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 2814–2821

Download references

Acknowledgments

The work was supported by NSFC (62072400) and the Collaborative Innovation Center of Artificial Intelligence by MOE and Zhejiang Provincial Government (ZJU). This work was also partially funded by the Zhejiang Laboratory (2021KE0AC02).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yingcai Wu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (mp4 15805 KB)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhou, Y., Meng, X., Wu, Y. et al. An intelligent approach to automatically discovering visual insights. J Vis 26, 705–722 (2023). https://doi.org/10.1007/s12650-022-00894-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12650-022-00894-z

Keywords

Navigation