What Will You Tell Me About the Chart? – Automated Description of Charts

Seweryn, Karolina; Lorenc, Katarzyna; Wróblewska, Anna; Sysko-Romańczuk, Sylwia

doi:10.1007/978-3-030-92307-5_2

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1516))

Included in the following conference series:

International Conference on Neural Information Processing

2408 Accesses
2 Citations

Abstract

An automatic chart description is a very challenging task. There are many more relationships between objects in a chart compared to general computer vision problems. Furthermore, charts have a different specificity to natural-scene pictures, so commonly used methods do not perform well. To tackle these problems, we propose a process consisting of three sub-tasks: (1) chart classification, (2) detection of a chart’s essential elements, and (3) generation of text description.

Due to the lack of plot datasets dedicated to the task of generating text, we prepared a new dataset – ChaTa+ which contains real-made figures. Additionally, we have adjusted publicly available FigureQA and PlotQA datasets to our particular tasks and tested our method on them. We compared our results with those of the Adobe team [3], which we treated as a benchmark. Finally, we obtained comparable results of the models’ performance, although we trained them on a more complex dataset (semi-synthetic PlotQA) and built a less resource-intensive infrastructure.

Research was funded by the Centre for Priority Research Area Artificial Intelligence and Robotics of Warsaw University of Technology within the Excellence Initiative: Research University (IDUB) programme (grant no 1820/27/Z01/POB2/2021).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Bahdanau, D.: Neural machine translation by jointly learning to align and translate. In: ICLR (2015)
Google Scholar
Behzadian, M., Otaghsara, S., Yazdani, M., Ignatius, J.: A state-of the-art survey of TOPSIS applications. Expert Syst. Appl. 39, 13051–13069 (2012)
Article Google Scholar
Chen, C., Zhang, R., et al.: Figure captioning with relation maps for reasoning. In: WACV (2020)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2015)
Google Scholar
Jobin, K.V., Mondal, A., Jawahar, C.V.: Docfigure: a dataset for scientific document figure classification. In: ICDAR (2019)
Google Scholar
Kafle, K., Price, B., Cohen, S., Kanan, C.: DVQA: understanding data visualizations via question answering. In: CVPR (2018)
Google Scholar
Kahou, S.E., Michalski, V., Atkinson, A., Kadar, A., Trischler, A., Bengio, Y.: FigureQA: an annotated figure dataset for visual reasoning. In: ICLR (2018)
Google Scholar
Lin, T.Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: CVPR (2017)
Google Scholar
Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: ICCV (2017)
Google Scholar
Lin, T.Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Chapter Google Scholar
Nitesh, M., Pritha, G., Mitesh, K., Pratyush, K.: PlotQA: reasoning over scientific plots. In: WACV (2020)
Google Scholar
Ren, S., He, K., Girshick, R.B., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: NIPS (2015)
Google Scholar
Savva, M., Kong, N., Chhajta, A., Fei-Fei, L., Agrawala, M., Heer, J.: ReVision: automated classification, analysis and redesign of chart images. In: ACM (2011)
Google Scholar
Siegel, N., Horvitz, Z., Levin, R., Divvala, S., Farhadi, A.: FigureSeer: parsing result-figures in research papers. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9911, pp. 664–680. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46478-7_41
Chapter Google Scholar

Download references

Acknowledgements

We would like to thank to Przemysław Biecek and Tomasz Stanisławek for their work on common idea for creating the ChaTa dataset of Charts and Tables along with annotations of their elements, and preliminary ideas of the system to annotate them, and we are grateful for many students from the Faculty of Mathematics and Information Science who contributed to the annotation tool and gathering the preliminary ChaTa dataset, which we modified further.

Author information

Authors and Affiliations

Faculty of Mathematics and Information Science, Warsaw University of Technology, Warsaw, Poland
Karolina Seweryn, Katarzyna Lorenc & Anna Wróblewska
Management Faculty, Warsaw University of Technology, Warsaw, Poland
Sylwia Sysko-Romańczuk

Authors

Karolina Seweryn
View author publications
You can also search for this author in PubMed Google Scholar
Katarzyna Lorenc
View author publications
You can also search for this author in PubMed Google Scholar
Anna Wróblewska
View author publications
You can also search for this author in PubMed Google Scholar
Sylwia Sysko-Romańczuk
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Karolina Seweryn or Anna Wróblewska .

Editor information

Editors and Affiliations

Sampoerna University, Jakarta, Indonesia
Teddy Mantoro
Kyungpook National University, Daegu, Korea (Republic of)
Minho Lee
Sampoerna University, Jakarta, Indonesia
Media Anugerah Ayu
Murdoch University, Murdoch, WA, Australia
Kok Wai Wong
Universitas Indonesia, Depok, Indonesia
Achmad Nizar Hidayanto

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Seweryn, K., Lorenc, K., Wróblewska, A., Sysko-Romańczuk, S. (2021). What Will You Tell Me About the Chart? – Automated Description of Charts. In: Mantoro, T., Lee, M., Ayu, M.A., Wong, K.W., Hidayanto, A.N. (eds) Neural Information Processing. ICONIP 2021. Communications in Computer and Information Science, vol 1516. Springer, Cham. https://doi.org/10.1007/978-3-030-92307-5_2

Download citation

DOI: https://doi.org/10.1007/978-3-030-92307-5_2
Published: 02 December 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-92306-8
Online ISBN: 978-3-030-92307-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

What Will You Tell Me About the Chart? – Automated Description of Charts