skip to main content
10.1145/3544548.3581067acmconferencesArticle/Chapter ViewAbstractPublication PageschiConference Proceedingsconference-collections
research-article

GVQA: Learning to Answer Questions about Graphs with Visualizations via Knowledge Base

Published: 19 April 2023 Publication History

Abstract

Graphs are common charts used to represent the topological relationship between nodes. It is a powerful tool for data analysis and information retrieval tasks involve asking questions about graphs. In formative study, we found that questions for graphs are not only about the relationship of nodes but also about the properties of graph elements. We propose a pipeline to answer natural language questions about graph visualizations and generate visual answers. We first extract the data from graphs and convert them into GML format. We design data structures to encode graph information and convert them into an knowledge base. We then extract topic entities from questions. We feed questions, entities and knowledge bases into our question-answer model to obtain the SPARQL queries for textual answers. Finally, we design a module to present the answers visually. A user study demonstrates that these visual and textual answers are useful, credible and and transparent.

Supplementary Material

MP4 File (3544548.3581067-video-figure.mp4)
Video Figure
MP4 File (3544548.3581067-video-preview.mp4)
Video Preview
MP4 File (3544548.3581067-talk-video.mp4)
Pre-recorded Video Presentation

References

[1]
Alan Akbik, Duncan Blythe, and Roland Vollgraf. 2018. Contextual string embeddings for sequence labeling. In Proceedings of the 27th international conference on computational linguistics. 1638–1649.
[2]
Sriram Karthik Badam, Zhicheng Liu, and Niklas Elmqvist. 2018. Elastic documents: Coupling text and tables through contextual visualizations for enhanced document reading. IEEE transactions on visualization and computer graphics 25, 1(2018), 661–671.
[3]
Junwei Bao, Nan Duan, Zhao Yan, Ming Zhou, and Tiejun Zhao. 2016. Constraint-based question answering with knowledge graph. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers. 2503–2514.
[4]
Mathieu Bastian, Sebastien Heymann, and Mathieu Jacomy. 2009. Gephi: an open source software for exploring and manipulating networks. In Proceedings of the international AAAI conference on web and social media, Vol. 3. 361–362.
[5]
Leilani Battle and Carlos Scheidegger. 2020. A structured review of data management technology for interactive visualization and analysis. IEEE Transactions on Visualization and Computer Graphics 27, 2(2020), 1128–1138.
[6]
Vincent D Blondel, Jean-Loup Guillaume, Renaud Lambiotte, and Etienne Lefebvre. 2008. Fast unfolding of communities in large networks. Journal of statistical mechanics: theory and experiment 2008, 10(2008), P10008.
[7]
Kurt Bollacker, Colin Evans, Praveen Paritosh, Tim Sturge, and Jamie Taylor. 2008. Freebase: a collaboratively created graph database for structuring human knowledge. In Proceedings of the 2008 ACM SIGMOD international conference on Management of data. 1247–1250.
[8]
Antoine Bordes, Nicolas Usunier, Sumit Chopra, and Jason Weston. 2015. Large-scale simple question answering with memory networks. arXiv preprint arXiv:1506.02075(2015).
[9]
Michael Bostock, Vadim Ogievetsky, and Jeffrey Heer. 2011. D3 data-driven documents. IEEE TVCG 17, 12 (2011), 2301–2309.
[10]
Ritwick Chaudhry, Sumit Shekhar, Utkarsh Gupta, Pranav Maneriker, Prann Bansal, and Ajay Joshi. 2020. Leaf-qa: Locate, encode & attend for figure question answering. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 3512–3521.
[11]
Yang Chen, Scott Barlowe, and Jing Yang. 2010. Click2annotate: Automated insight externalization with rich semantics. In 2010 IEEE symposium on visual analytics science and technology. IEEE, 155–162.
[12]
Yang Chen, Jing Yang, Scott Barlowe, and Dong H Jeong. 2010. Touch2Annotate: Generating better annotations with less human effort on multi-touch interfaces. In CHI’10 Extended Abstracts on Human Factors in Computing Systems. Association for Computing Machinery, 3703–3708.
[13]
Zi-Yuan Chen, Chih-Hung Chang, Yi-Pei Chen, Jijnasa Nayak, and Lun-Wei Ku. 2019. UHop: An Unrestricted-Hop Relation Extraction Framework for Knowledge-Based Question Answering. In Proceedings of NAACL-HLT. 345–356.
[14]
Rajarshi Das, Shehzaad Dhuliawala, Manzil Zaheer, Luke Vilnis, Ishan Durugkar, Akshay Krishnamurthy, Alex Smola, and Andrew McCallum. 2018. Go for a Walk and Arrive at the Answer: Reasoning Over Paths in Knowledge Bases using Reinforcement Learning. In International Conference on Learning Representations.
[15]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805(2018).
[16]
Tong Gao, Mira Dontcheva, Eytan Adar, Zhicheng Liu, and Karrie G Karahalios. 2015. Datatone: Managing ambiguity in natural language interfaces for data visualization. In Proceedings of the 28th annual acm symposium on user interface software & technology. 489–500.
[17]
Qingyu Guo, Fuzhen Zhuang, Chuan Qin, Hengshu Zhu, Xing Xie, Hui Xiong, and Qing He. 2020. A survey on knowledge graph-based recommender systems. IEEE Transactions on Knowledge and Data Engineering (2020).
[18]
Jonathan Harper and Maneesh Agrawala. 2017. Converting basic D3 charts into reusable style templates. IEEE transactions on visualization and computer graphics 24, 3(2017), 1274–1286.
[19]
Dan Hendrycks and Kevin Gimpel. 2016. Gaussian error linear units (gelus). arXiv preprint arXiv:1606.08415(2016).
[20]
Michael Himsolt. 1997. GML: A portable graph file format. Technical Report. Technical report, Universitat Passau.
[21]
Enamul Hoque, Vidya Setlur, Melanie Tory, and Isaac Dykeman. 2017. Applying pragmatics principles for interaction with visual analytics. IEEE transactions on visualization and computer graphics 24, 1(2017), 309–318.
[22]
Sen Hu, Lei Zou, and Xinbo Zhang. 2018. A state-transition framework to answer complex questions over knowledge base. In Proceedings of the 2018 conference on empirical methods in natural language processing. 2098–2108.
[23]
Zhiheng Huang, Wei Xu, and Kai Yu. 2015. Bidirectional LSTM-CRF Models for Sequence Tagging. ArXiv abs/1508.01991(2015).
[24]
Victor Hugo. 1863. Les misérables...C. Lassalle.
[25]
Zhengbao Jiang, Yi Mao, Pengcheng He, Graham Neubig, and Weizhu Chen. 2022. OmniTab: Pretraining with Natural and Synthetic Data for Few-shot Table-based Question Answering. arXiv preprint arXiv:2207.03637(2022).
[26]
Huang Jieying, Xi Yang, Hu Junnan, and Tao Jun. 2023. FlowNL: Asking the Flow Data in Natural Languages. IEEE Transactions on Visualization and Computer Graphics (2023), 1–11.
[27]
Kushal Kafle, Brian Price, Scott Cohen, and Christopher Kanan. 2018. Dvqa: Understanding data visualizations via question answering. In Proceedings of the IEEE conference on computer vision and pattern recognition. 5648–5656.
[28]
Samira Ebrahimi Kahou, Vincent Michalski, Adam Atkinson, and et al. 2018. FigureQA: An Annotated Figure Dataset for Visual Reasoning. In Proc. ICLR’18. arxiv:1710.07300http://arxiv.org/abs/1710.07300
[29]
Aniruddha Kembhavi, Mike Salvato, Eric Kolve, Minjoon Seo, Hannaneh Hajishirzi, and Ali Farhadi. 2016. A diagram is worth a dozen images. In European conference on computer vision. Springer, 235–251.
[30]
Daesik Kim, YoungJoon Yoo, Jee-Soo Kim, Sangkuk Lee, and Nojun Kwak. 2018. Dynamic graph generation network: Generating relational knowledge from diagrams. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4167–4175.
[31]
Dae Hyun Kim, Enamul Hoque, and Maneesh Agrawala. 2020. Answering questions about charts and generating visual explanations. In Proceedings of the 2020 CHI conference on human factors in computing systems. 1–13.
[32]
Donald Ervin Knuth. 1993. The Stanford GraphBase: a platform for combinatorial computing. Vol. 1. AcM Press New York.
[33]
Nicholas Kong, Marti A Hearst, and Maneesh Agrawala. 2014. Extracting references between text and charts via crowdsourcing. In Proceedings of the SIGCHI conference on Human Factors in Computing Systems. 31–40.
[34]
Valdis Krebs. 2004. Books about us politics. unpublished, http://www. orgnet. com(2004).
[35]
Chufan Lai, Zhixian Lin, Ruike Jiang, Yun Han, Can Liu, and Xiaoru Yuan. 2020. Automatic annotation synchronizing with textual description for visualization. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1–13.
[36]
Yunshi Lan and Jing Jiang. 2020. Query Graph Generation for Answering Multi-hop Complex Questions from Knowledge Bases. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 969–974. https://doi.org/10.18653/v1/2020.acl-main.91
[37]
Yunshi Lan, Shuohang Wang, and Jing Jiang. 2019. Knowledge Base Question Answering with Topic Units. In IJCAI. 5046–5052. https://doi.org/10.24963/ijcai.2019/701
[38]
Yunshi Lan, Shuohang Wang, and Jing Jiang. 2019. Multi-hop knowledge base question answering with an iterative sequence matching model. In 2019 IEEE International Conference on Data Mining (ICDM). IEEE, 359–368.
[39]
Deqing Li, Honghui Mei, Yi Shen, Shuang Su, Wenli Zhang, Junting Wang, Ming Zu, and Wei Chen. 2018. ECharts: a declarative framework for rapid construction of web-based visualization. Visual Informatics 2, 2 (2018), 136–146.
[40]
Haotian Li, Yong Wang, Songheng Zhang, Yangqiu Song, and Huamin Qu. 2021. KG4Vis: A knowledge graph-based approach for visualization recommendation. IEEE Transactions on Visualization and Computer Graphics 28, 1(2021), 195–205.
[41]
Chen Liang, Jonathan Berant, Quoc Le, Kenneth D Forbus, and Ni Lao. 2017. Neural symbolic machines: Learning semantic parsers on freebase with weak supervision. In 55th Annual Meeting of the Association for Computational Linguistics, ACL 2017. Association for Computational Linguistics (ACL), 23–33.
[42]
Can Liu, Yun Han, Ruike Jiang, and Xiaoru Yuan. 2021. Advisor: Automatic visualization answer for natural-language question on tabular data. In 2021 IEEE 14th Pacific Visualization Symposium (PacificVis). IEEE, 11–20.
[43]
Can Liu, Liwenhan Xie, Yun Han, Datong Wei, and Xiaoru Yuan. 2020. AutoCaption: An approach to generate natural language description from visualization automatically. In 2020 IEEE Pacific Visualization Symposium (PacificVis). IEEE, 191–195.
[44]
Kangqi Luo, Fengli Lin, Xusheng Luo, and Kenny Zhu. 2018. Knowledge base question answering via encoding of complex query graphs. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2185–2194.
[45]
Yuyu Luo, Nan Tang, Guoliang Li, Chengliang Chai, Wenbo Li, and Xuedi Qin. 2021. Synthesizing natural language to visualization (NL2VIS) benchmarks from NL2SQL benchmarks. In Proceedings of the 2021 International Conference on Management of Data. 1235–1247.
[46]
Minesh Mathew, Viraj Bagal, Rubèn Tito, Dimosthenis Karatzas, Ernest Valveny, and CV Jawahar. 2022. InfographicVQA. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 1697–1706.
[47]
Ronald Metoyer, Qiyu Zhi, Bart Janczuk, and Walter Scheirer. 2018. Coupling story to visualization: Using textual analysis as a bridge between data and interpretation. In 23rd International Conference on Intelligent User Interfaces. 503–507.
[48]
George A Miller. 1995. WordNet: a lexical database for English. Commun. ACM 38, 11 (1995), 39–41.
[49]
Semi Min and Juyong Park. 2016. Narrative as a Complex Network: A Study of Victor Hugo’s Les Misérables. In Proceedings of HCI Korea. 100–107.
[50]
Tamara Munzner. 2014. Visualization analysis and design. CRC press.
[51]
Mark EJ Newman. 2006. Finding community structure in networks using the eigenvectors of matrices. Physical review E 74, 3 (2006), 036104.
[52]
Mark EJ Newman. 2013. Network data. http://www-personal. umich. edu/mejn/netdata/ (2013).
[53]
Jason Obeid and Enamul Hoque. 2020. Chart-to-text: Generating natural language descriptions for charts by adapting the transformer model. arXiv preprint arXiv:2010.09142(2020).
[54]
Panupong Pasupat and Percy Liang. 2015. Compositional Semantic Parsing on Semi-Structured Tables. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 1470–1480.
[55]
Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, 2019. Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019).
[56]
Michael Petrochuk and Luke Zettlemoyer. 2018. Simplequestions nearly solved: A new upperbound and baseline approach. arXiv preprint arXiv:1804.08798(2018).
[57]
Donghao Ren, Matthew Brehmer, Bongshin Lee, Tobias Höllerer, and Eun Kyoung Choe. 2017. Chartaccent: Annotation for data-driven storytelling. In 2017 IEEE Pacific Visualization Symposium (PacificVis). Ieee, 230–239.
[58]
Stefan Schweter and Alan Akbik. 2020. Flert: Document-level features for named entity recognition. arXiv preprint arXiv:2011.06993(2020).
[59]
Vidya Setlur, Sarah E Battersby, Melanie Tory, Rich Gossweiler, and Angel X Chang. 2016. Eviza: A natural language interface for visual analysis. In Proceedings of the 29th annual symposium on user interface software and technology. 365–377.
[60]
Leixian Shen, Enya Shen, Yuyu Luo, Xiaocong Yang, Xuming Hu, Xiongshuai Zhang, Zhiwei Tai, and Jianmin Wang. 2022. Towards Natural Language Interfaces for Data Visualization: A Survey. IEEE Transactions on Visualization and Computer Graphics (2022), 1–1. https://doi.org/10.1109/TVCG.2022.3148007
[61]
Danqing Shi, Xinyue Xu, Fuling Sun, Yang Shi, and Nan Cao. 2020. Calliope: Automatic visual data story generation from a spreadsheet. IEEE Transactions on Visualization and Computer Graphics 27, 2(2020), 453–463.
[62]
Hrituraj Singh and Sumit Shekhar. 2020. Stl-cqa: Structure-based transformers with localization and encoding for chart question answering. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 3275–3284.
[63]
Sicheng Song, Chenhui Li, Yujing Sun, and Changbo Wang. 2022. VividGraph: Learning to Extract and Redesign Network Graphs from Visualization Images. IEEE Transactions on Visualization and Computer Graphics (2022).
[64]
Arjun Srinivasan, Bongshin Lee, Nathalie Henry Riche, Steven M Drucker, and Ken Hinckley. 2020. InChorus: Designing consistent multimodal interactions for data visualization on tablet devices. In Proceedings of the 2020 CHI conference on human factors in computing systems. 1–13.
[65]
Arjun Srinivasan and John Stasko. 2017. Orko: Facilitating multimodal interaction for visual exploration and analysis of networks. IEEE transactions on visualization and computer graphics 24, 1(2017), 511–521.
[66]
Harris Steve, Seaborne Andy, and Prud’hommeaux Eric. 2013. SPARQL 1.1 Query Language. https://www.w3.org/TR/sparql11-query/.
[67]
Chao Tong, Richard Roberts, Rita Borgo, Sean Walton, Robert S Laramee, Kodzo Wegba, Aidong Lu, Yun Wang, Huamin Qu, Qiong Luo, 2018. Storytelling and visualization: An extended survey. Information 9, 3 (2018), 65.
[68]
Mati Ullah, Abdul Shahid, Muhammad Roman, Muhammad Assam, Muhammad Fayaz, Yazeed Ghadi, Hanan Aljuaid, 2022. Analyzing Interdisciplinary Research Using Co-Authorship Networks. Complexity 2022(2022).
[69]
Quan Wang, Zhendong Mao, Bin Wang, and Li Guo. 2017. Knowledge graph embedding: A survey of approaches and applications. IEEE Transactions on Knowledge and Data Engineering 29, 12(2017), 2724–2743.
[70]
Yanyan Wang, Zhanning Bai, Zhifeng Lin, Xiaoqing Dong, Yingchaojie Feng, Jiacheng Pan, and Wei Chen. 2021. G6: A web-based library for graph visualization. Visual Informatics 5, 4 (2021), 49–55.
[71]
Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, Rémi Louf, Morgan Funtowicz, Joe Davison, Sam Shleifer, Patrick von Platen, Clara Ma, Yacine Jernite, Julien Plu, Canwen Xu, Teven Le Scao, Sylvain Gugger, Mariama Drame, Quentin Lhoest, and Alexander M. Rush. 2020. Transformers: State-of-the-Art Natural Language Processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. Association for Computational Linguistics, Online, 38–45. https://www.aclweb.org/anthology/2020.emnlp-demos.6
[72]
Scott Wen-tau Yih, Ming-Wei Chang, Xiaodong He, and Jianfeng Gao. 2015. Semantic parsing via staged query graph generation: Question answering with knowledge base. In Proceedings of the Joint Conference of the 53rd Annual Meeting of the ACL and the 7th International Joint Conference on Natural Language Processing of the AFNLP.
[73]
Wen-tau Yih, Xiaodong He, and Christopher Meek. 2014. Semantic parsing for single-relation question answering. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). 643–648.
[74]
Zhaoquan Yuan, Xiao Peng, Xiao Wu, and Changsheng Xu. 2021. Hierarchical Multi-Task Learning for Diagram Question Answering with Multi-Modal Transformer. In Proceedings of the 29th ACM International Conference on Multimedia. 1313–1321.
[75]
Wayne W Zachary. 1977. An information flow model for conflict and fission in small groups. Journal of anthropological research 33, 4 (1977), 452–473.
[76]
Yuchen Zhang, Panupong Pasupat, and Percy Liang. 2017. Macro grammars and holistic triggering for efficient semantic parsing. arXiv preprint arXiv:1707.07806(2017).
[77]
Ying Zhao, Jingcheng Shi, Jiawei Liu, Jian Zhao, Fangfang Zhou, Wenzhi Zhang, Kangyi Chen, Xin Zhao, Chunyao Zhu, and Wei Chen. 2021. Evaluating effects of background stories on graph perception. IEEE Transactions on Visualization and Computer Graphics (2021).
[78]
Victor Zhong, Caiming Xiong, and Richard Socher. 2017. Seq2SQL: Generating Structured Queries from Natural Language using Reinforcement Learning. CoRR abs/1709.00103(2017).

Cited By

View all
  • (2025)KNowNEt:Guided Health Information Seeking from LLMs via Knowledge Graph IntegrationIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2024.345636431:1(547-557)Online publication date: Jan-2025
  • (2025)Amplifying commonsense knowledge via bi-directional relation integrated graph-based contrastive pre-training from large language modelsInformation Processing & Management10.1016/j.ipm.2025.10406862:3(104068)Online publication date: May-2025
  • (2024)SalChartQA: Question-driven Saliency on Information VisualisationsProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642942(1-14)Online publication date: 11-May-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CHI '23: Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems
April 2023
14911 pages
ISBN:9781450394215
DOI:10.1145/3544548
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 April 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Knowledge Base
  2. Natural Language Process
  3. Network Graph
  4. Question Answering
  5. Reinforcement Learning
  6. Visualization

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

Conference

CHI '23
Sponsor:

Acceptance Rates

Overall Acceptance Rate 6,199 of 26,314 submissions, 24%

Upcoming Conference

CHI 2025
ACM CHI Conference on Human Factors in Computing Systems
April 26 - May 1, 2025
Yokohama , Japan

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)398
  • Downloads (Last 6 weeks)81
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2025)KNowNEt:Guided Health Information Seeking from LLMs via Knowledge Graph IntegrationIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2024.345636431:1(547-557)Online publication date: Jan-2025
  • (2025)Amplifying commonsense knowledge via bi-directional relation integrated graph-based contrastive pre-training from large language modelsInformation Processing & Management10.1016/j.ipm.2025.10406862:3(104068)Online publication date: May-2025
  • (2024)SalChartQA: Question-driven Saliency on Information VisualisationsProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642942(1-14)Online publication date: 11-May-2024
  • (2024)AQuA: Automated Question-Answering in Software Tutorial Videos with Visual AnchorsProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642752(1-19)Online publication date: 11-May-2024
  • (2024)SenseMap: Urban Performance Visualization and Analytics Via Semantic Textual SimilarityIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2023.333335630:9(6275-6290)Online publication date: Sep-2024
  • (2024)Generative AI for visualization: State of the art and future directionsVisual Informatics10.1016/j.visinf.2024.04.0038:2(43-66)Online publication date: Jun-2024
  • (2024)Foundation models meet visualizations: Challenges and opportunitiesComputational Visual Media10.1007/s41095-023-0393-x10:3(399-424)Online publication date: 2-May-2024
  • (2023)InkSight: Leveraging Sketch Interaction for Documenting Chart Findings in Computational NotebooksIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2023.332717030:1(944-954)Online publication date: 25-Oct-2023

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

Full Text

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media