short-paper

Neural (Knowledge Graph) Question Answering Using Synthetic Training Data

Author:
Trond Linjordet

University of Stavanger, Stavanger, Norway

University of Stavanger, Stavanger, Norway
View Profile

CIKM '20: Proceedings of the 29th ACM International Conference on Information & Knowledge ManagementOctober 2020Pages 3245–3248https://doi.org/10.1145/3340531.3418505

Published:19 October 2020Publication History

CIKM '20: Proceedings of the 29th ACM International Conference on Information & Knowledge Management

Pages 3245–3248

ABSTRACT

Deep learning requires volume, quality, and variety of training data. In neural question answering, a trade-off between quality and volume comes from the need to either manually curate or construct realistic question answering data, which is costly, or else augmenting, weakly labeling or generating training data from smaller datasets, leading to low variety and sometimes low quality. What can be done to make the best of this necessary trade-off? What can be understood from the endeavor to seek such solutions?

References

Abdalghani Abujabal, Mohamed Yahya, Mirek Riedewald, and Gerhard Weikum. 2017. Automated Template Generation for Question Answering over Knowledge Graphs. In Proc. of WWW '17. 1191--1200.Google ScholarDigital Library
Satanjeev Banerjee and Alon Lavie. 2005. METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments. In Proc. of ACL '05, Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization. 65--72.Google Scholar
Peter F. Brown, Stephen A. Della Pietra, Vincent J. Della Pietra, Jennifer C. Lai, and Robert L. Mercer. 1992. An Estimate of an Upper Bound for the Entropy of English. Computational Linguistics, Vol. 18, 1 (1992), 31--40.Google Scholar
Nilesh Chakraborty, Denis Lukovnikov, Gaurav Maheshwari, Priyansh Trivedi, Jens Lehmann, and Asja Fischer. 2019. Introduction to Neural Network based Approaches for Question Answering over Knowledge Graphs. (2019). arxiv: 1907.09361Google Scholar
Xinyun Chen, Chang Liu, and Dawn Song. 2019. Execution-Guided Neural Program Synthesis. In Proc. of ICLR '19.Google Scholar
Li Dong and Mirella Lapata. 2016. Language to Logical Form with Neural Attention. In Proc. of ACL '16. 33--43.Google ScholarCross Ref
Li Dong and Mirella Lapata. 2018. Coarse-to-Fine Decoding for Neural Semantic Parsing. In Proc. of ACL '18. 731--742.Google ScholarCross Ref
Marzieh Fadaee, Arianna Bisazza, and Christof Monz. 2017. Data Augmentation for Low-Resource Neural Machine Translation. In Proc. of ACL '17. 567--573.Google ScholarCross Ref
Aaron L.-F. Han, Derek F. Wong, Lidia S. Chao, Liangye He, Yi Lu, Junwen Xing, and Xiaodong Zeng. 2013. Language-independent Model for Machine Translation Evaluation with Reinforced Factors. In Proc. of MT Summit '13. 215--222.Google Scholar
Ann-Kathrin Hartmann, Edgard Marx, and Tommaso Soru. 2018. Generating a Large Dataset for Neural Question Answering over the DBpedia Knowledge Base. (2018).Google Scholar
Hideki Isozaki, Tsutomu Hirao, Kevin Duh, Katsuhito Sudoh, and Hajime Tsukada. 2010. Automatic Evaluation of Translation Quality for Distant Language Pairs. In Proc. of EMNLP '10. 944--952.Google Scholar
Trond Linjordet and Krisztian Balog. 2019. Impact of Training Dataset Size on Neural Answer Selection Models. In Proc. of ECIR '19. 828--835.Google ScholarDigital Library
Trond Linjordet and Krisztian Balog. 2020. Sanitizing Synthetic Training Data Generation for Question Answering over Knowledge Graphs. In Proc. of ICTIR '20.Google ScholarDigital Library
Ding Liu and Daniel Gildea. 2005. Syntactic Features for Evaluation of Machine Translation. In Proc. of ACL '05, Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization. 25--32.Google Scholar
Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. Bleu: a Method for Automatic Evaluation of Machine Translation. In Proc. of ACL '02. 311--318.Google Scholar
Vasin Punyakanok, Dan Roth, and Wen tau Yih. 2004. Mapping Dependencies Trees: An Application to Question Answering. In Proc. of ISAIM '04.Google Scholar
Pranav Rajpurkar, Robin Jia, and Percy Liang. 2018. Know What You Don't Know: Unanswerable Questions for SQuAD. In Proc. of ACL '18. 784--789.Google ScholarCross Ref
Matthew Snover, Bonnie Dorr, Richard Schwartz, Linnea Micciulla, and John Makhoul. 2006. A Study of Translation Edit Rate with Targeted Human Annotation. In Proc. of AMTA '06. 223--231.Google Scholar
Matthew G. Snover, Nitin Madnani, Bonnie J. Dorr, and Richard M. Schwartz. 2009. TER-Plus: paraphrase, semantic, and alignment enhancements to Translation Edit Rate. Machine Translation, Vol. 23 (2009), 117--127.Google ScholarDigital Library
Yanli Sun. 2010. Mining the Correlation between Human and Automatic Evaluation at Sentence Level. In Proc. of LREC '10.Google Scholar
Priyansh Trivedi, Gaurav Maheshwari, Mohnish Dubey, and Jens Lehmann. 2017. LC-QuAD: A Corpus for Complex Question Answering over Knowledge Graphs. In Proc. of ISWC '17,, Claudia d'Amato, Miriam Fernandez, Valentina Tamma, Freddy Lecue, Philippe Cudré-Mauroux, Juan Sequeda, Christoph Lange, and Jeff Heflin (Eds.). 210--218.Google ScholarDigital Library
Joseph P. Turian, Luke Shen, and I. Dan Melamed. 2003. Evaluation of machine translation and its evaluation. In Proc. MT Summit '03.Google Scholar
Ricardo Usbeck, Axel-Cyrille Ngonga Ngomo, Bastian Haarmann, Anastasia Krithara, Michael Röder, and Giulio Napolitano. 2017. 7th Open Challenge on Question Answering over Linked Data (QALD-7). Semantic Web Challenges, Vol. 769 (2017).Google Scholar
Yi Yang, Scott Wen-tau Yih, and Chris Meek. 2015. WikiQA: A Challenge Dataset for Open-Domain Question Answering. In Proc. of EMNLP '15. 2013--2018.Google ScholarCross Ref
Xiaoyu Yin, Dagmar Gromann, and Sebastian Rudolph. 2019. Neural Machine Translating from Natural Language to SPARQL. (2019). arxiv: 1906.09302Google Scholar
Lei Yu, Karl Moritz Hermann, Phil Blunsom, and Stephen Pulman. 2014. Deep Learning for Answer Sentence Selection. In Proc. of NIPS '14.Google Scholar

Index Terms

Neural (Knowledge Graph) Question Answering Using Synthetic Training Data
1. Computing methodologies
  1. Machine learning
    1. Machine learning approaches
      1. Neural networks
2. Information systems
  1. Data management systems
    1. Database design and models
      1. Entity relationship models
  2. Information retrieval

Recommendations

Sanitizing Synthetic Training Data Generation for Question Answering over Knowledge Graphs
ICTIR '20: Proceedings of the 2020 ACM SIGIR on International Conference on Theory of Information Retrieval

Synthetic data generation is important to training and evaluating neural models for question answering over knowledge graphs. The quality of the data and the partitioning of the datasets into training, validation and test splits impact the performance ...
Read More
Revisiting multiple instance neural networks

We revisit the problem of solving MIL using neural networks (MINNs), which are ignored in current MIL research community. Our experiments show that MINNs are very effective and efficient.We proposed a novel MI-Net which is centered on learning bag ...
Read More
Domain-adversarial training of neural networks

We introduce a new representation learning approach for domain adaptation, in which data at training and test time come from similar but different distributions. Our approach is directly inspired by the theory on domain adaptation suggesting that, for ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CIKM '20: Proceedings of the 29th ACM International Conference on Information & Knowledge Management
October 2020
3619 pages
ISBN:9781450368599
DOI:10.1145/3340531
General Chairs:
Mathieu d'Aquin
DSI, Insight, NUI Galway, Ireland
,
Stefan Dietze
GESIS, Cologne, Germany, Heinrich-Heine-University Düsseldorf, Germany, L3S Research Center, Germany
,
Program Chairs:
Claudia Hauff
TU Delft, The Netherlands
,
Edward Curry
DSI, Insight, NUI Galway, Ireland
,
Philippe Cudre Mauroux
eXascale, University of Fribourg, Switzerland
Copyright © 2020 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 19 October 2020
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
deep learning
knowledge graph question answering
neural networks
semi-supervised learning
synthetic data
Qualifiers
- short-paper
Conference

Acceptance Rates
Overall Acceptance Rate1,861of8,427submissions,22%
Upcoming Conference
CIKM '24

Sponsor:

sigir

sigir

The 33rd ACM International Conference on Information and Knowledge Management

October 21 - 25, 2024

Boise , ID , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 183
  Total Downloads
- Downloads (Last 12 months)8
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Neural (Knowledge Graph) Question Answering Using Synthetic Training Data

CIKM '20: Proceedings of the 29th ACM International Conference on Information & Knowledge Management

ABSTRACT

References

Cited By

Index Terms

Recommendations

Sanitizing Synthetic Training Data Generation for Question Answering over Knowledge Graphs

Revisiting multiple instance neural networks

Domain-adversarial training of neural networks