research-article

How to Build Templates for RDF Question/Answering: An Uncertain Graph Similarity Join Approach

Authors:
Weiguo Zheng

Peking University, Beijing, China

Peking University, Beijing, China
View Profile

,
Lei Zou

Peking University, Beijing, China

Peking University, Beijing, China
View Profile

,
Xiang Lian

University of Texas-Pan American, Edinburg, TX, USA

University of Texas-Pan American, Edinburg, TX, USA
View Profile

,
Jeffrey Xu Yu

The Chinese University of Hong Kong, Hong Kong, China

The Chinese University of Hong Kong, Hong Kong, China
View Profile

,
Shaoxu Song

Tsinghua University, Beijing, China

Tsinghua University, Beijing, China
View Profile

,
Dongyan Zhao

Peking University, Beijing, China

Peking University, Beijing, China
View Profile

SIGMOD '15: Proceedings of the 2015 ACM SIGMOD International Conference on Management of DataMay 2015Pages 1809–1824https://doi.org/10.1145/2723372.2747648

Published:27 May 2015Publication History

SIGMOD '15: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data

Pages 1809–1824

ABSTRACT

A challenging task in the natural language question answering (Q/A for short) over RDF knowledge graph is how to bridge the gap between unstructured natural language questions (NLQ) and graph-structured RDF data (GOne of the effective tools is the "template", which is often used in many existing RDF Q/A systems. However, few of them study how to generate templates automatically. To the best of our knowledge, we are the first to propose a join approach for template generation. Given a workload D of SPARQL queries and a set N of natural language questions, the goal is to find some pairs q, n, for q∈ D ∧ n ∈, N, where SPARQL query q is the best match for natural language question n. These pairs provide promising hints for automatic template generation. Due to the ambiguity of the natural languages, we model the problem above as an uncertain graph join task. We propose several structural and probability pruning techniques to speed up joining. Extensive experiments over real RDF Q/A benchmark datasets confirm both the effectiveness and efficiency of our approach.

References

L. Brun, B. Gaüzère, and S. Fourey. Relationships between graph edit distance and maximal common structural subgraph. In SSPR/SPR, pages 42--50, 2012.Google Scholar
J. J. Carroll, I. Dickinson, C. Dollin, D. Reynolds, A. Seaborne, and K. Wilkinson. Jena: implementing the semantic web recommendations. In WWW, pages 74--83, 2004. Google ScholarDigital Library
N. N. Dalvi and D. Suciu. Efficient query evaluation on probabilistic databases. VLDB J., 16(4), 2007. Google ScholarDigital Library
M. Dredze, P. McNamee, D. Rao, A. Gerber, and T. Finin. Entity disambiguation for knowledge base population. In COLING, pages 277--285, 2010. Google ScholarDigital Library
R. Durrett. Probability: Theory and Examples. Cambridge University Press, 2010. Google ScholarDigital Library
J. Eisner. Learning non-isomorphic tree mappings for machine translation. In ACL, pages 205--208, 2003. Google ScholarDigital Library
A. Fader, L. S. Zettlemoyer, and O. Etzioni. Paraphrase-driven learning for open question answering. In ACL, pages 1608--1618, 2013.Google Scholar
X. Gao, B. Xiao, D. Tao, and X. Li. A survey of graph edit distance. Pattern Anal. Appl., 13(1):113--129, 2010. Google ScholarDigital Library
M. Heilman and N. A. Smith. Tree edit models for recognizing textual entailments, paraphrases, and answers to questions. In NAACL, pages 1011--1019, 2010. Google ScholarDigital Library
H.W.Kuhn. The hungarian method for the assignment problem. In Naval Research Logistics, pages 83--97, 1955.Google ScholarCross Ref
B. Kimelfeld, Y. Kosharovsky, and Y. Sagiv. Query efficiency in probabilistic xml models. In SIGMOD, pages 701--714, 2008. Google ScholarDigital Library
B. Kimelfeld and P. Senellart. Probabilistic xml: Models and complexity. In Advances in Probabilistic Databases for Uncertain Information Management, pages 39--66. 2013.Google ScholarCross Ref
D. Klein and C. D. Manning. Accurate unlexicalized parsing. In ACL, pages 423--430, 2003. Google ScholarDigital Library
G. Kollios, M. Potamias, and E. Terzi. Clustering large probabilistic graphs. IEEE Trans. Knowl. Data Eng., 25(2):325--336, 2013. Google ScholarDigital Library
T. Neumann and G. Weikum. RDF-3X: a risc-style engine for RDF. PVLDB, 1(1):647--659, 2008. Google ScholarDigital Library
D. Rao, P. McNamee, and M. Dredze. Entity linking: Finding extracted entities in a knowledge base. In A book chapter in Multi-source, Multi-lingual Information Extraction and Summarization, 2011.Google Scholar
K. Riesen, S. Fankhauser, and H. Bunke. Speeding up graph edit distance computation with a bipartite heuristic. In MLG, 2007.Google Scholar
D. Suciu and N. N. Dalvi. Foundations of probabilistic answers to queries. In SIGMOD Conference, 2005. Google ScholarDigital Library
W. Tunstall-Pedoe. True knowledge: Open-domain question answering using structured knowledge and inference. AI Magazine, 31(3):80--92, 2010.Google ScholarCross Ref
C. Unger, L. Bühmann, J. Lehmann, A.-C. N. Ngomo, D. Gerber, and P. Cimiano. Template-based question answering over rdf data. In WWW, pages 639--648, 2012. Google ScholarDigital Library
G. Wang, B. Wang, X. Yang, and G. Yu. Efficiently indexing large sparse graphs for similarity search. TKDE, 24(3):440--451, 2012. Google ScholarDigital Library
X. Wang, X. Ding, A. K. H. Tung, S. Ying, and H. Jin. An efficient graph indexing method. In ICDE, pages 210--221, 2012. Google ScholarDigital Library
M. Yahya, K. Berberich, S. Elbassuoni, M. Ramanath, V. Tresp, and G. Weikum. Natural language questions for the web of data. In EMNLP-CoNLL, pages 379--390, 2012. Google ScholarDigital Library
M. Yahya, K. Berberich, S. Elbassuoni, and G. Weikum. Robust question answering over the web of linked data. In CIKM, pages 1107--1116, 2013. Google ScholarDigital Library
X. Yao and B. V. Durme. Information extraction over structured data: Question answering with freebase. In ACL, pages 956--966, 2014.Google ScholarCross Ref
X. Yao, B. V. Durme, C. Callison-Burch, and P. Clark. Answer extraction as sequence tagging with tree edit distance. In NAACL, pages 858--867, 2013.Google Scholar
Y. Yuan, G. Wang, L. Chen, and H. Wang. Efficient subgraph similarity search on large probabilistic graph databases. PVLDB, 5(9):800--811, 2012. Google ScholarDigital Library
Y. Yuan, G. Wang, H. Wang, and L. Chen. Efficient subgraph search over large uncertain graphs. PVLDB, 4(11):876--886, 2011.Google ScholarDigital Library
Z. Zeng, A. K. H. Tung, J. Wang, J. Feng, and L. Zhou. Comparing stars: On approximating graph edit distance. PVLDB, 2(1):25--36, 2009. Google ScholarDigital Library
X. Zhao, C. Xiao, X. Lin, Q. Liu, and W. Zhang. A partition-based approach to structure similarity search. PVLDB, 7(3):169--180, 2013. Google ScholarDigital Library
X. Zhao, C. Xiao, X. Lin, and W. Wang. Efficient graph similarity joins with edit distance constraints. In ICDE, pages 834--845, 2012. Google ScholarDigital Library
Z. Zheng, F. Li, M. Huang, and X. Zhu. Learning to link entities with knowledge base. In NAACL, pages 483--491, 2010. Google ScholarDigital Library
L. Zou, R. Huang, H. Wang, J. X. Yu, W. He, and D. Zhao. Natural language question answering over RDF: a graph data driven approach. In SIGMOD Conference, pages 313--324, 2014. Google ScholarDigital Library
L. Zou, J. Mo, L. Chen, M. T. Özsu, and D. Zhao. gstore: Answering sparql queries via subgraph matching. PVLDB, 4(8), 2011. Google ScholarDigital Library
Z. Zou, H. Gao, and J. Li. Discovering frequent subgraphs over uncertain graph databases under probabilistic semantics. In KDD, pages 633--642, 2010. Google ScholarDigital Library

Index Terms

How to Build Templates for RDF Question/Answering: An Uncertain Graph Similarity Join Approach
1. Information systems
  1. Information systems applications

Recommendations

Template-based question answering over RDF data
WWW '12: Proceedings of the 21st international conference on World Wide Web

As an increasing amount of RDF data is published as Linked Data, intuitive ways of accessing this data become more and more important. Question answering approaches have been proposed as a good compromise between intuitiveness and expressivity. Most ...
Read More
Natural language question answering over RDF: a graph data driven approach
SIGMOD '14: Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data

RDF question/answering (Q/A) allows users to ask questions in natural languages over a knowledge base represented by RDF. To answer a national language question, the existing work takes a two-stage approach: question understanding and query evaluation. ...
Read More
Improving the Precision of RDF Question/Answering Systems: A Why Not Approach
WWW '17 Companion: Proceedings of the 26th International Conference on World Wide Web Companion

Given a natural language question _qNL over an RDF dataset D, an RDF Question/Answering (Q/A) system first translatesq_NL into a SPARQL query graph Q and then evaluates Q over the underlying knowledge graph to figure out the answers Q(D). However, due to ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGMOD '15: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data
May 2015
2110 pages
ISBN:9781450327589
DOI:10.1145/2723372
General Chair:
Timos Sellis
RMIT University, Australia
,
Program Chairs:
Susan B. Davidson
University of Pennsylvania, USA
,
Zack Ives
University of Pennsylvania, USA
Copyright © 2015 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 27 May 2015
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
graph database
question answering
rdf
Qualifiers
- research-article
Conference

Acceptance Rates
SIGMOD '15 Paper Acceptance Rate106of415submissions,26%Overall Acceptance Rate785of4,003submissions,20%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 38
  Total Citations
  View Citations
- 935
  Total Downloads
- Downloads (Last 12 months)33
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

How to Build Templates for RDF Question/Answering: An Uncertain Graph Similarity Join Approach

SIGMOD '15: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data

ABSTRACT

References

Cited By

Index Terms

Recommendations

Template-based question answering over RDF data

Natural language question answering over RDF: a graph data driven approach

Improving the Precision of RDF Question/Answering Systems: A Why Not Approach