skip to main content
10.1145/2723372.2747648acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

How to Build Templates for RDF Question/Answering: An Uncertain Graph Similarity Join Approach

Published:27 May 2015Publication History

ABSTRACT

A challenging task in the natural language question answering (Q/A for short) over RDF knowledge graph is how to bridge the gap between unstructured natural language questions (NLQ) and graph-structured RDF data (GOne of the effective tools is the "template", which is often used in many existing RDF Q/A systems. However, few of them study how to generate templates automatically. To the best of our knowledge, we are the first to propose a join approach for template generation. Given a workload D of SPARQL queries and a set N of natural language questions, the goal is to find some pairs q, n, for qDn ∈, N, where SPARQL query q is the best match for natural language question n. These pairs provide promising hints for automatic template generation. Due to the ambiguity of the natural languages, we model the problem above as an uncertain graph join task. We propose several structural and probability pruning techniques to speed up joining. Extensive experiments over real RDF Q/A benchmark datasets confirm both the effectiveness and efficiency of our approach.

References

  1. L. Brun, B. Gaüzère, and S. Fourey. Relationships between graph edit distance and maximal common structural subgraph. In SSPR/SPR, pages 42--50, 2012.Google ScholarGoogle Scholar
  2. J. J. Carroll, I. Dickinson, C. Dollin, D. Reynolds, A. Seaborne, and K. Wilkinson. Jena: implementing the semantic web recommendations. In WWW, pages 74--83, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. N. N. Dalvi and D. Suciu. Efficient query evaluation on probabilistic databases. VLDB J., 16(4), 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. M. Dredze, P. McNamee, D. Rao, A. Gerber, and T. Finin. Entity disambiguation for knowledge base population. In COLING, pages 277--285, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. R. Durrett. Probability: Theory and Examples. Cambridge University Press, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. J. Eisner. Learning non-isomorphic tree mappings for machine translation. In ACL, pages 205--208, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. A. Fader, L. S. Zettlemoyer, and O. Etzioni. Paraphrase-driven learning for open question answering. In ACL, pages 1608--1618, 2013.Google ScholarGoogle Scholar
  8. X. Gao, B. Xiao, D. Tao, and X. Li. A survey of graph edit distance. Pattern Anal. Appl., 13(1):113--129, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. M. Heilman and N. A. Smith. Tree edit models for recognizing textual entailments, paraphrases, and answers to questions. In NAACL, pages 1011--1019, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. H.W.Kuhn. The hungarian method for the assignment problem. In Naval Research Logistics, pages 83--97, 1955.Google ScholarGoogle ScholarCross RefCross Ref
  11. B. Kimelfeld, Y. Kosharovsky, and Y. Sagiv. Query efficiency in probabilistic xml models. In SIGMOD, pages 701--714, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. B. Kimelfeld and P. Senellart. Probabilistic xml: Models and complexity. In Advances in Probabilistic Databases for Uncertain Information Management, pages 39--66. 2013.Google ScholarGoogle ScholarCross RefCross Ref
  13. D. Klein and C. D. Manning. Accurate unlexicalized parsing. In ACL, pages 423--430, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. G. Kollios, M. Potamias, and E. Terzi. Clustering large probabilistic graphs. IEEE Trans. Knowl. Data Eng., 25(2):325--336, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. T. Neumann and G. Weikum. RDF-3X: a risc-style engine for RDF. PVLDB, 1(1):647--659, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. D. Rao, P. McNamee, and M. Dredze. Entity linking: Finding extracted entities in a knowledge base. In A book chapter in Multi-source, Multi-lingual Information Extraction and Summarization, 2011.Google ScholarGoogle Scholar
  17. K. Riesen, S. Fankhauser, and H. Bunke. Speeding up graph edit distance computation with a bipartite heuristic. In MLG, 2007.Google ScholarGoogle Scholar
  18. D. Suciu and N. N. Dalvi. Foundations of probabilistic answers to queries. In SIGMOD Conference, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. W. Tunstall-Pedoe. True knowledge: Open-domain question answering using structured knowledge and inference. AI Magazine, 31(3):80--92, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  20. C. Unger, L. Bühmann, J. Lehmann, A.-C. N. Ngomo, D. Gerber, and P. Cimiano. Template-based question answering over rdf data. In WWW, pages 639--648, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. G. Wang, B. Wang, X. Yang, and G. Yu. Efficiently indexing large sparse graphs for similarity search. TKDE, 24(3):440--451, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. X. Wang, X. Ding, A. K. H. Tung, S. Ying, and H. Jin. An efficient graph indexing method. In ICDE, pages 210--221, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. M. Yahya, K. Berberich, S. Elbassuoni, M. Ramanath, V. Tresp, and G. Weikum. Natural language questions for the web of data. In EMNLP-CoNLL, pages 379--390, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. M. Yahya, K. Berberich, S. Elbassuoni, and G. Weikum. Robust question answering over the web of linked data. In CIKM, pages 1107--1116, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. X. Yao and B. V. Durme. Information extraction over structured data: Question answering with freebase. In ACL, pages 956--966, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  26. X. Yao, B. V. Durme, C. Callison-Burch, and P. Clark. Answer extraction as sequence tagging with tree edit distance. In NAACL, pages 858--867, 2013.Google ScholarGoogle Scholar
  27. Y. Yuan, G. Wang, L. Chen, and H. Wang. Efficient subgraph similarity search on large probabilistic graph databases. PVLDB, 5(9):800--811, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Y. Yuan, G. Wang, H. Wang, and L. Chen. Efficient subgraph search over large uncertain graphs. PVLDB, 4(11):876--886, 2011.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Z. Zeng, A. K. H. Tung, J. Wang, J. Feng, and L. Zhou. Comparing stars: On approximating graph edit distance. PVLDB, 2(1):25--36, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. X. Zhao, C. Xiao, X. Lin, Q. Liu, and W. Zhang. A partition-based approach to structure similarity search. PVLDB, 7(3):169--180, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. X. Zhao, C. Xiao, X. Lin, and W. Wang. Efficient graph similarity joins with edit distance constraints. In ICDE, pages 834--845, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Z. Zheng, F. Li, M. Huang, and X. Zhu. Learning to link entities with knowledge base. In NAACL, pages 483--491, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. L. Zou, R. Huang, H. Wang, J. X. Yu, W. He, and D. Zhao. Natural language question answering over RDF: a graph data driven approach. In SIGMOD Conference, pages 313--324, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. L. Zou, J. Mo, L. Chen, M. T. Özsu, and D. Zhao. gstore: Answering sparql queries via subgraph matching. PVLDB, 4(8), 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Z. Zou, H. Gao, and J. Li. Discovering frequent subgraphs over uncertain graph databases under probabilistic semantics. In KDD, pages 633--642, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. How to Build Templates for RDF Question/Answering: An Uncertain Graph Similarity Join Approach

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      SIGMOD '15: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data
      May 2015
      2110 pages
      ISBN:9781450327589
      DOI:10.1145/2723372

      Copyright © 2015 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 27 May 2015

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      SIGMOD '15 Paper Acceptance Rate106of415submissions,26%Overall Acceptance Rate785of4,003submissions,20%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader