skip to main content
10.1145/2661829.2661968acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Mining Semi-Structured Online Knowledge Bases to Answer Natural Language Questions on Community QA Websites

Published: 03 November 2014 Publication History

Abstract

Over the past few years, community QA websites (e.g. Yahoo! Answers) have become a useful platform for users to post questions and obtain answers. However, not all questions posted there receive informative answers or are answered in a timely manner. In this paper, we show that the answers to some of these questions are available in online domain-specific knowledge bases and propose an approach to automatically discover those answers. In the proposed approach, we would first mine appropriate SQL query patterns by leveraging an existing collection of QA pairs, and then use the learned query patterns to answer previously unseen questions by returning relevant entities from the knowledge base. Evaluation on a collection of health domain questions from Yahoo! Answers shows that the proposed method is effective in discovering potential answers to user questions from an online medical knowledge base.

References

[1]
Trec question answering track. http://trec.nist.gov/data/qamain.html.
[2]
S. J. Athenikos and H. Han. Biomedical question answering: A survey. Computer Methods and Programs in Biomedicine, 99(1):1--24, 2010.
[3]
M. W. Bilotti, J. Elsas, J. Carbonell, and E. Nyberg. Rank learning for factoid question answering with linguistic and semantic constraints. In Proceedings of the 19th ACM international conference on Information and knowledge management, CIKM '10, pages 459--468, New York, NY, USA, 2010.
[4]
R. H. Byrd, J. Nocedal, and R. B. Schnabel. Representations of quasi-newton matrices and their use in limited memory methods. Math. Program., 63(2):129--156, Jan. 1994.
[5]
H. Cui, M.-Y. Kan, and T.-S. Chua. Generic soft pattern models for definitional question answering. In Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR '05, pages 384--391, New York, NY, USA, 2005.
[6]
Z. Huang, M. Thint, and A. Celikyilmaz. Investigation of question classifier in question answering. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2, EMNLP '09, pages 543--550, Stroudsburg, PA, USA, 2009. Association for Computational Linguistics.
[7]
J. Jeon, W. B. Croft, and J. H. Lee. Finding similar questions in large question and answer archives. In Proceedings of the 14th ACM international conference on Information and knowledge management, CIKM '05, pages 84--90, New York, NY, USA, 2005.
[8]
V. Jijkoun and M. de Rijke. Retrieving answers from frequently asked questions pages on the web. In Proceedings of the 14th ACM international conference on Information and knowledge management, CIKM '05, pages 76--83, New York, NY, USA, 2005.
[9]
J. Kim, X. Xue, and W. B. Croft. A probabilistic retrieval model for semistructured data. In Proceedings of the 31th European Conference on IR Research on Advances in Information Retrieval, ECIR '09, pages 228--239, Berlin, Heidelberg, 2009. Springer-Verlag.
[10]
J. Y. Kim and W. B. Croft. A field relevance model for structured document retrieval. In Proceedings of the 34th European conference on Advances in Information Retrieval, ECIR'12, pages 97--108, Berlin, Heidelberg, 2012. Springer-Verlag.
[11]
M. Minock. C-phrase: A system for building robust natural language interfaces to databases. Data Knowl. Eng., 69(3):290--302, Mar. 2010.
[12]
A.-M. Popescu, A. Armanasu, O. Etzioni, D. Ko, and A. Yates. Modern natural language interfaces to databases: composing statistical parsing with semantic tractability. In Proceedings of the 20th international conference on Computational Linguistics, COLING '04, Stroudsburg, PA, USA, 2004. Association for Computational Linguistics.
[13]
S. Robertson, H. Zaragoza, and M. Taylor. Simple bm25 extension to multiple weighted fields. In Proceedings of the thirteenth ACM international conference on Information and knowledge management, CIKM '04, pages 42--49, New York, NY, USA, 2004.
[14]
R. Soricut and E. Brill. Automatic question answering using the web: Beyond the factoid. Inf. Retr., 9(2):191--206, Mar. 2006.
[15]
V. Tablan, D. Damljanovic, and K. Bontcheva. A natural language query interface to structured information. In Proceedings of the 5th European semantic web conference on The semantic web: research and applications, ESWC'08, pages 361--375, Berlin, Heidelberg, 2008. Springer-Verlag.
[16]
X. Xue, J. Jeon, and W. B. Croft. Retrieval models for question and answer archives. In Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR '08, pages 475--482, New York, NY, USA, 2008.
[17]
C. Zhai. Statistical language models for information retrieval a critical review. Found. Trends Inf. Retr., 2(3):137--213, Mar. 2008.
[18]
C. Zhai and J. Lafferty. Model-based feedback in the language modeling approach to information retrieval. In Proceedings of the tenth international conference on Information and knowledge management, CIKM '01, pages 403--410, New York, NY, USA, 2001.

Cited By

View all
  • (2024)Boolean interpretation, matching, and ranking of natural language queries in product selection systemsDiscover Computing10.1007/s10791-024-09432-x27:1Online publication date: 3-Apr-2024
  • (2022)Retrieving and Ranking Relevant Products from Boolean Natural Language Queries2022 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT)10.1109/WI-IAT55865.2022.00051(303-308)Online publication date: Nov-2022
  • (2020)QWikiProceedings of the 16th International Symposium on Open Collaboration10.1145/3412569.3412576(1-12)Online publication date: 25-Aug-2020
  • Show More Cited By

Index Terms

  1. Mining Semi-Structured Online Knowledge Bases to Answer Natural Language Questions on Community QA Websites

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CIKM '14: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management
    November 2014
    2152 pages
    ISBN:9781450325981
    DOI:10.1145/2661829
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 03 November 2014

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. knowledge base
    2. question answering
    3. retrieval models
    4. text mining

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    CIKM '14
    Sponsor:

    Acceptance Rates

    CIKM '14 Paper Acceptance Rate 175 of 838 submissions, 21%;
    Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

    Upcoming Conference

    CIKM '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)2
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 02 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Boolean interpretation, matching, and ranking of natural language queries in product selection systemsDiscover Computing10.1007/s10791-024-09432-x27:1Online publication date: 3-Apr-2024
    • (2022)Retrieving and Ranking Relevant Products from Boolean Natural Language Queries2022 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT)10.1109/WI-IAT55865.2022.00051(303-308)Online publication date: Nov-2022
    • (2020)QWikiProceedings of the 16th International Symposium on Open Collaboration10.1145/3412569.3412576(1-12)Online publication date: 25-Aug-2020
    • (2019)Integrating Multi-level Tag Recommendation with External Knowledge Bases for Automatic Question AnsweringACM Transactions on Internet Technology10.1145/331952819:3(1-22)Online publication date: 7-May-2019
    • (2018)Web Forum Retrieval and Text AnalyticsFoundations and Trends in Information Retrieval10.1561/150000006212:1(1-163)Online publication date: 3-Jan-2018
    • (2018)Clustering Analysis-Based Approach to Detecting Entity Mixture in Knowledge BasesProceedings of the 18th ACM/IEEE on Joint Conference on Digital Libraries10.1145/3197026.3203896(395-396)Online publication date: 23-May-2018
    • (2017)Detect Incorrect Triples in Knowledge Base Based on Triple Confidence EvaluationProceedings of the 3rd International Conference on Industrial and Business Engineering10.1145/3133811.3133829(93-101)Online publication date: 17-Aug-2017
    • (2016)Detection of Entity Mixture in Knowledge Bases Using Hierarchical ClusteringNatural Language Understanding and Intelligent Applications10.1007/978-3-319-50496-4_24(288-299)Online publication date: 2-Dec-2016
    • (2015)CQADupStackProceedings of the 20th Australasian Document Computing Symposium10.1145/2838931.2838934(1-8)Online publication date: 8-Dec-2015

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media