skip to main content
10.1145/1076034.1076101acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
Article

Generic soft pattern models for definitional question answering

Published:15 August 2005Publication History

ABSTRACT

This paper explores probabilistic lexico-syntactic pattern matching, also known as soft pattern matching. While previous methods in soft pattern matching are ad hoc in computing the degree of match, we propose two formal matching models: one based on bigrams and the other on the Profile Hidden Markov Model (PHMM). Both models provide a theoretically sound method to model pattern matching as a probabilistic process that generates token sequences. We demonstrate the effectiveness of these models on definition sentence retrieval for definitional question answering. We show that both models significantly outperform state-of-the-art manually constructed patterns. A critical difference between the two models is that the PHMM technique handles language variations more effectively but requires more training data to converge. We believe that both models can be extended to other areas where lexico-syntactic pattern matching can be applied.

References

  1. S. Blair-Goldensohn, K.R. McKeown and A. Hazen Schlaikjer, A Hybrid Approach for QA Track Definitional Questions, Proc. of TREC 2003, 2003, pp. 336--343.Google ScholarGoogle Scholar
  2. H. Cui, M.-Y. Kan and T.-S. Chua, Unsupervised Learning of Soft Patterns for Generating Definitions from Online News, Proc. of WWW '04, New York, 2004, pp. 90--99. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. H. Cui, M.-Y. Kan, T.-S. Chua and J. Xiao, A Comparative Study on Sentence Retrieval for Definitional Question Answering, SIGIR Workshop on Information Retrieval for Question Answering (IR4QA), Sheffield, U.K., 2004.Google ScholarGoogle Scholar
  4. A.P. Dempster, N.M. Laird and D. B. Rubin, Maximum likelihood from incomplete data via the EM algorithm, Journal of the Royal Statistical Society, 39:1--38, 1977.Google ScholarGoogle Scholar
  5. S. Harabagiu, D. Moldovan, C. Clark, M. Bowden, J. Williams and J. Bensley, Answer Mining by Combining Extraction Techniques with Abductive Reasoning, Proc. of TREC 2003, 2003.Google ScholarGoogle Scholar
  6. W. Hildebrandt, B. Katz and J. Lin, Answering Definition Questions with Multiple Knowledge Sources, Proc. of HLT/NAACL 2004, Boston, MA, 2004, pp. 49--56.Google ScholarGoogle Scholar
  7. F. Jelinek and R. L. Mercer, Interpolated estimation of markov source parameters from sparse data, Proc. of the Workshop Pattern Recognition in Practice, Amsterdam, Holland, 1980, pp. 381--397.Google ScholarGoogle Scholar
  8. A. Krogh, M. Brown, I.S. Mian K. Sjolander and D. Haussler, Hidden Markov Models in Computational Biology - Applications to Protein Modeling, J. Mol. Biol. (1994) 235, pp. 1501--1531.Google ScholarGoogle Scholar
  9. C.-Y. Lin and E.H. Hovy, Automatic Evaluation of Summaries Using N-gram Co-occurrence Statistics, Proc. of HLT-NAACL '03, Edmonton, Canada, 2003, pp. 71--78. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. C.D. Manning and H. Schtze, editors. Foundations of Statistical Natural Language Processing, The MIT Press, Cambridge, MA, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. I. Muslea, Extraction patterns for information extraction tasks: A survey, Proc. of AAAI-99 Workshop on Machine Learning for Information Extraction, 1999, pp.1--6.Google ScholarGoogle Scholar
  12. D. Ravichandran and E. Hovy, Learning Surface Text Patterns for a Question Answering System, Proc. of ACL '02, Philadelphia, July 2002, pp. 41--47. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. E. Riloff, and J. Wiebe, Learning Extraction Patterns for Subjective Expressions, Proc. of EMNLP '03, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. R. Rosenfeld, Two decades of statistical language modeling: Where do we go from here, Proc. of the IEEE, 88, August, 2000, pp. 1270--1278.Google ScholarGoogle ScholarCross RefCross Ref
  15. M. Skounakis, M. Craven, and S. Ray, Hierarchical hidden markov models for information extraction, Proc. of IJCAI '03, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. E.M.Voorhees, Overview of the TREC 2003 question answering track, Proc. of TREC 2003, 2003.Google ScholarGoogle Scholar
  17. E.M. Voorhees, Overview of the TREC 2004 question answering track, Proc. of TREC 2004, 2004.Google ScholarGoogle Scholar
  18. J. Xiao, T.-S. Chua and H. Cui, Cascading Use of Soft and Hard Matching Pattern Rules for Weakly Supervised Information Extraction, Proc. of COLING '04, Geneva, Switzerland, 2004, pp.542--548. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. J. Xu, R. M. Weischedel and A. Licuanan, Evaluation of an extraction-based approach to answering definitional questions, Proc. of SIGIR '04, Sheffield, UK, 2004, pp. 418--424. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. H. Yang, H. Cui, M.-Y. Kan, M. Maslennikov, L. Qiu and T.-S. Chua, QUALIFIER in TREC 12 QA Main Task, Proc. of TREC 2003, 2003, pp. 54--63.Google ScholarGoogle Scholar

Index Terms

  1. Generic soft pattern models for definitional question answering

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        SIGIR '05: Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
        August 2005
        708 pages
        ISBN:1595930345
        DOI:10.1145/1076034

        Copyright © 2005 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 15 August 2005

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • Article

        Acceptance Rates

        Overall Acceptance Rate792of3,983submissions,20%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader