Article

Generic soft pattern models for definitional question answering

Authors:
Hang Cui

National University of Singapore

National University of Singapore
View Profile

,
Min-Yen Kan

National University of Singapore

National University of Singapore
View Profile

,
Tat-Seng Chua

National University of Singapore

National University of Singapore
View Profile

SIGIR '05: Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrievalAugust 2005Pages 384–391https://doi.org/10.1145/1076034.1076101

Published:15 August 2005Publication History

SIGIR '05: Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval

Pages 384–391

ABSTRACT

This paper explores probabilistic lexico-syntactic pattern matching, also known as soft pattern matching. While previous methods in soft pattern matching are ad hoc in computing the degree of match, we propose two formal matching models: one based on bigrams and the other on the Profile Hidden Markov Model (PHMM). Both models provide a theoretically sound method to model pattern matching as a probabilistic process that generates token sequences. We demonstrate the effectiveness of these models on definition sentence retrieval for definitional question answering. We show that both models significantly outperform state-of-the-art manually constructed patterns. A critical difference between the two models is that the PHMM technique handles language variations more effectively but requires more training data to converge. We believe that both models can be extended to other areas where lexico-syntactic pattern matching can be applied.

References

S. Blair-Goldensohn, K.R. McKeown and A. Hazen Schlaikjer, A Hybrid Approach for QA Track Definitional Questions, Proc. of TREC 2003, 2003, pp. 336--343.Google Scholar
H. Cui, M.-Y. Kan and T.-S. Chua, Unsupervised Learning of Soft Patterns for Generating Definitions from Online News, Proc. of WWW '04, New York, 2004, pp. 90--99. Google ScholarDigital Library
H. Cui, M.-Y. Kan, T.-S. Chua and J. Xiao, A Comparative Study on Sentence Retrieval for Definitional Question Answering, SIGIR Workshop on Information Retrieval for Question Answering (IR4QA), Sheffield, U.K., 2004.Google Scholar
A.P. Dempster, N.M. Laird and D. B. Rubin, Maximum likelihood from incomplete data via the EM algorithm, Journal of the Royal Statistical Society, 39:1--38, 1977.Google Scholar
S. Harabagiu, D. Moldovan, C. Clark, M. Bowden, J. Williams and J. Bensley, Answer Mining by Combining Extraction Techniques with Abductive Reasoning, Proc. of TREC 2003, 2003.Google Scholar
W. Hildebrandt, B. Katz and J. Lin, Answering Definition Questions with Multiple Knowledge Sources, Proc. of HLT/NAACL 2004, Boston, MA, 2004, pp. 49--56.Google Scholar
F. Jelinek and R. L. Mercer, Interpolated estimation of markov source parameters from sparse data, Proc. of the Workshop Pattern Recognition in Practice, Amsterdam, Holland, 1980, pp. 381--397.Google Scholar
A. Krogh, M. Brown, I.S. Mian K. Sjolander and D. Haussler, Hidden Markov Models in Computational Biology - Applications to Protein Modeling, J. Mol. Biol. (1994) 235, pp. 1501--1531.Google Scholar
C.-Y. Lin and E.H. Hovy, Automatic Evaluation of Summaries Using N-gram Co-occurrence Statistics, Proc. of HLT-NAACL '03, Edmonton, Canada, 2003, pp. 71--78. Google ScholarDigital Library
C.D. Manning and H. Schtze, editors. Foundations of Statistical Natural Language Processing, The MIT Press, Cambridge, MA, 1999. Google ScholarDigital Library
I. Muslea, Extraction patterns for information extraction tasks: A survey, Proc. of AAAI-99 Workshop on Machine Learning for Information Extraction, 1999, pp.1--6.Google Scholar
D. Ravichandran and E. Hovy, Learning Surface Text Patterns for a Question Answering System, Proc. of ACL '02, Philadelphia, July 2002, pp. 41--47. Google ScholarDigital Library
E. Riloff, and J. Wiebe, Learning Extraction Patterns for Subjective Expressions, Proc. of EMNLP '03, 2003. Google ScholarDigital Library
R. Rosenfeld, Two decades of statistical language modeling: Where do we go from here, Proc. of the IEEE, 88, August, 2000, pp. 1270--1278.Google ScholarCross Ref
M. Skounakis, M. Craven, and S. Ray, Hierarchical hidden markov models for information extraction, Proc. of IJCAI '03, 2003. Google ScholarDigital Library
E.M.Voorhees, Overview of the TREC 2003 question answering track, Proc. of TREC 2003, 2003.Google Scholar
E.M. Voorhees, Overview of the TREC 2004 question answering track, Proc. of TREC 2004, 2004.Google Scholar
J. Xiao, T.-S. Chua and H. Cui, Cascading Use of Soft and Hard Matching Pattern Rules for Weakly Supervised Information Extraction, Proc. of COLING '04, Geneva, Switzerland, 2004, pp.542--548. Google ScholarDigital Library
J. Xu, R. M. Weischedel and A. Licuanan, Evaluation of an extraction-based approach to answering definitional questions, Proc. of SIGIR '04, Sheffield, UK, 2004, pp. 418--424. Google ScholarDigital Library
H. Yang, H. Cui, M.-Y. Kan, M. Maslennikov, L. Qiu and T.-S. Chua, QUALIFIER in TREC 12 QA Main Task, Proc. of TREC 2003, 2003, pp. 54--63.Google Scholar

Index Terms

Generic soft pattern models for definitional question answering
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
2. Information systems
  1. Information retrieval
    1. Retrieval models and ranking

Recommendations

Soft pattern matching models for definitional question answering

We explore probabilistic lexico-syntactic pattern matching, also known as soft pattern matching, in a definitional question answering system. Most current systems use regular expression-based hard matching patterns to identify definition sentences. Such ...
Read More
Automatic Word Spacing Using Probabilistic Models Based on Character n-grams

Automatic word spacing decides the correct boundaries between words in a sentence. Word spacing is important in Korean, and word spacing errors are frequent. Several proposed probabilistic word-spacing models resolve problems with previous statistical ...
Read More
Probabilistic model for definitional question answering
SIGIR '06: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval

This paper proposes a probabilistic model for definitional question answering (QA) that reflects the characteristics of the definitional question. The intention of the definitional question is to request the definition about the question target. ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGIR '05: Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
August 2005
708 pages
ISBN:1595930345
DOI:10.1145/1076034
General Chairs:
Ricardo Baeza-Yates
University of Chile, Chile
,
Nivio Ziviani
Federal University of Minas Gerais, Brazil
,
Program Chairs:
Gary Marchionini
University of North Carolina, USA
,
Alistair Moffat
University of Melbourne, Australia
,
John Tait
University of Sunderland, UK
Copyright © 2005 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 15 August 2005
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
definitional question answering
probabilistic models
soft pattern
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate792of3,983submissions,20%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 43
  Total Citations
  View Citations
- 824
  Total Downloads
- Downloads (Last 12 months)0
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Generic soft pattern models for definitional question answering

SIGIR '05: Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval

ABSTRACT

References

Cited By

Index Terms

Recommendations

Soft pattern matching models for definitional question answering

Automatic Word Spacing Using Probabilistic Models Based on Character n-grams

Probabilistic model for definitional question answering