skip to main content
10.1145/2505515.2505688acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Building structures from classifiers for passage reranking

Published: 27 October 2013 Publication History

Abstract

This paper shows that learning to rank models can be applied to automatically learn complex patterns, such as relational semantic structures occurring in questions and their answer passages. This is achieved by providing the learning algorithm with a tree representation derived from the syntactic trees of questions and passages connected by relational tags, where the latter are again provided by the means of automatic classifiers, i.e., question and focus classifiers and Named Entity Recognizers. This way effective structural relational patterns are implicitly encoded in the representation and can be automatically utilized by powerful machine learning models such as kernel methods.
We conduct an extensive experimental evaluation of our models on well-known benchmarks from the question answer (QA) track of TREC challenges. The comparison with state-of-the-art systems and BM25 show a relative improvement in MAP of more than 14% and 45%, respectively. Further comparison on the task restricted to the answer sentence reranking shows an improvement in MAP of more than 8% over the state of the art.

References

[1]
A. Agarwal, H. Raghavan, K. Subbian, P. Melville, D. Gondek, and R. Lawrence. Learning to rank for robust question answering. In CIKM, 2012.
[2]
E. Agirre, D. Cer, M. Diab, and Gonzalez-Agirre. Semeval-2012 task 6: A pilot on semantic textual similarity. In *SEM, 2012.
[3]
E. Aktolga, J. Allan, and D. A. Smith. Passage reranking for question answering using syntactic structures and answer types. In ECIR, 2011.
[4]
L. Allison and T. I. Dix. A bit-string longest-common-subsequence algorithm. Inf. Process. Lett., 23(6):305--310, Dec. 1986.
[5]
G. Attardi, A. Cisternino, F. Formica, M. Simi, and R. Tommasi. Piqasso: Pisa question answering system. In TREC, pages 599--607, 2001.
[6]
D. Bar, C. Biemann, I. Gurevych, and T. Zesch. Ukp: Computing semantic textual similarity by combining multiple content similarity measures. In Proceedings of SemEval, 2012.
[7]
C. Biemann. Creating a system for lexical substitutions from scratch using crowdsourcing. Lang. Resour. Eval., 47(1):97--122, Mar. 2013.
[8]
R. Bunescu and Y. Huang. Towards a general model of answer typing: Question focus identification. In CICLing, 2010.
[9]
A. Celikyilmaz, D. Hakkani-Tur, and G. Tur. Lda based similarity modeling for question answering. In NAACL HLT Workshop on Semantic Search, 2010.
[10]
M. Ciaramita and Y. Altun. Broad-coverage sense disambiguation and information extraction with a supersense sequence tagger. In EMNLP, 2006.
[11]
M. Collins and N. Duffy. New Ranking Algorithms for Parsing and Tagging: Kernels over Discrete Structures, and the Voted Perceptron. In ACL, 2002.
[12]
D. Damljanovic, M. Agatonovic, and H. Cunningham. Identification of the question focus: Combining syntactic analysis and ontology-based lookup through the user interaction. In LREC, 2010.
[13]
M. Denkowski and A. Lavie. Meteor 1.3: Automatic Metric for Reliable Optimization and Evaluation of Machine Translation Systems. In Proceedings of the EMNLP Workshop on Statistical MT, 2011.
[14]
D. Ferrucci, E. Brown, J. Chu-Carroll, J. Fan, D. Gondek, A. Kalyanpur, A. Lally, J. W. Murdock, E. Nyberg, J. Prager, N. Schlaefer, and C. Welty. Building watson: An overview of the deepQA project. AI Magazine, 31(3), 2010.
[15]
E. Gabrilovich and S. Markovitch. Computing semantic relatedness using wikipedia-based explicit semantic analysis. In Proceedings of IJCAI, 2007.
[16]
D. Gusfield. Algorithms on strings, trees, and sequences: computer science and computational biology. Cambridge Univ. Press, NY, USA, 1997.
[17]
M. Heilman and N. A. Smith. Tree edit models for recognizing textual entailments, paraphrases, and answers to questions. In NAACL, 2010.
[18]
A. Hickl, J. Williams, J. Bensley, K. Roberts, Y. Shi, and B. Rink. Question answering with lcc chaucer at trec 2006. In TREC, 2006.
[19]
J. Jeon, W. B. Croft, and J. H. Lee. Finding similar questions in large question and answer archives. In CIKM, 2005.
[20]
Z. Ji, F. Xu, B. Wang, and B. He. Question-answer topic model for question retrieval in community question answering. In CIKM, 2012.
[21]
T. Joachims. Text categorization with support vector machines: Learning with many relevant features. In In Proceedings of ECML-98, pages 137--142, 1998.
[22]
T. Joachims. Optimizing search engines using clickthrough data. In KDD, pages 133--142, 2002.
[23]
X. Li and D. Roth. Learning question classifiers. In COLING, 2002.
[24]
R. Mihalcea, C. Corley, and C. Strapparava. Corpus-based and knowledge-based measures of text semantic similarity. In Proceedings AAAI'06, pages 775--780. AAAI Press, 2006.
[25]
G. A. Miller. Wordnet: A lexical database for english. Communications of the ACM, 38:39--41, 1995.
[26]
A. Moschitti. Efficient convolution kernels for dependency and constituent syntactic trees. In ECML, 2006.
[27]
A. Moschitti. Kernel Methods, Syntax and Semantics for Relational Text Categorization. In Proceeding of CIKM, Napa Valley, CA, USA, 2008.
[28]
A. Moschitti. Syntactic and Semantic Kernels for Short Text Pair Categorization. In Proceedings EACL, Athens, Greece 2009.
[29]
A. Moschitti and S. Quarteroni. Kernels on Linguistic Structures for Answer Extraction. In ACL, 2008.
[30]
A. Moschitti and S. Quarteroni. Linguistic Kernels for Answer Re-ranking in Question Answering Systems. Information Processing & Management, 2010.
[31]
A. Moschitti, S. Quarteroni, R. Basili, and S. Manandhar. Exploiting syntactic and shallow semantic kernels for question/answer classification. In ACL, 2007.
[32]
A. Moschitti and F. M. Zanzotto. Fast and effective kernels for relational learning from texts. In Proceedings of ICML, NY, 2007.
[33]
C. Pinchak. A probabilistic answer type model. In EACL, 2006.
[34]
J. M. Prager. Open-domain question-answering. Foundations and Trends in Information Retrieval, 1(2):91--231, 2006.
[35]
S. Quarteroni, V. Guerrisi, and P. L. Torre. Evaluating multi-focus natural language queries over data services. In LREC, 2012.
[36]
F. Radlinski and T. Joachims. Query chains: Learning to rank from implicit feedback. CoRR, 2006.
[37]
P. Resnik. Using information content to evaluate semantic similarity in a taxonomy. In Proceedings of IJCAI, 1995.
[38]
S. Riezler, A. Vasserman, I. Tsochantaridis, V. Mittal, and Y. Liu. Statistical machine translation for query expansion in answer retrieval. In ACL, 2007.
[39]
A. Severyn and A. Moschitti. Structural relationships for large-scale learning of answer re-ranking. In SIGIR, 2012.
[40]
A. Severyn, M. Nicosia, and A. Moschitti. Learning adaptable patterns for passage reranking. In CoNLL, 2013.
[41]
D. Shen and M. Lapata. Using semantic roles to improve question answering. In EMNLP-CoNLL, 2007.
[42]
S. Small, T. Strzalkowski, T. Liu, S. Ryan, R. Salkin, N. Shimizu, P. Kantor, D. Kelly, and N. Wacholder. HitiQA: Towards analytical question answering. In COLING, 2004.
[43]
R. Soricut and E. Brill. Automatic question answering using the web: Beyond the factoid. Inf. Retr., 9(2):191--206, 2006.
[44]
C. Stokoe, M. P. Oakes, and J. Tait. Word sense disambiguation in information retrieval revisited. In SIGIR, 2003.
[45]
M. Surdeanu, M. Ciaramita, and H. Zaragoza. Learning to rank answers on large online QA collections. In Proceedings of ACL-HLT, 2008.
[46]
E. M. Voorhees. Overview of the TREC 2001 Question Answering Track. In Proceedings of TREC, 2001.
[47]
E. M. Voorhees. Overview of TREC 2003. In TREC.
[48]
E. M. Voorhees. Overview of the TREC 2004 question answering track. In TREC, 2004.
[49]
M. Wang and C. D. Manning. Probabilistic tree-edit models with structured latent variables for textual entailment and question answer- ing. In ACL, 2010.
[50]
M. Wang, N. A. Smith, and T. Mitaura. What is the jeopardy model? a quasi-synchronous grammar for qa. In EMNLP, 2007.
[51]
M. J. Wise. Yap3: improved detection of similarities in computer program and other texts. In Proceedings of SIGCSE '96, NY, USA, 1996.
[52]
P. C. Xuchen Yao, Benjamin Van Durme and C. Callison-Burch. Answer extraction as sequence tagging with tree edit distance. In NAACL, 2013.
[53]
X. Xue, J. Jeon, and W. B. Croft. Retrieval models for question and answer archives. In SIGIR, 2008.
[54]
F. M. Zanzotto, L. Dell'Arciprete, and A. Moschitti. Efficient graph kernels for textual entailment recognition. FUNDAMENTA INFORMATICAE, 2010.
[55]
F. M. Zanzotto and A. Moschitti. Automatic Learning of Textual Entailments with Cross-Pair Similarities. In COLING-ACL, Sydney, Australia, 2006.
[56]
F. M. Zanzotto, M. Pennacchiotti, and A. Moschitti. A Machine Learning Approach to Recognizing Textual Entailment. NLE, Volume 15 Issue 4, October.

Cited By

View all
  • (2021)GCNs-Based Context-Aware Short Text Similarity Model2020 25th International Conference on Pattern Recognition (ICPR)10.1109/ICPR48806.2021.9412460(1329-1335)Online publication date: 10-Jan-2021
  • (2019)A semantic textual similarity measurement model based on the syntactic-semantic representationIntelligent Data Analysis10.3233/IDA-18394723:4(933-950)Online publication date: 26-Sep-2019
  • (2019)An Efficient Framework for Sentence Similarity ModelingIEEE/ACM Transactions on Audio, Speech and Language Processing10.1109/TASLP.2019.289949427:4(853-865)Online publication date: 1-Apr-2019
  • Show More Cited By

Index Terms

  1. Building structures from classifiers for passage reranking

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CIKM '13: Proceedings of the 22nd ACM international conference on Information & Knowledge Management
    October 2013
    2612 pages
    ISBN:9781450322638
    DOI:10.1145/2505515
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 27 October 2013

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. kernel methods
    2. learning to rank
    3. question answering
    4. structural kernels

    Qualifiers

    • Research-article

    Conference

    CIKM'13
    Sponsor:
    CIKM'13: 22nd ACM International Conference on Information and Knowledge Management
    October 27 - November 1, 2013
    California, San Francisco, USA

    Acceptance Rates

    CIKM '13 Paper Acceptance Rate 143 of 848 submissions, 17%;
    Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

    Upcoming Conference

    CIKM '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)3
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 15 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2021)GCNs-Based Context-Aware Short Text Similarity Model2020 25th International Conference on Pattern Recognition (ICPR)10.1109/ICPR48806.2021.9412460(1329-1335)Online publication date: 10-Jan-2021
    • (2019)A semantic textual similarity measurement model based on the syntactic-semantic representationIntelligent Data Analysis10.3233/IDA-18394723:4(933-950)Online publication date: 26-Sep-2019
    • (2019)An Efficient Framework for Sentence Similarity ModelingIEEE/ACM Transactions on Audio, Speech and Language Processing10.1109/TASLP.2019.289949427:4(853-865)Online publication date: 1-Apr-2019
    • (2019)Ranking Like Human: Global-View Matching via Reinforcement Learning for Answer Selection2019 International Conference on Asian Language Processing (IALP)10.1109/IALP48816.2019.9037725(456-461)Online publication date: Nov-2019
    • (2019)Fusing Syntax and Word Embedding Knowledge for Measuring Semantic Similarity2019 IEEE 21st International Conference on High Performance Computing and Communications; IEEE 17th International Conference on Smart City; IEEE 5th International Conference on Data Science and Systems (HPCC/SmartCity/DSS)10.1109/HPCC/SmartCity/DSS.2019.00062(359-366)Online publication date: Aug-2019
    • (2019)Making sense of kernel spaces in neural learningComputer Speech and Language10.1016/j.csl.2019.03.00658:C(51-75)Online publication date: 1-Nov-2019
    • (2019)A framework for intelligent question answering system using semantic context-specific document clustering and WordnetSādhanā10.1007/s12046-018-1022-844:3Online publication date: 18-Feb-2019
    • (2019)A Semantic Expansion-Based Joint Model for Answer Ranking in Chinese Question Answering SystemsInformation Retrieval Technology10.1007/978-3-030-42835-8_3(22-33)Online publication date: 7-Nov-2019
    • (2018)ACV-treeProceedings of the 27th International Joint Conference on Artificial Intelligence10.5555/3304222.3304345(4137-4143)Online publication date: 13-Jul-2018
    • (2018)Shallow and Deep Syntactic/Semantic Structures for Passage Reranking in Question-Answering SystemsACM Transactions on Information Systems10.1145/323377237:1(1-38)Online publication date: 19-Nov-2018
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media