ABSTRACT
Transformer-based models have dramatically improved performance of various natural language processing tasks like question answering, fact verification, topic-driven summarization and natural language inferencing. However, these models can’t process input context longer than their token-length limit (TLL) at a time. Given a large document however, the required context may be spread over a larger area and also may not be restricted to contiguous sentences. Existing methods fail to handle such situations correctly. In this paper, we propose a method to handle this issue by detecting the right context from a large document before performing the actual query-context text-pair task. The proposed method fragments a long text document into sub-texts and then employs a cross-encoder model to generate a query-focused relevance score for each sub-text module. The actual downstream task is performed with the most relevant sub-text chosen as the context, rather than arbitrarily selecting the top few sentences. This extricates the model from the traditional way of iterating over TLL window size text fragments and saves computational cost. The efficacy of the approach has been established with multiple tasks. The proposed model out-performs several state of the art models for the tasks by a significant margin.
- Payal Bajaj, Daniel Campos, Nick Craswell, Li Deng, Jianfeng Gao, Xiaodong Liu, Rangan Majumder, Andrew McNamara, Bhaskar Mitra, Tri Nguyen, 2016. Ms marco: A human generated machine reading comprehension dataset. arXiv preprint arXiv:1611.09268(2016).Google Scholar
- Iz Beltagy, Matthew E. Peters, and Arman Cohan. 2020. Longformer: The Long-Document Transformer. arXiv:2004.05150 (2020).Google Scholar
- Samuel R. Bowman, Gabor Angeli, Christopher Potts, and Christopher D. Manning. 2015. A large annotated corpus for learning natural language inference. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Lisbon, Portugal, 632–642. https://doi.org/10.18653/v1/D15-1075Google ScholarCross Ref
- Daniel Cer, Yinfei Yang, Sheng-yi Kong, Nan Hua, Nicole Limtiaco, Rhomni St John, Noah Constant, Mario Guajardo-Cespedes, Steve Yuan, Chris Tar, 2018. Universal sentence encoder. arXiv preprint arXiv:1803.11175(2018).Google Scholar
- Krzysztof Choromanski, Valerii Likhosherstov, David Dohan, Xingyou Song, Andreea Gane, Tamas Sarlos, Peter Hawkins, Jared Davis, Afroz Mohiuddin, Lukasz Kaiser, 2020. Rethinking attention with performers. arXiv preprint arXiv:2009.14794(2020).Google Scholar
- Kevin Clark, Minh-Thang Luong, Quoc V Le, and Christopher D Manning. 2020. Electra: Pre-training text encoders as discriminators rather than generators. arXiv preprint arXiv:2003.10555(2020).Google Scholar
- Pradeep Dasigi, Kyle Lo, Iz Beltagy, Arman Cohan, Noah A. Smith, and Matt Gardner. 2021. A Dataset of Information-Seeking Questions and Answers Anchored in Research Papers. In NAACL.Google Scholar
- Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805(2018).Google Scholar
- Karl Moritz Hermann, Tomas Kocisky, Edward Grefenstette, Lasse Espeholt, Will Kay, Mustafa Suleyman, and Phil Blunsom. 2015. Teaching machines to read and comprehend. Advances in neural information processing systems 28 (2015).Google Scholar
- Samuel Humeau, Kurt Shuster, Marie-Anne Lachaux, and Jason Weston. 2019. Poly-encoders: Transformer architectures and pre-training strategies for fast and accurate multi-sentence scoring. arXiv preprint arXiv:1905.01969(2019).Google Scholar
- Nikita Kitaev, Łukasz Kaiser, and Anselm Levskaya. 2020. Reformer: The efficient transformer. arXiv preprint arXiv:2001.04451(2020).Google Scholar
- Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Ves Stoyanov, and Luke Zettlemoyer. 2019. Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv preprint arXiv:1910.13461(2019).Google Scholar
- Chin-Yew Lin. 2004. Rouge: A package for automatic evaluation of summaries. In Text summarization branches out. 74–81.Google Scholar
- Hanmeng Liu, Leyang Cui, Jian Liu, and Yue Zhang. 2021. Natural Language Inference in Context - Investigating Contextual Reasoning over Long Texts. Proceedings of the AAAI Conference on Artificial Intelligence 35, 15 (May 2021), 13388–13396. https://ojs.aaai.org/index.php/AAAI/article/view/17580Google ScholarCross Ref
- Yang Liu and Mirella Lapata. 2019. Text summarization with pretrained encoders. arXiv preprint arXiv:1908.08345(2019).Google Scholar
- Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692(2019).Google Scholar
- Andriy Mulyar, Elliot Schumacher, Masoud Rouhizadeh, and Mark Dredze. 2019. Phenotyping of Clinical Notes with Improved Document Classification Models Using Contextualized Neural Language Models. arXiv e-prints, Article arXiv:1910.13664 (Oct. 2019), arXiv:1910.13664 pages. arxiv:1910.13664 [cs.CL]Google Scholar
- Yixin Nie, Adina Williams, Emily Dinan, Mohit Bansal, Jason Weston, and Douwe Kiela. 2020. Adversarial NLI: A New Benchmark for Natural Language Understanding. In ACL. 4885–4901. https://doi.org/10.18653/v1/2020.acl-main.441Google Scholar
- Raghavendra Pappagari, Piotr Zelasko, Jesús Villalba, Yishay Carmiel, and Najim Dehak. 2019. Hierarchical transformers for long document classification. In 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU). IEEE, 838–844.Google ScholarCross Ref
- Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J Liu. 2019. Exploring the limits of transfer learning with a unified text-to-text transformer. arXiv preprint arXiv:1910.10683(2019).Google Scholar
- Pranav Rajpurkar, Jian Zhang, Konstantin Lopyrev, and Percy Liang. 2016. SQuAD: 100,000+ Questions for Machine Comprehension of Text. arXiv e-prints arXiv:1606.05250(2016).Google Scholar
- Nils Reimers and Iryna Gurevych. 2019. Sentence-bert: Sentence embeddings using siamese bert-networks. arXiv preprint arXiv:1908.10084(2019).Google Scholar
- Max Savery, Asma Ben Abacha, Soumya Gayen, and Dina Demner-Fushman. 2020. Question-driven summarization of answers to consumer health questions. Scientific Data 7, 1 (2020), 1–9.Google ScholarCross Ref
- Chi Sun, Xipeng Qiu, Yige Xu, and Xuanjing Huang. 2019. How to fine-tune bert for text classification?. In China national conference on Chinese computational linguistics. Springer, 194–206.Google ScholarDigital Library
- James Thorne, Andreas Vlachos, Christos Christodoulopoulos, and Arpit Mittal. 2018. Fever: a large-scale dataset for fact extraction and verification. arXiv preprint arXiv:1803.05355(2018).Google Scholar
- George Tsatsaronis, Georgios Balikas, Prodromos Malakasiotis, Ioannis Partalas, Matthias Zschunke, Michael R Alvers, Dirk Weissenborn, Anastasia Krithara, Sergios Petridis, Dimitris Polychronopoulos, 2015. An overview of the BIOASQ large-scale biomedical semantic indexing and question answering competition. BMC bioinformatics 16, 1 (2015), 1–28.Google Scholar
- Adina Williams, Nikita Nangia, and Samuel Bowman. 2018. A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). Association for Computational Linguistics, New Orleans, Louisiana, 1112–1122. https://doi.org/10.18653/v1/N18-1101Google ScholarCross Ref
- Zhilin Yang, Peng Qi, Saizheng Zhang, Yoshua Bengio, William W. Cohen, Ruslan Salakhutdinov, and Christopher D. Manning. 2018. HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering. In Conference on Empirical Methods in Natural Language Processing (EMNLP).Google Scholar
- Jingqing Zhang, Yao Zhao, Mohammad Saleh, and Peter Liu. 2020. Pegasus: Pre-training with extracted gap-sentences for abstractive summarization. In International Conference on Machine Learning. PMLR, 11328–11339.Google Scholar
- Ruixuan Zhang, Zhuoyu Wei, Yu Shi, and Yining Chen. 2020. {BERT}-{AL}: {BERT} for Arbitrarily Long Document Understanding. https://openreview.net/forum?id=SklnVAEFDBGoogle Scholar
Index Terms
- Query-Focused Re-Ranking to Enhance the Performance of Text Entailment and Question Answering
Recommendations
Document-Based Question Answering Improves Query-Focused Multi-document Summarization
Natural Language Processing and Chinese ComputingAbstractDue to the lack of large scale datasets, it remains difficult to train neural Query-focused Multi-Document Summarization (QMDS) models. Several large size datasets on the Document-based Question Answering (DQA) have been released and numerous ...
Sentiment-oriented query-focused text summarization addressed with a multi-objective optimization approach
AbstractNowadays, the automatic text summarization is a highly relevant task in many contexts. In particular, query-focused summarization consists of generating a summary from one or multiple documents according to a query given by the user. ...
Highlights- The sentiment-oriented query-focused text summarization problem is tackled.
- ...
A unified relevance model for opinion retrieval
CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge managementRepresenting the information need is the greatest challenge for opinion retrieval. Typical queries for opinion retrieval are composed of either just content words, or content words with a small number of cue "opinion" words. Both are inadequate for ...
Comments