Abstract
Long-form text matching plays a significant role in many real world Natural Language processing (NLP) and Information Retrieval (IR) applications. Recently Transformer based models such as BERT have been widely applied to address this problem and achieved promising results. However, they are all based on the architecture of Siamese Network and thus need to come up with extra techniques to capture the matching signals to remedy the problem of late interaction. In this paper, we investigate the usage of sequence pair classification architecture as the solution to long-form text matching. That is, we concatenate the pair of long-form texts into one sequence as the input into a pre-trained language model for fine-tuning. The initial experimental results show that such a simple baseline method can outperform state-of-the-art approaches in this field without further optimization. This findings illustrate that it is a promising choice to use sequence pair classification as the solution for this problem which has not been explored by previous studies yet. We also conduct in-depth empirical analysis to present more comprehensive results to support our claim and provide more insights for researchers in this direction.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Beltagy, I., Peters, M.E., Cohan, A.: Longformer: The long-document transformer. CoRR abs/2004.05150 (2020)
Brown, T.B., Mann, B., Ryder, N., et al.: Language models are few-shot learners. In: NeurIPS (2020)
Caciularu, A., Cohan, A., Beltagy, I., Peters, M.E., Cattan, A., Dagan, I.: CDLM: cross-document language modeling. In: Findings of the Association for Computational Linguistics: EMNLP 2021, Virtual Event / Punta Cana, Dominican Republic, 16-20 November, 2021. pp. 2648–2662 (2021)
Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: NAACL-HLT. pp. 4171–4186 (2019)
Hu, B., Lu, Z., Li, H., Chen, Q.: Convolutional neural network architectures for matching natural language sentences. In: NeurIPS. pp. 2042–2050 (2014)
Huang, P., He, X., Gao, J., Deng, L., Acero, A., Heck, L.P.: Learning deep structured semantic models for web search using clickthrough data. In: CIKM. pp. 2333–2338 (2013)
Jiang, J., Zhang, M., Li, C., Bendersky, M., Golbandi, N., Najork, M.: Semantic text matching for long-form documents. In: The World Wide Web Conference, WWW 2019, San Francisco, CA, USA, May 13-17, 2019. pp. 795–806 (2019)
Li, C., Fisher, E., Thomas, R., Pittard, S., Hertzberg, V., Choi, J.D.: Competence-level prediction and resume & job description matching using context-aware transformer models. In: EMNLP. pp. 8456–8466 (2020)
Liu, B., Niu, D., Wei, H., Lin, J., He, Y., Lai, K., Xu, Y.: Matching article pairs with graphical decomposition and convolutions. In: ACL. pp. 6284–6294 (2019)
Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019)
Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. In: 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019 (2019)
Lu, J., Lin, C., Wang, J., Li, C.: Synergy of database techniques and machine learning models for string similarity search and join. In: CIKM. pp. 2975–2976. ACM (2019)
Miao, Z., Wang, J.: Watchog: A light-weight contrastive learning based framework for column annotation. Proc. ACM Manag. Data 1(4), 272:1–272:24 (2023)
Pang, B., Nijkamp, E., Kryscinski, W., Savarese, S., Zhou, Y., Xiong, C.: Long document summarization with top-down and bottom-up inference. In: EACL. pp. 1237–1254 (2023)
Pang, L., Lan, Y., Cheng, X.: Match-ignition: Plugging pagerank into transformer for long-form text matching. In: CIKM. pp. 1396–1405 (2021)
Pappagari, R., Zelasko, P., Villalba, J., Carmiel, Y., Dehak, N.: Hierarchical transformers for long document classification. In: IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2019, Singapore, December 14-18, 2019. pp. 838–844. IEEE (2019)
Park, H.H., Vyas, Y., Shah, K.: Efficient classification of long documents using transformers. In: ACL. pp. 702–709 (2022)
Rae, J.W., Potapenko, A., Jayakumar, S.M., Hillier, C., Lillicrap, T.P.: Compressive transformers for long-range sequence modelling. In: 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020 (2020)
Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of BERT: smaller, faster, cheaper and lighter. CoRR abs/1910.01108 (2019)
Tay, Y., Dehghani, M., Abnar, S., Shen, Y., Bahri, D., Pham, P., Rao, J., Yang, L., Ruder, S., Metzler, D.: Long range arena : A benchmark for efficient transformers. In: 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021 (2021)
Tian, B., Zhang, Y., Wang, J., Xing, C.: Hierarchical inter-attention network for document classification with multi-task learning. In: IJCAI. pp. 3569–3575 (2019)
Wang, J., Li, Y.: Minun: evaluating counterfactual explanations for entity matching. In: DEEM ’22: Proceedings of the Sixth Workshop on Data Management for End-To-End Machine Learning. pp. 7:1–7:11 (2022)
Wang, J., Li, Y., Hirota, W.: Machamp: A generalized entity matching benchmark. In: CIKM. pp. 4633–4642 (2021)
Wang, J., Li, Y., Hirota, W., Kandogan, E.: Machop: an end-to-end generalized entity matching framework. In: aiDM ’22: Proceedings of the Fifth International Workshop on Exploiting Artificial Intelligence Techniques for Data Management. pp. 2:1–2:10 (2022)
Wang, J., Lin, C., Li, M., Zaniolo, C.: Boosting approximate dictionary-based entity extraction with synonyms. Inf. Sci. 530, 1–21 (2020)
Wei, J., Wang, X., Schuurmans, D., Bosma, M., Ichter, B., Xia, F., Chi, E.H., Le, Q.V., Zhou, D.: Chain-of-thought prompting elicits reasoning in large language models. In: NeurIPS (2022)
Wolf, T., Debut, L., Sanh, V., et al.: Transformers: State-of-the-art natural language processing. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, EMNLP 2020 - Demos, Online, November 16-20, 2020. pp. 38–45 (2020)
Wu, C., Wu, F., Qi, T., Huang, Y.: Hi-transformer: Hierarchical interactive transformer for efficient and effective long document modeling. In: ACL/IJCNLP. pp. 848–853 (2021)
Wu, J., Zhang, Y., Wang, J., Lin, C., Fu, Y., Xing, C.: Scalable metric similarity join using mapreduce. In: ICDE. pp. 1662–1665. IEEE (2019)
Yang, L., Zhang, M., Li, C., Bendersky, M., Najork, M.: Beyond 512 tokens: Siamese multi-depth transformer-based hierarchical encoder for long-form document matching. In: CIKM. pp. 1725–1734 (2020)
Yang, Z., Yang, D., Dyer, C., He, X., Smola, A.J., Hovy, E.H.: Hierarchical attention networks for document classification. In: NAACL HLT. pp. 1480–1489 (2016)
Zaheer, M., Guruganesh, G., Dubey, K.A., Ainslie, J., Alberti, C., Ontañón, S., Pham, P., Ravula, A., Wang, Q., Yang, L., Ahmed, A.: Big bird: Transformers for longer sequences. In: NeurIPS (2020)
Zhang, H., Zhang, J.: Text graph transformer for document classification. In: EMNLP. pp. 8322–8327 (2020)
Zhou, X., Pappas, N., Smith, N.A.: Multilevel text alignment with cross-document attention. In: EMNLP. pp. 5012–5025 (2020)
Zhu, C., Ping, W., Xiao, C., Shoeybi, M., Goldstein, T., Anandkumar, A., Catanzaro, B.: Long-short transformer: Efficient transformers for language and vision. In: Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, December 6-14, 2021, virtual. pp. 17723–17736 (2021)
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Shen, C., Wang, J. (2024). Investigation of Simple-but-Effective Architecture for Long-form Text Matching with Transformers. In: Onizuka, M., et al. Database Systems for Advanced Applications. DASFAA 2024. Lecture Notes in Computer Science, vol 14854. Springer, Singapore. https://doi.org/10.1007/978-981-97-5569-1_3
Download citation
DOI: https://doi.org/10.1007/978-981-97-5569-1_3
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-5568-4
Online ISBN: 978-981-97-5569-1
eBook Packages: Computer ScienceComputer Science (R0)