Investigation of Simple-but-Effective Architecture for Long-form Text Matching with Transformers

Shen, Chen; Wang, Jin

doi:10.1007/978-981-97-5569-1_3

Chen Shen¹⁵ &
Jin Wang¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14854))

Included in the following conference series:

International Conference on Database Systems for Advanced Applications

327 Accesses

Abstract

Long-form text matching plays a significant role in many real world Natural Language processing (NLP) and Information Retrieval (IR) applications. Recently Transformer based models such as BERT have been widely applied to address this problem and achieved promising results. However, they are all based on the architecture of Siamese Network and thus need to come up with extra techniques to capture the matching signals to remedy the problem of late interaction. In this paper, we investigate the usage of sequence pair classification architecture as the solution to long-form text matching. That is, we concatenate the pair of long-form texts into one sequence as the input into a pre-trained language model for fine-tuning. The initial experimental results show that such a simple baseline method can outperform state-of-the-art approaches in this field without further optimization. This findings illustrate that it is a promising choice to use sequence pair classification as the solution for this problem which has not been explored by previous studies yet. We also conduct in-depth empirical analysis to present more comprehensive results to support our claim and provide more insights for researchers in this direction.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 159.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

TBNF:A Transformer-based Noise Filtering Method for Chinese Long-form Text Matching

Article 27 June 2023

Feature Differentiation and Fusion for Semantic Text Matching

Retrieval-based language model adaptation for handwritten Chinese text recognition

Article 27 October 2022

Notes

1.
https://xuhuizhou.github.io/Multilevel-Text-Alignment/

References

Beltagy, I., Peters, M.E., Cohan, A.: Longformer: The long-document transformer. CoRR abs/2004.05150 (2020)
Google Scholar
Brown, T.B., Mann, B., Ryder, N., et al.: Language models are few-shot learners. In: NeurIPS (2020)
Google Scholar
Caciularu, A., Cohan, A., Beltagy, I., Peters, M.E., Cattan, A., Dagan, I.: CDLM: cross-document language modeling. In: Findings of the Association for Computational Linguistics: EMNLP 2021, Virtual Event / Punta Cana, Dominican Republic, 16-20 November, 2021. pp. 2648–2662 (2021)
Google Scholar
Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: NAACL-HLT. pp. 4171–4186 (2019)
Google Scholar
Hu, B., Lu, Z., Li, H., Chen, Q.: Convolutional neural network architectures for matching natural language sentences. In: NeurIPS. pp. 2042–2050 (2014)
Google Scholar
Huang, P., He, X., Gao, J., Deng, L., Acero, A., Heck, L.P.: Learning deep structured semantic models for web search using clickthrough data. In: CIKM. pp. 2333–2338 (2013)
Google Scholar
Jiang, J., Zhang, M., Li, C., Bendersky, M., Golbandi, N., Najork, M.: Semantic text matching for long-form documents. In: The World Wide Web Conference, WWW 2019, San Francisco, CA, USA, May 13-17, 2019. pp. 795–806 (2019)
Google Scholar
Li, C., Fisher, E., Thomas, R., Pittard, S., Hertzberg, V., Choi, J.D.: Competence-level prediction and resume & job description matching using context-aware transformer models. In: EMNLP. pp. 8456–8466 (2020)
Google Scholar
Liu, B., Niu, D., Wei, H., Lin, J., He, Y., Lai, K., Xu, Y.: Matching article pairs with graphical decomposition and convolutions. In: ACL. pp. 6284–6294 (2019)
Google Scholar
Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019)
Google Scholar
Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. In: 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019 (2019)
Google Scholar
Lu, J., Lin, C., Wang, J., Li, C.: Synergy of database techniques and machine learning models for string similarity search and join. In: CIKM. pp. 2975–2976. ACM (2019)
Google Scholar
Miao, Z., Wang, J.: Watchog: A light-weight contrastive learning based framework for column annotation. Proc. ACM Manag. Data 1(4), 272:1–272:24 (2023)
Google Scholar
Pang, B., Nijkamp, E., Kryscinski, W., Savarese, S., Zhou, Y., Xiong, C.: Long document summarization with top-down and bottom-up inference. In: EACL. pp. 1237–1254 (2023)
Google Scholar
Pang, L., Lan, Y., Cheng, X.: Match-ignition: Plugging pagerank into transformer for long-form text matching. In: CIKM. pp. 1396–1405 (2021)
Google Scholar
Pappagari, R., Zelasko, P., Villalba, J., Carmiel, Y., Dehak, N.: Hierarchical transformers for long document classification. In: IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2019, Singapore, December 14-18, 2019. pp. 838–844. IEEE (2019)
Google Scholar
Park, H.H., Vyas, Y., Shah, K.: Efficient classification of long documents using transformers. In: ACL. pp. 702–709 (2022)
Google Scholar
Rae, J.W., Potapenko, A., Jayakumar, S.M., Hillier, C., Lillicrap, T.P.: Compressive transformers for long-range sequence modelling. In: 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020 (2020)
Google Scholar
Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of BERT: smaller, faster, cheaper and lighter. CoRR abs/1910.01108 (2019)
Google Scholar
Tay, Y., Dehghani, M., Abnar, S., Shen, Y., Bahri, D., Pham, P., Rao, J., Yang, L., Ruder, S., Metzler, D.: Long range arena : A benchmark for efficient transformers. In: 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021 (2021)
Google Scholar
Tian, B., Zhang, Y., Wang, J., Xing, C.: Hierarchical inter-attention network for document classification with multi-task learning. In: IJCAI. pp. 3569–3575 (2019)
Google Scholar
Wang, J., Li, Y.: Minun: evaluating counterfactual explanations for entity matching. In: DEEM ’22: Proceedings of the Sixth Workshop on Data Management for End-To-End Machine Learning. pp. 7:1–7:11 (2022)
Google Scholar
Wang, J., Li, Y., Hirota, W.: Machamp: A generalized entity matching benchmark. In: CIKM. pp. 4633–4642 (2021)
Google Scholar
Wang, J., Li, Y., Hirota, W., Kandogan, E.: Machop: an end-to-end generalized entity matching framework. In: aiDM ’22: Proceedings of the Fifth International Workshop on Exploiting Artificial Intelligence Techniques for Data Management. pp. 2:1–2:10 (2022)
Google Scholar
Wang, J., Lin, C., Li, M., Zaniolo, C.: Boosting approximate dictionary-based entity extraction with synonyms. Inf. Sci. 530, 1–21 (2020)
Article Google Scholar
Wei, J., Wang, X., Schuurmans, D., Bosma, M., Ichter, B., Xia, F., Chi, E.H., Le, Q.V., Zhou, D.: Chain-of-thought prompting elicits reasoning in large language models. In: NeurIPS (2022)
Google Scholar
Wolf, T., Debut, L., Sanh, V., et al.: Transformers: State-of-the-art natural language processing. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, EMNLP 2020 - Demos, Online, November 16-20, 2020. pp. 38–45 (2020)
Google Scholar
Wu, C., Wu, F., Qi, T., Huang, Y.: Hi-transformer: Hierarchical interactive transformer for efficient and effective long document modeling. In: ACL/IJCNLP. pp. 848–853 (2021)
Google Scholar
Wu, J., Zhang, Y., Wang, J., Lin, C., Fu, Y., Xing, C.: Scalable metric similarity join using mapreduce. In: ICDE. pp. 1662–1665. IEEE (2019)
Google Scholar
Yang, L., Zhang, M., Li, C., Bendersky, M., Najork, M.: Beyond 512 tokens: Siamese multi-depth transformer-based hierarchical encoder for long-form document matching. In: CIKM. pp. 1725–1734 (2020)
Google Scholar
Yang, Z., Yang, D., Dyer, C., He, X., Smola, A.J., Hovy, E.H.: Hierarchical attention networks for document classification. In: NAACL HLT. pp. 1480–1489 (2016)
Google Scholar
Zaheer, M., Guruganesh, G., Dubey, K.A., Ainslie, J., Alberti, C., Ontañón, S., Pham, P., Ravula, A., Wang, Q., Yang, L., Ahmed, A.: Big bird: Transformers for longer sequences. In: NeurIPS (2020)
Google Scholar
Zhang, H., Zhang, J.: Text graph transformer for document classification. In: EMNLP. pp. 8322–8327 (2020)
Google Scholar
Zhou, X., Pappas, N., Smith, N.A.: Multilevel text alignment with cross-document attention. In: EMNLP. pp. 5012–5025 (2020)
Google Scholar
Zhu, C., Ping, W., Xiao, C., Shoeybi, M., Goldstein, T., Anandkumar, A., Catanzaro, B.: Long-short transformer: Efficient transformers for language and vision. In: Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, December 6-14, 2021, virtual. pp. 17723–17736 (2021)
Google Scholar

Download references

Author information

Authors and Affiliations

Megagon Labs, Mountain View, USA
Chen Shen & Jin Wang

Authors

Chen Shen
View author publications
You can also search for this author in PubMed Google Scholar
Jin Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Chen Shen or Jin Wang .

Editor information

Editors and Affiliations

Osaka University, Suita, Osaka, Japan
Makoto Onizuka
KAIST, Daejeon, Korea (Republic of)
Jae-Gil Lee
Beihang University, Beijing, China
Yongxin Tong
Osaka University, Osaka, Japan
Chuan Xiao
Nagoya University, Nagoya, Japan
Yoshiharu Ishikawa
University of Grenoble Alpes, Saint-Martin d’Hères, France
Sihem Amer-Yahia
University of Michigan, Ann Arbor, MI, USA
H. V. Jagadish
Nagoya University, Nagoya, Japan
Kejing Lu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Shen, C., Wang, J. (2024). Investigation of Simple-but-Effective Architecture for Long-form Text Matching with Transformers. In: Onizuka, M., et al. Database Systems for Advanced Applications. DASFAA 2024. Lecture Notes in Computer Science, vol 14854. Springer, Singapore. https://doi.org/10.1007/978-981-97-5569-1_3

Download citation

DOI: https://doi.org/10.1007/978-981-97-5569-1_3
Published: 13 December 2024
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-5568-4
Online ISBN: 978-981-97-5569-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics