Skip to main content

Investigation of Simple-but-Effective Architecture for Long-form Text Matching with Transformers

  • Conference paper
  • First Online:
Database Systems for Advanced Applications (DASFAA 2024)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14854))

Included in the following conference series:

  • 327 Accesses

Abstract

Long-form text matching plays a significant role in many real world Natural Language processing (NLP) and Information Retrieval (IR) applications. Recently Transformer based models such as BERT have been widely applied to address this problem and achieved promising results. However, they are all based on the architecture of Siamese Network and thus need to come up with extra techniques to capture the matching signals to remedy the problem of late interaction. In this paper, we investigate the usage of sequence pair classification architecture as the solution to long-form text matching. That is, we concatenate the pair of long-form texts into one sequence as the input into a pre-trained language model for fine-tuning. The initial experimental results show that such a simple baseline method can outperform state-of-the-art approaches in this field without further optimization. This findings illustrate that it is a promising choice to use sequence pair classification as the solution for this problem which has not been explored by previous studies yet. We also conduct in-depth empirical analysis to present more comprehensive results to support our claim and provide more insights for researchers in this direction.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://xuhuizhou.github.io/Multilevel-Text-Alignment/

References

  1. Beltagy, I., Peters, M.E., Cohan, A.: Longformer: The long-document transformer. CoRR abs/2004.05150 (2020)

    Google Scholar 

  2. Brown, T.B., Mann, B., Ryder, N., et al.: Language models are few-shot learners. In: NeurIPS (2020)

    Google Scholar 

  3. Caciularu, A., Cohan, A., Beltagy, I., Peters, M.E., Cattan, A., Dagan, I.: CDLM: cross-document language modeling. In: Findings of the Association for Computational Linguistics: EMNLP 2021, Virtual Event / Punta Cana, Dominican Republic, 16-20 November, 2021. pp. 2648–2662 (2021)

    Google Scholar 

  4. Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: NAACL-HLT. pp. 4171–4186 (2019)

    Google Scholar 

  5. Hu, B., Lu, Z., Li, H., Chen, Q.: Convolutional neural network architectures for matching natural language sentences. In: NeurIPS. pp. 2042–2050 (2014)

    Google Scholar 

  6. Huang, P., He, X., Gao, J., Deng, L., Acero, A., Heck, L.P.: Learning deep structured semantic models for web search using clickthrough data. In: CIKM. pp. 2333–2338 (2013)

    Google Scholar 

  7. Jiang, J., Zhang, M., Li, C., Bendersky, M., Golbandi, N., Najork, M.: Semantic text matching for long-form documents. In: The World Wide Web Conference, WWW 2019, San Francisco, CA, USA, May 13-17, 2019. pp. 795–806 (2019)

    Google Scholar 

  8. Li, C., Fisher, E., Thomas, R., Pittard, S., Hertzberg, V., Choi, J.D.: Competence-level prediction and resume & job description matching using context-aware transformer models. In: EMNLP. pp. 8456–8466 (2020)

    Google Scholar 

  9. Liu, B., Niu, D., Wei, H., Lin, J., He, Y., Lai, K., Xu, Y.: Matching article pairs with graphical decomposition and convolutions. In: ACL. pp. 6284–6294 (2019)

    Google Scholar 

  10. Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019)

    Google Scholar 

  11. Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. In: 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019 (2019)

    Google Scholar 

  12. Lu, J., Lin, C., Wang, J., Li, C.: Synergy of database techniques and machine learning models for string similarity search and join. In: CIKM. pp. 2975–2976. ACM (2019)

    Google Scholar 

  13. Miao, Z., Wang, J.: Watchog: A light-weight contrastive learning based framework for column annotation. Proc. ACM Manag. Data 1(4), 272:1–272:24 (2023)

    Google Scholar 

  14. Pang, B., Nijkamp, E., Kryscinski, W., Savarese, S., Zhou, Y., Xiong, C.: Long document summarization with top-down and bottom-up inference. In: EACL. pp. 1237–1254 (2023)

    Google Scholar 

  15. Pang, L., Lan, Y., Cheng, X.: Match-ignition: Plugging pagerank into transformer for long-form text matching. In: CIKM. pp. 1396–1405 (2021)

    Google Scholar 

  16. Pappagari, R., Zelasko, P., Villalba, J., Carmiel, Y., Dehak, N.: Hierarchical transformers for long document classification. In: IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2019, Singapore, December 14-18, 2019. pp. 838–844. IEEE (2019)

    Google Scholar 

  17. Park, H.H., Vyas, Y., Shah, K.: Efficient classification of long documents using transformers. In: ACL. pp. 702–709 (2022)

    Google Scholar 

  18. Rae, J.W., Potapenko, A., Jayakumar, S.M., Hillier, C., Lillicrap, T.P.: Compressive transformers for long-range sequence modelling. In: 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020 (2020)

    Google Scholar 

  19. Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of BERT: smaller, faster, cheaper and lighter. CoRR abs/1910.01108 (2019)

    Google Scholar 

  20. Tay, Y., Dehghani, M., Abnar, S., Shen, Y., Bahri, D., Pham, P., Rao, J., Yang, L., Ruder, S., Metzler, D.: Long range arena : A benchmark for efficient transformers. In: 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021 (2021)

    Google Scholar 

  21. Tian, B., Zhang, Y., Wang, J., Xing, C.: Hierarchical inter-attention network for document classification with multi-task learning. In: IJCAI. pp. 3569–3575 (2019)

    Google Scholar 

  22. Wang, J., Li, Y.: Minun: evaluating counterfactual explanations for entity matching. In: DEEM ’22: Proceedings of the Sixth Workshop on Data Management for End-To-End Machine Learning. pp. 7:1–7:11 (2022)

    Google Scholar 

  23. Wang, J., Li, Y., Hirota, W.: Machamp: A generalized entity matching benchmark. In: CIKM. pp. 4633–4642 (2021)

    Google Scholar 

  24. Wang, J., Li, Y., Hirota, W., Kandogan, E.: Machop: an end-to-end generalized entity matching framework. In: aiDM ’22: Proceedings of the Fifth International Workshop on Exploiting Artificial Intelligence Techniques for Data Management. pp. 2:1–2:10 (2022)

    Google Scholar 

  25. Wang, J., Lin, C., Li, M., Zaniolo, C.: Boosting approximate dictionary-based entity extraction with synonyms. Inf. Sci. 530, 1–21 (2020)

    Article  Google Scholar 

  26. Wei, J., Wang, X., Schuurmans, D., Bosma, M., Ichter, B., Xia, F., Chi, E.H., Le, Q.V., Zhou, D.: Chain-of-thought prompting elicits reasoning in large language models. In: NeurIPS (2022)

    Google Scholar 

  27. Wolf, T., Debut, L., Sanh, V., et al.: Transformers: State-of-the-art natural language processing. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, EMNLP 2020 - Demos, Online, November 16-20, 2020. pp. 38–45 (2020)

    Google Scholar 

  28. Wu, C., Wu, F., Qi, T., Huang, Y.: Hi-transformer: Hierarchical interactive transformer for efficient and effective long document modeling. In: ACL/IJCNLP. pp. 848–853 (2021)

    Google Scholar 

  29. Wu, J., Zhang, Y., Wang, J., Lin, C., Fu, Y., Xing, C.: Scalable metric similarity join using mapreduce. In: ICDE. pp. 1662–1665. IEEE (2019)

    Google Scholar 

  30. Yang, L., Zhang, M., Li, C., Bendersky, M., Najork, M.: Beyond 512 tokens: Siamese multi-depth transformer-based hierarchical encoder for long-form document matching. In: CIKM. pp. 1725–1734 (2020)

    Google Scholar 

  31. Yang, Z., Yang, D., Dyer, C., He, X., Smola, A.J., Hovy, E.H.: Hierarchical attention networks for document classification. In: NAACL HLT. pp. 1480–1489 (2016)

    Google Scholar 

  32. Zaheer, M., Guruganesh, G., Dubey, K.A., Ainslie, J., Alberti, C., Ontañón, S., Pham, P., Ravula, A., Wang, Q., Yang, L., Ahmed, A.: Big bird: Transformers for longer sequences. In: NeurIPS (2020)

    Google Scholar 

  33. Zhang, H., Zhang, J.: Text graph transformer for document classification. In: EMNLP. pp. 8322–8327 (2020)

    Google Scholar 

  34. Zhou, X., Pappas, N., Smith, N.A.: Multilevel text alignment with cross-document attention. In: EMNLP. pp. 5012–5025 (2020)

    Google Scholar 

  35. Zhu, C., Ping, W., Xiao, C., Shoeybi, M., Goldstein, T., Anandkumar, A., Catanzaro, B.: Long-short transformer: Efficient transformers for language and vision. In: Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, December 6-14, 2021, virtual. pp. 17723–17736 (2021)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Chen Shen or Jin Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Shen, C., Wang, J. (2024). Investigation of Simple-but-Effective Architecture for Long-form Text Matching with Transformers. In: Onizuka, M., et al. Database Systems for Advanced Applications. DASFAA 2024. Lecture Notes in Computer Science, vol 14854. Springer, Singapore. https://doi.org/10.1007/978-981-97-5569-1_3

Download citation

  • DOI: https://doi.org/10.1007/978-981-97-5569-1_3

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-97-5568-4

  • Online ISBN: 978-981-97-5569-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics