Prompt-Based and Two-Stage Training for Few-Shot Text Classification

Yan, Zexin; Tang, Yan; Liu, Xin

doi:10.1007/978-981-99-9637-7_2

Zexin Yan¹¹,
Yan Tang¹¹ &
Xin Liu¹¹

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 2012))

Included in the following conference series:

CCF Conference on Computer Supported Cooperative Work and Social Computing

585 Accesses

Abstract

Text classification is a crucial task in the field of Natural Language Processing (NLP), which aims at predicting the category where a text belongs. Recently, prompt-based learning has emerged as a powerful approach to handling a wide variety of tasks in NLP. It effectively bridges the gap between pre-trained language models (PLMs) and downstream tasks. Verbalizers are key components in prompt-based tuning. Existing manual prompts heavily rely on domain knowledge, while automatically generated verbalizers, whether on discrete or continuous space, have been suboptimal. In this work, we propose a two-stage training strategy for few-shot text classification, combining prompt-based learning and contrastive learning to learn appropriate verbalizers. In the first stage, we construct positive and negative samples for each input text and obtain soft verbalizers by integrating prompt-based learning and contrastive learning. In the second stage, we leverage the verbalizer learned in the first stage, along with prompt tuning, to train the entire model. Through experiments on some text classification datasets, our method outperforms other existing mainstream methods, which demonstrates its significance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Eliciting Knowledge from Pretrained Language Models for Prototypical Prompt Verbalizer

A Few-Shot Approach to Resume Information Extraction via Prompts

Prompt-Learning for Semi-supervised Text Classification

Notes

References

Sun, X.J.: Paradigm shift in natural language processing. Mach. Intell. Res. 19(3), 169–183 (2022). https://doi.org/10.1007/s11633-022-1331-6
Article MathSciNet Google Scholar
Minaee, S., Kalchbrenner, N., Cambria, E., Nikzad, N., Chenaghlu, M., Gao, J.: Deep learning-based text classification: a comprehensive review. ACM Comput. Surv. 54(3) (2021). https://doi.org/10.1145/3439726
Brown, T., et al.: Language models are few-shot learners. In: Advances in Neural Information Processing Systems, pp. 1877–1901. Curran Associates Inc (2020)
Google Scholar
Liu, P., Yuan, W., Fu, J., Jiang, Z., Hayashi, H., Neubig, G.: Pre-train, prompt, and predict: a systematic survey of prompting methods in natural language processing. ACM Comput. Surv. 55(9) (2023). https://doi.org/10.1145/3560815
Schick, T., Schutze, H.: Exploiting cloze-questions for few-shot text classification and natural language inference. In: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pp. 255–269. Association for Computational Linguistics (2021). https://doi.org/10.18653/v1/2021.eacl-main.20
Gao, T., Fisch, A., Chen, D.: Making pre-trained language models better few-shot learners. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL/IJCNLP 2021, (Volume 1: Long Papers), Virtual Event, 1–6 August 2021, pp. 3816–3830. Association for Computational Linguistics (2021). https://doi.org/10.18653/v1/2021.acl-long.295
Wang, H., Xu, C., McAuley, J.: Automatic multi-label prompting: simple and interpretable few-shot classification. In: Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 5483–5492. Association for Computational Linguistics (2022). https://doi.org/10.18653/v1/2022.naacl-main.401
Hu, S., et al.: Knowledgeable prompt-tuning: incorporating knowledge into prompt verbalizer for text classification. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2022, Dublin, Ireland, 22–27 May 2022, pp. 2225–2240. Association for Computational Linguistics (2022). https://doi.org/10.18653/v1/2022.acl-long.158
Hambardzumyan, K., Khachatrian, H., May, J.: WARP: word-level adversarial ReProgramming. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL/IJCNLP 2021, (Volume 1: Long Papers), Virtual Event, 1–6 August 2021, pp. 4921–4933. Association for Computational Linguistics (2021). https://doi.org/10.18653/v1/2021.acl-long.381
He, K., Fan, H., Wu, Y., Xie, S., Girshick, R.: Momentum contrast for unsupervised visual representation learning. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, 13–19 June 2020, pp. 9726–9735. Computer Vision Foundation/IEEE (2020). https://doi.org/10.1109/CVPR42600.2020.00975
Oord, A.V.D., Li, Y., Vinyals, O.: Representation Learning with Contrastive Predictive Coding. CoRR, abs/1807.03748 (2018)
Google Scholar
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 4171–4186. Association for Computational Linguistics (2019). https://doi.org/10.18653/v1/N19-1423
Liu, Y., et al.: RoBERTa: a robustly optimized BERT pretraining approach. CoRR, abs/1907.11692 (2019)
Google Scholar
Raffel, C., et al.: Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res. 21, 140:1–140:67 (2020)
Google Scholar
Liu, X., et al.: GPT understands, Too. CoRR, abs/2103.10385 (2021)
Google Scholar
Lester, B., Al-Rfou, R., Constant, N.: The power of scale for parameter-efficient prompt tuning. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, EMNLP 2021, Virtual Event/Punta Cana, Dominican Republic, 7–11 November, 2021, pp. 3045–3059. Association for Computational Linguistics (2021). https://doi.org/10.18653/v1/2021.emnlp-main.243
Li, X.L., Liang, P.: Prefix-tuning: optimizing continuous prompts for generation. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL/IJCNLP 2021, (Volume 1: Long Papers), Virtual Event, 1–6 August 2021, pp. 4582–4597. Association for Computational Linguistics (2021). https://doi.org/10.18653/v1/2021.acl-long.353
Cui, G., Hu, S., Ding, N., Huang, L., Liu, Z.: Prototypical verbalizer for prompt-based few-shot tuning. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2022, Dublin, Ireland, 22–27 May 2022, pp. 7014–7024. Association for Computational Linguistics (2022). https://doi.org/10.18653/v1/2022.acl-long.483
Shin, T., Razeghi, Y., Logan IV, R.L., Wallace, E., Singh, S.: AutoPrompt: eliciting knowledge from language models with automatically generated prompts. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, EMNLP 2020, 16–20 November 2020, pp. 4222–4235. Association for Computational Linguistics (2020). https://doi.org/10.18653/v1/2020.emnlp-main.346
Chen, T., Kornblith, S., Norouzi, M., Hinton, G.: A simple framework for contrastive learning of visual representations. In: Proceedings of the 37th International Conference on Machine Learning, ICML 2020, 13–18 July 2020, Virtual Event, pp. 1597–1607. PMLR (2020)
Google Scholar
Chen, X., Fan, H., Girshick, R., He, K.: Improved Baselines with Momentum Contrastive Learning. CoRR, abs/2003.04297 (2020)
Google Scholar
Gao, T., Yao, X., Chen, D.: SimCSE: simple contrastive learning of sentence embeddings. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, EMNLP 2021, Virtual Event / Punta Cana, Dominican Republic, 7–11 November, 2021, pp. 6894–6910. Association for Computational Linguistics (2021). https://doi.org/10.18653/v1/2021.emnlp-main.552
Giorgi, J., Nitski, O., Wang, B., Bader, G.: DeCLUTR: deep contrastive learning for unsupervised textual representations. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL/IJCNLP 2021, (Volume 1: Long Papers), Virtual Event, 1–6 August 2021, pp. 879–895. Association for Computational Linguistics (2021). https://doi.org/10.18653/v1/2021.acl-long.72
Zhang, X., Zhao, J., LeCun, Y.: Character-level convolutional networks for text classification. In: Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, December 7–12, 2015, Montreal, Quebec, Canada, pp. 649–657 (2015)
Google Scholar
Collins, E., Rozanov, N., Zhang, B.: Evolutionary data measures: understanding the difficulty of text classification tasks. In: Proceedings of the 22nd Conference on Computational Natural Language Learning, CoNLL 2018, Brussels, Belgium, October 31–November 1, 2018, pp. 380–391. Association for Computational Linguistics (2018). https://doi.org/10.18653/v1/k18-1037
Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. In: 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, 6–9 May 2019. OpenReview.net (2019)
Google Scholar
Schick, T., Schmid, H., Schütze, H.: Automatically identifying words that can serve as labels for few-shot text classification. In: Proceedings of the 28th International Conference on Computational Linguistics, COLING 2020, Barcelona, Spain, 8–13 December 2020, pp. 5569–5578. International Committee on Computational Linguistics (2020). https://doi.org/10.18653/v1/2020.coling-main.488
Wang, F., Liu, H.: Understanding the behaviour of contrastive loss. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, 19–25 June 2021, pp. 2495–2504. Computer Vision Foundation/IEEE (2021). https://doi.org/10.1109/CVPR46437.2021.00252

Download references

Author information

Authors and Affiliations

College of Computer and Information, Hohai University, Nanjing, China
Zexin Yan, Yan Tang & Xin Liu

Authors

Zexin Yan
View author publications
You can also search for this author in PubMed Google Scholar
Yan Tang
View author publications
You can also search for this author in PubMed Google Scholar
Xin Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zexin Yan .

Editor information

Editors and Affiliations

Shandong University, Jinan, China
Yuqing Sun
Fudan University, Shanghai, China
Tun Lu
Harbin Engineering University, Harbin, China
Tong Wang
Tongji University, Shanghai, China
Hongfei Fan
Guangdong University of Technology, Guangzhou, China
Dongning Liu
Tongji University, Shanghai, China
Bowen Du

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yan, Z., Tang, Y., Liu, X. (2024). Prompt-Based and Two-Stage Training for Few-Shot Text Classification. In: Sun, Y., Lu, T., Wang, T., Fan, H., Liu, D., Du, B. (eds) Computer Supported Cooperative Work and Social Computing. ChineseCSCW 2023. Communications in Computer and Information Science, vol 2012. Springer, Singapore. https://doi.org/10.1007/978-981-99-9637-7_2

Download citation

DOI: https://doi.org/10.1007/978-981-99-9637-7_2
Published: 05 January 2024
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-9636-0
Online ISBN: 978-981-99-9637-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the China Computer Federation (CCF) (opens in a new tab)

Prompt-Based and Two-Stage Training for Few-Shot Text Classification