Skip to main content

Prompt-Based and Two-Stage Training for Few-Shot Text Classification

  • Conference paper
  • First Online:
Computer Supported Cooperative Work and Social Computing (ChineseCSCW 2023)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 2012))

  • 585 Accesses

Abstract

Text classification is a crucial task in the field of Natural Language Processing (NLP), which aims at predicting the category where a text belongs. Recently, prompt-based learning has emerged as a powerful approach to handling a wide variety of tasks in NLP. It effectively bridges the gap between pre-trained language models (PLMs) and downstream tasks. Verbalizers are key components in prompt-based tuning. Existing manual prompts heavily rely on domain knowledge, while automatically generated verbalizers, whether on discrete or continuous space, have been suboptimal. In this work, we propose a two-stage training strategy for few-shot text classification, combining prompt-based learning and contrastive learning to learn appropriate verbalizers. In the first stage, we construct positive and negative samples for each input text and obtain soft verbalizers by integrating prompt-based learning and contrastive learning. In the second stage, we leverage the verbalizer learned in the first stage, along with prompt tuning, to train the entire model. Through experiments on some text classification datasets, our method outperforms other existing mainstream methods, which demonstrates its significance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://huggingface.co.

  2. 2.

    https://github.com/thunlp/OpenPrompt.

References

  1. Sun, X.J.: Paradigm shift in natural language processing. Mach. Intell. Res. 19(3), 169–183 (2022). https://doi.org/10.1007/s11633-022-1331-6

    Article  MathSciNet  Google Scholar 

  2. Minaee, S., Kalchbrenner, N., Cambria, E., Nikzad, N., Chenaghlu, M., Gao, J.: Deep learning-based text classification: a comprehensive review. ACM Comput. Surv. 54(3) (2021). https://doi.org/10.1145/3439726

  3. Brown, T., et al.: Language models are few-shot learners. In: Advances in Neural Information Processing Systems, pp. 1877–1901. Curran Associates Inc (2020)

    Google Scholar 

  4. Liu, P., Yuan, W., Fu, J., Jiang, Z., Hayashi, H., Neubig, G.: Pre-train, prompt, and predict: a systematic survey of prompting methods in natural language processing. ACM Comput. Surv. 55(9) (2023). https://doi.org/10.1145/3560815

  5. Schick, T., Schutze, H.: Exploiting cloze-questions for few-shot text classification and natural language inference. In: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pp. 255–269. Association for Computational Linguistics (2021). https://doi.org/10.18653/v1/2021.eacl-main.20

  6. Gao, T., Fisch, A., Chen, D.: Making pre-trained language models better few-shot learners. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL/IJCNLP 2021, (Volume 1: Long Papers), Virtual Event, 1–6 August 2021, pp. 3816–3830. Association for Computational Linguistics (2021). https://doi.org/10.18653/v1/2021.acl-long.295

  7. Wang, H., Xu, C., McAuley, J.: Automatic multi-label prompting: simple and interpretable few-shot classification. In: Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 5483–5492. Association for Computational Linguistics (2022). https://doi.org/10.18653/v1/2022.naacl-main.401

  8. Hu, S., et al.: Knowledgeable prompt-tuning: incorporating knowledge into prompt verbalizer for text classification. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2022, Dublin, Ireland, 22–27 May 2022, pp. 2225–2240. Association for Computational Linguistics (2022). https://doi.org/10.18653/v1/2022.acl-long.158

  9. Hambardzumyan, K., Khachatrian, H., May, J.: WARP: word-level adversarial ReProgramming. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL/IJCNLP 2021, (Volume 1: Long Papers), Virtual Event, 1–6 August 2021, pp. 4921–4933. Association for Computational Linguistics (2021). https://doi.org/10.18653/v1/2021.acl-long.381

  10. He, K., Fan, H., Wu, Y., Xie, S., Girshick, R.: Momentum contrast for unsupervised visual representation learning. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, 13–19 June 2020, pp. 9726–9735. Computer Vision Foundation/IEEE (2020). https://doi.org/10.1109/CVPR42600.2020.00975

  11. Oord, A.V.D., Li, Y., Vinyals, O.: Representation Learning with Contrastive Predictive Coding. CoRR, abs/1807.03748 (2018)

    Google Scholar 

  12. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 4171–4186. Association for Computational Linguistics (2019). https://doi.org/10.18653/v1/N19-1423

  13. Liu, Y., et al.: RoBERTa: a robustly optimized BERT pretraining approach. CoRR, abs/1907.11692 (2019)

    Google Scholar 

  14. Raffel, C., et al.: Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res. 21, 140:1–140:67 (2020)

    Google Scholar 

  15. Liu, X., et al.: GPT understands, Too. CoRR, abs/2103.10385 (2021)

    Google Scholar 

  16. Lester, B., Al-Rfou, R., Constant, N.: The power of scale for parameter-efficient prompt tuning. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, EMNLP 2021, Virtual Event/Punta Cana, Dominican Republic, 7–11 November, 2021, pp. 3045–3059. Association for Computational Linguistics (2021). https://doi.org/10.18653/v1/2021.emnlp-main.243

  17. Li, X.L., Liang, P.: Prefix-tuning: optimizing continuous prompts for generation. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL/IJCNLP 2021, (Volume 1: Long Papers), Virtual Event, 1–6 August 2021, pp. 4582–4597. Association for Computational Linguistics (2021). https://doi.org/10.18653/v1/2021.acl-long.353

  18. Cui, G., Hu, S., Ding, N., Huang, L., Liu, Z.: Prototypical verbalizer for prompt-based few-shot tuning. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2022, Dublin, Ireland, 22–27 May 2022, pp. 7014–7024. Association for Computational Linguistics (2022). https://doi.org/10.18653/v1/2022.acl-long.483

  19. Shin, T., Razeghi, Y., Logan IV, R.L., Wallace, E., Singh, S.: AutoPrompt: eliciting knowledge from language models with automatically generated prompts. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, EMNLP 2020, 16–20 November 2020, pp. 4222–4235. Association for Computational Linguistics (2020). https://doi.org/10.18653/v1/2020.emnlp-main.346

  20. Chen, T., Kornblith, S., Norouzi, M., Hinton, G.: A simple framework for contrastive learning of visual representations. In: Proceedings of the 37th International Conference on Machine Learning, ICML 2020, 13–18 July 2020, Virtual Event, pp. 1597–1607. PMLR (2020)

    Google Scholar 

  21. Chen, X., Fan, H., Girshick, R., He, K.: Improved Baselines with Momentum Contrastive Learning. CoRR, abs/2003.04297 (2020)

    Google Scholar 

  22. Gao, T., Yao, X., Chen, D.: SimCSE: simple contrastive learning of sentence embeddings. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, EMNLP 2021, Virtual Event / Punta Cana, Dominican Republic, 7–11 November, 2021, pp. 6894–6910. Association for Computational Linguistics (2021). https://doi.org/10.18653/v1/2021.emnlp-main.552

  23. Giorgi, J., Nitski, O., Wang, B., Bader, G.: DeCLUTR: deep contrastive learning for unsupervised textual representations. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL/IJCNLP 2021, (Volume 1: Long Papers), Virtual Event, 1–6 August 2021, pp. 879–895. Association for Computational Linguistics (2021). https://doi.org/10.18653/v1/2021.acl-long.72

  24. Zhang, X., Zhao, J., LeCun, Y.: Character-level convolutional networks for text classification. In: Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, December 7–12, 2015, Montreal, Quebec, Canada, pp. 649–657 (2015)

    Google Scholar 

  25. Collins, E., Rozanov, N., Zhang, B.: Evolutionary data measures: understanding the difficulty of text classification tasks. In: Proceedings of the 22nd Conference on Computational Natural Language Learning, CoNLL 2018, Brussels, Belgium, October 31–November 1, 2018, pp. 380–391. Association for Computational Linguistics (2018). https://doi.org/10.18653/v1/k18-1037

  26. Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. In: 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, 6–9 May 2019. OpenReview.net (2019)

    Google Scholar 

  27. Schick, T., Schmid, H., Schütze, H.: Automatically identifying words that can serve as labels for few-shot text classification. In: Proceedings of the 28th International Conference on Computational Linguistics, COLING 2020, Barcelona, Spain, 8–13 December 2020, pp. 5569–5578. International Committee on Computational Linguistics (2020). https://doi.org/10.18653/v1/2020.coling-main.488

  28. Wang, F., Liu, H.: Understanding the behaviour of contrastive loss. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, 19–25 June 2021, pp. 2495–2504. Computer Vision Foundation/IEEE (2021). https://doi.org/10.1109/CVPR46437.2021.00252

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zexin Yan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Yan, Z., Tang, Y., Liu, X. (2024). Prompt-Based and Two-Stage Training for Few-Shot Text Classification. In: Sun, Y., Lu, T., Wang, T., Fan, H., Liu, D., Du, B. (eds) Computer Supported Cooperative Work and Social Computing. ChineseCSCW 2023. Communications in Computer and Information Science, vol 2012. Springer, Singapore. https://doi.org/10.1007/978-981-99-9637-7_2

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-9637-7_2

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-9636-0

  • Online ISBN: 978-981-99-9637-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics