Abstract
Text classification is a crucial task in the field of Natural Language Processing (NLP), which aims at predicting the category where a text belongs. Recently, prompt-based learning has emerged as a powerful approach to handling a wide variety of tasks in NLP. It effectively bridges the gap between pre-trained language models (PLMs) and downstream tasks. Verbalizers are key components in prompt-based tuning. Existing manual prompts heavily rely on domain knowledge, while automatically generated verbalizers, whether on discrete or continuous space, have been suboptimal. In this work, we propose a two-stage training strategy for few-shot text classification, combining prompt-based learning and contrastive learning to learn appropriate verbalizers. In the first stage, we construct positive and negative samples for each input text and obtain soft verbalizers by integrating prompt-based learning and contrastive learning. In the second stage, we leverage the verbalizer learned in the first stage, along with prompt tuning, to train the entire model. Through experiments on some text classification datasets, our method outperforms other existing mainstream methods, which demonstrates its significance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Sun, X.J.: Paradigm shift in natural language processing. Mach. Intell. Res. 19(3), 169–183 (2022). https://doi.org/10.1007/s11633-022-1331-6
Minaee, S., Kalchbrenner, N., Cambria, E., Nikzad, N., Chenaghlu, M., Gao, J.: Deep learning-based text classification: a comprehensive review. ACM Comput. Surv. 54(3) (2021). https://doi.org/10.1145/3439726
Brown, T., et al.: Language models are few-shot learners. In: Advances in Neural Information Processing Systems, pp. 1877–1901. Curran Associates Inc (2020)
Liu, P., Yuan, W., Fu, J., Jiang, Z., Hayashi, H., Neubig, G.: Pre-train, prompt, and predict: a systematic survey of prompting methods in natural language processing. ACM Comput. Surv. 55(9) (2023). https://doi.org/10.1145/3560815
Schick, T., Schutze, H.: Exploiting cloze-questions for few-shot text classification and natural language inference. In: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pp. 255–269. Association for Computational Linguistics (2021). https://doi.org/10.18653/v1/2021.eacl-main.20
Gao, T., Fisch, A., Chen, D.: Making pre-trained language models better few-shot learners. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL/IJCNLP 2021, (Volume 1: Long Papers), Virtual Event, 1–6 August 2021, pp. 3816–3830. Association for Computational Linguistics (2021). https://doi.org/10.18653/v1/2021.acl-long.295
Wang, H., Xu, C., McAuley, J.: Automatic multi-label prompting: simple and interpretable few-shot classification. In: Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 5483–5492. Association for Computational Linguistics (2022). https://doi.org/10.18653/v1/2022.naacl-main.401
Hu, S., et al.: Knowledgeable prompt-tuning: incorporating knowledge into prompt verbalizer for text classification. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2022, Dublin, Ireland, 22–27 May 2022, pp. 2225–2240. Association for Computational Linguistics (2022). https://doi.org/10.18653/v1/2022.acl-long.158
Hambardzumyan, K., Khachatrian, H., May, J.: WARP: word-level adversarial ReProgramming. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL/IJCNLP 2021, (Volume 1: Long Papers), Virtual Event, 1–6 August 2021, pp. 4921–4933. Association for Computational Linguistics (2021). https://doi.org/10.18653/v1/2021.acl-long.381
He, K., Fan, H., Wu, Y., Xie, S., Girshick, R.: Momentum contrast for unsupervised visual representation learning. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, 13–19 June 2020, pp. 9726–9735. Computer Vision Foundation/IEEE (2020). https://doi.org/10.1109/CVPR42600.2020.00975
Oord, A.V.D., Li, Y., Vinyals, O.: Representation Learning with Contrastive Predictive Coding. CoRR, abs/1807.03748 (2018)
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 4171–4186. Association for Computational Linguistics (2019). https://doi.org/10.18653/v1/N19-1423
Liu, Y., et al.: RoBERTa: a robustly optimized BERT pretraining approach. CoRR, abs/1907.11692 (2019)
Raffel, C., et al.: Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res. 21, 140:1–140:67 (2020)
Liu, X., et al.: GPT understands, Too. CoRR, abs/2103.10385 (2021)
Lester, B., Al-Rfou, R., Constant, N.: The power of scale for parameter-efficient prompt tuning. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, EMNLP 2021, Virtual Event/Punta Cana, Dominican Republic, 7–11 November, 2021, pp. 3045–3059. Association for Computational Linguistics (2021). https://doi.org/10.18653/v1/2021.emnlp-main.243
Li, X.L., Liang, P.: Prefix-tuning: optimizing continuous prompts for generation. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL/IJCNLP 2021, (Volume 1: Long Papers), Virtual Event, 1–6 August 2021, pp. 4582–4597. Association for Computational Linguistics (2021). https://doi.org/10.18653/v1/2021.acl-long.353
Cui, G., Hu, S., Ding, N., Huang, L., Liu, Z.: Prototypical verbalizer for prompt-based few-shot tuning. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2022, Dublin, Ireland, 22–27 May 2022, pp. 7014–7024. Association for Computational Linguistics (2022). https://doi.org/10.18653/v1/2022.acl-long.483
Shin, T., Razeghi, Y., Logan IV, R.L., Wallace, E., Singh, S.: AutoPrompt: eliciting knowledge from language models with automatically generated prompts. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, EMNLP 2020, 16–20 November 2020, pp. 4222–4235. Association for Computational Linguistics (2020). https://doi.org/10.18653/v1/2020.emnlp-main.346
Chen, T., Kornblith, S., Norouzi, M., Hinton, G.: A simple framework for contrastive learning of visual representations. In: Proceedings of the 37th International Conference on Machine Learning, ICML 2020, 13–18 July 2020, Virtual Event, pp. 1597–1607. PMLR (2020)
Chen, X., Fan, H., Girshick, R., He, K.: Improved Baselines with Momentum Contrastive Learning. CoRR, abs/2003.04297 (2020)
Gao, T., Yao, X., Chen, D.: SimCSE: simple contrastive learning of sentence embeddings. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, EMNLP 2021, Virtual Event / Punta Cana, Dominican Republic, 7–11 November, 2021, pp. 6894–6910. Association for Computational Linguistics (2021). https://doi.org/10.18653/v1/2021.emnlp-main.552
Giorgi, J., Nitski, O., Wang, B., Bader, G.: DeCLUTR: deep contrastive learning for unsupervised textual representations. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL/IJCNLP 2021, (Volume 1: Long Papers), Virtual Event, 1–6 August 2021, pp. 879–895. Association for Computational Linguistics (2021). https://doi.org/10.18653/v1/2021.acl-long.72
Zhang, X., Zhao, J., LeCun, Y.: Character-level convolutional networks for text classification. In: Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, December 7–12, 2015, Montreal, Quebec, Canada, pp. 649–657 (2015)
Collins, E., Rozanov, N., Zhang, B.: Evolutionary data measures: understanding the difficulty of text classification tasks. In: Proceedings of the 22nd Conference on Computational Natural Language Learning, CoNLL 2018, Brussels, Belgium, October 31–November 1, 2018, pp. 380–391. Association for Computational Linguistics (2018). https://doi.org/10.18653/v1/k18-1037
Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. In: 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, 6–9 May 2019. OpenReview.net (2019)
Schick, T., Schmid, H., Schütze, H.: Automatically identifying words that can serve as labels for few-shot text classification. In: Proceedings of the 28th International Conference on Computational Linguistics, COLING 2020, Barcelona, Spain, 8–13 December 2020, pp. 5569–5578. International Committee on Computational Linguistics (2020). https://doi.org/10.18653/v1/2020.coling-main.488
Wang, F., Liu, H.: Understanding the behaviour of contrastive loss. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, 19–25 June 2021, pp. 2495–2504. Computer Vision Foundation/IEEE (2021). https://doi.org/10.1109/CVPR46437.2021.00252
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Yan, Z., Tang, Y., Liu, X. (2024). Prompt-Based and Two-Stage Training for Few-Shot Text Classification. In: Sun, Y., Lu, T., Wang, T., Fan, H., Liu, D., Du, B. (eds) Computer Supported Cooperative Work and Social Computing. ChineseCSCW 2023. Communications in Computer and Information Science, vol 2012. Springer, Singapore. https://doi.org/10.1007/978-981-99-9637-7_2
Download citation
DOI: https://doi.org/10.1007/978-981-99-9637-7_2
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-9636-0
Online ISBN: 978-981-99-9637-7
eBook Packages: Computer ScienceComputer Science (R0)