Abstract
In some few-shot text classification tasks with strong data privacy or difficult labeling, the performance of pipeline methods, which directly encode text features and perform linear classification, is limited by the feature extraction ability of models. An increasing number of studies have recognized the significance of combining text features with label semantics and achieved good results. However, these existing methods cannot be well generalized to classification tasks where the class names have weak correlations with the instance texts. In this work, we address this problem by means of an effective fusion of text-label similarity and a redesign of contrastive loss. Firstly, the semantic similarity modules of text-text and text-label are adopted for further merging to improve the feature extraction ability. Then, we introduce DLSC, an inter-class differences and label semantics contrastive loss that facilitates instance embeddings to approximate correct label semantics in vector space. Experimental results show that our approach has greatly improved F1 scores on English and Chinese datasets from six classification tasks, even in tasks where label names are not strongly correlated with texts.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 4171–4186, Minneapolis, Minnesota, June 2019. Association for Computational Linguistics
Snell, J., Swersky, K., Zemel, R.: Prototypical networks for few-shotlearning. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS’17, pp. 4080–4090, Red Hook, NY, USA, 2017. Curran Associates Inc.
Müller, T., Pérez-Torró, G., Franco-Salvador, M.: Few-shotlearning with Siamese networks and label tuning. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 8532–8545, Dublin, Ireland, May 2022. Association for Computational Linguistics
Ma, J., et al.: Label semantics for few shot named entity recognition. In: Findings of the Association for Computational Linguistics: ACL 2022, pp. 1956–1971, Dublin, Ireland, May 2022. Association for Computational Linguistics
Sun, S., Sun, Q., Zhou, K., Lv, T.: Hierarchical attention prototypical networks for few-shot text classification. In: Proceedings of the2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 476–485 (2019)
Geng, R., Li, B., Li, Y., Zhu, X., Jian, P., Sun, J.: Induction networks for few-shot text classification. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 3904–3913, Hong Kong, China, November 2019. Association for Computational Linguistics
Liu, P., Yuan, W., Jinlan, F., Jiang, Z., Hayashi, H., Neubig, G.: Pre-train, prompt, and predict: a systematic survey of prompting methods in natural language processing. ACM Comput. Surv. 55(9), 1–35 (2023)
Brown, T.B., et al.: Language models are few-shot learners. In: Advances in Neural Information Processing Systems (NeurIPS) (2020)
Shin, T., Razeghi, Y., Logan IV, R.L., Wallace, E., Singh, S.: AutoPrompt: eliciting knowledge from language models with automatically generated prompts. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 4222–4235, Online,November 2020. Association for Computational Linguistics
Gao, T., Fisch, A., Chen, D.: Making pre-trained language modelsbetter few-shot learners. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 3816–3830, Online, August 2021. Association for Computational Linguistics
Luo, Q., Liu, L., Lin, Y., Zhang, W.: Don’t miss the labels:label-semantic augmented meta-learner for few-shot text classification. In: Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pp. 2773–2782, Online, August 2021. Association for Computational Linguistics
Mueller, A., et al.: Label semantic aware pre-training for few-shot textclassification. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 8318–8334, Dublin, Ireland, May 2022. Association for Computational Linguistics
Hadsell, R., Chopra, S., LeCun, Y.: Dimensionality reduction by learning an invariant mapping. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), vol. 2, pp. 1735–1742. IEEE (2006)
Gao, T., Yao, X., Chen, D.: SimCSE: simple contrastive learning of sentence embeddings. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp. 6894–6910, Online and Punta Cana, Dominican Republic, November 2021. Association for Computational Linguistics
He, K., Fan, H., Wu, Y., Xie, S., Girshick, R.: Momentum contrast for unsupervised visual representation learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9729–9738 (2020)
Chen, T., Kornblith, S., Norouzi, M., Hinton, G.: A simple framework for contrastive learning of visual representations. In: International Conference on Machine Learning, pp. 1597–1607. PMLR (2020)
Khosla, P., et al.: Supervised contrastive learning. Adv. Neural. Inf. Process. Syst. 33, 18661–18673 (2020)
Pang, B., Lee, L.: A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts. arXiv preprint cs/0409058 (2004)
Hu, M., Liu, B.: Mining and summarizing customer reviews. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 168–177 (2004)
Voorhees, E.M., Tice, D.M.: Building a question answering test collection. In: Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 200–207 (2000)
Zhang, X., Zhao, J., LeCun, Y.: Character-level convolutional networks for text classification. In: Advances in Neural Information Processing Systems, vol. 28 (2015)
SMP2020-EWECT Homepage. https://smp2020ewect.github.io/. Accessed 09 June 2020
Xu, L., et al. Fewclue: a Chinese few-shot learning evaluation benchmark. arXiv preprint arXiv:2107.07498 (2021)
Gunel, B., Du, J., Conneau, A., Stoyanov. V.: Supervised contrastive learning for pre-trained language model fine-tuning. arXiv preprint arXiv:2011.01403 (2020)
Wang, F., Liu, H.: Understanding the behaviour of contrastive loss. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2495–2504 (2021)
Acknowledgements
This research was supported by Sichuan Science and Technology Program, grant number 2022ZHCG0007.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Xie, X., Chen, R., Peng, T., Cui, Z., Chen, Z. (2023). Leveraging Inter-class Differences and Label Semantics for Few-Shot Text Classification. In: Huang, DS., Premaratne, P., Jin, B., Qu, B., Jo, KH., Hussain, A. (eds) Advanced Intelligent Computing Technology and Applications. ICIC 2023. Lecture Notes in Computer Science(), vol 14089. Springer, Singapore. https://doi.org/10.1007/978-981-99-4752-2_57
Download citation
DOI: https://doi.org/10.1007/978-981-99-4752-2_57
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-4751-5
Online ISBN: 978-981-99-4752-2
eBook Packages: Computer ScienceComputer Science (R0)