Skip to main content

Leveraging Inter-class Differences and Label Semantics for Few-Shot Text Classification

  • Conference paper
  • First Online:
Advanced Intelligent Computing Technology and Applications (ICIC 2023)

Abstract

In some few-shot text classification tasks with strong data privacy or difficult labeling, the performance of pipeline methods, which directly encode text features and perform linear classification, is limited by the feature extraction ability of models. An increasing number of studies have recognized the significance of combining text features with label semantics and achieved good results. However, these existing methods cannot be well generalized to classification tasks where the class names have weak correlations with the instance texts. In this work, we address this problem by means of an effective fusion of text-label similarity and a redesign of contrastive loss. Firstly, the semantic similarity modules of text-text and text-label are adopted for further merging to improve the feature extraction ability. Then, we introduce DLSC, an inter-class differences and label semantics contrastive loss that facilitates instance embeddings to approximate correct label semantics in vector space. Experimental results show that our approach has greatly improved F1 scores on English and Chinese datasets from six classification tasks, even in tasks where label names are not strongly correlated with texts.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 4171–4186, Minneapolis, Minnesota, June 2019. Association for Computational Linguistics

    Google Scholar 

  2. Snell, J., Swersky, K., Zemel, R.: Prototypical networks for few-shotlearning. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS’17, pp. 4080–4090, Red Hook, NY, USA, 2017. Curran Associates Inc.

    Google Scholar 

  3. Müller, T., Pérez-Torró, G., Franco-Salvador, M.: Few-shotlearning with Siamese networks and label tuning. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 8532–8545, Dublin, Ireland, May 2022. Association for Computational Linguistics

    Google Scholar 

  4. Ma, J., et al.: Label semantics for few shot named entity recognition. In: Findings of the Association for Computational Linguistics: ACL 2022, pp. 1956–1971, Dublin, Ireland, May 2022. Association for Computational Linguistics

    Google Scholar 

  5. Sun, S., Sun, Q., Zhou, K., Lv, T.: Hierarchical attention prototypical networks for few-shot text classification. In: Proceedings of the2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 476–485 (2019)

    Google Scholar 

  6. Geng, R., Li, B., Li, Y., Zhu, X., Jian, P., Sun, J.: Induction networks for few-shot text classification. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 3904–3913, Hong Kong, China, November 2019. Association for Computational Linguistics

    Google Scholar 

  7. Liu, P., Yuan, W., Jinlan, F., Jiang, Z., Hayashi, H., Neubig, G.: Pre-train, prompt, and predict: a systematic survey of prompting methods in natural language processing. ACM Comput. Surv. 55(9), 1–35 (2023)

    Article  Google Scholar 

  8. Brown, T.B., et al.: Language models are few-shot learners. In: Advances in Neural Information Processing Systems (NeurIPS) (2020)

    Google Scholar 

  9. Shin, T., Razeghi, Y., Logan IV, R.L., Wallace, E., Singh, S.: AutoPrompt: eliciting knowledge from language models with automatically generated prompts. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 4222–4235, Online,November 2020. Association for Computational Linguistics

    Google Scholar 

  10. Gao, T., Fisch, A., Chen, D.: Making pre-trained language modelsbetter few-shot learners. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 3816–3830, Online, August 2021. Association for Computational Linguistics

    Google Scholar 

  11. Luo, Q., Liu, L., Lin, Y., Zhang, W.: Don’t miss the labels:label-semantic augmented meta-learner for few-shot text classification. In: Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pp. 2773–2782, Online, August 2021. Association for Computational Linguistics

    Google Scholar 

  12. Mueller, A., et al.: Label semantic aware pre-training for few-shot textclassification. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 8318–8334, Dublin, Ireland, May 2022. Association for Computational Linguistics

    Google Scholar 

  13. Hadsell, R., Chopra, S., LeCun, Y.: Dimensionality reduction by learning an invariant mapping. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), vol. 2, pp. 1735–1742. IEEE (2006)

    Google Scholar 

  14. Gao, T., Yao, X., Chen, D.: SimCSE: simple contrastive learning of sentence embeddings. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp. 6894–6910, Online and Punta Cana, Dominican Republic, November 2021. Association for Computational Linguistics

    Google Scholar 

  15. He, K., Fan, H., Wu, Y., Xie, S., Girshick, R.: Momentum contrast for unsupervised visual representation learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9729–9738 (2020)

    Google Scholar 

  16. Chen, T., Kornblith, S., Norouzi, M., Hinton, G.: A simple framework for contrastive learning of visual representations. In: International Conference on Machine Learning, pp. 1597–1607. PMLR (2020)

    Google Scholar 

  17. Khosla, P., et al.: Supervised contrastive learning. Adv. Neural. Inf. Process. Syst. 33, 18661–18673 (2020)

    Google Scholar 

  18. Pang, B., Lee, L.: A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts. arXiv preprint cs/0409058 (2004)

    Google Scholar 

  19. Hu, M., Liu, B.: Mining and summarizing customer reviews. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 168–177 (2004)

    Google Scholar 

  20. Voorhees, E.M., Tice, D.M.: Building a question answering test collection. In: Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 200–207 (2000)

    Google Scholar 

  21. Zhang, X., Zhao, J., LeCun, Y.: Character-level convolutional networks for text classification. In: Advances in Neural Information Processing Systems, vol. 28 (2015)

    Google Scholar 

  22. SMP2020-EWECT Homepage. https://smp2020ewect.github.io/. Accessed 09 June 2020

  23. Xu, L., et al. Fewclue: a Chinese few-shot learning evaluation benchmark. arXiv preprint arXiv:2107.07498 (2021)

  24. Gunel, B., Du, J., Conneau, A., Stoyanov. V.: Supervised contrastive learning for pre-trained language model fine-tuning. arXiv preprint arXiv:2011.01403 (2020)

  25. Wang, F., Liu, H.: Understanding the behaviour of contrastive loss. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2495–2504 (2021)

    Google Scholar 

Download references

Acknowledgements

This research was supported by Sichuan Science and Technology Program, grant number 2022ZHCG0007.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xinran Xie .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Xie, X., Chen, R., Peng, T., Cui, Z., Chen, Z. (2023). Leveraging Inter-class Differences and Label Semantics for Few-Shot Text Classification. In: Huang, DS., Premaratne, P., Jin, B., Qu, B., Jo, KH., Hussain, A. (eds) Advanced Intelligent Computing Technology and Applications. ICIC 2023. Lecture Notes in Computer Science(), vol 14089. Springer, Singapore. https://doi.org/10.1007/978-981-99-4752-2_57

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-4752-2_57

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-4751-5

  • Online ISBN: 978-981-99-4752-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics