Abstract
Aiming at the problems of insufficient labelled samples and low-generalization performance in text classification tasks, this paper studies text classification problems under the condition of few labelled samples and proposes a few-shot short-text classification method (Meta-FCS) that combines the advantages of text semantic vector representation, meta-learning, fine-tuning and vector similarity measurement. The method not only effectively transfers the common features of different fields but also highlights the individual features of this field through fine-tuning. In addition, to facilitate the downstream text classification task, a deep language representation model is proposed. On this basis, the similarity between the query set and the class centroid of the support set is compared to determine the query set category. We evaluate the proposed method on a well-studied sentiment classification dataset, an entity-relationship classification dataset and an news topic dataset. The experimental results show that on these three datasets, the proposed method significantly outperforms the existing state-of-the-art approaches. It can thus be further suggested that the combination of deep language representation, episode training mechanism, and similarity measurement can be a promising solution for few-shot learning (FSL) of natural language processing (NLP) tasks.
Similar content being viewed by others
Notes
The short text studied in this paper refers to texts with a length of no more than 512 words.
References
Bao Y, Wu M, Chang S et al (2020) Few-shot text classification with distributional signatures. Paper presented at the 8th international conference on learning representations, Addis Ababa, Ethiopia, April 2020
Chen Y, Chiang S, Wu M (2022) A few-shot transfer learning approach using text-label embedding with legal attributes for law article prediction. Appl Intell 52(3):2884–2902
Deng S, Zhang N, Sun Z, et al. (2020) When low resource NLP meets unsupervised language model: meta-pretraining then meta-learning for few-shot text classification (student abstract). Paper presented at the 34th A,A A I conference on Artificial Intelligence, New York USA, pp 13773–13774 February
Finn C, Abbeel P, Levine S (2017) Model-agnostic meta-learning for fast adaptation of deep networks. Paper presented at the 34th international conference on machine learning, Sydney, Australia, 1126–1135 August 2017
Geng R, Li B, Li Y et al (2019) Induction networks for few-shot text classification. Paper presented at the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing, Hong Kong, China, 3904–3913 November 2019
Geng R, Li B, Li Y, et al. (2020) Dynamic memory induction networks for few-shot text classification. Paper presented at the 58th annual meeting of the association for computational linguistics, A Virtual Conference, 1087–1094 July 2020
Han X, Zhu H, Yu P, et al. (2020) Fewrel: A large-scale supervised few-shot relation classification dataset with state-of-the-art evaluation. Paper presented at the 2018 conference on empirical methods in natural language processing, Brussels, Belgium, pp 4803–4809, October 31–November 4, 2018
Howard J, Ruder S (2018) Universal language model fine-tuning for text classification. Paper presented at the 56th annual meeting of the association for computational linguistics, Melbourne, Australia, 328–339 July 2018
Kim J, Kim T, Kim S, et al. (2019) Edge-labeling graph neural network for few-shot learning. Paper presented at the I,EEE/CVF conference on computer vision and pattern recognition (CVPR 2019), Long Beach, CA, USA, 11–20 June 2019
Koch G, Zemel R, Salakhutdinov R (2015) Siamese neural networks for one-shot image recognition. Paper presented at the 32nd international conference on machine learning, ICML 2015, Lille, France, volume 37 June 2015
Lee JH, Ko SK, Han YS (2021) Salnet: semi-supervised few-shot text classification with attention-based lexicon construction
Lee K, Maji S, Ravichandran A et al (2019) Meta-learning with differentiable convex optimization. Paper presented at the IEEE/CVF conference on computer vision and pattern recognition (CVPR 2019), Long Beach, CA, USA, 10657–10665 June 2019
Minaee S, Kalchbrenner N, Cambria E, et al. (2021) Deep learning–based text classification: a comprehensive review. ACM Comput Surv CSUR 54(3):1–40
Mirończuk MM, Protasiewicz J (2018) A recent overview of the state-of-the-art elements of text classification. Expert Syst Appl 106(15):36–54
Mishra N, Rohaninejad M, Chen X, et al. (2018) A simple neural attentive meta-learner. Paper presented at the 6th international conference on learning representations, Vancouver, Canada, April 30 – May 3, 2018
Munkhdalai T, Yu H (2017). Paper presented at the 34th international conference on machine learning, Sydney, Australia, 2554–2563 August 2017
Munkhdalai T, Yuan X, Mehri S et al (2018) Rapid adaptation with conditionally shifted neurons. Paper presented at the 35th international conference on machine learning, Stockholm, Sweden, 3664–3673 July 2018
Pang N, Zhao X, Wang W et al (2021) Few-shot text classification by leveraging bi-directional attention and cross-class knowledge. Sci China Inf Sci 64(3):1–13
Rusu A, Rao D, Sygnowski J et al (2018) Meta-learning with latent embedding optimization. Paper presented at the 6th international conference on learning representations, Vancouver, Canada, May 2018
Satorras VG, Estrach JB (2018) Few-shot learning with graph neural networks. Paper presented at the 6th international conference on learning representations, Vancouver, Canada April 30–May 3 2018
Snell J, Swersky K, Zemel R (2017) Prototypical networks for few-shot learning. Adv Neural Inf Process Syst 30:4077–4087
Sung F, Yang Y G, Zhang L et al (2018) Learning to compare: Relation network for few-shot learning. Paper presented at the 31st IEEE conference on computer vision and pattern recognition, Salt Lake City, USA, 1199–1208 June 2018
Vinyals O, Blundell C, Lillicrap T et al (2016) Matching networks for one shot learning. Adv Neural Inf Process Syst 29:3630–3638
Wang F, Li C, Zeng Z et al (2021a) Cornerstone network with feature extractor: a metric-based few-shot model for chinese natural sign language. Appl Intell 51(10):7139–7150
Wang JX, Wang KC, Rudzicz F et al (2021b) Grad2Task: Improved few-shot text classification using gradients for task representation. Adv Neural Inf Process Syst 34:1–13
Xu JC, Du QF (2020) Learning transferable features in meta-learning for few-shot text classification. Pattern Recog Lett 135:271–278
Xu SY, Yang X (2021) Frog-GNN: multi-perspective aggregation based graph neural network for few-shot text classification. Exp Syst Appl 176:114,795
Xu T, Sun H, Ma C et al (2020) Classification model for few-shot texts based on bi-directional long-term attention features. Data Anal Knowl Discov 4(10):113–123
Yan LM, Zheng YH, Cao J (2018) Few-shot learning for short text classification. Multimed Tools Appl 77(22):29,799–29,810
Yu M, Guo X, Yi J et al (2018) Diverse few-shot text classification with multiple metrics. Paper presented at the 16th annual conference of the North American chapter of the association for computational linguistics: human language technologies, New Orleans Louisiana, USA, 1206–1215 June 2018
Funding
This work was supported in part by the National Natural Science Foundation of China (NO. 61802433) and the “Advanced Industrial internet Security Platform” project of Zhijiang Laboratory (NO. 2018FD0ZX01).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interests
The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Liu, W., Pang, J., Li, N. et al. Few-shot short-text classification with language representations and centroid similarity. Appl Intell 53, 8061–8072 (2023). https://doi.org/10.1007/s10489-022-03880-y
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-022-03880-y