Skip to main content
Log in

Few-shot learning for short text classification

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Due to the limited length and freely constructed sentence structures, it is a difficult classification task for short text classification. In this paper, a short text classification framework based on Siamese CNNs and few-shot learning is proposed. The Siamese CNNs will learn the discriminative text encoding so as to help classifiers distinguish those obscure or informal sentence. The different sentence structures and different descriptions of a topic are viewed as ‘prototypes’, which will be learned by few-shot learning strategy to improve the classifier’s generalization. Our experimental results show that the proposed framework leads to better results in accuracies on twitter classifications and outperforms some popular traditional text classification methods and a few deep network approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

References

  1. Bin G, Sheng VS (2016) A robust regularization path algorithm for ν-support vector classification. IEEE Trans Neural Netw Learn Syst. https://doi.org/10.1109/TNNLS.2016.2527796

    Article  Google Scholar 

  2. Bin G, Sheng VS, Tay KY, Romano W, Li S (2015) Incremental support vector learning for ordinal regression. IEEE Trans Neural Netw Learn Syst 26(7):1403–1416

    Article  MathSciNet  Google Scholar 

  3. Blaes S, Burwick T (2017) Few-shot learning in deep networks through global prototyping[J]. Neural Netw Off J Int Neural Netw Soc 94:159–172

    Article  Google Scholar 

  4. Chen B, Qi X, Sun X, Shi Y-Q (2017) Quaternion pseudo-Zernike moments combining both of RGB information and depth information for color image splicing detection. J Vis Commun Image Represent

  5. Cheng J, Zhang X, Li P et al (2016) Exploring sentiment parsing of microblogging texts for opinion polling on Chinese public figures. Appl Intell 45(2):429–442

    Article  Google Scholar 

  6. Ding G, Guo Y, Zhou J, Gao Y (2016) Large-scale cross-modality search via collective matrix factorization hashing. IEEE Trans Image Process 25(11):5427–5440

    Article  MathSciNet  Google Scholar 

  7. Ding G, Zhou J, Guo Y, Lin Z, Zhao S (2017) Large-scale image retrieval with sparse embedded hashing. Neurocomputing 257:24–36

    Article  Google Scholar 

  8. Fu Z, Huang F, Sun X, Vasilakos AV, Yang C-N (2016) Enabling semantic search based on conceptual graphs over encrypted outsourced data. IEEE Trans Serv Comput. https://doi.org/10.1109/TSC.2016.2622697

  9. Guo Y, Ding G, Han J (2017) Robust quantization for general similarity search. IEEE Trans Image Process PP(99):1–1

  10. Guo Y, Ding G, Liu L, Han J, Shao L (2017) Learning to hash with optimized anchor embedding for scalable retrieval. IEEE Trans Image Process 26(3):1344–1354

    Article  MathSciNet  Google Scholar 

  11. Guo Y, Ding G, Han J et al Zero-shot learning with transferred samples. IEEE Trans Image Process 26(7):3277

    Article  MathSciNet  Google Scholar 

  12. Han J, Cheng G, Li Z et al (2017) A unified metric learning-based framework for co-saliency detection. IEEE Trans Circuits Syst Video Technol PP(99):1–1

  13. Han J, Chen H, Liu N et al (2017) CNNs-based RGB-D saliency detection via cross-view transfer and multiview fusion[J]. IEEE Trans Cybern PP(99):1–13

  14. Hariharan B, Girshick R (2016). Low-shot visual object recognition. arXiv:1606.02819

  15. Hecht T, Gepperth A (2016). Computational advantages of deep prototype-based learning. In: International conference on artificial neural networks, Springer, pp 121–127

  16. Jetley S, Romera-Paredes B, Jayasumana S, Torr P (2015) Prototypical priors: from improving classification to zero-shot learning. arXiv preprint arXiv:1512. 01192

  17. Kim Y (2014) Convolutional neural networks for sentence classification. arXiv, 1408.5882

  18. Koch G, Zemel R, Salakhutdinov R (2015) Siamese neural networks for one-shot image recognition. Proceedings of the 32nd international conference on machine learning, Lille, France

  19. Lake BM, Salakhutdinov R, Tenenbaum JB (2013) One-shot learning by inverting a compositional causal process[J]. Adv Neural Inf Proces Syst 2526–2534

  20. Lampert CH, Nickisch H, Harmeling S (2009) Learning to detect unseen object classes by between-class attribute transfer. In: IEEE conference on computer vision and pattern recognition. CVPR 2009 IEEE, pp 951–958

  21. Li J, Li X, Yang B, Sun X (2015) Segmentation-based image copy-move forgery detection scheme. IEEE Trans Inf Forensics Secur 10(3):507–518

    Article  Google Scholar 

  22. Mike T, Kevan B, Georgios P (2012) Sentiment strength detection for the social web. J Assoc Inf Sci Technol 63(1):163–173

    Article  Google Scholar 

  23. Mikolov T, Karafiát M, Burget L et al (2010) Recurrent neural network based language model. 11th Annual Conference of the International Speech Communication Association, Makuhari, Japan, pp 1045–1048

  24. Nakov P, Rosenthal S, Kiritchenko S et al (2016) Developing a successful SemEval task in sentiment analysis of Twitter and other social media texts. Lang Resour Eval 50(1):35–65

    Article  Google Scholar 

  25. Ravi S, Larochelle H (2017) Optimization as a Model for Few-Shot Learning. 5th International Conference on Learning Representations(ICLR), Toulon, France. https://openreview.net/pdf?id=rJY0-Kcll

  26. Rezende DJ, Mohamed S, Danihelka I, Gregor K, Wierstra D (2016) One-shot generalization in deep generative models. arXiv preprint arXiv:1603.05106

  27. Saif H, Fernández M, He Y et al (2013) Evaluation datasets for twitter sentiment analysis: a survey and a new dataset, the STS-gold. Proceedings of the first international workshop on emotion and sentiment in social and expressive media: approaches and perspectives from AI, A workshop of the XIII International Conference of the Italian Association for Artificial Intelligence, Turin, Italy, pp 9–21

  28. Schroff F, Kalenichenko D, Philbin J (2015) Facenet: a unified embedding for face recognition and clustering. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 815–823

  29. Snell J, Swersky K, Zemel RS (2017) Prototypical networks for few-shot learning. arXiv:1703.05175

  30. Socher R, Lin CC-Y, Ng AY, Manning CD (2011) Parsing natural scenes and natural language with recursive neural networks. Proceedings of the 28th international conference on machine learning, Washington, USA, pp 129–136

  31. Speriosu M, Upadhyay S, Sudan N et al (2011) Twitter polarity classification with label propagation over lexical links and the follower graph. Proceedings of the EMNLP First workshop on Unsupervised Learning in NLP, Edinburgh, Scotland, pp 53–63

  32. Sundermeyer M, Schlüter R, Ney H (2012) LSTM neural networks for language modeling. 13th annual conference of the international speech communication association, Portland, USA, pp 194–197

  33. Tang D, Wei F, Qin B (2014) Coooolll: A deep learning system for Twitter sentiment classification. Proceedings of the 8th International Workshop on Semantic Evaluation, Dublin, Ireland, pp 208–212

  34. Triantafillou E, Zemel RS, Urtasun R Few-shot learning through an information retrieval lens. arXiv:1707.02610

  35. Turney PD, Pantel P (2010) From frequency to meaning: vector space models of semantics. J Artif Intell Res 37(1):141–188

    Article  MathSciNet  Google Scholar 

  36. Vinyals O, Blundell C, Lillicrap T, Wierstra D et al (2016) Matching networks for one shot learning. Adv Neural Inf Process Sys 3630–3638

  37. Wang X, Liu Y, Sun C et al (2012) Predicting polarities of tweets by composing word embeddings with long short-term memory. Unabbreviated Name of Conference, Portland, USA, pp 194–197

  38. Wang J, Li T, Shi Y-Q, Lian S, Ye J Forensics feature analysis in quaternion wavelet domain for distinguishing photographic images and computer graphics. Multimed Tools Appl. https://doi.org/10.1007/s11042-016-4153-0

    Article  Google Scholar 

  39. Weinberger KQ, Blitzer J, Saul LK (2005) Distance metric learning for large margin nearest neighbor classification. In: Advances in neural information processing systems, pp 1473–1480

  40. Yan L, Zheng W, Zhang H(H) et al (2017) Learning discriminative sentiment chunk vectors for twitter sentiment analysis. J Inf Technol 18(7):1605–1613. https://doi.org/10.6138/JIT.2017.18.7.20170410

    Article  Google Scholar 

  41. Yao X, Han J, Cheng G, Qian X, Guo L (2016) Semantic annotation of high-resolution satellite images via weakly supervised learning. IEEE Trans Geosci Remote Sens 54(6):3660–3671

    Article  Google Scholar 

  42. Yao X, Han J, Zhang D, Nie F (2017) Revisiting co-saliency detection: a novel approach based on two-stage multi-view spectral rotation co-clustering. IEEE Trans Image Process 26(7):3196–3209

    Article  MathSciNet  Google Scholar 

  43. Zhang Z, Saligrama V (2015) Zero-shot learning via semantic similarity embedding. In: Proceedings of the IEEE international conference on computer vision, pp 4166–4174

  44. Zhang D, Han J, Li C, Wang J, Li X (2016) Detection of co-salient objects by looking deep and wide. Int J Comput Vis 20(2):215–232

    Article  MathSciNet  Google Scholar 

  45. Zhang D, Han J, Jiang L, Ye S, Chang X (2017) Revealing event saliency in unconstrained video collection. IEEE Trans Image Process 26(4):1746–1758

    Article  MathSciNet  Google Scholar 

  46. Zhang D, Meng D, Han J (2017) Co-saliency detection via a self-paced multiple-instance learning framework. IEEE Trans Pattern Anal Mach Intell 39(5):865–878

    Article  Google Scholar 

  47. Zhao Y, Ding DZ, Chen RS (2016) A discontinuous Galerkin time domain integral equation method for electromagnetic scattering from PEC objects. IEEE Trans Antennas Propag 64(6):2410–2417

    Article  MathSciNet  Google Scholar 

  48. Zheng Y, Jeon B, Sun L, Zhang J, Zhang H (2017) Student's t-Hidden Markov Model for Unsupervised Learning Using Localized Feature Selection. IEEE Trans Circuits Syst Video Technol. https://doi.org/10.1109/TCSVT.2017.2724940

  49. Zhou Z, Yang C-N, Chen B, Sun X, Liu Q, Wu QMJ (2016) Effective and efficient image copy detection with resistance to arbitrary rotation. IEICE Trans Inf Syst E99-D(6):1531–1540

    Article  Google Scholar 

  50. Zhou Z, Wang Y, Jonathan Wu QM, Yang C-N, Sun X (2017) Effective and efficient global context verification for image copy detection. IEEE Trans Inf Forensics Secur 12(1):48–63

    Article  Google Scholar 

Download references

Acknowledgements

This work is supported by the Chinese National Natural Science Foundation (NSFC) [grant numbers 61772281, 61602254]; the National Social Science Foundation of China (No. 16ZDA054); Jiangsu Provincial 333 Project (BRA2017396); Six Major Talents PeakProject of Jiangsu Province (XYDXXJS-CXTD-005); the Priority Academic Program Development of Jiangsu Higher Education Institutions (PAPD) and Jiangsu Collaborative Innovation Center on Atmospheric Environment and Equipment Technology (CICAEET).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Leiming Yan.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yan, L., Zheng, Y. & Cao, J. Few-shot learning for short text classification. Multimed Tools Appl 77, 29799–29810 (2018). https://doi.org/10.1007/s11042-018-5772-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-018-5772-4

Keywords

Navigation