research-article

A Simple Semi-Supervised Joint Learning Framework for Few-shot Text Classification

Authors:

Guangyue LuAuthors Info & Claims

AIPR '22: Proceedings of the 2022 5th International Conference on Artificial Intelligence and Pattern Recognition

Pages 14 - 21

https://doi.org/10.1145/3573942.3573945

Published: 16 May 2023 Publication History

Abstract

The lack of labeled data is the bottleneck restricting deep text classification algorithm. State-of-the-art for most existing deep text classification methods follow the two-step transfer learning paradigm: pre-training a large model on an auxiliary task, and then fine-tuning the model on a labeled data. Their shortcoming is the high cost of training. To reduce training costs as well as alleviate the need for labeled data, we present a novel simple Semi-Supervised Joint Learning (SSJL) framework for few-shot text classification that captures the rich text semantics from large user-tagged data (referred to as weakly-labeled data) with noisy labels while also learning correct category distributions in small labeled data. We refine the contrastive loss function to better exploit inter-class contrastive patterns, making contrastive learning more applicable to the weakly-labeled setting. Besides, an appropriate temperature hyper-parameter can improve model robustness under label noise. The experimental results on four real-world datasets show that our approach outperforms the other baseline methods. Moreover, SSJL significantly boosts the deep models’ performance with only 0.5% (i.e. 32 samples) of the labeled data, showing its robustness in the data sparsity scenario.

References

[1]

Choi G, Oh S, Kim H. Improving document-level sentiment classification using importance of sentences [J]. Entropy, 2020, 22(12): 1336.

[2]

Sachin S, Tripathi A, Mahajan N, Sentiment analysis using gated recurrent neural networks [J]. SN Computer Science, 2020, 1(2): 1-13.

Digital Library

[3]

Li W, Qi F, Tang M, Bidirectional LSTM with self-attention mechanism and multi-channel features for sentiment classification [J]. Neurocomputing, 2020, 387:63-77.

Digital Library

[4]

Guan Z, Chen L, Zhao W, Weakly-Supervised Deep Learning for Customer Review Sentiment Classification [C] //IJCAI. 2016: 3719-3725.

[5]

Go A, Bhayani R, Huang L. Twitter sentiment classification using distant supervision [J]. CS224N project report, Stanford, 2009, 1(12): 2009

[6]

Deriu J, Lucchi A, De Luca V, Leveraging large amounts of weakly supervised data for multi-language sentiment classification [C] //Proceedings of the 26th international conference on world wide web. 2017: 1045-1052.

[7]

Qu L, Gemulla R, Weikum G. A weakly supervised model for sentence-level semantic orientation analysis with multiple experts [C] //Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. 2012: 149-159.

[8]

Severyn A, Moschitti A. Unitn: Training deep convolutional neural network for twitter sentiment classification [C] //Proceedings of the 9th international workshop on semantic evaluation (SemEval 2015). 2015: 464-469.

[9]

Kanavos A, Nodarakis N, Sioutas S, Large scale implementations for twitter sentiment classification [J]. Algorithms, 2017, 10(1): 33.

[10]

Zhao W, Guan Z, Chen L, Weakly-supervised deep embedding for product review sentiment analysis [J]. IEEE Transactions on Knowledge and Data Engineering, 2017, 30(1): 185-197.

[11]

Zhang C, Bengio S, Hardt M, Understanding deep learning (still) requires rethinking generalization [J]. Communications of the ACM, 2021, 64(3): 107-115.

Digital Library

[12]

Liu Y, Ott M, Goyal N, Roberta: A robustly optimized bert pretraining approach [J]. arXiv preprint arXiv:1907.11692, 2019.

[13]

Devlin J, Chang M W, Lee K, Bert: Pre-training of deep bidirectional transformers for language understanding [J]. arXiv preprint arXiv:1810.04805, 2018.

[14]

Chen T, Kornblith S, Norouzi M, A simple framework for contrastive learning of visual representations [C] //International conference on machine learning. PMLR, 2020: 1597-1607.

[15]

Wang F, Liu H. Understanding the behaviour of contrastive loss [C] //Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021: 2495-2504.

[16]

Ghosh A, Lan A. Contrastive learning improves model robustness under label noise [C] //Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021: 2703-2708.

[17]

Wang S I, Manning C D. Baselines and bigrams: Simple, good sentiment and topic classification [C] //Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). 2012: 90-94.

[18]

Gunel B, Du J, Conneau A, Supervised contrastive learning for pre-trained language model fine-tuning [J]. arXiv preprint arXiv:2011.01403, 2020.

[19]

Mikolov T, Sutskever I, Chen K, Distributed representations of words and phrases and their compositionality [J]. Advances in neural information processing systems, 2013, 26.

Index Terms

A Simple Semi-Supervised Joint Learning Framework for Few-shot Text Classification
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing

Recommendations

Semi-Supervised Text Classification via Self-Paced Semantic-Level Contrast
Advances in Knowledge Discovery and Data Mining
Abstract
Semi-Supervised Text Classification (SSTC) aims to explore discriminative information from unlabeled texts in a self-training manner. These methods pre-train the deep classifier on labeled texts. Recent works further fine-tune the model on the ... $^{}$ $^{}$ $^{}$ $^{}$
SPContrastNet: A Self-Paced Contrastive Learning Model for Few-Shot Text Classification
Meta-learning has recently promoted few-shot text classification, which identifies target classes based on information transferred from source classes through a series of small tasks or episodes. Existing works constructing their meta-learner on ...
Semantic guide for semi-supervised few-shot multi-label node classification
Abstract
We study a new research problem named semi-supervised few-shot multi-label node classification which has the following characteristics: 1) the extreme imbalance between the number of labeled and unlabeled nodes that are connected on ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

AIPR '22: Proceedings of the 2022 5th International Conference on Artificial Intelligence and Pattern Recognition

September 2022

1221 pages

ISBN:9781450396899

DOI:10.1145/3573942

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 16 May 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

AIPR 2022

AIPR 2022: 2022 5th International Conference on Artificial Intelligence and Pattern Recognition

September 23 - 25, 2022

Xiamen, China

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
61
Total Downloads

Downloads (Last 12 months)21
Downloads (Last 6 weeks)3

Reflects downloads up to 01 Mar 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten