Abstract
In the Semi-Supervised Text Classification (SSTC) task, the performance of the SSTC-based models heavily rely on the accuracy of the pseudo-labels for unlabeled data, which is not practical in real-world scenarios. Prompt-learning has recently proved to be effective to alleviate the low accuracy problem caused by the limited label data in SSTC. In this paper, we present a Pattern Exploiting Training with Unsupervised Data Augmentation (PETUDA) method to address SSCT under limited labels setting. We first exploit the potential of the PLMs using prompt learning, convert the text classification task into a cloze-style task, and use the masked prediction ability of the PLMs to predict the categories. Then, we use a variety of data augmentation methods to enhance the model performance with unlabeled data, and introduce a consistency loss in the model training process to make full use of unlabeled data. Finally, we conduct extensive experiments on three text classification benchmark datasets. Empirical results show that PETUDA consistently outperforms the baselines in all cases.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Brown, T., et al.: Language models are few-shot learners. In: NeurIPS 2020, pp. 1877–1901 (2020)
Chang, M., Ratinov, L., Roth, D., Srikumar, V.: Importance of semantic representation: dataless classification. In: AAAI 2008, pp. 830–835 (2008)
Chen, J., Yang, Z., Yang, D.: MixText: linguistically-informed interpolation of hidden space for semi-supervised text classification. In: ACL 2020, pp. 2147–2157 (2020)
Han, X., Zhao, W., Ding, N., Liu, Z., Sun, M.: PTR: prompt tuning with rules for text classification. AI Open 3, 182–192 (2022)
Li, C., Li, X., Ouyang, J.: Semi-supervised text classification with balanced deep representation distributions. In: ACL 2021, pp. 5044–5053 (2021)
Murtadha, A., et al.: Rank-aware negative training for semi-supervised text classification. CoRR abs/2306.07621 (2023)
Schick, T., Schütze, H.: Exploiting cloze-questions for few-shot text classification and natural language inference. In: EACL 2021, pp. 255–269 (2021)
Song, R., et al.: Label prompt for multi-label text classification. Appl. Intell. 53(8), 8761–8775 (2023)
Xie, Q., Dai, Z., Hovy, E., Luong, M.T., Le, Q.V.: Unsupervised data augmentation for consistency training. In: NeurIPS 2020, pp. 6256–6268 (2020)
Zhang, X., Zhao, J.J., LeCun, Y.: Character-level convolutional networks for text classification. In: (NIPS 2015), pp. 649–657 (2015)
Zhu, Y., Zhou, X., Qiang, J., Li, Y., Yuan, Y., Wu, X.: Prompt-learning for short text classification. CoRR abs/2202.11345 (2022)
Acknowledgements
This work was supported in part by the National Natural Science Foundation of China under Grant U1811263, the Science and Technology Program of Guangzhou under Grant 2023A04J1728, the Talent Research Start-Up Foundation of Guangdong Polytechnic Normal University (No. 2021SDKYA098).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Yuan, C., Zhou, Z., Tang, F., Lin, R., Mao, C., Teng, L. (2023). Prompt-Learning for Semi-supervised Text Classification. In: Zhang, F., Wang, H., Barhamgi, M., Chen, L., Zhou, R. (eds) Web Information Systems Engineering – WISE 2023. WISE 2023. Lecture Notes in Computer Science, vol 14306. Springer, Singapore. https://doi.org/10.1007/978-981-99-7254-8_3
Download citation
DOI: https://doi.org/10.1007/978-981-99-7254-8_3
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-7253-1
Online ISBN: 978-981-99-7254-8
eBook Packages: Computer ScienceComputer Science (R0)