skip to main content
10.1145/3580305.3599456acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article
Free Access

Open-Set Semi-Supervised Text Classification with Latent Outlier Softening

Published:04 August 2023Publication History

ABSTRACT

Semi-supervised text classification (STC) has been extensively researched and reduces human annotation. However, existing research assuming that unlabeled data only contains in-distribution texts is unrealistic. This paper extends STC to a more practical Open-set Semi-supervised Text Classification (OSTC) setting, which assumes that the unlabeled data contains out-of-distribution (OOD) texts. The main challenge in OSTC is the false positive inference problem caused by inadvertently including OOD texts during training. To address the problem, we first develop baseline models using outlier detectors for hard OOD-data filtering in a pipeline procedure. Furthermore, we propose a Latent Outlier Softening (LOS) framework that integrates semi-supervised training and outlier detection within probabilistic latent variable modeling. LOS softens the OOD impacts by the Expectation-Maximization (EM) algorithm and weighted entropy maximization. Experiments on 3 created datasets show that LOS significantly outperforms baselines.

Skip Supplemental Material Section

Supplemental Material

video1355596953.mp4

mp4

3.1 MB

References

  1. Randall Balestriero, Sebastien Paris, and Richard G. Baraniuk. 2020. Analytical Probability Distributions and Exact Expectation-Maximization for Deep Generative Networks. In NeurIPS.Google ScholarGoogle Scholar
  2. Siqi Bao, Huang He, Fan Wang, Hua Wu, and Haifeng Wang. 2020. PLATO: Pre-trained Dialogue Generation Model with Discrete Latent Variable. In ACL. Association for Computational Linguistics, 85--96.Google ScholarGoogle Scholar
  3. Yu Bao, Hao Zhou, Shujian Huang, Dongqi Wang, Lihua Qian, Xinyu Dai, Jiajun Chen, and Lei Li. 2022. latent-GLAT: Glancing at Latent Variables for Parallel Text Generation. In ACL. 8398--8409.Google ScholarGoogle Scholar
  4. Iacer Calixto, Miguel Rios, and Wilker Aziz. 2019. Latent Variable Model for Multi-modal Translation. In ACL. 6392--6405.Google ScholarGoogle Scholar
  5. Ming-Wei Chang, Lev-Arie Ratinov, Dan Roth, and Vivek Srikumar. 2008. Importance of Semantic Representation: Dataless Classification.. In AAAI, Vol. 2. 830--835.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Jiaao Chen, Zichao Yang, and Diyi Yang. 2020. MixText: Linguistically-Informed Interpolation of Hidden Space for Semi-Supervised Text Classification. In ACL. 2147--2157.Google ScholarGoogle Scholar
  7. Junfan Chen, Richong Zhang, Yongyi Mao, Hongyu Guo, and Jie Xu. 2019. Uncover the Ground-Truth Relations in Distant Supervision: A Neural Expectation-Maximization Framework. In EMNLP-IJCNLP. 326--336.Google ScholarGoogle Scholar
  8. Junfan Chen, Richong Zhang, Jie Xu, Chunming Hu, and Yongyi Mao. 2022b. A Neural Expectation-Maximization Framework for Noisy Multi-Label Text Classification. TKDE 01 (2022), 1--12.Google ScholarGoogle Scholar
  9. Wei Chen, Yeyun Gong, Song Wang, Bolun Yao, Weizhen Qi, Zhongyu Wei, Xiaowu Hu, Bartuer Zhou, Yi Mao, Weizhu Chen, Biao Cheng, and Nan Duan. 2022a. DialogVED: A Pre-trained Latent Variable Encoder-Decoder Model for Dialog Response Generation. In ACL. 4852--4864.Google ScholarGoogle Scholar
  10. Jihun Choi, Taeuk Kim, and Sang-goo Lee. 2019. A Cross-Sentence Latent Variable Model for Semi-Supervised Text Sequence Matching. In ACL. 4747--4761.Google ScholarGoogle Scholar
  11. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In NAACL-HLT. 4171--4186.Google ScholarGoogle Scholar
  12. Angela F. Gao, Jorge C. Castellanos, Yisong Yue, Zachary E. Ross, and Katherine L. Bouman. 2021. DeepGEM: Generalized Expectation-Maximization for Blind Inversion. In NeurIPS. 11592--11603.Google ScholarGoogle Scholar
  13. Suchin Gururangan, Tam Dang, Dallas Card, and Noah A. Smith. 2019. Variational Pretraining for Semi-supervised Text Classification. In ACL. Association for Computational Linguistics, 5880--5894.Google ScholarGoogle Scholar
  14. Dan Hendrycks and Kevin Gimpel. 2017. A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks. In ICLR.Google ScholarGoogle Scholar
  15. Junkai Huang, Chaowei Fang, Weikai Chen, Zhenhua Chai, Xiaolin Wei, Pengxu Wei, Liang Lin, and Guanbin Li. 2021. Trash to Treasure: Harvesting OOD Data with Cross-Modal Matching for Open-Set Semi-Supervised Learning. In ICCV. 8290--8299.Google ScholarGoogle Scholar
  16. Shuning Jin, Sam Wiseman, Karl Stratos, and Karen Livescu. 2020. Discrete Latent Variable Representations for Low-Resource Text Classification. In ACL. 4831--4842.Google ScholarGoogle Scholar
  17. Canasai Kruengkrai. 2019. Better Exploiting Latent Variables in Text Modeling. In ACL. 5527--5532.Google ScholarGoogle Scholar
  18. Dong-Hyun Lee et al. 2013. Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks. In Workshop on challenges in representation learning, ICML, Vol. 3. 896.Google ScholarGoogle Scholar
  19. Ju Hyoung Lee, Sang-Ki Ko, and Yo-Sub Han. 2021. SALNet: Semi-supervised Few-Shot Text Classification with Attention-based Lexicon Construction. In AAAI. 13189--13197.Google ScholarGoogle Scholar
  20. Changchun Li, Ximing Li, and Jihong Ouyang. 2021. Semi-Supervised Text Classification with Balanced Deep Representation Distributions. In ACL. 5044--5053.Google ScholarGoogle Scholar
  21. Haoran Li, Chun-Mei Feng, Tao Zhou, Yong Xu, and Xiaojun Chang. 2022a. Prompt-driven efficient Open-set Semi-supervised Learning. CoRR , Vol. abs/2209.14205 (2022).Google ScholarGoogle Scholar
  22. Shujie Li, Min Yang, Chengming Li, and Ruifeng Xu. 2022b. Dual Pseudo Supervision for Semi-Supervised Text Classification with a Reliable Teacher. In SIGIR. 2513--2518.Google ScholarGoogle Scholar
  23. Ting-En Lin and Hua Xu. 2019. Deep Unknown Intent Detection with Margin Loss. In ACL. 5491--5496.Google ScholarGoogle Scholar
  24. Chen Liu, Mengchao Zhang, Zhibing Fu, Panpan Hou, and Yu Li. 2021. FLiText: A Faster and Lighter Semi-Supervised Text Classification with Convolution Networks. In EMNLP. 2481--2491.Google ScholarGoogle Scholar
  25. Yen-Cheng Liu, Chih-Yao Ma, Xiaoliang Dai, Junjiao Tian, Peter Vajda, Zijian He, and Zsolt Kira. 2022. Open-Set Semi-Supervised Object Detection. In ECCV. 143--159.Google ScholarGoogle Scholar
  26. Pablo Mendes, Max Jakob, and Christian Bizer. 2012. DBpedia: A Multilingual Cross-domain Knowledge Base. In LREC. Istanbul, Turkey, 1813--1817.Google ScholarGoogle Scholar
  27. Yu Meng, Jiaming Shen, Chao Zhang, and Jiawei Han. 2018. Weakly-Supervised Neural Text Classification. In CIKM. 983--992.Google ScholarGoogle Scholar
  28. Takeru Miyato, Andrew M. Dai, and Ian J. Goodfellow. 2017. Adversarial Training Methods for Semi-Supervised Text Classification. In ICLR.Google ScholarGoogle Scholar
  29. Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, Ilya Sutskever, et al. 2019. Language models are unsupervised multitask learners. OpenAI blog, Vol. 1, 8 (2019), 9.Google ScholarGoogle Scholar
  30. Kuniaki Saito, Donghyun Kim, and Kate Saenko. 2021. OpenMatch: Open-set Consistency Regularization for Semi-supervised Learning with Outliers. In NeurIPS.Google ScholarGoogle Scholar
  31. Lei Shu, Hu Xu, and Bing Liu. 2017. DOC: Deep Open Classification of Text Documents. In EMNLP. 2911--2916.Google ScholarGoogle Scholar
  32. Antti Tarvainen and Harri Valpola. 2017. Weight-averaged consistency targets improve semi-supervised deep learning results. CoRR , Vol. abs/1703.01780 (2017).Google ScholarGoogle Scholar
  33. Austin Cheng-Yun Tsai, Sheng-Ya Lin, and Li-Chen Fu. 2022. Contrast-Enhanced Semi-supervised Text Classification with Few Labels. In AAAI. 11394--11402.Google ScholarGoogle Scholar
  34. Qizhe Xie, Zihang Dai, Eduard H. Hovy, Thang Luong, and Quoc Le. 2020. Unsupervised Data Augmentation for Consistency Training. In NeurIPS.Google ScholarGoogle Scholar
  35. Hai-Ming Xu, Lingqiao Liu, and Ehsan Abbasnejad. 2022. Progressive Class Semantic Matching for Semi-supervised Text Classification. In NAACL. 3003--3013.Google ScholarGoogle Scholar
  36. Guangfeng Yan, Lu Fan, Qimai Li, Han Liu, Xiaotong Zhang, Xiao-Ming Wu, and Albert Y. S. Lam. 2020. Unknown Intent Detection Using Gaussian Mixture Model with an Application to Zero-shot Intent Classification. In ACL. 1050--1060.Google ScholarGoogle Scholar
  37. Qing Yu, Daiki Ikami, Go Irie, and Kiyoharu Aizawa. 2020. Multi-task Curriculum Framework for Open-Set Semi-supervised Learning. In ECCV, , Andrea Vedaldi, Horst Bischof, Thomas Brox, and Jan-Michael Frahm (Eds.), Vol. 12357. 438--454.Google ScholarGoogle Scholar
  38. Qingfu Zhu, Wei-Nan Zhang, Ting Liu, and William Yang Wang. 2021. Neural Stylistic Response Generation with Disentangled Latent Variables. In ACL. 4391--4401.Google ScholarGoogle Scholar
  39. Ronghang Zhu and Sheng Li. 2022. CrossMatch: Cross-Classifier Consistency Regularization for Open-Set Single Domain Generalization. In ICLR.Google ScholarGoogle Scholar

Index Terms

  1. Open-Set Semi-Supervised Text Classification with Latent Outlier Softening

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      KDD '23: Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
      August 2023
      5996 pages
      ISBN:9798400701030
      DOI:10.1145/3580305

      Copyright © 2023 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 4 August 2023

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate1,133of8,635submissions,13%

      Upcoming Conference

      KDD '24
    • Article Metrics

      • Downloads (Last 12 months)457
      • Downloads (Last 6 weeks)85

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader