Skip to main content
Log in

Towards Safe Semi-supervised Classification: Adjusted Cluster Assumption via Clustering

  • Published:
Neural Processing Letters Aims and scope Submit manuscript

Abstract

Semi-supervised classification methods can perform even worse than the supervised counterparts in some cases. It undoubtedly reduces their confidence in real applications, and it is desired to improve the safety of semi-supervised classification such that it never performs worse than the supervised counterpart. Considering that the cluster assumption may not well reflect the real data distribution, which can be one possible cause of unsafe learning, we develop a safe semi-supervised support vector machine method in this paper by adjusting the cluster assumption (ACA-S3VM for short). Specifically, when samples from different classes are seriously overlapped, the real boundary actually lies not in the low density region, which will not be found by the cluster assumption. However, an unsupervised clustering method is able to detect the real boundary in this case. As a result, we design ACA-S3VM by adjusting the cluster assumption with the help of clustering, which considers the distances of individual unlabeled instances to the distribution boundary in learning. Empirical results show the competition of ACA-S3VM compared with the off-the-shelf safe semi-supervised classification methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Notes

  1. http://archive.ics.uci.edu/ml/datasets.html.

  2. http://www.kyb.tuebingen.mpg.de/ssl-book/.

References

  1. Zhou Z-H, Li M (2010) Semi-supervised learning by disagreement. Knowl Inf Syst 24(3):415–439

    Article  MathSciNet  Google Scholar 

  2. Zhu X, Goldberg AB (2009) Introduction to semi-supervised learning. Morgan & Claypool, San Rafael

    MATH  Google Scholar 

  3. Zhu X (2008) Semi-supervised learning literature survey. University of Wisconsin-Madison, Computer Sciences, Madison

    Google Scholar 

  4. Chapelle O, Scholkopf B, Zien A (2006) Semi-supervised learning. MIT Press, Cambridge

    Book  Google Scholar 

  5. Gong C et al (2015) Scalable semi-supervised classification via Neumann series. Neural Process Lett 42(1):187–197

    Article  MathSciNet  Google Scholar 

  6. Zhao Z-Q et al (2010) A modified semi-supervised learning algorithm on Laplacian eigenmaps. Neural Process Lett 32(1):75–82

    Article  Google Scholar 

  7. Mallapragada PK et al (2009) Semiboost: boosting for semi-supervised learning. IEEE Trans Pattern Anal Mach Intell 31(11):2000–2014

    Article  Google Scholar 

  8. Fung G, Mangasarian OL (2001) Semi-supervised support vector machine for unlabeled data classification. Opt Methods Softw 15(1):99–105

    MATH  Google Scholar 

  9. Collobert R et al (2006) Large scale transductive SVMs. J Mach Learn Res 7:1687–1712

    MathSciNet  MATH  Google Scholar 

  10. Li Y-F, Kwok JT, Zhou Z-H (2009) Semi-supervised learning using label mean. In: Proceedings of the 26th international conference on machine learning. Montreal, Canada

  11. Bengio Y, Alleau OB, Le Roux N (2006) Label propagation andquadratic criterion. In: Chapelle O, Schölkopf B, Zien A (eds) Semi-supervised learning. MIT Press, Cambridge, pp 193–216

    Google Scholar 

  12. Zhu X, Ghahramani Z (2002) Learning from labeled and unlabeled data with label propagation. Carnegie Mellon University, Pittsburgh

    Google Scholar 

  13. Belkin M, Niyogi P, Sindhwani V (2006) Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. J Mach Learn Res 7(11):2399–2434

    MathSciNet  MATH  Google Scholar 

  14. Li Y-F, Zhou Z-H (2011) Improving semi-supervised support vector machines through unlabeled instances selection. In: Proceedings of the 25th AAAI conference on artificial intelligence (AAAI’11). San Francisco, CA

  15. Li Y-F, Zhou Z-H (2011) Towards making unlabeled data never hurt. In: Proceedings of the 28th international conference on machine learning (ICML’11). Bellevue, WA

  16. Wang Y, Chen S (2013) Safety-aware semi-supervised classification. IEEE Trans Neural Netw Learn Syst 24(11):1763–1772

    Article  Google Scholar 

  17. Li Y-F, Zhou Z-H (2015) Towards making unlabeled data never hurt. IEEE Trans Pattern Anal Mach Intell 37(1):175–188

    Article  Google Scholar 

  18. Wang Y, Chen S, Zhou Z-H (2012) New semi-supervised classification method based on modified cluster assumption. IEEE Trans Neural Netw Learn Syst 23(5):689–702

    Article  Google Scholar 

  19. Soares RGF, Chen H, Yao X (2012) Semi-supervised classification with cluster regularisation. IEEE Trans Neural Netw Learn Syst 23(11):1779–1792

    Article  Google Scholar 

  20. Gu B, Sheng VS (2016) A robust regularization path algorithm for \(\nu \)-support vector classification. IEEE Trans Neural Netw Learn Syst 1:1–8

    Google Scholar 

  21. Joachims T (1999) Transductive inference for text classification using support vector machines. In: Proceedings of the 16th international conference on machine learning. Bled, Slovenia

  22. Gorski J, Pfeuffer F (2007) Biconvex sets and optimization with biconvex functions: a survey and extensions. Math Methods Oper Res 66(3):373–407

    Article  MathSciNet  MATH  Google Scholar 

  23. Anguita D et al (2014) Unlabeled patterns to tighten Rademacher complexity error bounds for kernel classifiers. Pattern Recognit Lett 37:210–219

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China under Grant Nos. 61300165, 61375057 and 61300164, the Specialized Research Fund for the Doctoral Program of Higher Education of China under Grant No. 20133223120009, the Introduction of Talent Research Foundation of Nanjing University of Posts and Telecommunications under Grant Nos. NY213033 and NY213031.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yunyun Wang.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, Y., Meng, Y., Fu, Z. et al. Towards Safe Semi-supervised Classification: Adjusted Cluster Assumption via Clustering. Neural Process Lett 46, 1031–1042 (2017). https://doi.org/10.1007/s11063-017-9607-5

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11063-017-9607-5

Keywords

Navigation