Skip to main content

Advertisement

A robust self-training algorithm based on relative node graph

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Self-training algorithm is a well-known framework of semi-supervised learning. How to select high-confidence samples is the key step for self-training algorithm. If high-confidence examples with incorrect labels are employed to train the classifier, the error will get worse during iterations. To improve the quality of high-confidence samples, a novel data editing technique termed Relative Node Graph Editing (RNGE) is put forward. Say concretely, mass estimation is used to calculate the density and peak of each sample to build a prototype tree to reveal the underlying spatial structure of the data. Then, we define the Relative Node Graph (RNG) for each sample. Finally, the mislabeled samples in the candidate high-confidence sample set are identified by hypothesis test based on RNG. Combined above, we propose a Robust Self-training Algorithm based on Relative Node Graph (STRNG), which uses RNGE to identify mislabeled samples and edit them. The experimental results show that the proposed algorithm can improve the performance of the self-training algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Algorithm 1
Algorithm 2
Fig. 1
Fig. 2
Algorithm 3
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Data Availability

The data sets, and experimental results generated during analyzed and the current study are available from https://github.com/511lab/STRNG.

Notes

  1. http://archive.ics.uci.edu/ml/index.php.

  2. http://archive.ics.uci.edu/ml/index.php.

  3. http://archive.ics.uci.edu/ml/index.php.

  4. http://www.uk.research.att.com/facedatabase.html.

References

  1. Asencio-Cortés G, Martínez-Álvarez F, Morales-Esteban A, Reyes J (2016) A sensitivity study of seismicity indicators in supervised learning to improve earthquake prediction. Knowl Based Syst 101:15–30

    Article  MATH  Google Scholar 

  2. Li J, Zhu Q (2020) A boosting self-training framework based on instance generation with natural neighbors for K nearest neighbor. Appl Intell 50:3535–3553

    Article  MATH  Google Scholar 

  3. Wang LM, Zhang XH, Li K, Zhang S (2022) Semi-supervised learning for k-dependence bayesian classifiers. Appl Intell 52:3604–3622

    Article  MATH  Google Scholar 

  4. Pei H, Wang K, Lin Q, Zhong P (2018) Robust semi-supervised extreme learning machine. Knowl Based Syst 159:203–220

    Article  MATH  Google Scholar 

  5. Liu Z, Lai Z, Weihua O, Zhang K, Huo H (2023) Discriminative sparse least square regression for semi-supervised learning. Inf Sci 636:118903

    Article  MATH  Google Scholar 

  6. Levatic J, Ceci M, Kocev D, Saso D (2017) Self-training for multi-target regression with tree ensembles. Knowl Based Syst 123:41–60

    Article  MATH  Google Scholar 

  7. Zhou P, Wang N, Zhao S, Zhang Y (2023) Robust semi-supervised clustering via data transductive warping. Appl Intell 53:1254–1270

    Article  MATH  Google Scholar 

  8. Ienco D, Interdonato R (2023) Deep semi-supervised clustering for multi-variate time-series. Neurocomputing 516:2023

    Article  MATH  Google Scholar 

  9. Gan H, Fan Y, Luo Z, Huang R, Yang Z (2019) Confidence-weighted safe semi-supervised clustering. Eng Appl Artif Intell 81:107–116

    Article  MATH  Google Scholar 

  10. Schwenker F, Trentin E (2014) Pattern classification and clustering: A review of partially supervised learning approaches. Pattern Recognit Lett 37:4–14

    Article  MATH  Google Scholar 

  11. Sichao F, Wang S, Liu W, Liu B, Zhou B, You X, Peng Q, Jing XY (2022) Adaptive graph con-volutional collaboration networks for semi-supervised classification. Inf Sci 611:262–276

    Article  MATH  Google Scholar 

  12. Chen X, Guoxian Y, Tan Q, Wang J (2019) Weighted samples based semi-supervised classification. Appl Soft Comput 79:46–58

    Article  MATH  Google Scholar 

  13. Zou Y, Zhiding Y, Liu X, Kumar BVK, Vijaya, Wang J (2019) Confidence Regularized Self-Training. In: 2019 IEEE/CVF international conference on computer vision (ICCV), pp 5981–5990

  14. Nigam K, Ghani R (2000) Analyzing the effectiveness and applicability of co-training

  15. Li J (2022) Nang-st: A natural neighborhood graph-based self-training method for semi-supervised classification. Neurocomputing 514:268–284

    Article  MATH  Google Scholar 

  16. Hao D, Ahsan M, Salim T, Duarte Rojo A, Esmaeel D, Zhang Y, Arefan D, Shandong W (2022) A self-training teacher-student model with an automatic label grader for abdominal skeletal muscle segmentation. Artif Intell Med 132:102366

    Article  Google Scholar 

  17. Ren Y, Zhu H, Tian Y, Jinglu H (2021) A laplacian svm based semi-supervised classification using multi-local linear model. IEE Trans Elect Electr Eng 16(3):455–463

    Article  MATH  Google Scholar 

  18. Li J, Zhu Q, Quanwang W, Cheng D (2020) An effective framework based on local cores for self-labeled semi-supervised classification. Knowl Based Syst 197:105804

    Article  MATH  Google Scholar 

  19. Banerjee B, Bovolo F, Bhattacharya A, Bruzzone L, Chaudhuri S, Krishna M (2015) A new self-training-based unsupervised satellite image classification technique using cluster ensemble strategy. IEEE Geosci Rem Sens Lett 12(4:741–745

  20. Xia CQ, Han K, Qi Y, Zhang Y, DongJun Y (2018) A self-training subspace clustering algorithm under low-rank representation for cancer classification on gene expression data. IEEE/ACM Trans Comput Biol Bioinformatics 15(4):1315–1324

    Article  MATH  Google Scholar 

  21. Li M, Zhou ZH (2005) SETRED: self-training with editing. Pacific-Asia Conference on Knowledge Discovery and Data Mining

  22. Muhlenbach Fabrice, Lallich Stéphane, Zighed Djamel A (2024) Identifying and handling mislabelled instances. J Intell Inf Syst 22(1):89–109

    Article  MATH  Google Scholar 

  23. Supowit KJ (1983) The relative neighborhood graph, with an application to minimumspanning trees. J the ACM 30(3:428–448

  24. Wang Y, Xiaoyuan X, Zhao H, Hua Z (2010) Semi-supervised learning based on nearest neighbor rule and cut edges. Knowl Based Syst 23(6):547–554

    Article  MATH  Google Scholar 

  25. Banerjee B, Bovolo F, Bhattacharya A, Bruzzone L, Chaudhuri S, Krishna MB (2015) A new self-training-based unsupervised satellite image classification technique using cluster ensemble strategy. IEEE Geosci Remote Sens Lett 12(4):741–745

    Article  MATH  Google Scholar 

  26. Wei Z, Wang H, Zhao R (2013) Semi-supervised multi-label image classification based on nearest neighbor editing. Neurocomputing 119:462–468

    Article  MATH  Google Scholar 

  27. lark SP, Wagner TJ (1977) Another look at the edited nearest neighbor rule. IEEE Trans Syst Man, and Cybern 7:92–94

  28. Gan H, Sang N, Huang R, Tong X, Dan Z (2013) Using clustering analysis to improve semi-supervised classification. Neurocomputing 101:290–298

    Article  MATH  Google Scholar 

  29. Yin X, Shu T, Huang Q (2012) Semi-supervised fuzzy clustering with metric learning and entropy regularization. Knowl Based Syst 35:304–311

    Article  MATH  Google Scholar 

  30. Adankon MM, Cheriet M (2011) Help-training for semi-supervised support vector machines. Pattern Recognit 44(9):2220–2230

    Article  MATH  Google Scholar 

  31. DianHua W, Shang M, Luo X, Ji X, Yan H (2018) Self-training semi-supervised classification based on density peaks of data. Neurocomputing 275:180–191

    Article  MATH  Google Scholar 

  32. Rodriguez A, Laio A (2014) Clustering by fast search and find of density peaks. Sci 344:1492–1496

    Article  MATH  Google Scholar 

  33. Wu D, Shang M, Wang G, Li L (2018) A self-training semi-supervised classification algorithm based on density peaks of data and differential evolution. In 2018 IEEE 15th international conference on networking, sensing and control. pp 1-6

  34. Li J, Zhu Q, QuanWang W (2019) A self-training method based on density peaks and an extended parameter-free local noise filter for k nearest neighbor. Knowl Based Syst 184:104895

    Article  MATH  Google Scholar 

  35. Liu Y (2020) Self-training algorithm combining density peak and cut edge weight. J Vis Lang Comput 1:11–16

    Article  MATH  Google Scholar 

  36. Amorim WP, Falcão AX, Papa JP (2018) Multi-label semi-supervised classification through optimum-path forest. Inf ences. 465:86–104

    MathSciNet  MATH  Google Scholar 

  37. Li J, Zhu Q (2019) Semi-supervised self-training method based on an optimum-path forest. IEEE Access, pp 36388–36399

  38. Zhao S, Li J (2021) A semi-supervised self-training method based on density peaks and natural neighbors. J Ambient Intell Humanized Comput 12(2):2939–2953

  39. Huang C, Li M, FeiLongCao HF, Li Z, XinDong W (2023) Are Graph Convolutional Networks With Random Weights Feasible?. IEEE Trans Pattern Anal Machine Intell 45(3):2751–2768

  40. Zou C, Han A, Lin L, Li M, Gao J (2023) A simple yet effective framelet-based graph neural network for directed graphs. IEEE Transactions on Artificial Intelligence. pp 1–11

  41. Li J, Zheng R, Feng H, Li M, Zhuang X (2024). Permutaion Equivariant Graph Framelets for Heterophilous Semi-supervised Learning. IEEE Transactions on Neural Networks and Learning Systems, pp 1–15

  42. Li M, Zhang L, LiXin CL, Bai ZL, XinDong W (2023) BLoG: Bootstrapped graph representation learning with local and global regularization for recommendation. Pattern Recognit 144:109874

  43. Li M, Zhuang X, Bai L, Ding W (2024) Multimodal graph learning based on 3D Haar semi-tight framelet for student engagement prediction. Inf Fusion 105:102224

  44. Li B, Wang J, Yang Z, Yi J, Nie F (2023) Fast semi-supervised self-training algorithm based on data editing. Inf Sci 626:293–314

    Article  MATH  Google Scholar 

  45. Xia S, Peng D, Meng D, Zhang C, Wang G, Giem E, Wei W, Chen Z (2020)A fast adaptive k-means with no bounds. IEEE Transactions on Pattern Analysis and Machine Intelligence

  46. Ting K, Zhou G, Liu FT, Tan SC (2013) Mass estimation. Machine Learn 90(1):127–160

    Article  Google Scholar 

  47. Ding S, Xiao X, Wang Y (2020) Optimized density peaks clustering algorithm based on dissimilarity measure. J Soft 31(11):3321–3333

    MATH  Google Scholar 

  48. Chen B, Ting KM, Washio T, Haffari G (2015) Half-space mass: a maximally robust and efficient data depth method. Mach Learn 100(2–3):67–699

    MathSciNet  MATH  Google Scholar 

  49. Liu FT, Ting KM, Zhou Z-H (2008) Isolation forest. In: 2008 Eighth IEEE international conference on data mining, pp 413–422

  50. Hasan MS, Wu X, Watson LT, Zhang L (2017) UPS-indel: a universal positioning system for indels. Sci Rep 7(1):1–13

    Article  MATH  Google Scholar 

Download references

Acknowledgements

The work was partially supported Gansu University Innovation Fund Project (2023B-94), the National Social Science Fund of China (Grant No. 20XTJ005), the Central Government Funds for Guiding Local Science and Technology Development of China (Grant No. YDZX20216200001876).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jikui Wang.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, J., Duan, H., Zhang, C. et al. A robust self-training algorithm based on relative node graph. Appl Intell 55, 1 (2025). https://doi.org/10.1007/s10489-024-06062-0

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10489-024-06062-0

Keywords