Abstract
With the rapid development of computing and multimedia technology, the volume of web traffic data, social networks, sensors and other types of resources is increasing rapidly. Most of the data are unlabeled, and only a very small percentage of it is labeled. In some cases, data tags can be obtained free or at a low cost, while for many complex tasks such as speech recognition, data extraction, classification and filtering, tagging is usually difficult, time-consuming and expensive, because it requires manual interpretation by human experts. Active learning examines the issue by selecting the most active data to query tags and achieving high learning accuracy with a small number of tagged items. There are different approaches for data classification which have disadvantages such as low accuracy or high computational time. In this study, by combining covariance-based self-expression correlation estimation and incremental active learning methods, a new three-step method to sequential semi-supervising classification is presented. Initially, a new method based on covariance is used to estimate the correlation between samples. Then, a semi-supervised active learning algorithm based on GAN (Generative Adversarial Networks) is introduced to perform the learning process. Finally, a new incremental learning model based on support vector machine is used for classification. We use 20-Newsgroups, Web KB and Image Net datasets to test the performance of our proposed method. The experimental results show the robustness and effectiveness of the proposed method in terms of accuracy, precision, recall and f-measure factors compared to other methods.
Similar content being viewed by others
Availability of data and material
The data that support the findings of this study are available from the corresponding author upon reasonable request.
Code availability
All code for data analysis associated with the current submission is available from the corresponding author upon reasonable request.
References
Deng C, Chen Z, Liu X, Gao X, Tao D (2018) Triplet-based deep hashing network for cross-modal retrieval. IEEE Trans Image Process 27(8):3893–3903
C. Li, F. Wei, W. Dong, X. Wang, Q. Liu, X. Zhang (2018) Dynamic structure embedded online multiple-output regression for streaming data, IEEE Trans. Pattern Anal. Mach. Intell.
Soares RG, Chen H, Yao X (2012) Semi supervised classification with cluster regularization. IEEE TNNLS 23(11):1779–1792
Rubens, Neil; Elahi, Mehdi; Sugiyama, Masashi; Kaplan, Dain (2016). "Active Learning in Recommender Systems". In Ricci, Francesco; Roach, Lior; Shapira, Bracha (eds.). Recommender Systems Handbook (2 ed.). Springer US. doi:https://doi.org/10.1007/978-1-4899-7637 6. hdl:11311/1006123. ISBN 978–1–4899–7637–6. S2CID 11569603.
Zhang X-Y et al (2019) Active semi-supervised learning based on self-expressive correlation with generative adversarial networks. Neuro computing 345:103–113
Padmanabha Reddy YCA, Viswanath P, Eswara Reddy B (2018) Semisupervised learning: a brief review. Inter J Eng Technol 7(1.8):81–85
Qin Y, Ding S, Wang L et al (2019) Research Progress on Semi-Supervised Clustering. Cogn Comput 11:599–612. https://doi.org/10.1007/s12559-019-09664-w
Reddy YP, Viswanath P, Reddy BE (2016) “Semi -supervised single -link clustering method”, in computational intelligence and computing research (ICCIC). IEEE Inter Conf IEEE. https://doi.org/10.1109/ICCIC.2016.7919689
Cervantes J, Garcia-Lamont F, Rodríguez-Mazahua L, Lopez A (2020) A comprehensive survey on support vector machine classification: Applications, challenges and trends. Neurocomputing 408:189–215. https://doi.org/10.1016/j.neucom.2019.10.118
Burr Settles. (2009) Active Learning Literature Survey.
C. Methani, R. Thota, and A. Kale (2012) “Semi-supervised multiple in-stance learning based domain adaptation for object detection,” In: Proceedings of the Eighth Indian Conference on Computer Vision, Graphics and Image Processing. ACM, p. 13. https://doi.org/10.1145/2425333.2425346
Aydav PSS, Minz S (2015) Modified self-learning with clustering for the classification of remote sensing images. Procedia Computer Science 58:97–104. https://doi.org/10.1016/j.procs.2015.08.034
D. Chamberlain, R. Kodgule, D. Ganelin, V. Miglani and R. R. Fletcher (2016) "Application of semi-supervised deep learning to lung sound analysis," 2016 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. 804–807, doi: https://doi.org/10.1109/EMBC.2016.7590823.
Yao L, Ge Z (2018) Deep Learning of Semisupervised Process Data With Hierarchical Extreme Learning Machine and Soft Sensor Application. IEEE Trans Industr Electron 65(2):1490–1498. https://doi.org/10.1109/TIE.2017.2733448
Hussain A, Cambria E (2018) Semi-supervised learning for big social data analysis. Neurocomputing 275:1662–1673. https://doi.org/10.1016/j.neucom.2017.10.010
Bo Jiang, Ziyan Zhang, Doudou Lin, Jin Tang, Bin Luo (2019) Semi-Supervised Learning with Graph Learning-Convolutional Networks, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 11313–11320.
Chen L-C et al (2020) Naive-Student: Leveraging Semi-Supervised Learning in Video Sequences for Urban Scene Segmentation. In: Vedaldi A, Bischof H, Brox T, Frahm J-M (eds) Computer Vision – ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part IX. Springer International Publishing, Cham, pp 695–714. https://doi.org/10.1007/978-3-030-58545-7_40
Kun Yu, Tian Ran Lin, Hui Ma, Xiang Li, Xu Li, (2021) A multi-stage semi-supervised learning approach for intelligent fault diagnosis of rolling bearing using data augmentation and metric learning, Mech Syst Signal Process, 146, 107043, ISSN 0888-3270, https://doi.org/10.1016/j.ymssp.2020.107043
Opsahl T, Agneessens F, Skvoretz J (2010) Node centrality in weighted networks: Generalizing degree and shortest paths. Social networks 32(3):245–251
Girvan M, Newman ME (2002) Community structure in social and biological networks. Proc Natl Acad Sci 99(12):7821–7826
Brands U (2008) On variants of shortest-path between centrality and their generic computation. Soc Netw 30(2):136–145
Camargo G, Bugatti PH, Saito P (2020) Active semi-supervised learning for biological data classification. PLoS ONE 15(8):e0237428
Lee M, Seok J (2019) Controllable Generative Adversarial Network. IEEE Access 7:28158–28169. https://doi.org/10.1109/ACCESS.2019.2899108
Chih-Chung Hsu, Chia-Yen Lee, Yi-Xiu Zhuang (2018) Learning to Detect Fake Face Images in the Wild, International Symposium on Computer, Consumer and Control (IS3C), 978–1–5386–7036–1/18/$31.00 ©2018 IEEE, https://doi.org/10.1109/IS3C.2018.00104
Hu J, Yan C, Liu X et al (2021) An integrated classification model for incremental learning. Multimed Tools Appl 80:17275–17290. https://doi.org/10.1007/s11042-020-10070w
Xu J, Xu C, Zou B, Tang YY, Peng J, You X (2019) New Incremental Learning Algorithm with Support Vector Machines. IEEE Transact Syst, Man Cybernetics: Syst 49(11):2230–2241. https://doi.org/10.1109/TSMC.2018.2791511
Rasha Kashef A (2021) A boosted SVM classifier trained by incremental learning and decremental unlearning approach. Expert Syst Appl 167:114154. https://doi.org/10.1016/j.eswa.2020.114154
P. P. Sherki and V. Vala (2020) "A Class-Incremental Classification Method Based on Support Vector Machine," IEEE 14th International Conference on Semantic Computing (ICSC), pp. 31–36, https://doi.org/10.1109/ICSC.2020.00012
Shams, M., Elsonbaty, A., & ElSawy, W. (2020). Arabic Handwritten Character Recognition based on Convolution Neural Networks and Support Vector Machine. arXiv preprint arXiv:2009.13450
Xu H, Li L, Guo P (2021) Semi-supervised active learning algorithm for SVMs based on QBC and tri-training. J Ambient Intell Human Comput 12:8809–8822. https://doi.org/10.1007/s12652-020-02665-w
Cheng F, Dong J (2021) Data-driven online detection of tip wear in tip-based nanomachining using incremental adaptive support vector machine. J Manufact Process 69:412–421. https://doi.org/10.1016/j.jmapro.2021.08.013
Parmar KA, Rathod D, Nayak MB (2021) Intrusion Detection System Using Semi-supervised Machine Learning. In: Kotecha K, Piuri V, Shah HN, Patel R (eds) Data Science and Intelligent Applications: Proceedings of ICDSIA 2020. Springer Singapore, Singapore, pp 233–238. https://doi.org/10.1007/978-981-15-4474-3_27
Karlos S, Aridas C, Kanas VG et al (2021) Classification of acoustical signals by combining active learning strategies with semi-supervised learning schemes. Neural Comput & Applic. https://doi.org/10.1007/s00521-021-05749-6
Wang R (2012) AdaBoost for Feature Selection, Classification and Its Relation with SVM, A Review. Phys Procedia 25:800–807. https://doi.org/10.1016/j.phpro.2012.03.160
Asuncion, A., & Newman, D. (1994). UCI machine learning repository. Irvine, CA: University of California, School of Information and Computer Science 2007
Craven, M., McCallum, A., PiPasquo, D., Mitchell, T., & Freitag, D. (1998). Learning to extract symbolic knowledge from the World Wide Web. Carnegie mellonuniv pittsburgh pa school of computer Science.
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li and Li Fei-Fei (2009) ImageNet: A Large-Scale Hierarchical Image Database, IEEE Conference on Computer Vision and Pattern Recognition, https://doi.org/10.1109/CVPR.2009.5206848
Funding
This paper is extracted from the PHd thesis and has no funding from any organizations.
Author information
Authors and Affiliations
Contributions
EKAi took part in formal analysis, conceptualization, methodology, data curation, writing—original draft. RM involved in data curation, methodology, validation, formulation, investigation, writing—review & editing, supervision. SHY involved in formal analysis, data curation, methodology, conceptualization, writing—review & editing. KB took part in formal analysis, conceptualization, writing—review & editing. HP took part in conceptualization, formulation, writing—review & editing.
Corresponding author
Ethics declarations
Conflict of interest
All of the authors declare that they have no conflict of interest.
Ethical approval
This paper does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Khalili, E., Malekhosseini, R., Yaghoubyan, S.H. et al. Sequential semi-supervised active learning model in extremely low training set (SSSAL). J Supercomput 79, 6646–6673 (2023). https://doi.org/10.1007/s11227-022-04847-z
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-022-04847-z