Abstract
Semi-Supervised Fuzzy C-Means (SSFCMeans) model enables inclusion of additional knowledge about the true class of a part of the training data. With this partial supervision, there comes a new possibility to use this model as a classifier. The main goal should be thus to minimize the classification error, just as in the fully supervised setting. However, the typical problems with minimizing the training error, test error, and avoiding the phenomenon of overfitting must be carefully considered with respect to the characteristics of the SSFCMeans model. In this work, we fill the identified research gap and analyze the way of handling partial supervision in Semi-Supervised Fuzzy C-Means and its impact on the aforementioned issues. We investigate this relationship experimentally using artificially simulated data. We show that the training error for the training phase is directly related to the scaling factor \(\alpha \) and is deterministically assured to be equal to 0 in some cases. We further illustrate our main findings for real-life partially labeled data collected from smartphones of patients with bipolar disorder in a problem of predicting the phase of the disease.
Supported by Small Grants Scheme (NOR/SGS/BIPOLAR/0239/2020-00) within the research project: “Bipolar disorder prediction with sensor-based semi-supervised Learning (BIPOLAR)” http://bipolar.ibspan.waw.pl/.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
The study obtained the consent of the Bioethical Commission at the District Medical Chamber in Warsaw (agreement no. KB/1094/17).
- 2.
References
Antoine, V., Labroche, N.: Semi-supervised fuzzy c-means variants: a study on noisy label supervision. In: Medina, J., et al. (eds.) IPMU 2018. CCIS, vol. 854, pp. 51–62. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-91476-3_5
Antosik-Wójcińska, A.Z., et al.: Smartphone as a monitoring tool for bipolar disorder: a systematic review including data analysis, machine learning algorithms and predictive modelling. Int. J. Med. Inform. 138, 104131 (2020). https://doi.org/10.1016/j.ijmedinf.2020.104131
Bezdek, J.C., Ehrlich, R., Full, W.: FCM: the fuzzy c-means clustering algorithm. Comput. Geosci. 10(2–3), 191–203 (1984). https://doi.org/10.1016/0098-3004(84)90020-7
Bouchachia, A., Pedrycz, W.: A semi-supervised clustering algorithm for data exploration. In: Bilgiç, T., De Baets, B., Kaynak, O. (eds.) IFSA 2003. LNCS, vol. 2715, pp. 328–337. Springer, Heidelberg (2003). https://doi.org/10.1007/3-540-44967-1_39
Bouchachia, A., Pedrycz, W.: Enhancement of fuzzy clustering by mechanisms of partial supervision. Fuzzy Sets Syst. 157(13), 1733–1759 (2006). https://doi.org/10.1016/j.fss.2006.02.015
Casalino, G., Castellano, G., Galetta, F., Kaczmarek-Majer, K.: Dynamic incremental semi-supervised fuzzy clustering for bipolar disorder episode prediction. In: Appice, A., Tsoumakas, G., Manolopoulos, Y., Matwin, S. (eds.) DS 2020. LNCS (LNAI), vol. 12323, pp. 79–93. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-61527-7_6
Casalino, G., Dominiak, M., Galetta, F., Kaczmarek-Majer, K.: Incremental semi-supervised fuzzy c-means for bipolar disorder episode prediction. In: 2020 IEEE Conference on Evolving and Adaptive Intelligent Systems (EAIS), Bari, Italy, pp. 1–8. IEEE (2020). https://doi.org/10.1109/EAIS48028.2020.9122748
Chapelle, O., Schölkopf, B., Zien, A. (eds.): Semi-supervised Learning. Adaptive Computation and Machine Learning. MIT Press, Cambridge (2006)
Dominiak, M., et al.: Behavioral and self-reported data collected from smartphones for the assessment of depressive and manic symptoms in patients with bipolar disorder: Prospective observational study. J. Med. Internet Res. 24, e28647 (2021)
Kaczmarek-Majer, K., Casalino, G., Castellano, G., Hryniewicz, O., Dominiak, M.: Explaining smartphone-based acoustic data in bipolar disorder: semi-supervised fuzzy clustering and relative linguistic summaries. Inf. Sci. 588, 174–195 (2022). https://doi.org/10.1016/j.ins.2021.12.049
Kaczmarek-Majer, K., Casalino, G., Castellano, G., Leite, D., Hryniewicz, O.: Fuzzy linguistic summaries for explaining online semi-supervised learning. In: 2022 IEEE 11th International Conference on Intelligent Systems (IS), pp. 1–8 (2022). https://doi.org/10.1109/IS57118.2022.10019636
Kaczmarek-Majer, K., et al.: Control charts designed using model averaging approach for phase change detection in bipolar disorder. In: Destercke, S., Denoeux, T., Gil, M.Á., Grzegorzewski, P., Hryniewicz, O. (eds.) SMPS 2018. AISC, vol. 832, pp. 115–123. Springer, Cham (2019). https://doi.org/10.1007/978-3-319-97547-4_16
Kamińska, O., et al.: Self-organizing maps using acoustic features for prediction of state change in bipolar disorder. In: Marcos, M., et al. (eds.) KR4HC/TEAAM 2019. LNCS (LNAI), vol. 11979, pp. 148–160. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-37446-4_12
Kmita, K., Casalino, G., Castellano, G., Hryniewicz, O., Kaczmarek-Majer, K.: Confidence path regularization for handling label uncertainty in semi-supervised learning: use case in bipolar disorder monitoring. In: 2022 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), pp. 1–8 (2022). https://doi.org/10.1109/FUZZ-IEEE55066.2022.9882759
Lai, D.T.C., Garibaldi, J.M.: A comparison of distance-based semi-supervised fuzzy c-means clustering algorithms. In: 2011 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE 2011), Taipei, Taiwan, pp. 1580–1586. IEEE (2011). https://doi.org/10.1109/FUZZY.2011.6007562
Pedrycz, W., Waletzky, J.: Fuzzy clustering with partial supervision. IEEE Trans. Syst. Man Cybern. Part B (Cybern.) 27(5), 787–795 (1997). https://doi.org/10.1109/3477.623232
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Kmita, K., Kaczmarek-Majer, K., Hryniewicz, O. (2023). Classification Error in Semi-Supervised Fuzzy C-Means. In: Massanet, S., Montes, S., Ruiz-Aguilera, D., González-Hidalgo, M. (eds) Fuzzy Logic and Technology, and Aggregation Operators. EUSFLAT AGOP 2023 2023. Lecture Notes in Computer Science, vol 14069. Springer, Cham. https://doi.org/10.1007/978-3-031-39965-7_60
Download citation
DOI: https://doi.org/10.1007/978-3-031-39965-7_60
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-39964-0
Online ISBN: 978-3-031-39965-7
eBook Packages: Computer ScienceComputer Science (R0)