Skip to main content

Classification Error in Semi-Supervised Fuzzy C-Means

  • Conference paper
  • First Online:
Fuzzy Logic and Technology, and Aggregation Operators (EUSFLAT 2023, AGOP 2023)

Abstract

Semi-Supervised Fuzzy C-Means (SSFCMeans) model enables inclusion of additional knowledge about the true class of a part of the training data. With this partial supervision, there comes a new possibility to use this model as a classifier. The main goal should be thus to minimize the classification error, just as in the fully supervised setting. However, the typical problems with minimizing the training error, test error, and avoiding the phenomenon of overfitting must be carefully considered with respect to the characteristics of the SSFCMeans model. In this work, we fill the identified research gap and analyze the way of handling partial supervision in Semi-Supervised Fuzzy C-Means and its impact on the aforementioned issues. We investigate this relationship experimentally using artificially simulated data. We show that the training error for the training phase is directly related to the scaling factor \(\alpha \) and is deterministically assured to be equal to 0 in some cases. We further illustrate our main findings for real-life partially labeled data collected from smartphones of patients with bipolar disorder in a problem of predicting the phase of the disease.

Supported by Small Grants Scheme (NOR/SGS/BIPOLAR/0239/2020-00) within the research project: “Bipolar disorder prediction with sensor-based semi-supervised Learning (BIPOLAR)” http://bipolar.ibspan.waw.pl/.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    The study obtained the consent of the Bioethical Commission at the District Medical Chamber in Warsaw (agreement no. KB/1094/17).

  2. 2.

    https://www.audeering.com/research/opensmile/.

References

  1. Antoine, V., Labroche, N.: Semi-supervised fuzzy c-means variants: a study on noisy label supervision. In: Medina, J., et al. (eds.) IPMU 2018. CCIS, vol. 854, pp. 51–62. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-91476-3_5

    Chapter  Google Scholar 

  2. Antosik-Wójcińska, A.Z., et al.: Smartphone as a monitoring tool for bipolar disorder: a systematic review including data analysis, machine learning algorithms and predictive modelling. Int. J. Med. Inform. 138, 104131 (2020). https://doi.org/10.1016/j.ijmedinf.2020.104131

    Article  Google Scholar 

  3. Bezdek, J.C., Ehrlich, R., Full, W.: FCM: the fuzzy c-means clustering algorithm. Comput. Geosci. 10(2–3), 191–203 (1984). https://doi.org/10.1016/0098-3004(84)90020-7

    Article  Google Scholar 

  4. Bouchachia, A., Pedrycz, W.: A semi-supervised clustering algorithm for data exploration. In: Bilgiç, T., De Baets, B., Kaynak, O. (eds.) IFSA 2003. LNCS, vol. 2715, pp. 328–337. Springer, Heidelberg (2003). https://doi.org/10.1007/3-540-44967-1_39

    Chapter  Google Scholar 

  5. Bouchachia, A., Pedrycz, W.: Enhancement of fuzzy clustering by mechanisms of partial supervision. Fuzzy Sets Syst. 157(13), 1733–1759 (2006). https://doi.org/10.1016/j.fss.2006.02.015

    Article  MathSciNet  MATH  Google Scholar 

  6. Casalino, G., Castellano, G., Galetta, F., Kaczmarek-Majer, K.: Dynamic incremental semi-supervised fuzzy clustering for bipolar disorder episode prediction. In: Appice, A., Tsoumakas, G., Manolopoulos, Y., Matwin, S. (eds.) DS 2020. LNCS (LNAI), vol. 12323, pp. 79–93. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-61527-7_6

    Chapter  Google Scholar 

  7. Casalino, G., Dominiak, M., Galetta, F., Kaczmarek-Majer, K.: Incremental semi-supervised fuzzy c-means for bipolar disorder episode prediction. In: 2020 IEEE Conference on Evolving and Adaptive Intelligent Systems (EAIS), Bari, Italy, pp. 1–8. IEEE (2020). https://doi.org/10.1109/EAIS48028.2020.9122748

  8. Chapelle, O., Schölkopf, B., Zien, A. (eds.): Semi-supervised Learning. Adaptive Computation and Machine Learning. MIT Press, Cambridge (2006)

    Google Scholar 

  9. Dominiak, M., et al.: Behavioral and self-reported data collected from smartphones for the assessment of depressive and manic symptoms in patients with bipolar disorder: Prospective observational study. J. Med. Internet Res. 24, e28647 (2021)

    Article  Google Scholar 

  10. Kaczmarek-Majer, K., Casalino, G., Castellano, G., Hryniewicz, O., Dominiak, M.: Explaining smartphone-based acoustic data in bipolar disorder: semi-supervised fuzzy clustering and relative linguistic summaries. Inf. Sci. 588, 174–195 (2022). https://doi.org/10.1016/j.ins.2021.12.049

    Article  Google Scholar 

  11. Kaczmarek-Majer, K., Casalino, G., Castellano, G., Leite, D., Hryniewicz, O.: Fuzzy linguistic summaries for explaining online semi-supervised learning. In: 2022 IEEE 11th International Conference on Intelligent Systems (IS), pp. 1–8 (2022). https://doi.org/10.1109/IS57118.2022.10019636

  12. Kaczmarek-Majer, K., et al.: Control charts designed using model averaging approach for phase change detection in bipolar disorder. In: Destercke, S., Denoeux, T., Gil, M.Á., Grzegorzewski, P., Hryniewicz, O. (eds.) SMPS 2018. AISC, vol. 832, pp. 115–123. Springer, Cham (2019). https://doi.org/10.1007/978-3-319-97547-4_16

    Chapter  Google Scholar 

  13. Kamińska, O., et al.: Self-organizing maps using acoustic features for prediction of state change in bipolar disorder. In: Marcos, M., et al. (eds.) KR4HC/TEAAM 2019. LNCS (LNAI), vol. 11979, pp. 148–160. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-37446-4_12

    Chapter  Google Scholar 

  14. Kmita, K., Casalino, G., Castellano, G., Hryniewicz, O., Kaczmarek-Majer, K.: Confidence path regularization for handling label uncertainty in semi-supervised learning: use case in bipolar disorder monitoring. In: 2022 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), pp. 1–8 (2022). https://doi.org/10.1109/FUZZ-IEEE55066.2022.9882759

  15. Lai, D.T.C., Garibaldi, J.M.: A comparison of distance-based semi-supervised fuzzy c-means clustering algorithms. In: 2011 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE 2011), Taipei, Taiwan, pp. 1580–1586. IEEE (2011). https://doi.org/10.1109/FUZZY.2011.6007562

  16. Pedrycz, W., Waletzky, J.: Fuzzy clustering with partial supervision. IEEE Trans. Syst. Man Cybern. Part B (Cybern.) 27(5), 787–795 (1997). https://doi.org/10.1109/3477.623232

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kamil Kmita .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kmita, K., Kaczmarek-Majer, K., Hryniewicz, O. (2023). Classification Error in Semi-Supervised Fuzzy C-Means. In: Massanet, S., Montes, S., Ruiz-Aguilera, D., González-Hidalgo, M. (eds) Fuzzy Logic and Technology, and Aggregation Operators. EUSFLAT AGOP 2023 2023. Lecture Notes in Computer Science, vol 14069. Springer, Cham. https://doi.org/10.1007/978-3-031-39965-7_60

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-39965-7_60

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-39964-0

  • Online ISBN: 978-3-031-39965-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics