Skip to main content

A Robust Fuzzy c-Means Clustering Algorithm for Incomplete Data

  • Conference paper
  • First Online:
Intelligent Computing, Networked Control, and Their Engineering Applications (ICSEE 2017, LSMS 2017)

Abstract

Date sets with missing feature values are prevalent in clustering analysis. Most existing clustering methods for incomplete data rely on imputations of missing feature values. However, accurate imputations are usually hard to obtain especially for small-size or highly corrupted data sets. To address this issue, this paper proposes a robust fuzzy c-means (RFCM) clustering algorithm, which does not require imputations. The proposed RFCM represents the missing feature values by intervals, which can be easily constructed using the K-nearest neighbors method, and adopts a min-max optimization model to reduce the impact of noises on clustering performance. We give an equivalent tractable reformulation of the min-max optimization problem and propose an efficient solution method based on smoothing and gradient projection techniques. Experiments on UCI data sets validate the effectiveness of the proposed RFCM algorithm by comparison with existing clustering methods for incomplete data.

S. Song—This work was supported by the Major Program of the National Natural Science Foundation of China under Grant 41427806, the National Natural Science Foundation of China under Grants 61503211 and 9152002, and the Project of China Ocean Association under Grant DYXM-125-25-02.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Bertsekas, D.P.: Nonlinear Programming. Athena Scientific, Belmont (1999)

    MATH  Google Scholar 

  2. Condat, L.: Fast projection onto the simplex and the l-1 ball. Preprint HAL, 1056171 (2014)

    Google Scholar 

  3. Hathaway, R.J., Bezdek, J.C.: Fuzzy c-means clustering of incomplete data. IEEE Trans. Syst. Man Cybern. Part B Cybern. 31(5), 735–744 (2001)

    Article  Google Scholar 

  4. Honda, K., Ichihashi, H.: Linear fuzzy clustering techniques with missing values and their application to local principal component analysis. IEEE Trans. Fuzzy Syst. 12(2), 183–193 (2004)

    Article  Google Scholar 

  5. Lanckriet, G.R.G., Ghaoui, L.E., Bhattacharyya, C., Jordan, M.I.: Minimax probability machine. Adv. Neural Inf. Process. Syst. 1, 801–808 (2002)

    Google Scholar 

  6. Li, D., Hong, G., Zhang, L.: A fuzzy c-means clustering algorithm based on nearest-neighbor intervals for incomplete data. Expert Syst. Appl. 37(10), 6942–6947 (2010)

    Article  Google Scholar 

  7. Li, D., Hong, G., Zhang, L.: A hybrid genetic algorithm–fuzzy c-means approach for incomplete data clustering based on nearest-neighbor intervals. Soft. Comput. 17(10), 1787–1796 (2013)

    Article  Google Scholar 

  8. Li, J., Song, S., Zhang, Y., Zhou, Z.: Robust k-median and k-means clustering algorithms for incomplete data. Math. Prob. Eng. 2016, 1–8 (2016)

    MathSciNet  Google Scholar 

  9. Shibayama, T.: A PCA-like method for multivariate data with missing values. Japan. J. Educ. Psychol. 40(2), 257–265 (1992)

    Article  Google Scholar 

  10. Song, S., Gong, Y., Zhang, Y., Huang, G., Huang, G.-B.: Dimension reduction by minimum error minimax probability machine. IEEE Trans. Syst. Man Cybern.: Syst. 47(1), 58–69 (2017)

    Article  Google Scholar 

  11. Trafalis, T., Gilbert, R.: Robust support vector machines for classification and computational issues. Optim. Methods Softw. 22(1), 187–198 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  12. Wang, B.L., Zhang, L.Y., Zhang, L., Bing, Z.H., Xu, X.H.: Missing data imputation by nearest-neighbor trained bp for fuzzy clustering. J. Inf. Comput. Sci. 11(15), 5367–5375 (2014)

    Article  Google Scholar 

  13. Wang, Y., Zhang, Y., Yi, J., Qu, H., Miu, J.: A robust probability classifier based on the modified-distance. Math. Probl. Eng. 2014, 1–13 (2014)

    MathSciNet  Google Scholar 

  14. Wang, Y., Zhang, Y., Zhang, F., Yi, J.: Robust quadratic regression and its application to energy-growth consumption problem. Math. Probl. Eng. 2013, 1–10 (2013)

    MathSciNet  MATH  Google Scholar 

  15. Huan, X., Caramanis, C., Mannor, S.: Robustness and regularization of support vector machines. J. Mach. Learn. Res. 10, 1485–1510 (2009)

    MathSciNet  MATH  Google Scholar 

  16. Yao, L., Weng, K.-S.: Imputation of incomplete data using adaptive ellipsoids with linear regression. J. Intell. Fuzzy Syst. 29(1), 253–265 (2015)

    Article  MathSciNet  Google Scholar 

  17. Zhang, L., Bing, Z., Zhang, L.: A hybrid clustering algorithm based on missing attribute interval estimation for incomplete data. Pattern Anal. Appl. 18(2), 377–384 (2015)

    Article  MathSciNet  Google Scholar 

  18. Zhang, Y., Shen, Z.-J.M., Song, S.: Distributionally robust optimization of two-stage lot-sizing problems. Prod. Oper. Manag. 25(12), 2116–2131 (2016)

    Article  Google Scholar 

  19. Zhang, Y., Song, S., Shen, Z.-J.M., Wu, C.: Data-driven robust shortest path problem with distributional uncertainty. IEEE Trans. Intell. Transp. Syst. (2017). doi:10.1109/TITS.2017.2709798

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Shiji Song or Yuli Zhang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer Nature Singapore Pte Ltd.

About this paper

Cite this paper

Li, J., Song, S., Zhang, Y., Li, K. (2017). A Robust Fuzzy c-Means Clustering Algorithm for Incomplete Data. In: Yue, D., Peng, C., Du, D., Zhang, T., Zheng, M., Han, Q. (eds) Intelligent Computing, Networked Control, and Their Engineering Applications. ICSEE LSMS 2017 2017. Communications in Computer and Information Science, vol 762. Springer, Singapore. https://doi.org/10.1007/978-981-10-6373-2_1

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-6373-2_1

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-6372-5

  • Online ISBN: 978-981-10-6373-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics