A Robust Fuzzy c-Means Clustering Algorithm for Incomplete Data

Li, Jinhua; Song, Shiji; Zhang, Yuli; Li, Kang

doi:10.1007/978-981-10-6373-2_1

Jinhua Li¹⁵,
Shiji Song¹⁵,
Yuli Zhang^15,16 &
…
Kang Li¹⁷

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 762))

Included in the following conference series:

2179 Accesses

Abstract

Date sets with missing feature values are prevalent in clustering analysis. Most existing clustering methods for incomplete data rely on imputations of missing feature values. However, accurate imputations are usually hard to obtain especially for small-size or highly corrupted data sets. To address this issue, this paper proposes a robust fuzzy c-means (RFCM) clustering algorithm, which does not require imputations. The proposed RFCM represents the missing feature values by intervals, which can be easily constructed using the K-nearest neighbors method, and adopts a min-max optimization model to reduce the impact of noises on clustering performance. We give an equivalent tractable reformulation of the min-max optimization problem and propose an efficient solution method based on smoothing and gradient projection techniques. Experiments on UCI data sets validate the effectiveness of the proposed RFCM algorithm by comparison with existing clustering methods for incomplete data.

S. Song—This work was supported by the Major Program of the National Natural Science Foundation of China under Grant 41427806, the National Natural Science Foundation of China under Grants 61503211 and 9152002, and the Project of China Ocean Association under Grant DYXM-125-25-02.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Fuzzy c-Means Clustering of Incomplete Data Using Dimension-Wise Fuzzy Variances of Clusters

An Improved Mean Imputation Clustering Algorithm for Incomplete Data

Article 02 July 2020

FIT2COMIn – Robust Clustering Algorithm for Incomplete Data

References

Bertsekas, D.P.: Nonlinear Programming. Athena Scientific, Belmont (1999)
MATH Google Scholar
Condat, L.: Fast projection onto the simplex and the l-1 ball. Preprint HAL, 1056171 (2014)
Google Scholar
Hathaway, R.J., Bezdek, J.C.: Fuzzy c-means clustering of incomplete data. IEEE Trans. Syst. Man Cybern. Part B Cybern. 31(5), 735–744 (2001)
Article Google Scholar
Honda, K., Ichihashi, H.: Linear fuzzy clustering techniques with missing values and their application to local principal component analysis. IEEE Trans. Fuzzy Syst. 12(2), 183–193 (2004)
Article Google Scholar
Lanckriet, G.R.G., Ghaoui, L.E., Bhattacharyya, C., Jordan, M.I.: Minimax probability machine. Adv. Neural Inf. Process. Syst. 1, 801–808 (2002)
Google Scholar
Li, D., Hong, G., Zhang, L.: A fuzzy c-means clustering algorithm based on nearest-neighbor intervals for incomplete data. Expert Syst. Appl. 37(10), 6942–6947 (2010)
Article Google Scholar
Li, D., Hong, G., Zhang, L.: A hybrid genetic algorithm–fuzzy c-means approach for incomplete data clustering based on nearest-neighbor intervals. Soft. Comput. 17(10), 1787–1796 (2013)
Article Google Scholar
Li, J., Song, S., Zhang, Y., Zhou, Z.: Robust k-median and k-means clustering algorithms for incomplete data. Math. Prob. Eng. 2016, 1–8 (2016)
MathSciNet Google Scholar
Shibayama, T.: A PCA-like method for multivariate data with missing values. Japan. J. Educ. Psychol. 40(2), 257–265 (1992)
Article Google Scholar
Song, S., Gong, Y., Zhang, Y., Huang, G., Huang, G.-B.: Dimension reduction by minimum error minimax probability machine. IEEE Trans. Syst. Man Cybern.: Syst. 47(1), 58–69 (2017)
Article Google Scholar
Trafalis, T., Gilbert, R.: Robust support vector machines for classification and computational issues. Optim. Methods Softw. 22(1), 187–198 (2007)
Article MathSciNet MATH Google Scholar
Wang, B.L., Zhang, L.Y., Zhang, L., Bing, Z.H., Xu, X.H.: Missing data imputation by nearest-neighbor trained bp for fuzzy clustering. J. Inf. Comput. Sci. 11(15), 5367–5375 (2014)
Article Google Scholar
Wang, Y., Zhang, Y., Yi, J., Qu, H., Miu, J.: A robust probability classifier based on the modified-distance. Math. Probl. Eng. 2014, 1–13 (2014)
MathSciNet Google Scholar
Wang, Y., Zhang, Y., Zhang, F., Yi, J.: Robust quadratic regression and its application to energy-growth consumption problem. Math. Probl. Eng. 2013, 1–10 (2013)
MathSciNet MATH Google Scholar
Huan, X., Caramanis, C., Mannor, S.: Robustness and regularization of support vector machines. J. Mach. Learn. Res. 10, 1485–1510 (2009)
MathSciNet MATH Google Scholar
Yao, L., Weng, K.-S.: Imputation of incomplete data using adaptive ellipsoids with linear regression. J. Intell. Fuzzy Syst. 29(1), 253–265 (2015)
Article MathSciNet Google Scholar
Zhang, L., Bing, Z., Zhang, L.: A hybrid clustering algorithm based on missing attribute interval estimation for incomplete data. Pattern Anal. Appl. 18(2), 377–384 (2015)
Article MathSciNet Google Scholar
Zhang, Y., Shen, Z.-J.M., Song, S.: Distributionally robust optimization of two-stage lot-sizing problems. Prod. Oper. Manag. 25(12), 2116–2131 (2016)
Article Google Scholar
Zhang, Y., Song, S., Shen, Z.-J.M., Wu, C.: Data-driven robust shortest path problem with distributional uncertainty. IEEE Trans. Intell. Transp. Syst. (2017). doi:10.1109/TITS.2017.2709798

Download references

Author information

Authors and Affiliations

Department of Automation, TNList, Tsinghua University, Beijing, 100084, People’s Republic of China
Jinhua Li, Shiji Song & Yuli Zhang
Department of Industrial Engineering, Tsinghua University, Beijing, 100084, People’s Republic of China
Yuli Zhang
School of Electronics, Electrical Engineering and Computer Science, Queens University Belfast, Belfast, UK
Kang Li

Authors

Jinhua Li
View author publications
You can also search for this author in PubMed Google Scholar
Shiji Song
View author publications
You can also search for this author in PubMed Google Scholar
Yuli Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Kang Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Shiji Song or Yuli Zhang .

Editor information

Editors and Affiliations

Nanjing University of Posts and Telecommunications, Nanjing, China
Dong Yue
Shanghai University , Shanghai, China
Chen Peng
Shanghai University , Shanghai, China
Dajun Du
Nanjing University of Posts and Telecommunications, Nanjing, China
Tengfei Zhang
Shanghai University , Shanghai, China
Min Zheng
Swinburne University of Technology, Melbourne, Victoria, Australia
Qinglong Han

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, J., Song, S., Zhang, Y., Li, K. (2017). A Robust Fuzzy c-Means Clustering Algorithm for Incomplete Data. In: Yue, D., Peng, C., Du, D., Zhang, T., Zheng, M., Han, Q. (eds) Intelligent Computing, Networked Control, and Their Engineering Applications. ICSEE LSMS 2017 2017. Communications in Computer and Information Science, vol 762. Springer, Singapore. https://doi.org/10.1007/978-981-10-6373-2_1

Download citation

DOI: https://doi.org/10.1007/978-981-10-6373-2_1
Published: 23 August 2017
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-6372-5
Online ISBN: 978-981-10-6373-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics