Cluster Center Initialization and Outlier Detection Based on Distance and Density for the K-Means Algorithm

He, Qi; Chen, Zhenxiang; Ji, Ke; Wang, Lin; Ma, Kun; Zhao, Chuan; Shi, Yuliang

doi:10.1007/978-3-030-16657-1_49

Qi He^18,19,
Zhenxiang Chen^18,19,
Ke Ji^18,19,
Lin Wang^18,19,
Kun Ma^18,19,
Chuan Zhao^18,19 &
…
Yuliang Shi²⁰

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 940))

Included in the following conference series:

International Conference on Intelligent Systems Design and Applications

1652 Accesses

Abstract

K-means algorithm, the most classic partition-based clustering method, has its disadvantages. If there are outliers in the data sets, the K-means algorithm may lead to serious deviation of the mean value. In addition, random initialization is very sensitive to the input data parameters. In this paper, we propose initialization and outlier detection based on distance and density for the K-means algorithm (KMIDDO), an improvement method to optimize the initial center points, especially it has more effective in the case of outliers. What’s more, we extend an outlier detection method to improve the clustering effect. We hope the distance between every two center points is as far as possible and the density of the center points are as large as they can. In terms of initialization, we calculate the distance and density of points. In the outliers detection, we take the outliers as a single class based on the distance and density. Experiments are conducted to illustrate the effectiveness and accuracy of the proposed algorithms on several synthetic and real datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

K-Means Algorithm Based on Initial Cluster Center Optimization

An Efficient Approach for Selection of Initial Cluster Centroids for k-means

Improved K-Means Algorithm for Optimizing Initial Centers

References

Wang, J., Ke, Q., Li, S., Wang, J.: Approximate k-means via cluster closures (2017)
Google Scholar
Zhou, Y., Yu, H., Cai, X.: A novel k-means algorithm for clustering and outlier detection. In: International Conference on Future Information Technology and Management Engineering, pp. 476–480 (2010)
Google Scholar
Xu, J., Han, J., Nie, F., Li, X.: Re-weighted discriminatively embedded $k$-means for multi-view clustering. IEEE Trans. Image Process. Publ. IEEE Signal Process. Soc. 26(6), 3016–3027 (2017)
Article MathSciNet Google Scholar
Ott, L., Pang, L., Ramos, F., Chawla, S.: On integrated clustering and outlier detection. In: Advances in Neural Information Processing Systems, pp. 1359–1367 (2014)
Google Scholar
Bai, L., Cheng, X., Liang, J., Shen, H., Guo, Y.: Fast density clustering strategies based on the k-means algorithm. Pattern Recognit. 71, 375–386 (2017)
Article Google Scholar
Jiang, F., Liu, G., Junwei, D., Sui, Y.: Initialization of k-modes clustering using outlier detection techniques. Inf. Sci. 332, 167–183 (2016)
Article Google Scholar
Ai, H., Li, W.: K-means initial clustering center optimal algorithm based on estimating density and refining initial. In: Information Science and Service Science and Data Mining, pp. 603–606 (2013)
Google Scholar
Gan, G., Chen, K.: A soft subspace clustering algorithm with log-transformed distances. Big Data Inf. Anal. 1(1), 93–109 (2015)
Google Scholar
Li, X., Lv, J., Li, L., Ao, F.: An angle and density-based method for key points detection. In: International Joint Conference on Neural Networks, pp. 3682–3688 (2016)
Google Scholar
Gan, G., Ng, K.P.: K-means Clustering with Outlier Removal. Elsevier Science Inc., New York (2017)
Book Google Scholar
Suleman, A.: Assessing a Fuzzy Extension of Rand Index and Related Measures. IEEE Press (2017)
Google Scholar
Coelho, G.P., Barbante, C.C., Boccato, L., Attux, R.R.F., Oliveira, J.R., Von Zuben, F.J.: Automatic feature selection for BCI: an analysis using the davies-bouldin index and extreme learning machines. In: International Joint Conference on Neural Networks, pp. 1–8 (2012)
Google Scholar
Chawla, S., Gionis, A.: K-means-: A unified approach to clustering and outlier detection (2013)
Google Scholar

Download references

Acknowledgment

This work was supported by the National Natural Science Foundation of China under Grants No. 61672262, No. 61573166 and No. 61702218, the Shandong Provincial Key R&D Program under Grant No. 2016GGX101001, CERNET Next Generation Internet Technology Innovation Project under Grant No. NGII20160404.

Author information

Authors and Affiliations

University of Jinan, Jinan, 250022, China
Qi He, Zhenxiang Chen, Ke Ji, Lin Wang, Kun Ma & Chuan Zhao
Shandong Provincial Key Laboratory of Network Based Intelligent Computing, Jinan, 250022, China
Qi He, Zhenxiang Chen, Ke Ji, Lin Wang, Kun Ma & Chuan Zhao
Shandong University, Jinan, 250013, China
Yuliang Shi

Authors

Qi He
View author publications
You can also search for this author in PubMed Google Scholar
Zhenxiang Chen
View author publications
You can also search for this author in PubMed Google Scholar
Ke Ji
View author publications
You can also search for this author in PubMed Google Scholar
Lin Wang
View author publications
You can also search for this author in PubMed Google Scholar
Kun Ma
View author publications
You can also search for this author in PubMed Google Scholar
Chuan Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Yuliang Shi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhenxiang Chen .

Editor information

Editors and Affiliations

Machine Intelligence Research Labs, Auburn, WA, USA
Ajith Abraham
School of Information Technology and Engineering, Vellore Institute of Technology, Vellore, Tamil Nadu, India
Aswani Kumar Cherukuri
Tijuana Institute of Technology, Tijuana, Mexico
Patricia Melin
Machine Intelligence Research Labs, Auburn, WA, USA
Niketa Gandhi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

He, Q. et al. (2020). Cluster Center Initialization and Outlier Detection Based on Distance and Density for the K-Means Algorithm. In: Abraham, A., Cherukuri, A.K., Melin, P., Gandhi, N. (eds) Intelligent Systems Design and Applications. ISDA 2018 2018. Advances in Intelligent Systems and Computing, vol 940. Springer, Cham. https://doi.org/10.1007/978-3-030-16657-1_49

Download citation

DOI: https://doi.org/10.1007/978-3-030-16657-1_49
Published: 12 April 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-16656-4
Online ISBN: 978-3-030-16657-1
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

Cluster Center Initialization and Outlier Detection Based on Distance and Density for the K-Means Algorithm

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

K-Means Algorithm Based on Initial Cluster Center Optimization

An Efficient Approach for Selection of Initial Cluster Centroids for k-means

Improved K-Means Algorithm for Optimizing Initial Centers

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Cluster Center Initialization and Outlier Detection Based on Distance and Density for the K-Means Algorithm

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

K-Means Algorithm Based on Initial Cluster Center Optimization

An Efficient Approach for Selection of Initial Cluster Centroids for k-means

Improved K-Means Algorithm for Optimizing Initial Centers

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation