A Novel on Altered K-Means Algorithm for Clustering Cost Decrease of Non-labeling Big-Data

Jung, Se-Hoon; So, Won-Ho; You, Kang-Soo; Sim, Chun-Bo

doi:10.1007/978-981-13-1328-8_48

A Novel on Altered K-Means Algorithm for Clustering Cost Decrease of Non-labeling Big-Data

Se-Hoon Jung³⁶,
Won-Ho So³⁷,
Kang-Soo You³⁸ &
…
Chun-Bo Sim³⁹

Conference paper
First Online: 29 November 2018

1125 Accesses

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 518))

Abstract

Machine learning in Big Data is getting the spotlight to retrieve useful knowledge inherent in multi-dimensional information and discover new inherent knowledge in the fields related to the storage and retrieval of massive multi-dimensional information that is newly produced. The machine learning technique can be divided into supervised and unsupervised learning according to whether there is data labeling or not. Unsupervised learning, which is a technique to classify and analyze data with no labeling, is utilized in various ways in the analysis of multi-dimensional Big Data. The present study thus proposed an altered K-means algorithm to analyze the problems with the old one and determine the number of clusters automatically. The study also proposed an approach of optimizing the number of clusters through principal component analysis, a pre-processing process, with the input data for clustering. The performance evaluation results confirm that the CVI of the proposed algorithm was superior to that of the old K-means algorithm in accuracy.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Jung SH, Kim KJ, Lim EC, Sim CB (2017) A novel on automatic K value for efficiency improvement of K-means clustering. In: Jong Hyuk JJ, Park et al (eds). Nature Singapore Pte. Ltd. 2017. LNEE. Springer, Heidelberg, vol. 448, pp 181–186
Google Scholar
Huang JZ, Ng MK, Rong H, Li Z (2005) Automated variable weighting in k-means type clustering. IEEE Trans Pattern Anal Mach Intell 27(5):657–668
Article Google Scholar
Zhang K, Bi W, Zhang X, Fu X, Zhou K, Zhu L (2015) A new Kmeans clustering algorithm for point cloud. Int J Hybrid Inf Technol 8(9):157–170
Article Google Scholar
Xiong H, Wu J, Chen J (2009) K-means clustering versus validation measures: a data-distribution perspective. IEEE Trans Syst Man Cybern B 39(2):318–331
Google Scholar
Jung SH, Kim JC, Sim CB (2016) Prediction data processing scheme using an artificial neural network and data clustering for big data. Int J Electr Comput Eng 6(1): 330–336
Article Google Scholar
Ding C, He X (2004) K-means clustering via principal component analysis. In: Proceedings of the twenty-first international conference on Machine learning. ACM
Google Scholar

Download references

Acknowledgements

This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (NRF-2017R1D1A3B03035379).

Author information

Authors and Affiliations

Department of Multimedia Engineering, Sunchon National University, Suncheon, Republic of Korea
Se-Hoon Jung
Department of Computer Education, Sunchon National University, Suncheon, Republic of Korea
Won-Ho So
School of Liberal Arts, Jeonju University, Jeonju, Republic of Korea
Kang-Soo You
School of Information Communication and Multimedia Engineering, Sunchon National University, Suncheon, Republic of Korea
Chun-Bo Sim

Authors

Se-Hoon Jung
View author publications
You can also search for this author in PubMed Google Scholar
Won-Ho So
View author publications
You can also search for this author in PubMed Google Scholar
Kang-Soo You
View author publications
You can also search for this author in PubMed Google Scholar
Chun-Bo Sim
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chun-Bo Sim .

Editor information

Editors and Affiliations

Department of Computer Science and Engineering, Seoul National University of Science and Technology, Seoul, Korea (Republic of)
James J. Park
Departement of Business Science, University of Salerno, Fisciano, Italy
Vincenzo Loia
Department of Information Systems and Cyber Security, The University of Texas at San Antonio, San Antonio, TX, USA
Kim-Kwang Raymond Choo
Department of Multimedia Engineering, Dongguk University, Seoul, Soul-t’ukpyolsi, Korea (Republic of)
Gangman Yi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jung, SH., So, WH., You, KS., Sim, CB. (2019). A Novel on Altered K-Means Algorithm for Clustering Cost Decrease of Non-labeling Big-Data. In: Park, J., Loia, V., Choo, KK., Yi, G. (eds) Advanced Multimedia and Ubiquitous Engineering. MUE FutureTech 2018 2018. Lecture Notes in Electrical Engineering, vol 518. Springer, Singapore. https://doi.org/10.1007/978-981-13-1328-8_48

Download citation

DOI: https://doi.org/10.1007/978-981-13-1328-8_48
Published: 29 November 2018
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-1327-1
Online ISBN: 978-981-13-1328-8
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics