Identifying stable objects for accelerating the classification phase of k-means

Mexicano, A.; Cervantes, S.; Rodríguez, R.; Pérez, J.; Almanza, N.; Jiménez, M. A.; Azuara, A.

doi:10.1007/978-3-319-49109-7_88

A. Mexicano⁵,
S. Cervantes⁶,
R. Rodríguez⁷,
J. Pérez⁸,
N. Almanza⁸,
M. A. Jiménez⁵ &
…
A. Azuara⁵

Part of the book series: Lecture Notes on Data Engineering and Communications Technologies ((LNDECT,volume 1))

Included in the following conference series:

International Conference on P2P, Parallel, Grid, Cloud and Internet Computing

1714 Accesses
1 Citations

Abstract

This work presents an improved version of the K-Means algorithm, this version consists in a simple heuristic where objects that remains in the same group, between the current and the previous iteration, are identified and excluded from calculi in the classification phase for subsequent iterations. In order to evaluate the improved version versus the standard, three synthetic and seven well-known real instances of specialized literature were used. Experimental results showed that the proposed heuristic spends less time than the standard algorithm. The best result was obtained when the Transactions instance was grouped into 200 clusters, achieving a time reduction of 90.1% regarding the standard version, with only a grouping quality reduction of 3.97%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

A Modified Version of K-Means Algorithm

An Adaptive Strategy for Dynamic Data Clustering with the K-Means Algorithm

Global k-means++: an effective relaxation of the global k-means clustering algorithm

Article 05 July 2024

References

Jain, A., and Dubes, R.: Algorithms for Clustering Data, Prentice Hall, Englewood Cliff, Nueva Jersey (1988)
Google Scholar
Junjie, W.: Advances in K-Means Clustering A Data Mining Thinking, Doctoral Thesis, Tsinghua University, China, Springer (2012)
Google Scholar
Scoltock J.: A survey of the literature of cluster analysis,” The Computer Journal, 25 (1982) 130-134
Google Scholar
Al-Zoubi, B., Hudaib, A., Huneiti, A., and Hammo, B.: New Efficient Strattegy to Accelerate K-Means Clustering Algorithm, American Journal of Applied Sciences, 5:9 (2008) 1247-1250
Google Scholar
Xu, R. and Wunsch II, D.: Survey of clustering algorithm, IEEE Transactions on Neural Networks, 16:3 (2005) 645-678
Google Scholar
Everitt, B. S., Laudau, S., Leese, M., and Stahl, D.: Cluster Analysis. John Wiley and Sons, Inc., London, United Kindom (2011)
Google Scholar
Chen, M., Mao, S., Zhang, Y., Leung, V.: Big data: related technologies, Challenges and future prospects, Springer (2014)
Google Scholar
Li, K.C., Jiang, H., Yang, T. L.: Big Data: Algorithms, Analytics, and Applications, CRC Press Taylor and Francis Group, New York (2015)
Google Scholar
MacQueen, J.: Some Methods for Classification and Analysis of Multivariate Observations, in Fifth Berkeley Symposium on Mathematics, Statistics and Probability, University of California Press, Berkeley, Calif., (1967) 281–296
Google Scholar
Wu, X., Kumar, V., Quinlan, J.L., Ghosh, J., Yang, Q., Motoda, H., McLachlan, G.J., Ng, A., Liu, B, Yu, P.S., Zhou, Z., Steinbach, M., Hand, D.J., and, Steinberg, D.: Top 10 algorithms in data mining, Journal of Knowledge and Information System, 14 (2008) 1-37
Google Scholar
Khan, S. S., and Ahmad, A.: Cluster center initialization algorithm for K-Means clustering, Pattern Recognition Letters, 25 (2004) 1293–1302
Google Scholar
Redmond, S. J. and Heneghan, C.: A method for initializing the K-Means clustering algorithm using kd-trees, Pattern Recognition Letters, 28: 8 (2007) 965–973
Google Scholar
Zalik, K. R.: An efficient K-Means clustering algorithm, Pattern Recognition Letters, 29 (2008) 1385–1391
Google Scholar
Li, C. S.: Cluster Center Initialization Method for K-Means Algorithm over Data Sets with Two Clusters, Procedia Engineering, 24, (2011) 324–328
Google Scholar
Eltibi, M. F. and Ashour, W.M.: Initializing K-Means Clustering Algorithm using Statistical Information, International Journal of Computer Applications, 29:7 (2011) 51–55
Google Scholar
Agha, M. E. and Ashour, W. M.: Efficient and Fast Initialization Algorithm for K-Means Clustering, International Journal of Intelligent Systems and Applications, 1:1 (2012) 21–31
Google Scholar
Kaur, N., Kaur, J., and Kaur, N.: Efficient K-Means clustering algorithm using ranking method in data mining, International Journal of Advanced Research in Computer Engineering & Technology, 1:3 (2012) 85–91
Google Scholar
Perez, J., Pazos, R., Cruz, L., Reyes, G., Basave, R. and Fraire, H.: Improving the Efficiency and Efficacy of the K-Means Clustering Algorithm through a New Convergence Condition, in Computational Science and Its Applications - ICCSA, Kuala Lumpur, Malaysia (2007) 674–682.
Google Scholar
Yu, S., Tranchevent, L. C., Liu, X., Glänzel, W., Suykens, J. A. K., Moor, B.D., and Moreau, Y.: Optimized Data Fusion for Kernel K-Means Clustering, IEEE Transactions on Pattern Analysis and Machine Intelligence, 34:5 (2012) 1031–1039
Google Scholar
Mexicano, A., Rodríguez, R., Cervantes, S., Montes, P., Jiménez, M., Almanza, N. and Abrego, A.: The early stop heuristic: A new convergence criterion for K-means, in AIP Conf. Proc. 1738, ICNAAM2015, Rhodes Greece (2016) 3100031–3100314.
Google Scholar
Lai, J.Z.C. and Liaw, Y.C.: Improvement of the K-Means clustering filtering algorithm, Pattern Recognition, 41(2008) 3677–3681
Google Scholar
Fahim, A. M., Salem, A. M., Torkey, F. A. and Ramadan, M. A.: An Efficient Enhanced KMeans Clustering Algorithm, Journal of Zhejiang University-Science, 7:10 (2006) 1626–1633
Google Scholar
Sheeba, A., Mahfooz, S., Khusro, S. and Javed, H.: Enhanced K-Mean Clustering Algorithm to Reduce Number of Iterations and Time Complexity, Middle-East Journal of Scientific Research, 12:7 (2012) 959–963
Google Scholar
Pérez, J., Martínez, A., Almanza, N., Mexicano, A., and Pazos, R.: Improvement to the KMeans algorithm by using its geometric and cluster neighborhood properties, in Proceedings of ICITSEM 2014, Dubai, UAE (2014) 21–26.
Google Scholar
Pérez, J., Pires, C. E., Balby, L., Mexicano, A. and Hidalgo, M.: Early Classification: A New Heuristic to Improve the Classification Step of K-Means, Journal of Information and Data Management, 4:2 (2013) 94–103
Google Scholar
Mexicano, A., Rodriguez, R., Cervantes, S., Ponce, R. and Bernal, W.: Fast means: Enhancing the K-Means algorithm by accelerating its early classification version, in AIP Conf. Proc. 1648, ICNAAM2014, Rhodes Greece (2015) 8200041–8200044
Google Scholar
Pérez, J., Pazos, R., Hidalgo, M., Almanza, N., Díaz-Parra, O., Santaolaya, R., and Caballero, V.: An improvement to the K-Means algorithm oriented to big data, in AIP Conf. Proc. 1648, ICNAAM2014, Rhodes Greece (2015) 8200021–8200024
Google Scholar
Pérez, J., Pazos, R., Olivares, V., Hidalgo, M., Ruiz, J., Martínez, A., Almanza, N., and González, M.: Optimization of the K-Means algorithm for the solution of high dimensional instances, in AIP Conf. Proc. 1738, ICNAAM2015, Rhodes Greece (2016) 3100021–3100214
Google Scholar
Merz, C., Murphy, P., and Aha, D.: UCI Repository of Machine Learning Databases. Department of Information and Computer Science, University of California, http://www.ics.uci.edu/mlearn/MLRepository.html, 2016.
http://sci2s.ugr.es/keel/datasets.php, Knowledge Extraction based on Evolutionary Learning, KEEL-dataset, last view: July 2016.
http://www.flickr.com/map/,Photography’s repository, last view: July 2016.

Download references

Author information

Authors and Affiliations

Technological Institute of Ciudad Victoria, Cd. Victoria, Tamaulipas, Mexico
A. Mexicano, M. A. Jiménez & A. Azuara
University Center of Ciudad Valles, Guadalajara University, Ciudad Valles, Jalisco, Mexico
S. Cervantes
Autonomous University of Ciudad Juarez, Ciudad Juarez, Chihuahua, Mexico
R. Rodríguez
National Centre of Research and Technological Development, Cuernavaca, Morelos, Mexico
J. Pérez & N. Almanza

Authors

A. Mexicano
View author publications
You can also search for this author in PubMed Google Scholar
S. Cervantes
View author publications
You can also search for this author in PubMed Google Scholar
R. Rodríguez
View author publications
You can also search for this author in PubMed Google Scholar
J. Pérez
View author publications
You can also search for this author in PubMed Google Scholar
N. Almanza
View author publications
You can also search for this author in PubMed Google Scholar
M. A. Jiménez
View author publications
You can also search for this author in PubMed Google Scholar
A. Azuara
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to S. Cervantes .

Editor information

Editors and Affiliations

Campus Nord,Ed. Omega (Room 109), Technical University of Catalonia Campus Nord,Ed. Omega (Room 109), Barcelona, Spain
Fatos Xhafa
Fukuoka Institute of Technology , Fukuoka, Japan
Leonard Barolli
Federico II, Università degli Studi di Napoli Federico II, Napoli, Italy
Flora Amato

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mexicano, A. et al. (2017). Identifying stable objects for accelerating the classification phase of k-means. In: Xhafa, F., Barolli, L., Amato, F. (eds) Advances on P2P, Parallel, Grid, Cloud and Internet Computing. 3PGCIC 2016. Lecture Notes on Data Engineering and Communications Technologies, vol 1. Springer, Cham. https://doi.org/10.1007/978-3-319-49109-7_88

Download citation

DOI: https://doi.org/10.1007/978-3-319-49109-7_88
Published: 22 October 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-49108-0
Online ISBN: 978-3-319-49109-7
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

Identifying stable objects for accelerating the classification phase of k-means

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

A Modified Version of K-Means Algorithm

An Adaptive Strategy for Dynamic Data Clustering with the K-Means Algorithm

Global k-means++: an effective relaxation of the global k-means clustering algorithm

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Identifying stable objects for accelerating the classification phase of k-means

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

A Modified Version of K-Means Algorithm

An Adaptive Strategy for Dynamic Data Clustering with the K-Means Algorithm

Global k-means++: an effective relaxation of the global k-means clustering algorithm

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation