Analyzing and Enhancing Processing Speed of K-Medoid Algorithm Using Efficient Large Scale Processing Frameworks

Jaiswal, Ayshwarya; Dwivedi, Vijay Kumar; Yadav, Om. Prakash

doi:10.1007/978-3-030-49336-3_14

Ayshwarya Jaiswal¹⁸,
Vijay Kumar Dwivedi¹⁸ &
Om. Prakash Yadav¹⁸

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1179))

Included in the following conference series:

International Conference on Hybrid Intelligent Systems

562 Accesses

Abstract

K-medoid algorithm has recently become a highly active and most discussed topic. It is better than k-means as it is more robust and less sensitive to outliers, but it itself has drawbacks such as number of medoids should be given in advance which is hard to determine and the initial k-clustering centers need to be chosen at random.

This article focuses on new modified k-medoid++ algorithm, which is a proposed algorithm for increasing the processing speed and efficiency of K-medoid algorithm.

However, not only modifying the algorithm increases the processing speed, but selecting appropriate framework to efficiently run the algorithm has its own perquisites.

Apache Hadoop and Spark provide an effective open source solution for big data. Many researchers are making false interpretations about these frameworks regarding the performance and efficiency.

In this paper, the performance of both the frameworks are compared by implementing simple k-medoid algorithm and then selecting the appropriate tool for modified k-medoid++ algorithm. It was also observed on implementing the k-medoid algorithm, that on selecting initial medoids randomly was giving random results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Ball K-Medoids: Faster and Exacter

Improving the Efficiency of the K-medoids Clustering Algorithm by Getting Initial Medoids

k-MM: A Hybrid Clustering Algorithm Based on k-Means and k-Medoids

References

Assefi, M., Behravesh, E., Liu, G., Tafti, A.P.: Big data machine learning using apache spark MLlib. In: 2017 IEEE International Conference on Big Data (2017)
Google Scholar
Han, D., Agrawal, A., Liao, W.-K., Choudhary, A.: A novel scalable DBSCAN algorithm with spark. In: IEEE Conference Publication, 04 August 2016
Google Scholar
Martino, A., Rizzi, A., Mascioli, F.M.: Efficient approaches for solving the large scale k-medoids problem. In: 9th IJCCI (2017)
Google Scholar
Jaiswal, A., Yadav, O.P.: Analyzing and enhancing processing speed for knowledge discovery from Big Data using Hadoop Framework. In: National Conference on Information Technology & Security Applications(NCITSA 2019) (2019). ISBN No. 9781-940543-0-6
Google Scholar
Song, H., Lee, J.-G., Han, W.-S.: PAMAE: parallel k-medoids clustering with high accuracy and efficiency. In: KDD 2017, 13–17 August 2017, Halifax, NS, Canada (2017)
Google Scholar
Omair Shafiq, M., Torunski, E: A parallel k-medoids algorithm for clustering based on MapReduce. In: 2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA) (2016)
Google Scholar
Yue, J., Mao, S., Li, M., et al.: An efficient PAM spatial clustering algorithm based on MapReduce. In: 2014 22nd International Conference on IEEE (2014)
Google Scholar
Jiang, Y., Zhang, J.: Parallel K-Medoids clustering algorithm based on Hadoop. In: 2014 IEEE 5th International Conference on Software Engineering and Service Science (2014)
Google Scholar
Vijayalaksmi, S., Punithavalli, M.: A fast approach to clustering datasets using DBSCAN and pruning algorithms. Int. J. Comput. Appl. (0975 – 8887) 60(14), 1–7 (2012)
Google Scholar
Verma, J.P., Patel, A.: Comparison of MapReduce and Spark programming frameworks for big data analytics on HDFS. IJCSC 7(2), 180–184 (2016)
Google Scholar
Fu, J., Sun, J., Wang, K.: Spark – a big data processing platform for machine learning. In: 2016 IEEE, International Conference on Industrial Informatics - Computing Technology, Intelligent Technology, Industrial Information integration (ICIICII) (2016)
Google Scholar
Richter, A.N., Khoshgoftaar, T.M., Landset, S., Hasanin, T.: A multi-dimensional comparison of toolkits for machine learning with Big data. In: 2015 IEEE 16th International Conference on Information Reuse and Integration (2015)
Google Scholar
Srinivas Jonnalagadda, V., Srikanth, P., Thumati, K.: A review study of apache spark in big data processing. Int. J. Comput. Sci. Trends Technol. (IJCST) 4(3), 93–98 (2016)
Google Scholar
UCI Machine learning repository
Google Scholar
Nandakumar, A.N., Yambem, N.: A survey on data mining algorithms on Apache Hadoop Platform. Int. J. Emerg. Technol. Adv. Eng. 4(1), 563–565 (2014)
Google Scholar
https://www.dezyre.com/article/apache-spark-architecture-explained-in-detail/338
https://www.edureka.co/blog/spark-architecture/
https://medium.com/better-programming/high-level-overview-of-apache-spark-c225a0a162e9
Zhu, Y., Wang, F., Sang, X., Lv, X.: K-medoids clustering based on MapReduce and optimal search of medoids. In: The 9th International Conference on Computer Science and Education (ICCSE 2014), Vancouver, Canada, 24 August (2014)
Google Scholar
Liu, A., Zuo, S., Qui, T., Bai, X.: Research on K-medoids clustering algorithm based on data density and its parallel processing based on MapReduce. J. Residuals Sci. Technol. 13, e4015 (2016)
Article Google Scholar

Download references

Author information

Authors and Affiliations

UCER, Prayagraj, Uttar Pradesh, India
Ayshwarya Jaiswal, Vijay Kumar Dwivedi & Om. Prakash Yadav

Authors

Ayshwarya Jaiswal
View author publications
You can also search for this author in PubMed Google Scholar
Vijay Kumar Dwivedi
View author publications
You can also search for this author in PubMed Google Scholar
Om. Prakash Yadav
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Vijay Kumar Dwivedi .

Editor information

Editors and Affiliations

Scientific Network for Innovation and Research Excellence, Machine Intelligence Research Labs (MIR), Auburn, WA, USA
Ajith Abraham
School of Computer Science and Engineering, VIT Bhopal University, Bhopal, Madhya Pradesh, India
Shishir K. Shandilya
Area of Project Engineering, University of Cordoba, Córdoba, Spain
Laura Garcia-Hernandez
Escola de Engenharia, Universidade do Minho, Guimarães, Portugal
Maria Leonilde Varela

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jaiswal, A., Dwivedi, V.K., Yadav, O.P. (2021). Analyzing and Enhancing Processing Speed of K-Medoid Algorithm Using Efficient Large Scale Processing Frameworks. In: Abraham, A., Shandilya, S., Garcia-Hernandez, L., Varela, M. (eds) Hybrid Intelligent Systems. HIS 2019. Advances in Intelligent Systems and Computing, vol 1179. Springer, Cham. https://doi.org/10.1007/978-3-030-49336-3_14

Download citation

DOI: https://doi.org/10.1007/978-3-030-49336-3_14
Published: 13 August 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-49335-6
Online ISBN: 978-3-030-49336-3
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

Analyzing and Enhancing Processing Speed of K-Medoid Algorithm Using Efficient Large Scale Processing Frameworks

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Ball K-Medoids: Faster and Exacter

Improving the Efficiency of the K-medoids Clustering Algorithm by Getting Initial Medoids

k-MM: A Hybrid Clustering Algorithm Based on k-Means and k-Medoids

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Analyzing and Enhancing Processing Speed of K-Medoid Algorithm Using Efficient Large Scale Processing Frameworks

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Ball K-Medoids: Faster and Exacter

Improving the Efficiency of the K-medoids Clustering Algorithm by Getting Initial Medoids

k-MM: A Hybrid Clustering Algorithm Based on k-Means and k-Medoids

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation