Abstract
With the explosive increase of multimedia objects represented as high-dimensional vectors, clustering techniques for these objects have received much attention in recent years. However, clustering methods usually require a large amount of computational cost when calculating the distances between these objects. In this paper, for accelerating the greedy K-medoids clustering algorithm with \(L_1\) distance, we propose a new method consisting of the fast first medoid selection, lazy evaluation, and pivot pruning techniques, where the efficiency of the pivot construction is enhanced by our new pivot generation method called PGM2. In our experiments using real image datasets where each object is represented as a high-dimensional vector and \(L_1\) distance is recommended as their dissimilarity, we show that our proposed method achieved a reasonably high acceleration performance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Zhu, X., He, W., Li, Y., Yang, Y., Zhang, S., Hu, R., Zhu, Y.: One-step spectral clustering via dynamically learning affinity matrix and subspace. In: Proceedings of the 31st AAAI Conference on Artificial Intelligence (AAAI 2017), pp. 2963–2969 (2017)
Leskovec, J., Krause, A., Guestrin, C., Faloutsos, C., VanBriesen, J., Glance, N.: Cost-effective outbreak detection in networks. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 420–429. ACM, New York (2007)
Hjaltason, G.R., Samet, H.: Index-driven similarity search in metric spaces (survey article). ACM Trans. Database Syst. 28(4), 517–580 (2003)
Kobayashi, E., Fushimi, T., Saito, K., Ikeda, T.: Similarity search by generating pivots based on Manhattan distance. In: Pham, D.-N., Park, S.-B. (eds.) PRICAI 2014. LNCS (LNAI), vol. 8862, pp. 435–446. Springer, Cham (2014). doi:10.1007/978-3-319-13560-1_35
Rubner, Y., Tomasi, C., Guibas, L.J.: The earth mover’s distance as a metric for image retrieval. Int. J. Comput. Vis. 40(2), 99–121 (2000)
Bolettieri, P., Esuli, A., Falchi, F., Lucchese, C., Perego, R., Piccioli, T., Rabitti, F.: CoPhIR: a test collection for content-based image retrieval. CoRR abs/0905.4627v2 (2009)
Kaufman, L., Rousseeuw, P.: Clustering large data sets (with discussion), pp. 425–437. Elsevier/North Holland (1986)
Park, H.S., Jun, C.H.: A simple and fast algorithm for k-medoids clustering. Expert Syst. Appl. 36(2, Part 2), 3336–3341 (2009)
Jiang, C., Li, Y., Shao, M., Jia, P.: Accelerating clustering methods through fractal based analysis. In: The 1st Workshop on Application of Self-similarity and Fractals in Data Mining (KDD 2002 Workshop) (2002)
Paterlini, A.A., Nascimento, M.A., Traina, C.J.: Using pivots to speed-up k-medoids clustering. J. Inf. Data Manag. 2(2), 221–236 (2011)
Bustos, B., Navarro, G., Chavez, E.: Pivot selection techniques for proximity searching in metric spaces. Pattern Recogn. Lett. 24(14), 2357–2366 (2003)
Zezula, P., Amato, G., Dohnal, V., Batko, M.: Similarity Search: The Metric Space Approach. Advances in Database Systems, vol. 32, 1st edn. Springer, New York (2006)
Acknowledgments
This work was supported by JSPS Grant-in-Aid for Scientific Research (No. 16K16154).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Fushimi, T., Saito, K., Ikeda, T., Kazama, K. (2017). Accelerating Greedy K-Medoids Clustering Algorithm with \(L_1\) Distance by Pivot Generation. In: Kryszkiewicz, M., Appice, A., Ślęzak, D., Rybinski, H., Skowron, A., Raś, Z. (eds) Foundations of Intelligent Systems. ISMIS 2017. Lecture Notes in Computer Science(), vol 10352. Springer, Cham. https://doi.org/10.1007/978-3-319-60438-1_9
Download citation
DOI: https://doi.org/10.1007/978-3-319-60438-1_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-60437-4
Online ISBN: 978-3-319-60438-1
eBook Packages: Computer ScienceComputer Science (R0)