Abstract
Clustering of data has numerous applications and has been studied extensively. It is very important in Bioinformatics and data mining. Though many parallel algorithms have been designed, most of algorithms use the CRCW-PRAM or CREW-PRAM models of computing. This paper proposed a parallel EREW deterministic algorithm for hierarchical clustering. Based on algorithms of complete graph and Euclidean minimum spanning tree, the proposed algorithms can cluster n objects with O(p) processors in O(n 2/p) time where 1≤ p ≤ \(\frac{n}{log n}\). Performance comparisons show that our algorithm is the first algorithm that is both without memory conflicts and adaptive.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Morgan Kaufmann Publishers, San Francisco (2000)
Olson, C.F.: Parallel Algorithms for Hierarchical Clustering. Parallel Computing 21, 1313–1325 (1995)
Dahlhaus, E.: Parallel Algorithms for Hierarchical Clustering and Applications to Split Decomposition and Parity Graph Recognition. Journal of Algorithms 36, 205–240 (2000)
Rajasekaran, S.: Efficient Parallel Hierarchical Clustering Algorithms. IEEE transactions on parallel and distributed systems 16(6), 497–502 (2005)
Rasmussen, E.M., Willett, P.: Efficiency of hierarchic agglomerative clustering using the ICL Distributed Array Processor. Journal of Documentation 45, 1–24 (1989)
Li, X., Fang, Z.: Parallel Clustering Algorithms. Parallel Computing 11, 275–290 (1989)
Li, X.: Parallel Algorithms for Hierarchical Clustering and Clustering Validity. IEEE Trans. Pattern Analysis and Machine Intelligence 12, 1088–1092 (1990)
Tsai, H.R., Horng, S.J., Lee, S.S., Tsai, S.S., Kao, T.W.: Parallel Hierarchical Clustering Algorithms on Processor Arrays with a Reconfigurable Bus System. Pattern Recognition 30, 801–815 (1997)
Akl, S G.: Optimal parallel merging and sorting without memory conflicts. IEEE Trans. Comput. 36(11), 1367–1369 (1987)
Chen, G.: Design and analysis of parallel algorithm. Higher education press, Beijing (2002)
Datta, A., Soundaralakshmi, S.: Fast Parallel Algorithm for Distance Transform. IEEE Transactions on Systems, Man, and Cybernetics 33(5), 429–434 (2003)
Akl, S.G.: An adaptive and cost-optimal parallel algorithm for minimum spanning trees. Computing 3, 271–277 (1986)
Li, K.L., Li, Q.H., Li, R.F.: Optimal parallel algorithm for the knapsack problem without memory conflicts. Journal of Computer Science and Technology 19(6), 760–768 (2004)
Jun, M., Shaohan, M.: Effcient Parallel Algorithm s for Some Graph Theory Problems. J. of Comput. Sci. Technol. 8(4), 362–366 (1993)
Nath, D., Maheshwari, S.N.: Parallel algorithms for the connected components and minimal spanning tree problems. Inf: Proc. Lett. 14(1), 7–11 (1982)
Chong, K.W., Han, Y.J.: Concurrent Threads and Optimal Parallel MinimumSpanning Trees Algorithm. Journal of the ACM 48(2), 297–323 (2001)
Dash, M., Petrutiu, S., Scheuermann, P.: pPOP: Fast yet accurate parallel hierarchical clustering using partitioning. Data & Knowledge Engineering 61(3), 563–578 (2007)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Li, Z., Li, K., Xiao, D., Yang, L. (2007). An Adaptive Parallel Hierarchical Clustering Algorithm. In: Perrott, R., Chapman, B.M., Subhlok, J., de Mello, R.F., Yang, L.T. (eds) High Performance Computing and Communications. HPCC 2007. Lecture Notes in Computer Science, vol 4782. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-75444-2_15
Download citation
DOI: https://doi.org/10.1007/978-3-540-75444-2_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-75443-5
Online ISBN: 978-3-540-75444-2
eBook Packages: Computer ScienceComputer Science (R0)