Parallel Fuzzy c-Means Cluster Analysis

Modenesi, Marta V.; Costa, Myrian C. A.; Evsukoff, Alexandre G.; Ebecken, Nelson F. F.

doi:10.1007/978-3-540-71351-7_5

Marta V. Modenesi¹,
Myrian C. A. Costa¹,
Alexandre G. Evsukoff¹ &
…
Nelson F. F. Ebecken¹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4395))

Included in the following conference series:

International Conference on High Performance Computing for Computational Science

762 Accesses
9 Citations

Abstract

This work presents an implementation of a parallel Fuzzy c-means cluster analysis tool, which implements both aspects of cluster investigation: the calculation of clusters’ centers with the degrees of membership of records to clusters, and the determination of the optimal number of clusters for the data, by using the PBM validity index to evaluate the quality of the partition.

The work’s main contributions are the implementation of the entire cluster’s analysis process, which is a new approach in literature, integrating to clusters calculation the finding of the best natural pattern present in data, and also, the parallel processing implementation of this tool, which enables this approach to be used with vary large volumes of data, a increasing need for data analysis in nowadays industries and business databases, making the cluster analysis a feasible tool to support specialist’s decision in all fields of knowledge.

The results presented in the paper show that this approach is scalable and brings processing time reduction as an benefit that parallel processing can bring to the matter of cluster analysis.

Topics of Interest: Unsupervised Classification, Fuzzy c-Means, Cluster and Grid Computing

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Sousa, M.S.R., Mattoso, M., Ebecken, N.F.F.: Mining a large database with a parallel database server. Intelligent Data Analysis 3, 437–451 (1999)
Article Google Scholar
Coppola, M., Vanneschi, M.: High-performance data mining with skeleton-based structured parallel programming. Parallel Computing 28, 783–813 (2002)
Article Google Scholar
Jin, R., Yang, G., Agrawal, G.: Shared Memory Parallelization of Data Mining Algorithms: Techniques, Programming Interface, and Performance. IEEE Transaction on Knowledge and Data Engineering 17(1), 71–89 (2005)
Article Google Scholar
Cannataro, M., et al.: Distributed data mining on grids: services, tools, and applications. IEEE Transactions on Systems, Man and Cybernetics, Part B 34(6), 2451–2465 (2004)
Article Google Scholar
Kubota, K., et al.: Parallelization of decision tree algorithm and its performance evaluation. In: Proceedings of the Fourth International Conference on High Performance Computing in the Asia-Pacific Region, vol. 2, pp. 574–579 (2000)
Google Scholar
Kim, M.W., Lee, J.G., Min, C.: Efficient fuzzy rule generation based on fuzzy decision tree for data mining. In: Proceedings of the IEEE International Fuzzy Systems Conference, FUZZ-IEEE ’99, pp. 1223–1228. IEEE Computer Society Press, Los Alamitos (1999)
Google Scholar
Evsukoff, A., Costa, M.C.A., Ebecken, N.F.F.: Parallel Implementation of Fuzzy Rule Based Classifier. In: Daydé, M., et al. (eds.) VECPAR 2004. LNCS, vol. 3402, pp. 443–452. Springer, Heidelberg (2005)
Google Scholar
Phua, P.K.H., Ming, D.: Parallel nonlinear optimization techniques for training neural networks. IEEE Transactions on Neural Networks 14(6), 1460–1468 (2003)
Article Google Scholar
Costa, M.C.A., Ebecken, N.F.F.: A Neural Network Implementation for Data Mining High Performance Computing. In: Proceedings of the V Brazilian Conference on Neural Networks, pp. 139–142 (2001)
Google Scholar
Agrawal, R., Shafer, J.C.: Parallel mining of association rules. IEEE Transactions on Knowledge and Data Engineering 8(6), 962–969 (1996)
Article Google Scholar
Shen, L., Shen, H., Cheng, L.: New algorithms for effcient mining of association rules. Information Sciences 118, 251–268 (1999)
Article Google Scholar
Boutsinas, B., Gnardellis, T.: On distributing the clustering process. Pattern Recognition Letters 23, 999–1008 (2002)
Article MATH Google Scholar
Rahimi, S., et al.: A parallel Fuzzy C-Mean algorithm for image segmentation. In: Proceedings of the IEEE Annual Meeting of the Fuzzy Information NAFIPS ’04, vol. 1, pp. 234–237. IEEE Computer Society Press, Los Alamitos (2004)
Chapter Google Scholar
Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: a review. ACM Computing Surveys 31(3), 264–323 (1999)
Article Google Scholar
Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum, New York (1981)
MATH Google Scholar
Xie, X.L., Beni, G.A.: Validity measure for fuzzy clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence 3(8), 841–846 (1991)
Article Google Scholar
Bezdek, J., Pal, N.R.: Some new indexes of cluster validity. IEEE Trans. Systems Man and Cybernetics B 28, 301–315 (1998)
Article Google Scholar
Pakhira, M.K., Bandyopadhyay, S., Maulik, U.: Validity index for crisp and fuzzy clusters. Pattern Recognition 37, 487–501 (2004)
Article MATH Google Scholar
Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)
MATH Google Scholar
Quinlan, R.: C4.5 – Programs for Machine Learning. Morgan Kaufmann, San Francisco (1993)
Google Scholar

Download references

Author information

Authors and Affiliations

COPPE/Federal University of Rio de Janeiro, P.O. Box 68506, 21945-970 Rio de Janeiro RJ, Brazil
Marta V. Modenesi, Myrian C. A. Costa, Alexandre G. Evsukoff & Nelson F. F. Ebecken

Authors

Marta V. Modenesi
View author publications
You can also search for this author in PubMed Google Scholar
Myrian C. A. Costa
View author publications
You can also search for this author in PubMed Google Scholar
Alexandre G. Evsukoff
View author publications
You can also search for this author in PubMed Google Scholar
Nelson F. F. Ebecken
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Michel Daydé José M. L. M. Palma Álvaro L. G. A. Coutinho Esther Pacitti João Correia Lopes

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Modenesi, M.V., Costa, M.C.A., Evsukoff, A.G., Ebecken, N.F.F. (2007). Parallel Fuzzy c-Means Cluster Analysis. In: Daydé, M., Palma, J.M.L.M., Coutinho, Á.L.G.A., Pacitti, E., Lopes, J.C. (eds) High Performance Computing for Computational Science - VECPAR 2006. VECPAR 2006. Lecture Notes in Computer Science, vol 4395. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71351-7_5

Download citation

DOI: https://doi.org/10.1007/978-3-540-71351-7_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-71350-0
Online ISBN: 978-3-540-71351-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics