Knowledge-Based Clustering in Computational Intelligence

Pedrycz, Witold

doi:10.1007/978-3-540-71984-7_12

Witold Pedrycz⁵

Part of the book series: Studies in Computational Intelligence ((SCI,volume 63))

1034 Accesses
2 Citations

Summary

Clustering is commonly regarded as a synonym of unsupervised learning aimed at the discovery of structure in highly dimensional data. With the evident plethora of existing algorithms, the area offers an outstanding diversity of possible approaches along with their underlying features and potential applications. With the inclusion of fuzzy sets, fuzzy clustering became an integral component of Computational Intelligence (CI) and is now broadly exploited in fuzzy modeling, fuzzy control, pattern recognition, and exploratory data analysis. A lot of pursuits of CI are human-centric in the sense they are either initiated or driven by some domain knowledge or the results generated by the CI constructs are made easily interpretable. In this sense, to follow the tendency of human-centricity so profoundly visible in the CI domain, the very concept of fuzzy clustering needs to be carefully revisited. We propose a certain paradigm shift that brings us to the idea of knowledge-based clustering in which the development of information granules – fuzzy sets is governed by the use of data as well as domain knowledge supplied through an interaction with the developers, users and experts. In this study, we elaborate on the concepts and algorithms of knowledge-based clustering by considering the well known scheme of Fuzzy C-Means (FCM) and viewing it as an operational model using which a number of essential developments could be easily explained. The fundamental concepts discussed here involve clustering with domain knowledge articulated through partial supervision and proximity-based knowledge hints.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Abonyi, J. and Szeifert, F. (2003). Supervised fuzzy clustering for the identification of fuzzy classifiers, Pattern Recognition Letters,24,14, 2195-2207.
Article MATH Google Scholar
Agarwal, R. and Srikant, R. (2000). Privacy-preserving data mining. In: Proc. of the ACM SIGMOD Conference on Management of Data. ACM Press, New York, May 2000, 439-450.
Chapter Google Scholar
Bensaid, A. M., Hall, L. O., Bezdek, J. C. and Clarke L. P. (1996). Partially supervised clustering for image segmentation, Pattern Recognition, 29,5,859-871.
Article Google Scholar
Bezdek, J. C. (1981). Pattern Recognition with Fuzzy Objective Function Algorithms, Plenum Press, NY.
MATH Google Scholar
Claerhout, B. and DeMoor, G.J.E. (2005). Privacy protection for clinical and genomic data: The use of privacy-enhancing techniques in medicine, Int. Journal of Medical Informatics, 74, 2-4, 257-265.
Article Google Scholar
Clifton, C. (2000). Using sample size to limit exposure to data mining, Journal of Computer Security 8,4, 281-307.
Google Scholar
Clifton, C. and Marks, D. (1996). Security and privacy implications of data mining. In: Workshop on Data Mining and Knowledge Discovery, Montreal, Canada, 15-19.
Google Scholar
Clifton, C. and Thuraisingham, B. (2001). Emerging standards for data mining, Computer Standards & Interfaces, 23, 3, 187-193.
Article Google Scholar
Coppi, R. and D'Urso, P. (2003). Three-way fuzzy clustering models for LR fuzzy time trajectories, Computational Statistics & Data Analysis, 43,2,149-177.
MATH MathSciNet Google Scholar
Da Silva, J. C., Giannella, C., Bhargava, R., Kargupta, H. and Klusch, M. (2005). Distributed data mining and agents, Engineering Applications of Artificial Intelligence, 18, 7, 791-807.
Article Google Scholar
Du, W., Zhan, Z. (2002). Building decision tree classifier on private data. In: Clifton, C., Estivill-Castro, V. (Eds.), IEEE ICDM Workshop on Privacy, Security and Data Mining, Conferences in Research and Practice in Information Technology, vol. 14, Maebashi City, Japan, ACS, pp. 1-8.
Google Scholar
Evfimievski, A., Srikant, R., Agrawal, R. and Gehrke, J. (2004). Privacy preserving mining of association rules, Information Systems, 29, 4, 343-364.
Article Google Scholar
Johnsten, T. and Raghavan V.V. (2002). A methodology for hiding knowledge in databases. In: Clifton, C., Estivill-Castro, C. (Eds.), IEEE ICDM Workshop on Privacy, Security and Data Mining, Conferences in Research and Practice in Information Technology, vol. 14. Maebashi City, Japan, ACS, pp. 9-17.
Google Scholar
Kargupta, H., Kun, L., Datta, S., Ryan, J. and Sivakumar, K. (2003). Homeland security and privacy sensitive data mining from multi-party distributed resources, Proc. 12^th IEEE International Conference on Fuzzy Systems, FUZZ '03,. Volume 2, 25-28 May 2003, vol.2, 1257-1260.
Google Scholar
Kersten, P.R. (1996). Including auxiliary information in fuzzy clustering, Proc. 1996 Biennial Conference of the North American Fuzzy Information Processing Society, NAFIPS, 19-22 June 1996, 221 -224.
Google Scholar
Lindell, Y. and Pinkas, B. (2000). Privacy preserving data mining. In: Lecture Notes in Computer Science, vol. 1880, 36-54.
Google Scholar
Liu, H. and Huang, S.T. (2003). Evolutionary semi-supervised fuzzy clustering, Pattern Recognition Letters, 24, 16, 3105-3113.
Article Google Scholar
Merugu, S and Ghosh, J. (2005).A privacy-sensitive approach to distributed clustering, Pattern Recognition Letters, 26, 4, 399-410.
Article Google Scholar
Park, B. and Kargupta, H. (2003). Distributed data mining: algorithms, systems, and applications. In: Ye, N. (Ed.), The Handbook of Data Mining. Lawrence Erlbaum Associates, N. York, 341-358.
Google Scholar
Pedrycz, W. (1985). Algorithms of fuzzy clustering with partial supervision, Pattern Recognition Letters, 3, 1985, 13-20.
Article Google Scholar
Pedrycz, W. and Waletzky, J. (1997). Fuzzy clustering with partial supervision, IEEE Trans. on Systems, Man, and Cybernetics, 5, 787-795.
Google Scholar
Pedrycz, W. and Waletzky, J. (1997). Neural network front-ends in unsupervised learning, IEEE Trans. on Neural Networks, 8, 390-401.
Article Google Scholar
Pedrycz, W., Loia, V. and Senatore, S. (2004). P-FCM: A proximity-based clustering, Fuzzy Sets & Systems, 148, 2004, 21-41.
Article MATH MathSciNet Google Scholar
Pedrycz, W. (2002). Collaborative fuzzy clustering, Pattern Recognition Letters, 23, 14, 1675-1686.
Article MATH Google Scholar
Pedrycz, W. (2005). Knowledge-Based Clustering: From Data to Information Granules, J. Wiley, N. York.
MATH Google Scholar
Pinkas, B. (2002). Cryptographic techniques for privacy-preserving data mining. ACM SIGKDD Explorations Newsletter 4, 2, 12-19.
Article Google Scholar
Strehl, A. and Ghosh, J. (2002). Cluster ensembles—a knowledge reuse framework for combining multiple partitions. Journal of Machine Learning Research, 3, 583-617.
Article MathSciNet Google Scholar
Timm, H., Klawonn, F. and Kruse, R. (2002). An extension of partially supervised fuzzy cluster analysis, Proc. Annual Meeting of the North American Fuzzy Information Processing Society, NAFIPS 2002, 27-29 June 2002, 63-68.
Google Scholar
Tsoumakas, G., Angelis, L. and Vlahavas, I. (2004). Clustering classifiers for knowledge discovery from physically distributed databases, Data & Knowledge Engineering, 49, 3, 223-242.
Article Google Scholar
Verykios, V.S., Bertino, E., Fovino, I.N., Provenza, L.P., Saygin, Y. and Theodoridis Y. (2004). State-of-the-art in privacy preserving data mining. SIGMOD Record 33, 1, 50-57.
Article Google Scholar
Wang K., Yu, P.S. and Chakraborty, S. (2004). Bottom-up generalization: a data mining solution to privacy protection, Proc. 4 ^th IEEE International Conference on Data Mining, ICDM 2004, 1-4 Nov. 2004, 249-256
Google Scholar
Wang, S.L. and Jafari, A. (2005). Using unknowns for hiding sensitive predictive association rules, Proc. 2005 IEEE International Conference on Information Reuse and Integration, 223-228.
Google Scholar
Wang, E.T., Lee, G. and Lin, Y. T. (2005). A novel method for protect-ing sensitive knowledge in association rules mining, Proc. 29 ^th Annual International Computer Software and Applications Conference (COMP-SAC 2005), vol. 2, 511-516.
Google Scholar
Zadeh, L. A. (2005). Toward a generalized theory of uncertainty (GTU) - an outline, Information Sciences, 172, 1-2, 1-40.
Article MATH MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical & Computer Engineering, University of Alberta, T6R 2G7, Edmonton, Canada
Witold Pedrycz

Authors

Witold Pedrycz
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Informatics, Nicolaus Copernicus University, Ul. Grudziadzka 5, 87-100, Torun, Poland
Włodzisław Duch
Division of Computer Science, School of Computer Engineering, Nanyang Technological University, 639798, Singapore
Włodzisław Duch
Faculty of Mathematics and Information Sciences, Warsaw University of Technology, Plac Politechniki 1, 00-661, Warsaw, Poland
Jacek Mańdziuk

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Pedrycz, W. (2007). Knowledge-Based Clustering in Computational Intelligence. In: Duch, W., Mańdziuk, J. (eds) Challenges for Computational Intelligence. Studies in Computational Intelligence, vol 63. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71984-7_12

Download citation

DOI: https://doi.org/10.1007/978-3-540-71984-7_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-71983-0
Online ISBN: 978-3-540-71984-7
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics