MembershipMap: A Data Transformation for Knowledge Discovery Based on Granulation and Fuzzy Membership Aggregation

Frigui, Hichem

doi:10.1007/978-3-540-79474-5_3

Hichem Frigui¹

Part of the book series: Studies in Computational Intelligence ((SCI,volume 137))

551 Accesses

Abstract

In this chapter, we describe a new data-driven transformation that facilitates many data mining, interpretation, and analysis tasks. This approach, called MembershipMap, strives to granulate and extract the underlying sub-concepts of each raw attribute. The orthogonal union of these sub-concepts are then used to define a new membership space. The sub-concept soft labels of each point in the original space determine the position of that point in the new space. Since sub-concept labels are prone to uncertainty inherent in the original data and in the initial extraction process, a combination of labeling schemes that are based on different measures of uncertainty will be presented. In particular, we introduce the CrispMap, the FuzzyMap, and the PossibilisticMap. We outline the advantages and disadvantages of each mapping scheme, and we show that the three transformed spaces are complementary. We also show that in addition to improving the performance of clustering by taking advantage of the richer information content, the MembershipMap can be used as a flexible pre-processing tool to support such tasks as: sampling, data cleaning, and outlier detection.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Fayyad, U., Piatetsky-Shapiro, G., Smyth, P., Uthurusamy, R.: Advances in Knowledge Discovery and Data Mining. MIT Press, Cambridge (1996)
Google Scholar
Famili, A., Shen, W., Weber, R., Simoudis, E.: Data preprocessing and intelligent data analysis. Intelligent Data Analysis 1(1), 3–23 (1997)
Article Google Scholar
Han, J., Kamber, M.: Data Mining Concepts and Techniques. Morgan Kaufmann, San Francisco (2001)
Google Scholar
Pyle, D.: Data Preparation for Data Mining. Morgan Kaufmann, San Francisco (1999)
Google Scholar
Jolliffe, I.: Principal Component Analysis. Springer, Heidelberg (1986)
Google Scholar
Shepard, R.N.: The analysis of proximities: multidimensional scaling with an unknown distance function I and II. Psychometrika 27, 125–139, 219–246 (1962)
Article MathSciNet Google Scholar
Kohonen, T.: Self-Organization and Associative Memory. Springer, Heidelberg (1989)
Google Scholar
Sammon, J.W.: A nonlinear mapping for data analysis. IEEE Transactions on Computers 18, 401–409 (1969)
Article Google Scholar
Jagadish, H.V.: A retrieval technique for similar shape. In: ACM SIGMOD, pp. 208–217 (1991)
Google Scholar
Faloutsos, C., Lin, K.-I.: FastMap: A fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets. In: SIGMOD, pp. 163–174 (1995)
Google Scholar
Schafer, J.L.: Analysis of Incomplete Multivariate Data. Chapman and Hall, Boca Raton (1997)
MATH Google Scholar
Wang, X., Barbará, S.D.: Modeling and imputation of large incomplete multidimensional data sets. In: Fourth International Conference on Data Warehousing and Knowledge Discovery, pp. 286–295 (2002)
Google Scholar
Gray, J., Chaudhuri, S., Bosworth, A., Layman, A., Reichart, D., Venkatrao, M., Pellow, F., Pirahesh, H.: Data cube: A relational aggregation operator generalizing group-by, cross-tab, and sub-totals. J. Data Mining and Knowledge Discovery 1(1), 29–53 (1997)
Article Google Scholar
Chauduri, S., Dayal, U.: An overview of data warehousing and OLAP technology. SIGMOD Record 26(1), 65–74 (1997)
Article Google Scholar
Almuallim, H., Dietterich, T.G.: Learning with many irrelevant features. In: Ninth National Conf. AI, pp. 547–552 (1991)
Google Scholar
Dash, M., Liu, H., Yao, J.: Dimensionality reduction for unsupervised data. In: 9th IEEE Int. Conf. on Tools with AI, ICTAI 1997, pp. 532–539 (1997)
Google Scholar
Frigui, H., Nasraoui, O.: Unsupervised learning of prototypes and attribute weights. Pattern Recognition 37(3), 567–581 (2004)
Article Google Scholar
Kivinen, J., Mannila, H.: The power of sampling in knowledge discovery. In: Thirteenth ACM SIGACT-SIGMOD-SIGART Symp. Principles of Database Sys., pp. 77–85 (1994)
Google Scholar
Dougherty, J., Kohavi, R., Sahami, M.: Supervised and unsupervised discretization of continuous features. In: 12th International Conference on Machine Learning, pp. 194–202 (1995)
Google Scholar
Ho, K., Scott, P.: Zeta: a global method for discretization of continuous variables. In: 3rd International Conference on Knowledge Discovery and Data Mining (KDD 1997), pp. 191–194. AAAI Press, Menlo Park (1997)
Google Scholar
Liu, H., Hussain, F., Tan, C., Dash, M.: Discretization: an enabling technique. Journal of Data Mining and Knowledge Discovery 6(4), 393–423 (2002)
Article MathSciNet Google Scholar
Barbará, S.D., DuMouchel, W., Faloutsos, C., Haas, P., Hellerstein, J., Ioannidis, Y., Jagadish, H., Johnson, T., Ng, R., Poosala, V., Ross, K., Sevcik, K.: The new jersey data reduction report. Bulletin of the Technical Committee on Data Engineering 20, 3–45 (1997)
Google Scholar
Duda, R.O., Hart, P.E.: Pattern Classification and Scene Analysis. John Wiley and Sons, Chichester (1973)
MATH Google Scholar
Kaufman, L., Rousseeuw, P.: Finding Groups in Data: An Introduction to Cluster Analysis. John Wiley and Sons, Chichester (1990)
Google Scholar
Zimmermann, H.Z.: Fuzzy Set Theory and Its Applications, 4th edn. Kluwer, Dordrecht (2001)
Google Scholar
Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum Press, New York (1981)
MATH Google Scholar
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society Series B 39(1), 1–38 (1977)
MathSciNet MATH Google Scholar
Frigui, H., Krishnapuram, R.: A robust competitive clustering algorithm with applications in computer vision. IEEE Trans. Patt. Analysis Mach. Intell. 21(5), 450–465 (1999)
Article Google Scholar
Krishnapuram, R., Keller, J.: A possibilistic approach to clustering. IEEE Trans. Fuzzy Systems 1(2), 98–110 (1993)
Article Google Scholar
Davé, R.N., Krishnapuram, R.: Robust clustering methods: A unified view. IEEE Trans. Fuzzy Systems 5(2), 270–293 (1997)
Article Google Scholar
Hampel, F.R., Ronchetti, E.M., Rousseeuw, P.J., Stahel, W.A.: Robust Statistics the Approach Based on Influence Functions. John Wiley & Sons, New York (1986)
MATH Google Scholar
Pedrycz, W.: Granular Computing: An Emerging Paradigm. Springer, Heidelberg (2001)
MATH Google Scholar
Yao, Y., Yao, J.: Granular computing as a basis for consistent classification problem. In: Chen, M.-S., Yu, P.S., Liu, B. (eds.) PAKDD 2002. LNCS (LNAI), vol. 2336, pp. 101–106. Springer, Heidelberg (2002)
Google Scholar
Runkler, T.A., Roychowdhury, S.: Generating decision trees and membership functions by fuzzy clustering. In: Seventh European Congress on Intelligent Techniques and Soft Computing (1999)
Google Scholar
Ishibuchi, H., Nakashima, T., Murata: Performance evaluation of fuzzy classifier systems for multidimensional pattern classification problems. IEEE Trans. on Systems, Man, and Cybernetics - Part B 29, 601–618 (1999)
Article Google Scholar
Klawonn, F., Kruse, R.: Derivation of fuzzy classification rules from multidimensional data. In: Lasker, X.L.G.E. (ed.) Advances in Intelligent Data Analysis, The International Institute for Advanced Studies in Systems Research and Cybernetics, Windsor, Ontario, pp. 90–94 (1995)
Google Scholar
Frigui, H., Krishnapuram, R.: Clustering by competitive agglomeration. Pattern Recognition 30(7), 1223–1232 (1997)
Article Google Scholar
Rhouma, M., Frigui, H.: Self-organization of a population of coupled oscillators with application to clustering. IEEE Trans. Patt. Analysis Mach. Intell. 23(2), 180–195 (2001)
Article Google Scholar
Otsu, N.: A threshold selection method from gray level histograms. IEEE Trans. Systems, Man and Cybernetics 9, 62–66 (1979)
Article Google Scholar
Wang, Z., Klir, G.: Fuzzy measure theory. Plenum Press, New York (1992)
MATH Google Scholar
Pedrycz, W., Waletzky, J.: Fuzzy clustering with partial supervision. IEEE Trans. Systems, Man and Cybernetics 27(5), 787–795 (1997)
Article Google Scholar
Carson, C., Thomas, M., Belongie, S., Hellerstein, J.M., Malik, J.: Blobworld: A system for region-based image indexing and retrieval. In: Huijsmans, D.P., Smeulders, A.W.M. (eds.) VISUAL 1999. LNCS, vol. 1614, pp. 509–516. Springer, Heidelberg (1999)
Google Scholar

Download references

Author information

Authors and Affiliations

Multimedia Research Lab Computer Engineering & Computer Science Dept., University of Louisville,
Hichem Frigui

Authors

Hichem Frigui
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Lakhmi C. Jain Mika Sato-Ilic Maria Virvou George A. Tsihrintzis Valentina Emilia Balas Canicious Abeynayake

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Frigui, H. (2008). MembershipMap: A Data Transformation for Knowledge Discovery Based on Granulation and Fuzzy Membership Aggregation. In: Jain, L.C., Sato-Ilic, M., Virvou, M., Tsihrintzis, G.A., Balas, V.E., Abeynayake, C. (eds) Computational Intelligence Paradigms. Studies in Computational Intelligence, vol 137. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-79474-5_3

Download citation

DOI: https://doi.org/10.1007/978-3-540-79474-5_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-79473-8
Online ISBN: 978-3-540-79474-5
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics