Skip to main content

MembershipMap: A Data Transformation for Knowledge Discovery Based on Granulation and Fuzzy Membership Aggregation

  • Chapter
Computational Intelligence Paradigms

Part of the book series: Studies in Computational Intelligence ((SCI,volume 137))

  • 551 Accesses

Abstract

In this chapter, we describe a new data-driven transformation that facilitates many data mining, interpretation, and analysis tasks. This approach, called MembershipMap, strives to granulate and extract the underlying sub-concepts of each raw attribute. The orthogonal union of these sub-concepts are then used to define a new membership space. The sub-concept soft labels of each point in the original space determine the position of that point in the new space. Since sub-concept labels are prone to uncertainty inherent in the original data and in the initial extraction process, a combination of labeling schemes that are based on different measures of uncertainty will be presented. In particular, we introduce the CrispMap, the FuzzyMap, and the PossibilisticMap. We outline the advantages and disadvantages of each mapping scheme, and we show that the three transformed spaces are complementary. We also show that in addition to improving the performance of clustering by taking advantage of the richer information content, the MembershipMap can be used as a flexible pre-processing tool to support such tasks as: sampling, data cleaning, and outlier detection.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Fayyad, U., Piatetsky-Shapiro, G., Smyth, P., Uthurusamy, R.: Advances in Knowledge Discovery and Data Mining. MIT Press, Cambridge (1996)

    Google Scholar 

  2. Famili, A., Shen, W., Weber, R., Simoudis, E.: Data preprocessing and intelligent data analysis. Intelligent Data Analysis 1(1), 3–23 (1997)

    Article  Google Scholar 

  3. Han, J., Kamber, M.: Data Mining Concepts and Techniques. Morgan Kaufmann, San Francisco (2001)

    Google Scholar 

  4. Pyle, D.: Data Preparation for Data Mining. Morgan Kaufmann, San Francisco (1999)

    Google Scholar 

  5. Jolliffe, I.: Principal Component Analysis. Springer, Heidelberg (1986)

    Google Scholar 

  6. Shepard, R.N.: The analysis of proximities: multidimensional scaling with an unknown distance function I and II. Psychometrika 27, 125–139, 219–246 (1962)

    Article  MathSciNet  Google Scholar 

  7. Kohonen, T.: Self-Organization and Associative Memory. Springer, Heidelberg (1989)

    Google Scholar 

  8. Sammon, J.W.: A nonlinear mapping for data analysis. IEEE Transactions on Computers 18, 401–409 (1969)

    Article  Google Scholar 

  9. Jagadish, H.V.: A retrieval technique for similar shape. In: ACM SIGMOD, pp. 208–217 (1991)

    Google Scholar 

  10. Faloutsos, C., Lin, K.-I.: FastMap: A fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets. In: SIGMOD, pp. 163–174 (1995)

    Google Scholar 

  11. Schafer, J.L.: Analysis of Incomplete Multivariate Data. Chapman and Hall, Boca Raton (1997)

    MATH  Google Scholar 

  12. Wang, X., Barbará, S.D.: Modeling and imputation of large incomplete multidimensional data sets. In: Fourth International Conference on Data Warehousing and Knowledge Discovery, pp. 286–295 (2002)

    Google Scholar 

  13. Gray, J., Chaudhuri, S., Bosworth, A., Layman, A., Reichart, D., Venkatrao, M., Pellow, F., Pirahesh, H.: Data cube: A relational aggregation operator generalizing group-by, cross-tab, and sub-totals. J. Data Mining and Knowledge Discovery 1(1), 29–53 (1997)

    Article  Google Scholar 

  14. Chauduri, S., Dayal, U.: An overview of data warehousing and OLAP technology. SIGMOD Record 26(1), 65–74 (1997)

    Article  Google Scholar 

  15. Almuallim, H., Dietterich, T.G.: Learning with many irrelevant features. In: Ninth National Conf. AI, pp. 547–552 (1991)

    Google Scholar 

  16. Dash, M., Liu, H., Yao, J.: Dimensionality reduction for unsupervised data. In: 9th IEEE Int. Conf. on Tools with AI, ICTAI 1997, pp. 532–539 (1997)

    Google Scholar 

  17. Frigui, H., Nasraoui, O.: Unsupervised learning of prototypes and attribute weights. Pattern Recognition 37(3), 567–581 (2004)

    Article  Google Scholar 

  18. Kivinen, J., Mannila, H.: The power of sampling in knowledge discovery. In: Thirteenth ACM SIGACT-SIGMOD-SIGART Symp. Principles of Database Sys., pp. 77–85 (1994)

    Google Scholar 

  19. Dougherty, J., Kohavi, R., Sahami, M.: Supervised and unsupervised discretization of continuous features. In: 12th International Conference on Machine Learning, pp. 194–202 (1995)

    Google Scholar 

  20. Ho, K., Scott, P.: Zeta: a global method for discretization of continuous variables. In: 3rd International Conference on Knowledge Discovery and Data Mining (KDD 1997), pp. 191–194. AAAI Press, Menlo Park (1997)

    Google Scholar 

  21. Liu, H., Hussain, F., Tan, C., Dash, M.: Discretization: an enabling technique. Journal of Data Mining and Knowledge Discovery 6(4), 393–423 (2002)

    Article  MathSciNet  Google Scholar 

  22. Barbará, S.D., DuMouchel, W., Faloutsos, C., Haas, P., Hellerstein, J., Ioannidis, Y., Jagadish, H., Johnson, T., Ng, R., Poosala, V., Ross, K., Sevcik, K.: The new jersey data reduction report. Bulletin of the Technical Committee on Data Engineering 20, 3–45 (1997)

    Google Scholar 

  23. Duda, R.O., Hart, P.E.: Pattern Classification and Scene Analysis. John Wiley and Sons, Chichester (1973)

    MATH  Google Scholar 

  24. Kaufman, L., Rousseeuw, P.: Finding Groups in Data: An Introduction to Cluster Analysis. John Wiley and Sons, Chichester (1990)

    Google Scholar 

  25. Zimmermann, H.Z.: Fuzzy Set Theory and Its Applications, 4th edn. Kluwer, Dordrecht (2001)

    Google Scholar 

  26. Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum Press, New York (1981)

    MATH  Google Scholar 

  27. Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society Series B 39(1), 1–38 (1977)

    MathSciNet  MATH  Google Scholar 

  28. Frigui, H., Krishnapuram, R.: A robust competitive clustering algorithm with applications in computer vision. IEEE Trans. Patt. Analysis Mach. Intell. 21(5), 450–465 (1999)

    Article  Google Scholar 

  29. Krishnapuram, R., Keller, J.: A possibilistic approach to clustering. IEEE Trans. Fuzzy Systems 1(2), 98–110 (1993)

    Article  Google Scholar 

  30. Davé, R.N., Krishnapuram, R.: Robust clustering methods: A unified view. IEEE Trans. Fuzzy Systems 5(2), 270–293 (1997)

    Article  Google Scholar 

  31. Hampel, F.R., Ronchetti, E.M., Rousseeuw, P.J., Stahel, W.A.: Robust Statistics the Approach Based on Influence Functions. John Wiley & Sons, New York (1986)

    MATH  Google Scholar 

  32. Pedrycz, W.: Granular Computing: An Emerging Paradigm. Springer, Heidelberg (2001)

    MATH  Google Scholar 

  33. Yao, Y., Yao, J.: Granular computing as a basis for consistent classification problem. In: Chen, M.-S., Yu, P.S., Liu, B. (eds.) PAKDD 2002. LNCS (LNAI), vol. 2336, pp. 101–106. Springer, Heidelberg (2002)

    Google Scholar 

  34. Runkler, T.A., Roychowdhury, S.: Generating decision trees and membership functions by fuzzy clustering. In: Seventh European Congress on Intelligent Techniques and Soft Computing (1999)

    Google Scholar 

  35. Ishibuchi, H., Nakashima, T., Murata: Performance evaluation of fuzzy classifier systems for multidimensional pattern classification problems. IEEE Trans. on Systems, Man, and Cybernetics - Part B 29, 601–618 (1999)

    Article  Google Scholar 

  36. Klawonn, F., Kruse, R.: Derivation of fuzzy classification rules from multidimensional data. In: Lasker, X.L.G.E. (ed.) Advances in Intelligent Data Analysis, The International Institute for Advanced Studies in Systems Research and Cybernetics, Windsor, Ontario, pp. 90–94 (1995)

    Google Scholar 

  37. Frigui, H., Krishnapuram, R.: Clustering by competitive agglomeration. Pattern Recognition 30(7), 1223–1232 (1997)

    Article  Google Scholar 

  38. Rhouma, M., Frigui, H.: Self-organization of a population of coupled oscillators with application to clustering. IEEE Trans. Patt. Analysis Mach. Intell. 23(2), 180–195 (2001)

    Article  Google Scholar 

  39. Otsu, N.: A threshold selection method from gray level histograms. IEEE Trans. Systems, Man and Cybernetics 9, 62–66 (1979)

    Article  Google Scholar 

  40. Wang, Z., Klir, G.: Fuzzy measure theory. Plenum Press, New York (1992)

    MATH  Google Scholar 

  41. Pedrycz, W., Waletzky, J.: Fuzzy clustering with partial supervision. IEEE Trans. Systems, Man and Cybernetics 27(5), 787–795 (1997)

    Article  Google Scholar 

  42. Carson, C., Thomas, M., Belongie, S., Hellerstein, J.M., Malik, J.: Blobworld: A system for region-based image indexing and retrieval. In: Huijsmans, D.P., Smeulders, A.W.M. (eds.) VISUAL 1999. LNCS, vol. 1614, pp. 509–516. Springer, Heidelberg (1999)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Lakhmi C. Jain Mika Sato-Ilic Maria Virvou George A. Tsihrintzis Valentina Emilia Balas Canicious Abeynayake

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Frigui, H. (2008). MembershipMap: A Data Transformation for Knowledge Discovery Based on Granulation and Fuzzy Membership Aggregation. In: Jain, L.C., Sato-Ilic, M., Virvou, M., Tsihrintzis, G.A., Balas, V.E., Abeynayake, C. (eds) Computational Intelligence Paradigms. Studies in Computational Intelligence, vol 137. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-79474-5_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-79474-5_3

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-79473-8

  • Online ISBN: 978-3-540-79474-5

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics