Abstract
Although a large variety of data analysis tools are available on the market today, none of them is perfect; they all have their strengths and weaknesses. In such a situation it is important that a user can enhance the capabilities of a data analysis tool by his or her own favourite methods in order to compensate for shortcomings of the shipped version. However, only few commercial products offer such a possibility. A rare exception is DataEngineTM, which is provided with a well-documented interface for user-defined function blocks (plug-ins). In this paper we describe three plug-ins we implemented for this well-known tool: An advanced fuzzy clustering plug-in that extends the fuzzy c-means algorithm (which is a built-in feature of DataEngineTM) by other, more flexible algorithms, a decision tree classifier plug-in that overcomes the serious drawback that DataEngineTM lacks a native module for this highly important technique, and finally a naive Bayes classifier plug-in that makes available an old and time-tested statistical classification method.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Nürnberger, A., Timm, H.: Or software: Dataengine. OR Spektrum 21, 305–313 (1999)
Timm, H.: A fuzzy cluster analysis plug-in for dataengine. In: Proc. 2nd Data Analysis Symposium, Aachen (1998)
Borgelt, C.: A decision tree plug-in for dataengine. In: Proc. 2nd Data Analysis Symposium, Aachen (1998)
Borgelt, C.: A naive bayes classifier plug-in for dataengine. In: Proc. 2nd Data Analysis Symposium, Aachen (1999)
Berry, M., Linoff, G.: Data Mining Techniques — For Marketing, Sales and Customer Support. J. Wiley & Sons, Chichester (1997)
Bezdek, J.: Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum, New York (1981)
Bezdek, J., Pal, S.: Fuzzy Models for Pattern Recognition — Methods that Search for Structures in Data. IEEE Press, Piscataway (1992)
Davé, R., Krishnapuram, R.: Robust clustering methods: A unified view. IEEE Trans. on Fuzzy Systems 5, 270–293 (1997)
Davé, R.: Characterization and detection of noise in clustering. Pattern Recognition Letters 12, 657–664 (1991)
Krishnapuram, R., Keller, J.: A possibilistic approach to clustering. IEEE Transactions on Fuzzy Systems 1, 98–110 (1993)
Krishnapuram, R., Keller, J.: The possibilistic c-means algorithm: Insights and recommendations. IEEE Trans. on Fuzzy Systems 4, 385–393 (1996)
Nasroui, O., Krishnapuram, R.: Crisp interpretations of fuzzy and possibilistic clustering algorithm. In: Proc. 3rd European Congress on Fuzzy and Intelligent Technologies, EUFIT 1995, Aachen, Germany, pp. 1312–1318. Verlag Mainz, Aachen (1994)
Barni, M., Capellini, V., Mecocci, A.: Comments on a possibilistic approach to clustering. IEEE Transactions on Fuzzy Systems 4, 393–396 (1996)
Pal, N., Pal, K., Bezdek, J.: A mixed c-means clustering model. In: Proc. 6th IEEE Int. Conf. on Fuzzy Systems (FUZZ-IEEE 1997), Barcelona, Spain, pp. 11–21. IEEE Press, Piscataway (1997)
Gustafson, E., Kessel, W.: Fuzzy clustering with a fuzzy covariance matrix. In: Proc. IEEE Conf. on Decision and Control (CDC 1979), San Diego, CA, pp. 761–766. IEEE Press, Piscataway (1979)
Höppner, F., Klawonn, F., Kruse, R., Runkler, T.: Fuzzy Cluster Analysis. J. Wiley & Sons, Chichester (1999)
Gath, I., Geva, A.: Unsupervised optimal fuzzy clustering. IEEE Trans. on Pattern Analysis and Machine Intelligence (PAMI) 11, 773–781 (1989)
Breiman, L., Friedman, J., Olshen, R., Stone, C.: Classification and Regression Trees. Wadsworth International, Belmont (1984)
Quinlan, J.: Induction of decision trees. Machine Learning 1, 81–106 (1986)
Quinlan, J.: C4.5: Programs for Machine Learning. Morgan Kaufman, San Mateo (1993)
Kullback, S., Leibler, R.: On information and sufficiency. Ann. Math. Statistics 22, 79–86 (1951)
Chow, C., Liu, C.: Approximating discrete probability distributions with dependence trees. IEEE Trans. on Information Theory 14(3), 462–467 (1968)
Lopez de Mantaras, R.: A distance-based attribute selection measure for decision tree induction. Machine Learning 6, 81–92 (1991)
Wehenkel, L.: On uncertainty measures used for decision tree induction. In: Proc. Int. Conf. on Information Processing and Management of Uncertainty in Knowledge-based Systems (IPMU 1996), Granada, Spain, pp. 413–417 (1996)
Zhou, X., Dillo, T.: A statistical-heuristic feature selection criterion for decision tree induction. IEEE Trans. on Pattern Analysis and Machine Intelligence (PAMI) 13, 834–841 (1991)
Kononenko, I.: Estimating attributes: Analysis and extensions of relief. In: Bergadano, F., De Raedt, L. (eds.) ECML 1994. LNCS, vol. 784. Springer, Heidelberg (1994)
Kira, K., Rendell, L.: A practical approach to feature selection. In: Proc. 9th Int. Conf. on Machine Learning (ICML 1992), pp. 250–256. Morgan Kaufman, San Franscisco (1992)
Kononenko, I.: On biases in estimating multi-valued attributes. In: Proc. 1st Int. Conf. on Knowledge Discovery and Data Mining (KDD 1995), Montreal, Canada, pp. 1034–1040. AAAI Press, Menlo Park (1995)
Baim, P.: A method for attribute selection in inductive learning systems. IEEE Trans. on Pattern Analysis and Machine Intelligence (PAMI) 10, 888–896 (1988)
Cooper, G., Herskovits, E.: A Bayesian Method for the Induction of Probabilistic Networks from Data. Machine Learning. Kluwer, Dordrecht (1992)
Heckerman, D., Geiger, D., Chickering, D.: Learning bayesian networks: The combination of knowledge and statistical data. Machine Learning 20, 197–243 (1995)
Buntine, W.: Theory refinement on bayesian networks. In: Proc. 7th Conf. on Uncertainty in Artificial Intelligence, pp. 52–60. Morgan Kaufman, Los Angeles (1991)
Krichevsky, R., Trofimov, V.: The performance of universal coding. IEEE Trans. on Information Theory 27(2), 199–207 (1983)
Rissanen, J.: Stochastic complexity. Journal of the Royal Statistical Society (Series B) 49, 223–239 (1987)
Gebhardt, J., Kruse, R.: Tightest hypertree decompositions of multivariate possibility distributions. In: Proc. Int. Conf. on Information Processing and Management of Uncertainty in Knowledge-based Systems (IPMU 1996), Granada, Spain, pp. 923–927 (1996)
Borgelt, C., Kruse, R.: Evaluation measures for learning probabilistic and possibilistic networks. In: Proc. 6th IEEE Int. Conf. on Fuzzy Systems (FUZZ-IEEE 1997), Barcelona, Spain, pp. 1034–1038. IEEE Press, Piscataway (1997)
Borgelt, C., Kruse, R.: Attributauswahlmaβe für die induktion von entscheidungsbäumen: Ein überblick. In: Nakhaeizadeh, G. (ed.) Data Mining: Theoretische Aspekte und Anwendungen, pp. 77–98. Physica-Verlag, Heidelberg (1998)
Good, I.: The Estimation of Probabilities: An Essay on Modern Bayesian Methods. MIT Press, Cambridge (1965)
Duda, R., Hart, P.: Pattern Classification and Scene Analysis. J. Wiley & Sons, New York (1973)
Langley, P., Iba, W., Thompson, K.: An analysis of bayesian classifiers. In: Proc. 10th Nat. Conf. on Artificial Intelligence (AAAI 1992), San Jose, CA, USA, pp. 223–228. AAAI Press/MIT Press, Menlo Park/Cambridge (1992)
Langley, P., Sage, S.: Induction of selective bayesian classifiers. In: Proc. 10th Conf. on Uncertainty in Artificial Intelligence (UAI 1994), Seattle, WA, USA, pp. 399–406. Morgan Kaufmann, San Mateo (1994)
Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference, 2nd edn. Morgan Kaufman, San Mateo (1992)
Fisher, R.: The use of multiple measurements in taxonomic problems. Annals of Eugenics 7(2), 179–188 (1936)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Borgelt, C., Timm, H. (2000). Advanced Fuzzy Clustering and Decision Tree Plug-Ins for DataEngineTM . In: Azvine, B., Nauck, D.D., Azarmi, N. (eds) Intelligent Systems and Soft Computing. Lecture Notes in Computer Science(), vol 1804. Springer, Berlin, Heidelberg. https://doi.org/10.1007/10720181_8
Download citation
DOI: https://doi.org/10.1007/10720181_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-67837-3
Online ISBN: 978-3-540-44917-1
eBook Packages: Springer Book Archive