Abstract
A Cartesian granule feature is a multidimensional feature formed over the cross product of words drawn from the linguistic partitions of the constituent input features. Systems can be quite naturally described in terms of Cartesian granule features incorporated into additive models (if-then-rules with weighted antecedents) where each Cartesian granule feature focuses on modelling the interactions of a subset of input variables. This can often lead to models that reduce if not eliminate decomposition error, while enhancing the model’s generalisation powers and transparency. Within a machine learning context the system identification of good, parsimonious additive Cartesian granule feature models is an exponential search problem. In this paper we present the G_DACG constructive induction algorithm as a means of automatically identifying additive Cartesian granule feature models from example data. G_DACG combines the powerful optimisation capabilities of genetic programming with a rather novel and cheap fitness function which relies on the semantic separation of concepts expressed in terms of Cartesian granule fuzzy sets in identifying these additive models. G_DACG helps avoid many of the problems of traditional approaches to system identification that arise from feature selection and feature abstraction such as local minima. G_DACG has been applied in the system identification of additive Cartesian granule feature models on a variety of artificial and real world problems. Here we present a sample of those results including those for the benchmark Pima Diabetes problem. A classification accuracy of 79.7% was achieved on this dataset outperforming previous bests of 78% (generally from black box modelling approaches such as neural nets and oblique decision trees).
Supported by European Community Marie Curie Fellowship Program.
Preview
Unable to display preview. Download preview PDF.
References and Related Bibliography
H. Almuallim and T. G. Dietterich (1991), “Learning with irrelevant features”, in Proc. AAAI-91, Anaheim, CA, pp 547–552.
J. F. Baldwin (1991) “A Theory of Mass Assignments for Artificial Intelligence”, in IJCAI ’91 Workshops on Fuzzy Logic and Fuzzy Control, Sydney, Australia, Lecture Notes in Artificial Intelligence, A. L. Ralescu, Editor 1991, pp. 22–34.
J. F. Baldwin, J. Lawry and T.P. Martin (1997), “Mass assignment fuzzy ID3 with applications rd, in Proc. Fuzzy Logic: Applications and Future Directions Workshop, London, UK, pp 278–294.
J. F. Baldwin, T. P. Martin and B. W. Pilsworth (1988) “FRIL Manual”, FRIL Systems Ltd, Bristo, BS8 1QX, UK.
J. F. Baldwin, T. P. Martin and B. W. Pilsworth (1995) “FRIL-Fuzzy and Evidential Reasoning in A.I.”, Research Studies Press(Wiley Inc.), ISBN 086380159 5.
J. F. Baldwin, T. P. Martin and J. G. Shanahan (1996) “Modelling with Words using Cartesian Granule Features”, (Report No. ITRC 246), Advanced Computing Research Centre, Dept. of Engineering Maths, University of Bristol, UK.
J. F. Baldwin, T.P. Martin and J. G. Shanahan (1997), “Fuzzy logic methods in vision recognition”, in Proc. Fuzzy Logic: Applications and Future Directions Workshop, London, UK, pp 300–316.
J. F. Baldwin, T. P. Martin and J. G. Shanahan (1997), “Modelling with words using Cartesian granule features”, in Proc. FUZZ-IEEE, Barcelona, Spain, pp 1295–1300.
J. F. Baldwin, T.P. Martin and J. G. Shanahan (1998), “Aggregation in Cartesian granule feature models” in Proc. IPMU, Paris, pp 6.
J. F. Baldwin, T. P. Martin and J. G. Shanahan (1998) “Controlling with words using automatically identified fuzzy Cartestian granule feature models”, (To Appear) International Journal of Approximate Reasoning-Special issue on Fuzzy Logic Control: Advances in Methodology, N/A, pp. 37.
J. F. Baldwin and B. W. Pilsworth (1997), “Genetic Programming for Knowledge Extraction of Fuzzy Rules”, in Proc. Fuzzy Logic: Applications and Future Directions Workshop, London, UK, pp 238–251.
A. Bastian (1995) “Modelling and Identifying Fuzzy Systems under varying User Knowledge”, PhD Thesis, Meiji University, Tokyo
A. Ben-Davis and J. Mandel (1995) “Classification accuracy: machine learning vs. explicit knowledge acquisition”, Machine Learning, 18, pp. 109–114.
A. L. Blum and P. Langley (1997) “Selection of relevant features and examples in machine learning”. Artificial Intelligence, 97, pp. 245–271.
K. M. Bossley (1997) “Neurofuzzy Modelling Approaches in System Identification”, PhD Thesis, Department of Electrical and Computer Science, Southampton University, UK
N. Cristianini (1998) “Application of oblique decision trees to Pima diabetes problem”, Personal Communication, Department of Engineering Mathematics, University of Bristol, UK.
P. A. Devijer and J. Kittler (1982) “Pattern Recognition: A Statistical Approach”, Prentice-Hall, Englewood Cliffs, NJ.
W. J. Frawley, G. Piatetsky-Shapiro and C. J. Matheus (1991) “Knowledge Discovery in Databases: An Overview”, in Knowledge Discovery in Databases, G. Piatetsky-Shapiro and W. J. Frawley, Editors 1991, AAAI Press/MIT Press. Cambridge, Mass, USA. pp. 1–27.
J. H. Friedman (1991) “Multivariate Adaptive Regression Splines”, The Annals of Statistics, 19, pp. 1–141.
S. Geman, E. Bienenstock and R Doursat (1992) “Neural networks and the bias/variance dilemma”, Neural computation, 4, pp. 1–58.
J. Hertz, K. Anders and R. G. Palmer (1991) “Introduction to the Theory of Neutral Computation”, Addison-Wesley, New York.
A. G. Ivanhnenko (1971) “Polynomial theory of complex systems”, IEEE Transactions on Systems, Man and Cybernetics, 1(4), pp. 363–378.
I. T. Jolliffe (1986) “Principal Component Analysis”, Springer, New York.
T. Kalvi (1993) “ASMOD: an algorithm for Adaptive Spline Modelling of Observation Data”, International Journal of Control, 58(4), pp. 947–968.
M. Kay, J. M. Gawson and P. Norvig (1994) “Verbmobil: A translation system for face-to-face dialog”, CSLI Press, Stanford, California, USA.
K. Kira and L Rendell (1992), “A practical approach to feature selection”, in Proc. 9th Conference in Machine Learning, Aberdeen, Scotland, pp 249–256.
G. J. Klir and B. Yuan (1995) “Fuzzy Sets and Fuzzy Logic, Theory and Applications”, Prentice Hall, New Jersey.
R Kohavi and G. H. John (1997) “Wrappers for feature selection”, Artificial Intelligence, 97, pp. 273–324.
R Kohavi and G. H. John (1997) “Inductive and Bayesian learning in medical diagnosis”, Artificial Intelligence, 7, pp. 317–337.
I. Kononenko and S J Hong (1997) “Attribute selection for modelling”, FGCS Special Issue in Data Mining, (Fall), pp. 34–55.
J. R. Koza (1992) “Genetic Programming”, MIT Press, Massachusetts.
J. R. Koza (1994) “Genetic Programming II”, MIT Press, Massachusetts.
P. Langley (1996) “Elements of Machine Learning,” Morgan Kaufmann, San Francisco, CA, USA.
L. Ljung (1987) “System identification: theory for the user”, Prentice Hall, Englewood Cliffs, New Jersey 07632.
C. J. Merz and P. M. Murphy (1996) “UCI Repository of machine learning databases [http://www.ics.uci.edu/~mlearn/MLRepository.html] Irvine, CA”, University of California, Irvine, CA.
R. S. Michalski, I. Bratko and M. Kubat (Ed), (1998), “Machine Learning and Data Mining”, Wiley, New York.
D. Michie, D. J. Spiegelhalter and C. C. Taylor (1993) “Dataset Descriptions and Results”, in Machine Learning, Neural and Statistical Classification, D. Michie, D. J. Spiegelhalter and C. C. Taylor, Editors 1993
D. Michie, D. J. Spiegelhalter and C. C. Taylor (Ed), (1993), “Machine Learning, Neural and Statistical Classification”
J. R. Quinlan (1986) “Induction of Decision Trees”, Machine Learning, 1(1), pp. 86–106.
B. Schweizer and A. Sklar (1961) “Associative functions and statistical triangle inequalities”, Publ. Math. Debrecen, 8, pp. 168–186.
J. G. Shanahan (1996) “Automatic Synthesis of Fuzzy Rule Cartesian Granule Features from Data for both Classification and Prediction”, (Report No. ITRC 247), Advanced Computing Research Centre, Dept. of Engineering Maths, University of Bristol, UK.
J. G. Shanahan (1998) “Cartesian Granule Features: Knowledge Discovery of Additive Models for Classification and Prediction”, PhD Thesis, Dept. of Engineering Maths, University of Bristol, UK.
J. G. Shanahan (1998) “Inductive logic programming with Cartesian granule features”, Personal Communication, Dept. of Engineering Maths, University of Bristol, UK.
H. A. Simon (1983) “Why should machine learn?”, in Machine Learning: An Artificial Intelligence Approach, R. S. Michalski, J. G. Carbonell and T. M. Mitchell, Editors 1983, Springer-Verlag. Berlin. pp. 25–37.
J. W. Smith, et al. (1988), “Using the ADAP learning algorithm to forecast the onset of diabetes mellitus”, in Proc. Symposium on Computer Applications and Medical Care, pp 261–265.
M. Sugeno and T. Yasukawa (1993) “A Fuzzy Logic Based Approach to Qualitative Modelling”, IEEE Trans on Fuzzy Systems, 1(1), pp. 7–31.
G. Syswerda (1989), “Uniform crossover in genetic algorithms”, in Proc. Third Int’l Conference on Genetic Algorithms, pp 989–995.
W. A. Tackett (1995) “Mining the Genetic Program”, IEEE Expert, (6), pp. 28–28.
R. R. Yager (1994) “Generation of Fuzzy Rules by Mountain Clustering”, J. Intelligent and Fuzzy Systems, 2, pp. 209–219.
L. A. Zadeh (1994) “Soft Computing and Fuzzy Logic”, IEEE Software, 11(6), pp. 48–56.
L. A. Zadeh (1996) “Fuzzy Logic=Computing with Words”, IEEE Transactions on Fuzzy Systems, 4(2), pp. 103–111.
Author information
Authors and Affiliations
Corresponding author
Editor information
Rights and permissions
Copyright information
© 1999 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Baldwin, J.F., Martin, T.P., Shanahan, J.G. (1999). System identification of fuzzy cartesian granule feature models using genetic programming. In: Ralescu, A.L., Shanahan, J.G. (eds) Fuzzy Logic in Artificial Intelligence. FLAI 1997. Lecture Notes in Computer Science, vol 1566. Springer, Berlin, Heidelberg . https://doi.org/10.1007/BFb0095073
Download citation
DOI: https://doi.org/10.1007/BFb0095073
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-66374-4
Online ISBN: 978-3-540-48358-8
eBook Packages: Springer Book Archive