Abstract
The importance of input representation has been recognised already in machine learning. This paper discusses the application of genetic-based feature construction methods to generate input data for the data summarisation method called Dynamic Aggregation of Relational Attributes (DARA). Here, feature construction methods are applied in order to improve the descriptive accuracy of the DARA algorithm. The DARA algorithm is designed to summarise data stored in the non-target tables by clustering them into groups, where multiple records stored in non-target tables correspond to a single record stored in a target table. This paper addresses the question whether or not the descriptive accuracy of the DARA algorithm benefits from the feature construction process. This involves solving the problem of constructing a relevant set of features for the DARA algorithm by using a genetic-based algorithm. This work also evaluates several scoring measures used as fitness functions to find the best set of constructed features.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Alfred, R., Kazakov, D.: Data Summarisation Approach to Relational Domain Learning Based on Frequent Pattern to Support the Development of Decision Making. In: 2nd ADMA International Conference, pp. 889–898 (2006)
Blockeel, H., Dehaspe, L.: Tilde and Warmr User Manual (1999), http://www.cs.kuleuvan.ac.be/~ml/PS/TWuser.ps.gz
Lavrač, N., Flach, P.A.: An extended transformation approach to Inductive Logic Programming. ACM Trans. Comput. Log. 2(4), 458–494 (2001)
Salton, G., Wong, A., Yang, C.S.: A Vector Space Model for Automatic Indexing. Commun. ACM 18(11), 613–620 (1975)
Pagallo, G., Haussler, D.: Boolean Feature Discovery in Empirical Learning. Machine Learning 5, 71–99 (1990)
Hu, Y.J., Kibler, D.F.: Generation of Attributes for Learning Algorithms. AAAI/IAAI 1, 806–811 (1996)
Hu, Y.J.: A genetic programming approach to constructive induction. In: Proc. of the Third Annual Genetic Programming Conference, pp. 146–157. Morgan Kauffman, Madison (1998)
Otero, F.E.B., Silva, M.S., Freitas, A.A., Nievola, J.C.: Genetic Programming for Attribute Construction in Data Mining. In: EuroGP, pp. 384–393 (2003)
Bensusan, H., Kuscu, I.: Constructive Induction using Genetic Programming. In: ICML 1996 Evolutionary computing and Machine Learning Workshop (1996)
Zheng, Z.: Constructing X-of-N Attributes for Decision Tree Learning. Machine Learning 40(1), 35–75 (2000)
Zheng, Z.: Effects of Different Types of New Attribute on Constructive Induction. In: ICTAI, pp. 254–257 (1996)
Quinlan, R.J.: Decision-Tree. In: C4.5: Programs for Machine Learning. Morgan Kaufmann Series in Machine Learning (1993)
Holland, J.: Adaptation in Natural and Artificial Systems. University of Michigan Press (1975)
Amaldi, E., Kann, V.: On the Approximability of Minimising Nonzero Variables or Unsatisfied Relations in Linear Systems. Theory Computer Science 209(1-2), 237–260 (1998)
Freitas, A.A.: Understanding the Crucial Role of Attribute Interaction in Data Mining. Artif. Intell. Rev. 16(3), 177–199 (2001)
Shafti, L.S., Pérez, E.: Genetic Approach to Constructive Induction Based on Non-algebraic Feature Representation. In: R. Berthold, M., Lenz, H.-J., Bradley, E., Kruse, R., Borgelt, C. (eds.) IDA 2003. LNCS, vol. 2810, pp. 599–610. Springer, Heidelberg (2003)
Vafaie, H., DeJong, K.: Feature Space Transformation Using Genetic Algorithms. IEEE Intelligent Systems 13(2), 57–65 (1998)
Koza, J.R.: Genetic Programming: On the programming of computers by means of natural selection. Statistics and Computing 4(2) (1994)
Krawiec, K.: Genetic Programming-based Construction of Features for Machine Learning and Knowledge Discovery Tasks. Genetic Programming and Evolvable Machines 3, 329–343 (2002)
Davies, D.L., Bouldin, D.W.: A Cluster Separation Measure. IEEE Trans. Pattern Analysis and Machine Intelligence. 1, 224–227 (1979)
Shannon, C.E.: A mathematical theory of communication. Bell system technical journal 27 (1948)
Wiener, N.: Cybernetics: Or Control and Communication in Animal and the Machine. MIT Press, Cambridge (2000)
Srinivasan, A., Muggleton, S., Sternberg, M.J.E., King, R.D.: Theories for Mutagenicity: A Study in First-Order and Feature-Based Induction. Artif. Intell. 85(1-2), 277–299 (1996)
Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, San Francisco (1999)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Alfred, R. (2008). Dynamic Aggregation of Relational Attributes Based on Feature Construction. In: Atzeni, P., Caplinskas, A., Jaakkola, H. (eds) Advances in Databases and Information Systems. ADBIS 2008. Lecture Notes in Computer Science, vol 5207. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85713-6_2
Download citation
DOI: https://doi.org/10.1007/978-3-540-85713-6_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-85712-9
Online ISBN: 978-3-540-85713-6
eBook Packages: Computer ScienceComputer Science (R0)