Abstract
In Data Mining, one of the steps of the Knowledge Discovery in Databases (KDD) process, the use of concept hierarchies as a background knowledge allows to express the discovered knowledge in a higher abstraction level, more concise and usually in a more interesting format. However, data mining for high level concepts is more complex because the search space is generally too big. Some data mining systems require the database to be pre-generalized to reduce the space, what makes difficult to discover knowledge at arbitrary levels of abstraction. To efficiently induce high-level rules at different levels of generality, without pre-generalizing databases, fast access to concept hierarchies and fast query evaluation methods are needed.
This work presents the NETUNO-HC system that performs induction of classification rules using concept hierarchies for the attributes values of a relational database, without pre-generalizing them or even using another tool to represent the hierarchies. It is showed how the abstraction level of the discovered rules can be affected by the adopted search strategy and by the relevance measures considered during the data mining step. Moreover, it is demonstrated by a series of experiments that the NETUNO-HC system shows efficiency in the data mining process, due to the implementation of the following techniques: (i) a SQL primitive to efficient execute the databases queries using hierarchies; (ii) the construction and encoding of numerical hierarchies; (iii) the use of Beam Search strategy, and (iv) the indexing and encoding of rules in a hash table in order to avoid mining discovered rules.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Beneditto, M.E.M.D.: Descoberta de regras de classificação com hierarquias conceituais. Master’s thesis, Instituto de Matemática e Estatística, Universidade de São Paulo, Brasil (2004)
Han, J., Fu, Y., Wang, W., Chiang, J., Gong, W., Koperski, K., Li, D., Lu, Y., Rajan, A., Stefanovic, N., Xia, B., Zaiane, O.R.: DBMiner: A system for mining knowledge in large relational databases. In: Simoudis, E., Han, J.W., Fayyad, U. (eds.) Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD 1996), pp. 250–263. AAAI Press, Menlo Park (1996)
Taylor, M.G.: Finding High Level Discriminant Rules in Parallel. PhD thesis, Faculty of the Graduate School of the University of Maryland, College Park, USA (1999)
Freitas, A., Lavington, S.: Speeding up knowledge discovery in large relational databases by means of a new discretization algorithm. In: Proc. 14th British Nat. Conf. on Databases (BNCOD-14), Edinburgh, Scotland, pp. 124–133 (1996)
Quinlan, J.R.: C4.5: Programs for machine learning., 1st edn. Morgan Kaufmann, San Francisco (1993)
Freitas, A., Lavington, S.: Using SQL primitives and parallel DB servers to speed up knowledge discovery in large relational databases. In: Trappl., R. (ed.) Cybernetics and Systems 1996: Proc. 13th European Meeting on Cybernetics and Systems Research, Viena, Austria, pp. 955–960 (1996)
Clark, P., Niblett, T.: The CN2 induction algorithm. Machine Learning 3, 261–283 (1989)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Di Beneditto, M.E.M., de Barros, L.N. (2004). Using Concept Hierarchies in Knowledge Discovery. In: Bazzan, A.L.C., Labidi, S. (eds) Advances in Artificial Intelligence – SBIA 2004. SBIA 2004. Lecture Notes in Computer Science(), vol 3171. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-28645-5_26
Download citation
DOI: https://doi.org/10.1007/978-3-540-28645-5_26
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23237-7
Online ISBN: 978-3-540-28645-5
eBook Packages: Springer Book Archive