Abstract
In this paper, we investigate data abstractions for mining association rules with numerical conditions and boolean consequents as a target class. The act of our abstraction corresponds to joining some consecutive primitive intervals of a numerical attribute. If the interclass variance for two adjacent intervals is less than a given admissible upper-bound ∈, then they are combined together into an extended interval. Intuitively speaking, a low value of the variance means that the two intervals can provide almost the same posterior class distributions. This implies few properties or characteristics about the class would be lost by combining such intervals together. We discuss a bottom-up method for finding maximally extended intervals, called maximal appropriate abstraction. Based on such an abstraction, we can reduce the number of extracted rules, still preserving almost the same quality of the rules extracted without abstractions. The usefulness of our abstraction method is shown by preliminary experimental results.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
R. Agrawal, R. Srikant: Fast Algorithms for Mining Association Rules, Proc. of the 20th Int’l Conf. on Very Large Data Bases, pp. 478–499, 1994.
R. Srikant and R. Agrawal: Mining Quantitative Association Rules in Large Relational Tables, Proc. of the ACM SIGMOD Int’l. Conf. on Management of Data, pp. 1–12, 1996.
J. R. Quinlan: C4.5: Programs for Machine Learning, Morgen Kaufmann, 1993
Y. Morimoto, H. Ishii and S. Morishita: Efficient Construction of Regression Trees with Range and Region Splitting, Proc. of the 23rd VLDB Conf., pp. 166–175, 1997.
T. Fukuda, Y. Morimoto, S. Morishita and T. Tokuyama: Mining Optimized Association Rules for Numeric Attributes, Proc. of the 15th ACM symposium on Principles of Database Systems, pp. 182–191, 1996.
J. Han and Y. Fu: Attribute-Oriented Induction in Data Mining. In Advances in Knowledge Discovery and Data Mining (Fayyad, U.N. et.al. eds.), pp. 399–421, 1996.
Y. Kudoh and M. Haraguchi: Detecting a Compact Decision Tree Based on an Appropriate Abstraction Proc. of 2nd Int’l. Conf. on Intelligent Data Engineering and Automated Learning, LNCS-1983, pp. 60–70, 2000.
Y. Okubo, Y. Kudoh and M. Haraguchi: Constructing Appropriate Data Abstractions for Mining Classification Knowledge Proc. of the 14th Int’l. Conf. on Application of Prolog, pp. 275–284, 2000.
M. P. Wellman and CL. Liu: State-Space Abstraction for Anytime Evaluation of Probabilistic Networks, Uncertainty in Artificial Intelligence, pp. 567–574, 1994.
S. Hettich and S. D. Bay: The UCI KDD Archive, http://kdd.ics.uci.edu, Univ. of California, Dept. of Information and Computer Science, 1999.
D. Fasulo: An Analysis of Recent Work on Clustering Algorithms, http://www.cs.washington.edu/homes/dfasulo/clustering.ps, 1999.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Narita, M., Haraguchi, M., Okubo, Y. (2002). Data Abstractions for Numerical Attributes in Data Mining. In: Yin, H., Allinson, N., Freeman, R., Keane, J., Hubbard, S. (eds) Intelligent Data Engineering and Automated Learning — IDEAL 2002. IDEAL 2002. Lecture Notes in Computer Science, vol 2412. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45675-9_7
Download citation
DOI: https://doi.org/10.1007/3-540-45675-9_7
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44025-3
Online ISBN: 978-3-540-45675-9
eBook Packages: Springer Book Archive