Data Abstractions for Numerical Attributes in Data Mining

Narita, Masaaki; Haraguchi, Makoto; Okubo, Yoshiaki

doi:10.1007/3-540-45675-9_7

Masaaki Narita⁷,
Makoto Haraguchi⁷ &
Yoshiaki Okubo⁷

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2412))

Included in the following conference series:

International Conference on Intelligent Data Engineering and Automated Learning

1784 Accesses
2 Citations

Abstract

In this paper, we investigate data abstractions for mining association rules with numerical conditions and boolean consequents as a target class. The act of our abstraction corresponds to joining some consecutive primitive intervals of a numerical attribute. If the interclass variance for two adjacent intervals is less than a given admissible upper-bound ∈, then they are combined together into an extended interval. Intuitively speaking, a low value of the variance means that the two intervals can provide almost the same posterior class distributions. This implies few properties or characteristics about the class would be lost by combining such intervals together. We discuss a bottom-up method for finding maximally extended intervals, called maximal appropriate abstraction. Based on such an abstraction, we can reduce the number of extracted rules, still preserving almost the same quality of the rules extracted without abstractions. The usefulness of our abstraction method is shown by preliminary experimental results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

A two-stage discretization algorithm based on information entropy

Article 24 May 2017

Rough Set Approaches to Imprecise Modeling

nuggets: Data Pattern Extraction Framework in R

References

R. Agrawal, R. Srikant: Fast Algorithms for Mining Association Rules, Proc. of the 20th Int’l Conf. on Very Large Data Bases, pp. 478–499, 1994.
Google Scholar
R. Srikant and R. Agrawal: Mining Quantitative Association Rules in Large Relational Tables, Proc. of the ACM SIGMOD Int’l. Conf. on Management of Data, pp. 1–12, 1996.
Google Scholar
J. R. Quinlan: C4.5: Programs for Machine Learning, Morgen Kaufmann, 1993
Google Scholar
Y. Morimoto, H. Ishii and S. Morishita: Efficient Construction of Regression Trees with Range and Region Splitting, Proc. of the 23rd VLDB Conf., pp. 166–175, 1997.
Google Scholar
T. Fukuda, Y. Morimoto, S. Morishita and T. Tokuyama: Mining Optimized Association Rules for Numeric Attributes, Proc. of the 15th ACM symposium on Principles of Database Systems, pp. 182–191, 1996.
Google Scholar
J. Han and Y. Fu: Attribute-Oriented Induction in Data Mining. In Advances in Knowledge Discovery and Data Mining (Fayyad, U.N. et.al. eds.), pp. 399–421, 1996.
Google Scholar
Y. Kudoh and M. Haraguchi: Detecting a Compact Decision Tree Based on an Appropriate Abstraction Proc. of 2nd Int’l. Conf. on Intelligent Data Engineering and Automated Learning, LNCS-1983, pp. 60–70, 2000.
Google Scholar
Y. Okubo, Y. Kudoh and M. Haraguchi: Constructing Appropriate Data Abstractions for Mining Classification Knowledge Proc. of the 14th Int’l. Conf. on Application of Prolog, pp. 275–284, 2000.
Google Scholar
M. P. Wellman and CL. Liu: State-Space Abstraction for Anytime Evaluation of Probabilistic Networks, Uncertainty in Artificial Intelligence, pp. 567–574, 1994.
Google Scholar
S. Hettich and S. D. Bay: The UCI KDD Archive, http://kdd.ics.uci.edu, Univ. of California, Dept. of Information and Computer Science, 1999.
D. Fasulo: An Analysis of Recent Work on Clustering Algorithms, http://www.cs.washington.edu/homes/dfasulo/clustering.ps, 1999.

Download references

Author information

Authors and Affiliations

Division of Electronics and Information Engineering, Hokkaido University, N-13 W-8, Sapporo, 060-8628, Japan
Masaaki Narita, Makoto Haraguchi & Yoshiaki Okubo

Authors

Masaaki Narita
View author publications
You can also search for this author in PubMed Google Scholar
Makoto Haraguchi
View author publications
You can also search for this author in PubMed Google Scholar
Yoshiaki Okubo
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Electrical Engineering and Electronics, UMIST, Manchester, M60 1QD, UK
Hujun Yin , Nigel Allinson & Richard Freeman , &
Department of Computation, UMIST, Manchester, M60 1QD, UK
John Keane
Department of Biomolecular Science, UMIST, Manchester, M60 1QD, UK
Simon Hubbard

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Narita, M., Haraguchi, M., Okubo, Y. (2002). Data Abstractions for Numerical Attributes in Data Mining. In: Yin, H., Allinson, N., Freeman, R., Keane, J., Hubbard, S. (eds) Intelligent Data Engineering and Automated Learning — IDEAL 2002. IDEAL 2002. Lecture Notes in Computer Science, vol 2412. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45675-9_7

Download citation

DOI: https://doi.org/10.1007/3-540-45675-9_7
Published: 20 August 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44025-3
Online ISBN: 978-3-540-45675-9
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics