ABSTRACT
We discuss data mining based on association rules for two numeric attributes and one Boolean attribute. For example, in a database of bank customers, "Age" and "Balance" are two numeric attributes, and "CardLoan" is a Boolean attribute. Taking the pair (Age, Balance) as a point in two-dimensional space, we consider an association rule of the form((Age, Balance) ∈ P) ⇒ (CardLoan = Yes),which implies that bank customers whose ages and balances fall in a planar region P tend to use card loan with a high probability. We consider two classes of regions, rectangles and admissible (i.e. connected and x-monotone) regions. For each class, we propose efficient algorithms for computing the regions that give optimal association rules for gain, support, and confidence, respectively. We have implemented the algorithms for admissible regions, and constructed a system for visualizing the rules.
- ACKT96.Tetsuo Asano, Danny Chen, Naoki Katoh, and Takeshi Tokuyama. Polynomial-time solutions to image segmentations. In Proc. 7th A CM-SIAM Symposium on Discrete Algorithms, pages 104-113, 1996.]] Google ScholarDigital Library
- AGI+92.R. Agrawal, S. Ghosh, T. Imielinski, B. Iyer, and A. Swami. An interval classifier for database mining applications. In Proceedings of the 18th VLDB Conference, pages 560-573, 1992.]] Google ScholarDigital Library
- AIS93a.Rakesh Agrawal, Tab Imielinski, and Arum Swami. Database mining: A performance perspective. IEEE Transactions on Knowledge and Data Engineering, 5(6):914-925, December 1993.]] Google ScholarDigital Library
- AIS93b.Rakesh Agrawal, Tako Imielinski, and Arum Swami. Mining association rules between sets of items in large databases. In Proceedings of the A CM SIGMOD Conference on Management of Data, pages 207-216, May 1993.]] Google ScholarDigital Library
- AKM+87.A. Aggarwal, M. Klawe, S. Moran, P. Shot, and R. Wilbur. Geometric applications of a matrixsearching algorithm. Algorithmica, 2:209-233, 1987.]]Google ScholarDigital Library
- AS94.Rakesh Agrawal and Ramakrishnan Srikant. Fast algorithms for mining association rules. In Proceedings of the 20th VLDB Conference, pages 487-499, 1994.]] Google ScholarDigital Library
- Ben84.Jon Bentley. Programming pearls. Communications of the A CM, 27(27):865-871, September 1984.]] Google ScholarDigital Library
- BFOS84.L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone. Classification and Regression Trees. Wadsworth, 1984.]]Google Scholar
- FHLL93.Paul Fischer, Klaus-U Hoffgen, ttanno Lefmann, and Tomasz Luczak. Approximations with axis-aligned rectangles. In Proceedings of the 9th International Conference on Fundarnentals of Computation Theory. Springer- Verlag, August 1993.]] Google ScholarDigital Library
- FMMT96a.Takeshi Fukuda, Yasuhiko Morimoto, Shinichi Morishita, and Takeshi Tokuyama. Mining optimized association rules for numeric attributes. In Proceedings of the Fifteenth A CM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, June 1996.]] Google ScholarDigital Library
- FMMT96b.Takeshi Fukuda, Yasuhiko Morimoto, Shinichi Morishita, and Takeshi Tokuyama. Data mining using two-dimensional optimized association rules: Scheme, algorithms, and visualization. In Technical Report, IBM Tokyo Research Laboratory, 1996.]]Google ScholarDigital Library
- GJ77.M.R. Garey and D. S. Johnson. The rectilinear steiner tree problem is np complete. SIAM J. Appl. Math, 32:836-834, 1977.]]Google ScholarCross Ref
- HCC92.Jiawei Han, Yandong Cai, and Nick Cercone. Knowledge discovery in databases: An attribute-oriented approach. In Proceedings of the 18th VLDB Conference, pages 547-559, 1992.]] Google ScholarDigital Library
- KKS94.D. Keim, H. Kriegel, and T. Seidl. Supporting data mining of large database by visual feedback queries. In Proc. l Oth Data Enginieering, pages 302-313, 1994.]] Google ScholarDigital Library
- MAR96.Manish Mehta, Rakesh Agrawal, and Jorma Rissanen. Sliq: A fast scalable classifier for data mining. In Proceedings of the Fifth International Conference on Extending Database Technology, 1996.]] Google ScholarDigital Library
- NH94a.Raymond T. Ng and Jiawei Han. Efficient and effective clustering methods for spatial data mining. In Proceedings of the 20th VLDB Conference, pages 144-155, 1994.]] Google ScholarDigital Library
- NH94b.Raymond T. Ng and Jiawei Han. Efficient and effective clustering methods for spatial data mining. In Proc. 20th VLDB Conlerence, pages 144-155, 1994.]] Google ScholarDigital Library
- NKT89.G.L. Nemhauser, A. H. G. Rinnoy Kan, , and M. J. Todd. Optimization: Handbooks in Operations Research and Management Science Vol.1. North-Holland, 1989.]] Google ScholarDigital Library
- PCY95.Jong Soo Park, Ming-Syan Chen, and Philip S. Yu. An effective hash-based algorithm for mining association rules. In Proceedings of the A CM SIGMOD Conference on Management of Data, pages 175-186, May 1995.]] Google ScholarDigital Library
- PS91.G. Piatetsky-Shapiro. Discovery, analysis, and presentation of strong rules. In Knowledge Discovery in Databases, pages 229-248, 1991.]]Google Scholar
- PSF91.G. Piatetsky-Shapiro and W. J. Frawley, editors. Knowledge Discovery ~n Databases. AAAI Press, 1991.]] Google ScholarDigital Library
- Qui86.J. Ross Quinlan. Induction of decision trees. Machine Learning, 1:81-106, 1986.]] Google ScholarCross Ref
- Qui93.J. Ross Quinlan. C~.5: Programs for Machine Learning. Morgan Kaufmann, 1993.]] Google ScholarDigital Library
- SA96.Ramakrishnan Srikant and Rakesh Agrawal. Mining quantitative association rules in large relational tables. In Proceedings of the A CM SIGMOD Conference on Management of Data, June 1996.]] Google ScholarDigital Library
- SAD+93.Michael Stonebraker, Rakesh Agrawal, Umeshwar Dayal, Erich J. Neuhold, and Andreas Reuter. DBMS research at a crossroads: The vienna update. In Proceedings o} the 19th VLDB Conference, pages 688-692, 1993.]] Google ScholarDigital Library
Index Terms
- Data mining using two-dimensional optimized association rules: scheme, algorithms, and visualization
Recommendations
Data mining using two-dimensional optimized association rules: scheme, algorithms, and visualization
We discuss data mining based on association rules for two numeric attributes and one Boolean attribute. For example, in a database of bank customers, "Age" and "Balance" are two numeric attributes, and "CardLoan" is a Boolean attribute. Taking the pair (...
Data Mining with optimized two-dimensional association rules
We discuss data mining based on association rules for two numeric attributes and one Boolean attribute. For example, in a database of bank customers, Age and Balance are two numeric attributes, and CardLoan is a Boolean attribute. Taking the pair (Age, ...
Mining Optimized Association Rules with Categorical and Numeric Attributes
Mining association rules on large data sets has received considerable attention in recent years. Association rules are useful for determining correlations between attributes of a relation and have applications in marketing, financial, and retail ...
Comments