Abstract
We describe an approach and a tool for the discovery of subgroups within the framework of distribution rule mining. Distribution rules are a kind of association rules particularly suited for the exploratory study of numerical variables of interest. Being an exploratory technique, the result of a distribution mining process is typically a very large number of patterns. Exploring such results is thus a complex task and limits the use of the technique. To overcome this shortcoming we developed a tool, written in Java, which supports subgroup discovery in a post-processing step. The tool engages the analyst in an interactive process of subgroup discovery by means of a graphical interface with well defined statistical grounds, where domain knowledge can be used during the identification of such subgroups amid the population. We show a case study to analyze the results of students in a large scale university admission examination.
Supported by POCI/TRA/61001/2004 Triana Project (Fundação Ciência e Tecnologia), FEDER e Programa de Financiamento Plurianual de Unidades de I & D.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Aumann, Y., Lindell, Y.: A statistical theory for quantitative association rules. Journal of Intelligent Information Systems (2003)
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: Proceedings of the 20th International Conference on Very Large Databases, pp. 487–499 (1994)
Azevedo, P.J.: Caren - A Java Based Apriori Implementation for Classification Purposes, Technical Report, Universidade do Minho, Portugal (2003)
Fayyad, U.M., Piatetsky-Shapiro, G., Smyth, P.: From Data Mining to Knowledge Discovery: An Overview. In: Advances in Knowledge Discovery and Data Mining, pp. 11–34 (1996)
Frawley, W.J., Piatetsky-Shapiro, G., Matheus, C.J.: Knowledge discovery in databases: An overview. In: Advances in Knowledge Discovery and Data Mining, pp. 57–70 (1992)
Gamberger, D., Lavrac, N.: Active subgroup mining: a case study in coronary heart disease risk group detection. Artificial Intelligence in Medicine 28(1), 27–57 (2003)
Gamberger, D., Lavrac, N., Wettschereck, D.: Subgroup visualization: A method and application in population screening. In: Proceedings of the International Workshop on intelligent Data Analysis in Medicine and Pharmacology, IDAMAP (2002)
JAKARTA-Commons (Webpage accessed in January 2007), http://jakarta.apache.org/commons/
Jorge, A., Poças, J., Azevedo, P.J.: Post-processing operators for browsing large sets of association rules. In: Lange, S., Satoh, K., Smith, C.H. (eds.) DS 2002. LNCS, vol. 2534, pp. 414–421. Springer, Heidelberg (2002)
Jorge, A.M., Azevedo, P.J., Pereira, F.: Distribution rules with numerical properties of interest. In: 10th European Conference on Principles and Practice of Knowledge Discovery in Databases. LNCS (LNAI), Springer, Berlin (2006)
Jorge, A.M., Pereira, F., Azevedo, P.J.: Visual interactive subgroup discovery with numerical properties of interest. In: Discovery Science 2006. LNCS (LNAI), Springer, Barcelona (2006)
Kavsek, B., Lavrac, N., Jovanoski, V.: Apriori-sd: Adapting association rule learning to subgroup discovery. In: Proceedings of the fifth International Symposium on Inteligent Data Analysis, pp. 230–241. Springer, Heidelberg (2003)
Klösgen, W.: Exploration of simulation experiments by discovery. In: AAAI 1994 Workshop on Knowledge Discovery in Databases. LNCS (LNAI), pp. 251–262. Springer, Barcelona (1994)
Klösgen, W.: Applications and Research Problems of Subgroup Mining. In: 11th International Symposium on Foundations of Intelligent Systems, pp. 1–15 (1999)
OSJava. Open Sourced Java (Webpage accessed in November 2006), http://www.osjava.org/
Ma, Y., Liu, B., Wong, C.K.: Web for data mining: organizing and interpreting the discovered rules using the web. SIGKDD Explor. Newsl. 2(1), 16–23 (2000)
Pereira, F.: Descoberta de subgrupos com regras de associação. MSc dissertation on Data Analysis and Decision Support Systems, Faculdade de Economia do Porto, Universidade do Porto (2006)
Wrobel, S.: An algorithm for multi-relational discovery of subgroups. In: Komorowski, J., Żytkow, J.M. (eds.) PKDD 1997. LNCS, vol. 1263, pp. 78–87. Springer, Heidelberg (1997)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lucas, J.P., Jorge, A.M., Pereira, F., Pernas, A.M., Machado, A.A. (2007). A Tool for Interactive Subgroup Discovery Using Distribution Rules. In: Neves, J., Santos, M.F., Machado, J.M. (eds) Progress in Artificial Intelligence. EPIA 2007. Lecture Notes in Computer Science(), vol 4874. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-77002-2_36
Download citation
DOI: https://doi.org/10.1007/978-3-540-77002-2_36
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-77000-8
Online ISBN: 978-3-540-77002-2
eBook Packages: Computer ScienceComputer Science (R0)