Abstract
One strategy for increasing the efficiency of rule discovery in data mining is to target a restricted class of rules, such as exact or almost exact rules, rules with a limited number of conditions, or rules in which each condition, on its own, eliminates a competing outcome class. An algorithm is presented for the discovery of rules in which each condition is a distinctive feature of the outcome class on its right-hand side in the subset of the data set defined by the conditions, if any, which precede it. Such a rule is said to be characteristic for the outcome class. A feature is defined as distinctive for an outcome class if it maximises a well-known measure of rule interest or is unique to the outcome class in the data set. In the special case of data mining which arises when each outcome class is represented by a single instance in the data set, a feature of an object is shown to be distinctive if and only if no other feature is shared by fewer objects in the data set.
Similar content being viewed by others
References
W.J. Frawley, G. Piatetsky-Shapiro, and C.J. Matheus, “Knowledge discovery in databases: An overview,” in Knowledge Discovery in Databases, edited by G. Piatetsky-Shapiro and W.J. Frawley, AAAI Press: Menlo Park, CA, pp. 1-27, 1991.
W. Ziarko, “Discovery, analysis, and representation of data dependencies,” in Knowledge Discovery in Databases, edited by G. Piatetsky-Shapiro and W.J. Frawley, AAAI Press: Menlo Park, CA, pp. 195-209, 1991.
C. Nelson, “Improving customer retention with knowledge guided data mining,” BCS Specialist Group on Expert Systems Newsletter, no. 33, pp. 15-20, 1995.
S. Thompson and M.A. Bramer, “Parallel knowledge discovery: A review of existing techniques,” Colloquium on Knowledge Discovery and Data Mining, Digest no. 96/198, Institution of Electrical Engineers, London, pp. 5.1-5.5, 1996.
J.R. Quinlan, “Induction of decision trees,” Machine Learning, vol. 1, pp. 81-106, 1986.
J. Cendrowska, “PRISM: An algorithm for inducing modular rules,” International Journal of Man-Machine Studies, vol. 27, pp. 349-370, 1987.
G. Piatetsky-Shapiro, “Discovery, analysis and presentation of strong rules,” in Knowledge Discovery in Databases, edited by G. Piatetsky-Shapiro and W.J. Frawley, AAAI Press: Menlo Park, CA, pp. 229-248, 1991.
P. Smyth and R.M. Goodman, “Rule induction using information theory,” in Knowledge Discovery in Databases, edited by G. Piatetsky-Shapiro and W.J. Frawley, AAAI Press: Menlo Park, CA, pp. 159-176, 1991.
D. McSherry, “Knowledge discovery by inspection,” Decision Support Systems, vol. 21, pp. 43-47, 1997.
D. McSherry, “Attribute-value distribution as a strategy for increasing the efficiency of data mining,” Colloquium on Knowledge Discovery and Data Mining, Digest no. 98/310, Institution of Electrical Engineers, London, pp. 2.1-2.3, 1998.
D. McSherry, “A strategy for increasing the efficiency of rule discovery in data mining,” in Proc. of Intelligent Data Analysis 97, London, 1997, pp. 397-408.
C.J. Merz and P.M. Murphy, UCI repository of machine learning databases, http://www.ics.uci.edu/~mlearn/MLRepository.html, University of California, Irvine, 1998.
D. McSherry, “An algorithm for the discovery of characteristic rules,” Colloquium on Knowledge Discovery and Data Mining, Digest no. 96/198, Institution of Electrical Engineers, London, pp. 4.1-4.3, 1996.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
McSherry, D., Roantree, D. Characteristic Rule Discovery in Aurum-3. Applied Intelligence 11, 297–304 (1999). https://doi.org/10.1023/A:1008343110906
Issue Date:
DOI: https://doi.org/10.1023/A:1008343110906