An Adjustable Description Quality Measure for Pattern Discovery Using the AQ Methodology

Kaufman, Kenneth A.; Michalski, Ryszard S.

doi:10.1023/A:1008787919756

An Adjustable Description Quality Measure for Pattern Discovery Using the AQ Methodology

Published: March 2000

Volume 14, pages 199–216, (2000)
Cite this article

Journal of Intelligent Information Systems Aims and scope Submit manuscript

Kenneth A. Kaufman¹ &
Ryszard S. Michalski^2,3

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Abstract

In concept learning and data mining tasks, the learner is typically faced with a choice of many possible hypotheses or patterns characterizing the input data. If one can assume that training data contain no noise, then the primary conditions a hypothesis must satisfy are consistency and completeness with regard to the data. In real-world applications, however, data are often noisy, and the insistence on the full completeness and consistency of the hypothesis is no longer valid. In such situations, the problem is to determine a hypothesis that represents the best trade-off between completeness and consistency. This paper presents an approach to this problem in which a learner seeks rules optimizing a rule quality criterion that combines the rule coverage (a measure of completeness) and training accuracy (a measure of inconsistency). These factors are combined into a single rule quality measure through a lexicographical evaluation functional (LEF). The method has been implemented in the AQ18 learning system for natural induction and pattern discovery, and compared with several other methods. Experiments have shown that the proposed method can be easily tailored to different problems and can simulate different rule learners by modifying the parameter of the rule quality criterion.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Families of the Granules for Association Rules and Their Properties

Formal Concept Analysis: From Knowledge Discovery to Knowledge Processing

Utilizing Coverage Lists as a Pruning Mechanism for Concept Discovery

References

Baim, P.W. (1982). The PROMISE Method for Selecting Most Relevant Attributes for Inductive Learning Systems. Report No. UIUCDCS-F-82–898, Department of Computer Science, University of Illinois, Urbana.
Google Scholar
Bergadano, F., Matwin, S., Michalski, R.S., and Zhang, J. (1992). Learning Two-Tiered Descriptions of Flexible Concepts: The POSEIDON System, Machine Learning, 8, 5–43.
Google Scholar
Bruha, I. (1997). Quality of Decision Rules: Definitions and Classification Schemes for Multiple Rules. In G. Nakhaeizadeh and C.C. Taylor (Eds.), Machine Learning and Statistics, The Interface (pp. 107–131). New York: John Wiley & Sons, Inc.
Google Scholar
Clark, P. and Boswell, R. (1991). Rule Induction with CN2: Some Recent Improvements. In Y. Kodratoff (Ed.), Proceedings of the Fifth European Working Session on Learning (EWSL-91) (pp. 151–163). Berlin: Springer-Verlag.
Google Scholar
Clark, P. and Niblett, T. (1989). The CN2 Induction Algorithm, Machine Learning, 3, 261–283.
Google Scholar
Cohen, W. (1995). Fast Effective Rule Induction. In Proceedings of the Twelfth International Conference on Machine Learning, Lake Tahoe, CA.
Fayyad, U.M., Piatetsky-Shapiro, G., Smyth, P., and Uthurusamy, R. (Eds.) (1996). Advances in Knowledge Discovery and Data Mining. Menlo Park, CA: AAAI Press.
Google Scholar
Fürnkranz, J. and Widmer, G. (1994). Incremental Reduced Error Pruning. In Proceedings of the Eleventh International Conference on Machine Learning, New Brunswick, NJ.
Kaufman, K.A. (1997). INLEN: A Methodology and Integrated System for Knowledge Discovery in Databases. Ph.D. Dissertation. Reports of the Machine Learning and Inference Laboratory, MLI 97–15, George Mason University, Fairfax, VA.
Google Scholar
Kaufman, K.A. and Michalski, R.S. (to appear). STAR: An Environment for Natural Induction and Learning. Reports of the Machine Learning and Inference Laboratory, George Mason University, Fairfax, VA.
Michalski, R.S. (1983). A Theory and Methodology of Inductive Learning. In R.S. Michalski, J.G. Carbonell, and T.M. Mitchell (Eds.), Machine Learning: An Artificial Intelligence Approach (pp. 83–129). Palo Alto: Tioga Publishing.
Google Scholar
Michalski, R.S. (to appear). NATURAL INDUCTION: A Theory, Methodology and Applications to Knowledge ining and Pattern Discovery. Reports of the Machine Learning and Inference Laboratory, George Mason University.
Piatetsky-Shapiro, G. (1991). Discovery, Analysis, and Presentation of Strong Rules. In G. Piatetsky-Shapiro and W. Frawley (Eds.), Knowledge Discovery in Databases (pp. 229–248). Menlo Park, CA: AAAI Press.
Google Scholar
Quinlan, J.R. (1986). Induction of Decision Trees, Machine Learning, 1, 81–106.
Google Scholar
Zhang, J. and Michalski, R.S. (1989). Rule Optimization via SG-TRUNC Method. In Proceedings of the Fourth European Working Session on Learning, Montpellier (pp. 251–262).

Download references

Author information

Authors and Affiliations

Machine Learning and Inference Laboratory, George Mason University, Fairfax, VA, 22030
Kenneth A. Kaufman
Machine Learning and Inference Laboratory, George Mason University, Fairfax, VA
Ryszard S. Michalski
Institute of Computer Science, Polish Academy of Sciences, Warsaw, Poland
Ryszard S. Michalski

Authors

Kenneth A. Kaufman
View author publications
You can also search for this author in PubMed Google Scholar
Ryszard S. Michalski
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kaufman, K.A., Michalski, R.S. An Adjustable Description Quality Measure for Pattern Discovery Using the AQ Methodology. Journal of Intelligent Information Systems 14, 199–216 (2000). https://doi.org/10.1023/A:1008787919756

Download citation

Issue Date: March 2000
DOI: https://doi.org/10.1023/A:1008787919756

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An Adjustable Description Quality Measure for Pattern Discovery Using the AQ Methodology

Abstract

Access this article

Similar content being viewed by others

Families of the Granules for Association Rules and Their Properties

Formal Concept Analysis: From Knowledge Discovery to Knowledge Processing

Utilizing Coverage Lists as a Pruning Mechanism for Concept Discovery

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

An Adjustable Description Quality Measure for Pattern Discovery Using the AQ Methodology

Abstract

Access this article

Similar content being viewed by others

Families of the Granules for Association Rules and Their Properties

Formal Concept Analysis: From Knowledge Discovery to Knowledge Processing

Utilizing Coverage Lists as a Pruning Mechanism for Concept Discovery

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation