Abstract
Semi-automatic data mining approaches often yield better results than plain automatic methods, due to the early integration of the user’s goals. For example in the medical domain, experts are likely to favor simpler models instead of more complex models. Then, the accuracy of discovered patterns is often not the only criterion to consider. Instead, the simplicity of the discovered knowledge is of prime importance, since this directly relates to the understandability and the interpretability of the learned knowledge.
In this paper, we present quality measures considering the understandability and the accuracy of (learned) rule bases. We describe a unifying quality measure, which can trade-off small losses concerning accuracy vs. an increased simplicity. Furthermore, we introduce a semi-automatic data mining method for learning understandable and accurate rule bases. The presented work is evaluated using cases from a real world application in the medical domain.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Ho, T., Saito, A., Kawasaki, S., Nguyen, D., Nguyen, T.: Failure and Success Experience in Mining Stomach Cancer Data. In: International Workshop Data Mining Lessons Learned, International Conf. Machine Learning, pp. 40–47 (2002)
Gamberger, D., Lavrac, N.: Expert-Guided Subgroup Discovery: Methodology and Application. Journal of Artificial Intelligence Research 17, 501–527 (2002)
Huettig, M., Buscher, G., Menzel, T., Scheppach, W., Puppe, F., Buscher, H.P.: A Diagnostic Expert System for Structured Reports, Quality Assessment, and Training of Residents in Sonography. Medizinische Klinik 99, 117–122 (2004)
Puppe, F., Ziegler, S., Martin, U., Hupp, J.: Wissensbasierte Diagnosesysteme im Service-Support (Diagnostic Knowledge Systems for the Service-Support). Springer, Heidelberg (2001)
Ohmann, C., et al.: Clinical Benefit of a Diagnostic Score for Appendicitis: Results of a Prospective Interventional Study. Archives of Surgery 134, 993–996 (1999)
Miller, R., Pople, H.E., Myers, J.: Internist-1, an Experimental Computer-Based Diagnostic Consultant for General Internal Medicine. NEJM 307, 468–476 (1982)
Neumann, M., Baumeister, J., Liess, M., Schulz, R.: An Expert System to Estimate the Pesticide Contamination of Small Streams using Benthic Macroinvertebrates as Bioindicators, Part 2. Ecological Indicators 2, 391–401 (2003)
Tuzhilin, A.: Usefulness, Novelty, and Integration of Interestingness Measures. In: Klösgen, Z. (ed.) Handbook of Data Mining and Knowledge Discovery, ch. 19.2.2. Oxford University Press, New York (2002)
Freitas, A.A.: On Rule Interestingness Measures. Knowledge-Based Systems 12, 309–325 (1999)
Lewis, D.D., Gale, W.A.: A Sequential Algorithm for Training Text Classifiers. In: Proc. of the 17th ACM International Conference on Research and Development in Information Retrieval (SIGIR 1994), London, pp. 3–12. Springer, Heidelberg (1994)
Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, San Francisco (1999)
Mitchell, T.: Machine Learning. McGraw-Hill Comp., New York (1997)
Yen, S.J., Chen, A.L.P.: An Efficient Algorithm for Deriving Compact Rules from Databases. In: Ling, M. (ed.) Proceedings of the 4th International Conference on Database Systems for Advanced Applications 1995, pp. 364–371. World Scientific, Singapore (1995)
Baumeister, J., Atzmueller, M., Puppe, F.: Inductive Learning for Case-Based Diagnosis with Multiple Faults. In: Craw, S., Preece, A.D. (eds.) ECCBR 2002. LNCS (LNAI), vol. 2416, pp. 28–42. Springer, Heidelberg (2002)
Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Morgan Kaufmann Publishers, San Mateo (2000)
Puppe, F.: Knowledge Reuse Among Diagnostic Problem-Solving Methods in the Shell-Kit D3. Int. J. Human-Computer Studies 49, 627–649 (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Atzmueller, M., Baumeister, J., Puppe, F. (2005). Quality Measures and Semi-automatic Mining of Diagnostic Rule Bases. In: Seipel, D., Hanus, M., Geske, U., Bartenstein, O. (eds) Applications of Declarative Programming and Knowledge Management. INAP WLP 2004 2004. Lecture Notes in Computer Science(), vol 3392. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11415763_5
Download citation
DOI: https://doi.org/10.1007/11415763_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-25560-4
Online ISBN: 978-3-540-32124-8
eBook Packages: Computer ScienceComputer Science (R0)