Abstract
The major problem in the evaluation of expert systems is the selection of the appropriate statistical measures of performance consistent with the parameters of the system domain. The objective of this paper is to develop the statistical evaluation methodology needed to assess the performance of medical expert systems including MEDAS — the Medical Emergency Decision Assistance System. The measures of performance are selected so as to have an operational interpretation and also reflect the predictive diagnostic capacity of a medical expert system. Certain summary measures are used that represent the sensitivity, specificity, and system response of a medical expert system. Measures of agreement such as the kappa statistic and the measure of conditional agrement are used to measure the agreement between the medical expert system and the physician. Goodman and Kruskal's lambda and tau measures of predictive association are introduced to evaluate the predictive capacity of a medical expert system. This methodology has been partially implemented in the performance evaluation of MEDAS.
We want to thank Dr. Daniel Woodard of Bionetics and Dr. Paul Buchanan of NASA for their advice and support.
Preview
Unable to display preview. Download preview PDF.
References
M. Ben-Bassat, R.W. Carlson, U.K. Puri, M.D. Davenport, J.A. Shriver, M. Latif, R. Smith, L.R. Portigal, E.H. Lipnick, and M.H. Weil. Pattern-Based Interactive Diagnosis of Multiple Disorders: The MEDAS System. IEEE Transactions on Pattern Analysis and Machine Intelligence, PAMI-2, no. 2, March, 1980, pp. 148–160.
M. Ben-Bassat, D. Campell, A. MacNeil, and M.H. Weil. Evaluating Multimembership Classifiers: A Methodology and Application to the MEDAS Diagnostic System. IEEE Transactions on Pattern Analysis and Machine Intelligence, PAMI-5, 2, March, 1983, pp. 225–229.
W.G. Cochran. Sampling Techniques. John Wiley & Sons, New York, 1977.
J.A. Reggia. Evaluation of Medical Expert Systems: A Case Study in Performance Analysis. Proceedings of the Ninth Annual Symposium on Computer Applications in Medical Care, Baltimore, MD, 1985. pp. 287–291.
J. Cohen. A Coefficient of Agreement for Nominal Scales. Educational and Psychological Measurement, 20, 1960, pp. 37–46.
Y. Bishop, S. Fienberg, and P. Holland. Discrete Multivariate Analysis: Theory and Practice. MIT Press, Cambridge, MA. 1984.
R. Light. Analysis of Variance for Categorical Data, with Applications to Agreement and Association. Ph.D. Dissertation, Department of Statistics, Harvard University, 1969..
D.C. Georgakis, R. Rosenthal, D.A. Trace, and M. Evens. Measures of Performance of the MEDAS System. Proceedings of the Fourth Annual Artificial Intelligence and Advanced Computer Technology Conference, Long Beach, CA, May, 1988, pp. 50–65.
L. Goodman and W. Kruskal. Measures of Association for Cross-Classifications. Springer-Verlag, New York, 1979.
B.S. Everitt. The Analysis of Contingency Tables. Halsted Press, John Wiley & Sons, New York, NY, 1977.
L. Goodman and W. Kruskal. Measures of Association for Cross-Classifications, Part I. Journal of the American Statistical Association, 49, 1954, pp. 732–764.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1991 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Georgakis, D.C., Evens, M., Naeymi-Rad, F., Trace, D.A. (1991). Performance evaluation of medical expert systems. In: Sherwani, N.A., de Doncker, E., Kapenga, J.A. (eds) Computing in the 90's. Great Lakes CS 1989. Lecture Notes in Computer Science, vol 507. Springer, New York, NY. https://doi.org/10.1007/BFb0038475
Download citation
DOI: https://doi.org/10.1007/BFb0038475
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-0-387-97628-0
Online ISBN: 978-0-387-34815-5
eBook Packages: Springer Book Archive