Abstract
Assuring the quality of an expert system is critical. A poor quality system may make costly errors resulting in considerable damage to the user or owner of the system, such as financial loss or human suffering. Hence verification and validation, methods and techniques aimed at ensuring quality, are fundamentally important.
This paper surveys the issues, methods and techniques for verifying and validating expert systems. Approaches to defining the quality of a system are discussed, drawing upon work in both computing and the model building disciplines, which leads to definitions of verification and validation and the associated concepts of credibility, assessment and evaluation. An approach to verification based upon the detection of anomalies is presented, and related to the concepts of consistency, completeness, correctness and redundancy. Automated tools for expert system verification are reviewed.
Considerable attention is then given to the issues in structuring the validation process, particularly the establishment of the criteria by which the system is judged, the need to maintain objectivity, and the concept of reliability. This is followed by a review of validation methods for validating both the components of a system and the system as a whole, and includes examples of some useful statistical methods. Management of the verification and validation process is then considered, and it is seen that the location of methods for verification and validation in the development life-cycle is of prime importance.
Similar content being viewed by others
Abbreviations
- D.2.4 [Software Engineering]:
-
Program Verification — Validation
- D.2.5 [Software Engineering]:
-
Testing and Debugging
- 1.2 [Artificial Intelligence]:
-
Applications and Expert Systems
- K.6.1 [Management of Computers and Information System]:
-
Project and People Management — Life Cycle
References
Adelman, L. (1991) ‘Experiments, Quasi-experiments, and Case Studies A Review of Empirical Methods for Evaluating Decision Support Systems’,IEEE Transactions on Systems, Man, and Cybernetics 21: 2, 293–301.
Adrion, W., Branstad, M. and Cherniavsky, J. (1982) ‘Validation, Verification and Testing of Computer Software’,ACM Computing Surveys 14: 2, 159–192.
Agarwal, R. and Tanniru, M. (1992) ‘A Petri-net Approach for Verifying the Integrity of Production Systems’,International Journal of Man-Machine Studies 26, 447–468.
Bachant, J. and McDermott, J. (1983) ‘R1 Revisited: Four Years in the Trenches’,AI Magazine 5: 3, 21–32.
Balci, O. (1987) ‘Credibility Assessment’, in Balci, O. (ed.),Proceedings of the 1987 Eastern Simulation Conference, the Society for Computer Simulation, La Jolla, CA.
Balci, O. and Sargent, R. (1981) ‘A Methodology for Cost Risk Analysis in the Statistical Validation of Simulation Models’,Communications of the ACM 24: 4, 190–197.
Balci, O. and Sargent, R. (1984) ‘Validation of Simulation Models Via Simultaneous Confidence Intervals’,American Journal of Mathematics and Management Sciences 4: 3&4, 375–406.
Batarekh, A., Preece, A. D., Bennett, A. and Grogono, P. (1991) ‘Specifying an Expert System’,Expert Systems with Applications 2, 285–303.
Bellman, K. L. (1990) ‘The Modeling Issues Inherent in Testing and Evaluating Knowledge-based Systems’,Expert Systems With Applications 1: 3, 199–216.
Benbasat, I. and Dhaliwal, J. (1989) ‘A Framework for the Validation of Knowledge Acquisition’,Knowledge Acquisition 1, 215–233.
Boehm, B. W. (1981)Software Engineering Economics, Prentice-Hall, Englewood Cliffs, NJ.
Boose, J. and Bradshaw, J. (1987) ‘Expertise Transfer and Complex Problems Using Aquinas as a Knowledge Acquisition Workbench for Expert Systems’,International Journal of Man-Machine Systems 26, 3–28.
Boose, J. H. (1986)Expertise Transfer for Expert System Design, Elsevier, New York.
Bratko, I., Mozetic, I. and Lavrac, N. (1989)KARDIO: A Study in Deep and Qualitative Knowledge for Expert Systems, MIT Press, Cambridge, MA.
Buchanan, B. and Shortliffe, E. (1985)Rule-Based Expert Systems: The MYCIN Experiments of the Stanford Heuristic Project, Addison-Wesley, Reading, MA.
Buchanan, B., Sutherland, G. and Feigenbaum, E. A. (1969) ‘Heuristic DENDRAL: A Program for Generating Explanatory Hypotheses in Organic Chemistry’, in Michie, D. (ed.),Machine Intelligence 4, Elsevier, NY.
Castore, G. (1987) ‘Validation and Verification for Knowledge-based Control Systems’,Proceedings of the First Annual Workshop on Space Operations, Automation and Robotics, NASA, pp. 197–202.
Chandrasekaran, B. (1983) ‘On Evaluating AI Systems for Medical Diagnosis’,AI Magazine 4: 2, 34–37.
Chang, C. L., Combs, J. B. and Stachowitz, R. A. (1990) ‘A Report on the Expert Systems Validation Associate (EVA)’,Expert systems With Applications 1: 3, 217–230.
Cochran, T. and Hutchins, B. (1987) ‘Testing, Verifying and Releasing an Expert System: The Case History of Mentor’,Proceedings of the Third IEEE Conference on AI Applications, pp. 163–167.
Cohen, J. (1960) ‘A Coefficient of Agreement for Nominal Scales’,Educational and Psychological Measurement 20, 37–46.
Cohen, J. (1968) ‘Weighted Kappa: Nominal Scale Agreement with Provision for Scaled Disagreement or Partial Credit’,Psychological Bulletin 70: 4, 213–220.
Cragun, B. and Steudal, H. (1987) ‘A Decision-table-based Processor for Checking Completeness and Consistency in Rule-based Expert Systems’,International Journal of Man-Machine Systems 25: 5, 633–648.
Davis, R. and Lenat, D. B. (1982)Knowledge-based Systems in Artificial Intelligence, McGraw-Hill, New York, NY.
Davis, R. (1984) ‘Reasoning from First Principles in Electronic Troubleshooting’,International Journal of Man-Machine Studies 24, 347–410.
Duchessi, P., Shawky, H. and Seagle, J. P. (1988) ‘A Knowledge-Engineered System for Commercial Loan Decisions’,Financial Management 17: 3, 57–65.
Duda, R., Gaschnig J. and Hart, P. (1979) ‘Model Design in the Prospector Consultant System for Mineral Exploration’, in Michie, D. (ed.),Expert Systems in the Microelectronic Age, Edinburgh University Press, pp. 153–167.
Eglese, R. W. (1986) ‘Heuristics in Operational Research’, in Belton, V. and O'Keefe, R. M. (eds.),Recent Developments in Operational Research, Pergamon Press, Oxford, UK, pp. 49–68.
Enand, R., Kahn, G. S. and Mills, R. A. (1990) ‘A Methodology for Validating Large Knowledge Bases’,International Journal of Man-Machine Studies 33, 361–371.
Ernst, C. J. (ed.) (1988)Management Expert Systems, Addison-Wesley, Reading, MA.
Fleiss, J. L. (1981)Statistical Methods for Rates and Proportions, John Wiley, NY.
Fox, M. S. (1990) ‘AI and Expert System Myths, Legends, and Facts’,IEEE Expert 5: 1, 8–20.
Gains, B. R. (1987) ‘An Overview of Knowledge Acquisition and Transfer’,International Journal of Man-Machine Studies 26, 453–472.
Gaschnig, J., Klahr, P., Pople, H., Shortliffe, E. and Terry, A. (1983) ‘Evaluation of Expert Systems: Issues and Case Studies’, in Hayes-Roth, F., Waterman, D. A. and Lenat, D. B. (eds.),Building Expert Systems, Addison-Wesley, Reading, MA, pp. 241–280.
Ginsberg, A. (1988) ‘Knowledge-based Reduction: A New Approach to Checking Knowledge Bases for Inconsistency and Redundancy’,Proceedings of AAAI'88, AAAI, Menlo Park, CA, pp. 585–589.
Ginsberg, A., Weiss, S. M. and Politakis, P. (1988) ‘Automatic Knowledge Base Refinement for Classification Systems’,Artificial Intelligence 35, 197–226.
Gruhl, J. (1982) ‘Model Credibility and Independent Evaluation: Three Case Studies’,Omega 10: 5, 525–537.
Hall, D. L. and Heinze, D. T. (1989) ‘The Use of Simulation Techniques for Expert System Test and Evaluation’,ISA Transactions 28: 1, 19–22.
Hamilton, D., Kelley, K. and Culbert, C. (1991) ‘State-of-the-practice in Knowledge-based System Verification and Validation’, Technical Report, NASA/Johnson Space Center, Houston, TX.
Hamilton, S. and Chervany, N. L. (1981) ‘Evaluating Information System Effectiveness — Part I: Comparing Evaluation Approaches’,MIS Quarterly 5: 3, 55–69.
Hansen, J. and Messier, W. (1986) ‘A Preliminary Investigation of EDP-XPERT’,Auditing: A Journal of Theory and Practice 6: 1, 109–123.
Harrison, P. R. (1989) ‘Testing and Evaluation of Knowledge-Based Systems’, in Liebowitz, J. and De Salvo, D. A. (eds.),Structuring Expert Systems, Prentice-Hall, Englewood Cliffs, NJ, pp. 303–329.
Harrison, P. R. and Ratcliffe, P. A. (1991) ‘Towards Standards for the Validation of Expert Systems’,Expert Systems With Applications 2, 251–258.
Hickam, D. H., Shortliffe, E. H., Bischoff, M. B., Scott, A. C. and Jacobs, C. D. (1985) ‘The Treatment Advice of a Computer-based Cancer Chemotherapy Protocol Advisor’,Annals of Internal Medicine 103, 928–936.
Hilden, J. and Habbeman, J. D. F. (1990) ‘Evaluation of Clinical Decision Aids — More to Think About’,Medical Informatics 15: 3, 275–284.
Jackson, P. (1986)Introduction to Expert Systems, Addison-Wesley, Reading, MA.
Jacob, R. J. K. and Froscher, J. N. (1990) ‘A Software Engineering Methodology for Rule-based Systems’,IEEE Transactions on Knowledge and Data Engineering 2: 2, 173–189.
Jafar, M. J. and Bahill, A. T. (1990) ‘Validator, A Tool for Verifying and Validating Personal Computer Based Expert Systems’, in Brown, D. E. and White C. C. (eds.),Operations Research and Artificial Intelligence: The Integration of Problem Solving Strategies, Kluwer Academic Press, Boston, MA.
Keen, P. W. (1981) ‘Value Analysis: Justifying Decision Support Systems’,MIS Quarterly 5: 1, 1–15.
Kerlinger, F. (1973)Foundations of Behavioral Research, Holt, Reinhart & Winston, New York.
King, M. and Phythian, G. J. (1992) ‘Validating an Expert Support System for Tender Enquiry Evaluation: A Case Study’,Journal of the Operational Research Society 43, 203–214.
Klinker, G., Bentolila, J., Genetet, S., Grimes, M. and McDermott, J. (1987) ‘KNACK — Report-Driven Knowledge Acquisition’,International Journal of Man-Machine Studies 26, 65–79.
Kulikowski, C. A. and Weiss, S. H. (1982) ‘Representation of Expert Knowledge for Consultation: the Casnet and Expert Projects’, in Szolovits, P. (ed.),Artificial Intelligence in Medicine, Westview Press, Boulder, CO, pp. 21–56.
Laudaner, C. (1990) ‘Correctness Principles for Rule-based Systems’,Expert Systems With Applications 1: 3, 291–316.
Landry, M., Malouin, J.-L. and Oral, M. (1983) ‘Model Validation in Operations Research’,European Journal of Operational Research 14, 207–220.
Langlotz, C. P. and Shortliffe, E. H. (1983) ‘Adapting a Consultation System to Critique User Plans’,International Journal of Man-Machine Studies 19, 479–496.
Langlotz, C. P., Shortliffe, E. H. and Fagan, L. M. (1986) ‘Using Decision Theory to Justify Heuristics’, inProceedings of AAAI'86, AAAI, Menlo Park, CA, pp. 215–219.
Lee, S. and O'Keefe, R. M. ‘Subsumption Anomalies in Hybrid Knowledge-bases’,International Journal of Expert Systems (forthcoming).
Lehner, P. (1989) ‘Toward an Empirical Approach to Evaluating the Knowledge Base of an Expert System’,IEEE Transactions on Systems, Man and Cybernetics 19: 3, 658–662.
Lethan, H. and Jacobsen, H. (1987) ‘ESKORT — An Expert System for Auditing VAT Accounts’,Proceedings of Expert Systems and their Applications, Avignon, France.
Liebowitz, J. (1986) ‘Useful Approach for EvaluatingExpert Systems’,Expert Systems 2: 3, 86–96.
Liu, N. K. and Dillon, T. (1991) ‘An Approach Towards the Verification of Expert Systems Using Numerical Petri Nets’,International Journal of Intelligent Systems 6, 255–276.
Meservy, R., Bailey, A. and Johnson, P. (1986) ‘Internal Control Evaluation: A Computational Model of the Review Process’,Auditing: A Journal of Theory and Practice 6: 1, 44–74.
Messier, W. F. and Hansen, J. V. (1992) ‘A Case Study and Field Evaluation of EDP-XPERT’,International Journal of Intelligent Systems in Accounting, Finance and Management 1: 3, 173–186.
Miller, L. A. (1989) ‘A Comprehensive Approach to the Verification and Validation of Knowledge-Based Systems’, inProceedings of the 1989 AAAI Workshop on Verification, Validation and Testing of Knowledge-Based Systems, AAAI, Menlo Park, CA.
Miller, L. A. (1990) ‘Dynamic Testing of Knowledge Bases Using the Heuristic Testing Approach’,Expert Systems with Applications 1: 3, 249–269.
Moninger, W. R., Stewart, T. R. and McIntosh, P. (1988) ‘Validation of Knowledge-Based Systems for Probabilistic Reasoning’, inProceedings of the 1988 AAAI Workshop on Verification, Validation and Testing of Knowledge-Based Systems, AAAI, Menlo Park, CA.
Mosteller, F. and Rourke, R. E. K. (1973)Sturdy Statistics, Addison Wesley, Reading, MA.
Nazareth, D. (1989) ‘Issues in the Verification of Knowledge in Rule-Based Systems’,International Journal of Man-Machine Studies 30, 255–271.
Nguyen, T., Perkins, W., Laffery, T. and Pecora, D. (1985) ‘Checking an Expert Systems Knowledge Base for Consistency and Completeness’,Proceedings of the International Joint Conference on Artificial Intelligence, pp. 374–378.
Nguyen, T., Perkins, W., Laffery, T. and Pecora, D. (1987) ‘Knowledge Base Verification’,AI Magazine 8: 2, 65–79.
Norman, P. and Naveed, S. (1990) ‘A Comparison of Expert System and Human Operator Performance for Cement Kiln Operation’,Journal of the Operational Research Society 41: 11, 1007–1019.
O'Keefe, R. M. (1989) ‘The Evaluation of Decision-aiding Systems: Guidelines and Methods’,Information and Management 17, 217–226.
O'Keefe, R. M. and Lee, S. (1990) ‘An Integrative Model of Expert System Verification and Validation’,Expert Systems and Their Application 1: 3, 231–236.
O'Keefe, R. M. and O'Leary, D. E. ‘Managing and Performing Expert System Validation’, in Grabowski, M. and Wallace, W. A. (eds.),Advances in Expert Systems and Artificial Intelligence for Management, JAI Press (forthcoming).
O'Keefe, R. M., Balci, O. and Smith, E. (1987) ‘Validating Expert System Performance’,IEEE Expert 2: 4, 81–89.
O'Leary, D. (1987) ‘Validation of Expert Systems’,Decision Sciences 18: 3, 468–486.
O'Leary, D. (1988a) ‘Methods of Validating Expert Systems’,Interfaces 18: 6, 72–79.
O'Leary, D. (1988b) ‘On the Representation and the Impact of Reliability on Expert System Weights’,International Journal of Man-Machine Studies 29: 6, 637–646.
O'Leary, D. (1988c) ‘Expert System Prototyping as a Research Tool’, in Turban, E. and Watkins, P. (eds.),Applied Expert Systems, North-Holland, Amsterdam, pp. 17–32.
O'Leary, D. (1990a) ‘Soliciting Weights or Probabilities from Experts for Rule-Based Systems’,International Journal of Man-Machine Studies 32, 293–301.
O'Leary, D. (1990b) ‘Verification of Frames and Semantic Networks’, in Gaines, B. (ed.),Proceedings of the Fourth Annual Workshop on Knowledge Acquisition, Banff, Canada.
O'Leary, D. (1991) ‘Design, Development and Validation of Expert Systems: A Survey of Developers’, inVerification, Validation and Testing of Expert Systems, John Wiley, New York, NY, pp. 3–19.
O'Leary, D. and Kandelin, N. (1988) ‘Validating the Weights in Rule-based Expert Systems’,International Journal of Expert Systems 1: 3, 253–279.
O'Leary, D. and Watkins, P. (1989)Expert Systems in Internal Auditing, Research Monograph, Institute of Internal Auditors.
O'Leary, T. J., Goul, M., Moffitt, K. E. and Radwan, A. E. (1990) ‘Validating Expert Systems’,IEEE Expert 5: 3, 51–58.
O'Neil, M. and Glowinski, A. (1990) ‘Evaluating and Validating Very Large Knowledge-based Systems’,Medical Informatics 15: 3, 237–252.
Ow, P. and Smith, S. (1987) ‘Two Design Principles for Knowledge-based Systems’,Decision Sciences 18: 3, 430–447.
Pearce, D. A. (1988) ‘The Induction of Fault Diagnosis Systems from Qualitative Models’,Proceedings of AAAI '88, AAAI, Menlo Park, CA, pp. 353–357.
Preece, A. D. (1989) ‘Verification of Rule-based Systems in Wide Domains’, in Shadbolt, N. (ed.),Research and Development in Expert Systems VI, Cambridge University Press, pp. 66–77.
Preece, A. D. (1990) ‘Towards a Methodology for Evaluating Expert Systems’,Expert Systems 7: 4, 215–223.
Preece, A. D., Shinghal, R. and Batarekh, A. (1992) ‘Verifying Expert Systems: A Logical Framework and a Practical Tool’,Expert Systems With Applications 5, 421–436.
Quinlan, J. R. (1979) ‘Discovering Rules by Induction from Large Collections of Samples’, in Michie, D. (ed.),Expert Systems in the Microelectronic Age, Edinburgh Univesity Press, UK, pp. 168–201.
Radwan, A. E., Goul, M., O'Leary, T. J. and Moffitt, K. (1989) ‘A Verification Approach for Knowledge-based Systems’,Transportation Research-A 23A: 4, 287–300.
Rushby, J. (1988)Quality Measures and Assurance for AI Software, NASA Contract Report 4187, Washington DC.
Shatz, H., Strahs, R. and Campbell, L. (1987) ‘ExperTAX: The Issue of Long-Term Maintenance’,Proceedings of the 3rd International Conference on Expert Systems, pp. 291–300.
Shaw, M. and Woodward, J. (1988) ‘Validation in a Knowledge Support System: Construing and Consistency with Multiple Experts’,International Journal of Man-Machine Studies 29: 3, 329–350.
Shpilberg, D. and Graham, L. E. (1989) ‘Developing ExperTAX: An Expert System for Corporate Tax Accrual and Planning’, in Vasarhelyi, M. A. (ed.),Artificial Intelligence in Accounting and Auditing, Markus Weiner, New York, NY, pp. 343–372.
Soloway, E., Bachant, J. and Jensen, K. (1987) ‘Assessing the Maintainability of XCON-in-RIME: Coping with the Problems of a Very Large Rule-base’, inProceedings of AAAI '87, AAAI, Menlo Park, CA.
Suen, C. Y., Grogono, P. D. and Shingahl, R. (1990) ‘Verifying, Validating and Measuring the Performance of Expert Systems’,Expert Systems With Applications 1, pp. 93–102.
Suwa, M., Scott, A. and Shortliffe, E. (1982) ‘Completeness and Consistency in Rule-Based Expert Systems’,AI Magazine 3: 4, 16–21 (see also Buchanan and Shortliffe (1985), Chapter 8).
Turing, A. M. (1950) ‘Computing Machinery and Intelligence’,Mind 59.
Waterman, D. A. (1986)A Guide to Expert Systems, Addison-Wesley, Reading, MA.
Weiss, S. M. and Kulikowski, C. A. (1984)A Practical Guide to Designing Expert Systems, Rowman, and Allenhead.
Weitzel, J. R. and Kershberg, L. (1989) ‘Developing Knowledge-Based Systems: Reorganizing the System Development Life-Cycle’,Communications of the ACM 32, 482–487.
Williams, G. (1976) ‘Comparing the Joint Agreement of Several Raters with Another Rater’,Biometrics 32: 2, 619–627.
Wyatt, J. and Spiegelhalter, D. (1990) ‘Evaluating Medical Expert Systems: What to Test and How?’,Medical Informatics 15: 3, 205–217.
Yager, R. R. and Larsen, H. L. (1991) ‘On Discovering Potential Inconsistencies in Validating Uncertain Knowledge Bases by Reflecting on the Input’,IEEE Transactions on Systems, Man and Cybernetics 21: 4, 790–801.
Yen, J., Neches, R. and MacGregor, R. (1991) ‘CLASP: Integrating Term Subsumption Systems and Production Systems’,IEEE Transactions on Knowledge and Data Engineering 3: 1, 25–31.
Yu., V., Buchanan, B., Shortliffe, E., Wraith, S., Davis, R., Scott, A. and Cohen, S. (1979a) ‘Evaluating the Performance of a Computer-based Consultant’,Computer Programs in Biomedicine 9: 1, 95–102.
Yu., V., Fagan, L., Wraith, S., Clancey, W., Scott, A., Hanigan, J., Blum, R., Buchanan, B. and Cohen S. (1979b) ‘Antimicrobial Selection by Computer’,Journal of the American Medical Association 242: 12, 1279–1282 (see also Buchanan and Shortliffe (1985), Chapter 31).
Zlatareva, N. P. (1992) ‘Truth Maintenance Systems and Their Application for Verifying Expert System Knowledge Bases’,Artificial Intelligence Review 6: 1, 67–108.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
O'Keefe, R.M., O'Leary, D.E. Expert system verification and validation: a survey and tutorial. Artif Intell Rev 7, 3–42 (1993). https://doi.org/10.1007/BF00849196
Issue Date:
DOI: https://doi.org/10.1007/BF00849196