Skip to main content
Log in

Benchmark investigation/identification project

  • Published:
Machine Translation

Abstract

Under the Benchmark I/I program, an evaluation methodology is being developed for determining the linguistic competence of natural language processing (NLP) systems. The goal is a procedure that is, insofar as possible, independent of application, domain, and system type and that produces descriptive profiles of NLP systems. The methodology is embodied in an evaluation procedure which is based on a detailed classification of linguistic phenomena and currently includes over 400 test items. The Benchmark Procedure has been applied to three NL database query and two MUC-3 systems. This paper focuses on the methodology and lessons learned.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • BBN Systems and Technologies Corp. (1988), Draft Corpus for Testing NL Data Base Query Interfaces, NL Evaluation Workshop, Wayne, PA.

  • Flickinger, D., Nerbonne, J., Sag, I., and Wasow, T. (1987). Toward Evaluation of Natural Language Processing Systems. Technical Report, Hewlett-Packard Laboratories.

  • Hendrix, G.G., Sacerdoti, E.D. and Slocum, J. (1976), Developing a Natural Language Interface to Complex Data, Technical Report, Artificial Intelligence Center, SRI International.

  • Lehnert, W. and Sundheim, B. (1991), A Performance Evaluation of Text-Analysis Technologies,AI Magazine, V. 12 (No. 3), pp. 81–94.

    Google Scholar 

  • Malhotra, A. (1975), Design Criteria for a Knowledge-Based Language System for Management: An Experimental Analysis, MIT/LCS/TR-146.

  • Neal, J.G. and Walter, S.M. (1991),Proceedings of the 1991 Workshop on Evaluation of Natural Language Processing Systems, Rome Laboratory Technical Report on the Workshop held in Berkeley, CA, June, 1991.

  • Neal, J.G., Feit, E.L., and Montgomery, C.A. (1991), The Benchmark Investigation/Identification Project: Phase I,Proceedings of the 1991 Workshop on Evaluation of Natural Language Processing Systems, Rome Laboratory Technical Report. pp. 41–69.

  • Palmer, M., Finin, T., and Walter, S.M. (1989),Workshop on the Evaluation of Natural Language Processing Systems, RADC-TR-89-302, RADC Technical Report on the Workshop held in Wayne, PA, in December 1988.

  • Read, W., Quilici, A., Reeves, J., Dyer, M., and Baker, E. (1988), Evaluating Natural Language Systems: A Sourcebook Approach, COLING-88, pp. 530–534.

  • Read, W., Dyer, M., Baker, E., Mutch, P., Butler, F., Quilici, A., and Reeves, J. (1990), Natural Language Sourcebook, Office of Naval Research Technical Report.

  • Sundheim, B., ed. (1991),Proceedings of the Third Message Understanding Conference, Morgan Kaufmann Publishers.

Download references

Author information

Authors and Affiliations

Authors

Additional information

This research is supported by Rome Laboratory under Contract No. F30602-90-C-0034.

Jeannette G. Neal has been a Principal Scientist at Calspan Corporation since earning her Ph.D. in Computer Science from the State University of New York at Buffalo in 1985. Her research has focused on natural language processing, intelligent multi-media interfaces, and evaluation of natural language processing systems.

Elissa Feit received her Master's in Computer Science at the State University of New York at Buffalo in 1990. She is currently at SUNY Buffalo working on her Ph.D. dissertation, tentatively titled “A Cognitive Linguistic Approach to Natural Language Understanding.”

Christine A. Montgomery, President of Language Systems, Inc., is a linguist whose particular NLP interest is text understanding. Her NLP career began with Russian-English MT research, and current work includes a project on machine-aided voice translation. Other work on evaluation includes comparisons of automated vs. human text extraction.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Neal, J.G., Feit, E.L. & Montgomery, C.A. Benchmark investigation/identification project. Mach Translat 8, 77–84 (1993). https://doi.org/10.1007/BF00981245

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF00981245

Keywords

Navigation